Tuesday, December 2, 2014

Week 12 - Halting Problems and Computability

This is hands down the most challenging concept of the course. The level of abstraction in this topic is way beyond anything else we have encountered so far. I have posted A LOT of questions on the forum about the halting function and reduction. (Side note: They should really consider substituting some of the marks for the slog entries with participation on the forum. I think I have posted enough content on Piazza to fill over 10 slogs!  Something to think about.)

I feel like we covered this topic too quickly, and moreover we jumped into overly complex examples of halt without first understanding the simpler ones. The examples in the course notes show functions with 1 or more parameters when we really should have started with functions that don't take any inputs at all.  Here is a snippet of code I posted on Piazza that I believe should have been the original proof for the non-existence of halt.

def halt(func):
    """Return True iff func() eventually halts. 
 
    Assume this exists.
    """


def unhaltable():
    """Loop infinitely if calling halt(unhaltable) returns True, i.e.
    unhaltable() halts. Return False otherwise."""

    if halt(unhaltable):
        while True:
            pass
    else:
        return False 

The above code provides a much simpler proof than the confused( ) function proof from the course notes. There are no extraneous parameters that complicate our understanding of the underlying concept. With this proof in hand we can then branch out to other functions that actually do take in arguments and then gradually develop more complex proofs from there. 

These extra parameters and functions that take in two functions as arguments caused me to focus too much on the python syntax side of things rather than the concept of halting. I spent several days deciphering the code and trying to program functions that take in two functions as inputs,  as well as seeing what happens when a function calls itself but with another function as its own input argument. I wanted to know whether the function passed in as an argument is ever reached or whether we just create a never ending recursive function. I even used a python visualizer to try and see what was going on.

Another area of confusion is that the conceptual halt function doesn't necessarily have to run the code of the function it receives.  It could instead just scan the code and analyze it to see if it halts.  Passing in a function along with another object input distorted this concept because I kept thinking that the function was just going to recurse forever. The challenges noted above are what prompted me to rewrite the confused() function from the course notes into something simpler like what is shown above.

 To sum up the semester I'm going to include a link to this student's Week 12 slog post. Truth be told, that comic he posted is probably what half the students in all of comp. sci. are thinking when they go through this course. I was sure to let him know in the comments section.
http://andrewgoupilcsc165.blogspot.ca/ 

Week 11 - Big-Theta and General Properties of Asymptotic Notation


So I heard from my CSC148 course that when computer scientists say a program runs in big-O of some function of n they generally mean that it is also in big- Θ of that function. My question is once we realize a program runs in say O(n2), do we even bother trying to show that it is also bounded above by n3 or bounded below by n when we can make that assessment by inspection? In other words, in practice do we typically worry about the other functions not in the same class or do we just prove that it is in big-O of something and leave it at that, until we can improve upon it.

I assume this is where the general properties for asymptotic notation come into play. Such as adding, and squaring a given function etc. and how they affect its relationship to its other bounds.  Thus we can very quickly say based on these properties that it performs better or worse than some other class of function without having to delve to much into the gritty details.

We had a similar question on Assignment 3 that resembles the concept of big-Θ. It asked us to show that there exists functions that do not bound each other above or below. And some people were suggesting that trigonometric functions such as sin(n) and cos(n) might work.  After thinking about that I came up with two flaws with using these functions.  Both sin(n) and cos(n) are functions that have a range in both the positive and negative reals. Thus these two functions violate the requirement that f(n) must only map onto 0 or the positive reals.  To prevent this, we could add a constant to raise the entire function above the x-axis but then why not just use a horizontal line?  Moreover, once you raise one trigonometric function entirely above the x-axis it then does behave like a horizontal line in that you can multiply it by some constant to bound the other trigonometric function above and below.  I went with a piecewise function, but even then the proof had many cases to show. Hopefully I caught them all.

 On a side note I was roaming around on some of the other slogs and stumbled upon this cool python implementation for the folding paper problem. This student also implemented something in graphical form which I thought was nice.
http://courseblog165.blogspot.ca/

Oh by they way, you can also run the code for my penny piles problem in the interpreter. Its kind of fun to find combinations of inputs that have really long sequences. Its been a while but I think (8, 99) and (16, 99) produce some neat outputs.  If you are really brave you can try (32, 9999)!

Monday, December 1, 2014

Week 10 - Limits and Non-Polynomial Functions

I get the feeling that logical notation has a hidden contempt for calculus and limits.  Prof. Heap is always saying that limits don't quite express the true nature of a function as some variable approaches infinity. He's of the opinion that numbers can never really approach infinity but instead they just get "very large" or "move farther way" from some value such as the origin. We also have our own expressions for limits that are always 3 or 4 times as long as the calculus version and I find that it adds an unnecessary complexity to a proof.

Take for example the limit as n approaches infinity for  2n / n2.  By calculus we can easily show that this approaches infinity.  Then by inspection we can quickly say that a function of 2n is guaranteed to surpass n2 as n gets large thus n2 is in O(2n) and 2n is in Ω(n2).  The proof using calculus, would probably be about 2 lines.  But the proof using logical notation and the structures of this course is approximately 5 to 6 times as long which really impedes our understanding of the underlying concept of the relationship between two functions. The meaning of big-O and big-Ω are hidden behind this excessive notation.

I know limits themselves have a similar precise definition, but they also have a higher level notation that quickly expresses the long term behavior of a function. Personally, I've had difficulty with limit proofs since the beginning of the course. To help I've reviewed several of the limit chapters from my old calculus text book and I've also tried drawing diagrams showing the delta and epsilon boundaries.  But when the function changes it can easily throw you off and put you right back at square one where you need to reinterpret the strict definition of the limit. I plan on spending a greater portion of time with limits of various types such as ones that approach infinity, zero and some other value and for various functions as well.  Hopefully there are some good examples in the course notes that I can practice on.

Here is someone who shares the same confusion with limit proofs.
http://slogcsc165.blogspot.ca/2014/11/week-11-recap-from-previous-weeks.html?showComment=1417548152625#c1153904731624398479

Week 9 Big-O and Big-Omega of Polynomials and Other Functions

Following up from my last post, I'm finally coming around to the idea that extraneous terms and constants that are not of the highest power in a polynomial are irrelevant.  In the long run, the highest power always wins and thus we need not worry about a slightly higher breakpoint.
With so many different types of classifications it definitely makes sense to use this simplified notation.

One thing I still don't quite see the use for is the lower bound big-Ω.  I understand that we are trying to establish the lower bound for the worst case, but if we extend this to the expected runtime for a random set of inputs, shouldn't we also include the input cases where the runtime is better than the worst case?  Thus rendering Big-Theta somewhat meaningless. Or is this another instance where we sweep things under the rug because they don't amount to anything significant, much like the coefficients, constants, and lower powered terms I mentioned earlier.

Its also interesting to see that the same function can be used to bound above and below. I guess this is another result have having the broad definition for big-O and big-Ω.

We covered similar topics in CSC148 where we very broadly classified various functions using the following asymptotic notations.  I have ordered them approximately from smallest to largest:
O(1) = constant time does not depend on the size of an input
O(k) = linear time based on some input value k
O(h) = linear time with the size or height of an internal data structure
O(logn) = logarithmic runtime typically encountered when traversing heaps
O(n) = linear runtime proportional to size n
O(nd) = runtime that is proportional to both the size of an input and the number of digits of the items of an input (radix sort)
O(nlogn) = runtime of mergesort typically
O(mn) = runtime is proportional to the size of two different inputs
O(n2) = quadratic runtime such as with insertion sort or selection sort
O(2n) = the worst of all the ones we've touched on

Overall, I am feeling good about working with asymptotic notation.  The concept seems intuitive at least for polynomial functions. Generally, it helps to have a graphing calculator handy so you can see the odd behavior for each function near the origin and for when n gets really large. I find that graphing the functions lets you develop that sixth sense for breakpoints and coefficients.  So far the trickiest part is making sure you choose the right coefficient c, and that you keep track of the minimum value for the breakpoint B. If there are constants involved for big-O then usually B is some value >= 1.  Also, c is usually >= 1 for big-O and < 1 for big-Ω.

Looking at some of the other slogs here is one that adds a nice touch using some of the lecture slides. I think this student also agrees that picking the right B and the right c for a proof is important. I let them know in the comments section of their blog.
http://ace165.blogspot.ca/2014/11/week-10-big-proof.html





 

Sunday, November 2, 2014

Week 8 – Assignment 2, Midterm and Algorithms


It’s Sunday, and I’ve just submitted Assignment 2.  But, of course, the fun never ends. I have a quiz tomorrow on a new topic – Algorithm Analysis, and a midterm on Wednesday on Proofs.

Overall, I think I spent too much time on the second assignment checking and double-checking everything. There were a lot of idiosyncrasies between how the proofs were presented in the lecture and how they were shown in the tutorial. Not to mention the course notes package as well.  I’m sure they are all correct but some just make more sense to me than others.

There are some proof types that I will need work on before the midterm such as limit proofs, proofs about sequences and non-boolean proofs.  These are the most challenging since it is not always obvious whether a claim is true or false and a lot more time is spent pondering which is the correct way to go before you can put pencil to paper.

The thing with proofs, however, is that there is only one answer to each question, either true or false, despite the many ways of achieving it.  But, with algorithm analysis we seem to be taking something that should have only one answer, such as the worst case performance of an algorithm, and making it somewhat arbitrary by randomly overestimating or underestimating whenever we choose.  This begs the question how much is too much?  We can all grossly overestimate / underestimate our answer to simplify our lives but where do we draw the line between something that meaningfully represents the performance of an algorithm and something that is not even in the same ballpark.  I had this feeling at the end of lecture one day so I delved a little deeper only to find that things are not as random as they appear…

At the end of the WIS in bigO(n2) (worst case insertion sort algorithm) lecture we had some function for bigO(n2) = 3n2 + 6n + 2 and we wanted to simplify to some upper bound function in the form of cn2.  The approach we took in class was just to add additional factors of n (this is the arbitrary part) to the remaining terms 6n and 2 so that we were left with cn2 = 11n2 with a breakpoint of n = 1.

So I was curious whether it was possible to do better in a less random fashion with relatively little work and with a minimal increase in the breakpoint of n = 1.  My approach was to solve for some incremental constant k such that kn2 = 6n + 2.  The new kn2 term would then be added to 3n2 to obtain the upper bound cn2.  As noted, there would be some trade off in the location of the breakpoint, as we will see.

To proceed I used the quadratic equation to solve for k, and if my math is correct we have the function B = (6 + (36 + 8k)0.5) / (2k) where k is the incremental coefficient and B is the new breakpoint for the new upper bound function (3 + k)n2.

So for bigO(n2) = 4n2 we have a breakpoint of n = 7 (rounded up from 6.32).  In other words we can reduce the upper bound function by a factor of 2.75 (calculated by 11n2/4n2) if we accept a slightly higher breakpoint of n = 7.

Interestingly, if we substitute the incremental coefficient k = 8 (from the quick and dirty method used in class), the corresponding breakpoint is in fact n = 1, as if by magic!  But, of course, there is no magic here and the cold hard lesson is that we are actually doing the same thing but in different ways. The “quick and dirty method” used in class was to just pick the easiest possible constant c that satisfied the proof and then solve for the corresponding breakpoint B, while the method shown here, although less arbitrary, instead gives a general relationship between the breakpoint and the coefficient.

What a dull conclusion to such a long discussion…I think I better get back to studying for the midterm…

Wednesday, October 29, 2014

Week 7 - Polya’s Approach to Penny Piles:


Understanding the Problem:
*Note: The original problem given on the worksheet in class has been generalized here for some initial condition (n, m), where n represents the number of pennies in the left drawer and m represents the number of pennies in the right drawer.

Here is some notation that will be used in our solution below:
(x, y) = a configuration with x number of pennies in the left drawer and y number of pennies in the right drawer
(n, m) = the initial condition
“0” = operation L
“1” = operation R
“0101…” = some sequence of operations L and R

Given some initial condition (n, m), for the number of pennies in the left and right drawers respectively, the problem is to determine whether a certain new configuration (x, y) is achievable given only two possible operations:
·      L: transfer half of the pennies in the left drawer to the right drawer only if the left has an even number of pennies
·      R: transfer half of the pennies in the right drawer to the left drawer only if the right has an even number of pennies.
Moreover, we would also like to determine all possible configurations for the number of pennies in each drawer given some initial condition (n, m) as well as the sequence of steps required to achieve each new configuration.

Of course, the set of new configurations depends on the initial condition.  For example, if n and m were two odd numbers then there would be no other possible configurations.

To break this down, our input would be some ordered pair (n, m) and our output would be all possible new ordered pairs (x, y) that can be achieved by operations L and R above.  The sequence of operations needed to achieve each new configuration should also be included in our output.

Devising a Plan:
This problem falls under the category of a binary tree problem similar to what is currently being taught in another programming course.  Since there are only two possible operations, then each node (representing a specific configuration) of the tree has at most two subtrees.  The leaves of the tree represent the configurations where no further operations can be performed.  To simplify the solution further, we can say that the leaves also encompass the configurations that have been previously identified from exploring some other part of the tree. These two types of configurations provide our stopping conditions or “base cases” for a possible recursive solution.

Here is a small diagram showing that the tree will look like.
 

Our plan here is to develop a recursive python function that computes and prints all possible configurations given some initial condition input.

Each configuration and its sequence of operations will be stored in a python dictionary, which is passed into a recursive function along with the current configuration and current sequence of steps used to arrive at the current state.  During each iteration, the program checks if any further operation can be performed or whether the current configuration has been encountered before.  If so, then it stops exploring this branch and passes the information back up the line.  If a new operation can be performed, then the new configuration and new sequence is stored in the dictionary and we move down the branch.

Eventually, all possible branches (sequence of operations) will be explored thus giving us all possible configurations for the number of pennies in each pile.

Carrying Out the Plan:
To carry out this plan I have implemented the python function penny_piles( ) shown below.  The output of the function can be verified by performing the sequence of operations on the initial condition.  This check can also be implemented in a separate python function to verify that each output is correct.

def penny_piles(start, steps = "", poss_piles = None):
    """ (tuple of (int, int), str, dict(tuple of (int, int), str) -> None

    Prints all possible configurations for two piles of pennies given in
    start followed by a binary string showing the sequence of operations
    performed on the start condition to obtain the new configuration.

    A "0" represents a left operation and a "1" represents a right operation.

    left operation = shift half of the pennies in the left to the right if
    the left side is even

    right operation = shift half of the pennies on the right to the left if
    right side is even

    >>> penny_piles((0,4))
    (0, 4) -
    (1, 3) 10
    (2, 2) 1
    (3, 1) 11
    """

    if poss_piles == None:
        poss_piles = {}

    left = start[0]
    right = start[1]

    if steps == "" and left % 2 != 0 and right % 2 != 0:
        print(start)
        return None

    if left % 2 != 0 and right % 2 != 0:
        return {}
    else:
        if left % 2 == 0 and left != 0:
            new_start = (int(left / 2), int(right + left / 2))
            if new_start not in poss_piles:
                new_steps = steps + "0"
                poss_piles[new_start] = new_steps
                poss_piles.update(penny_piles(new_start, new_steps, poss_piles))
        if right % 2 == 0 and right != 0:
            new_start = (int(left + right / 2), int(right / 2))
            if new_start not in poss_piles:
                new_steps = steps + "1"
                poss_piles[new_start] = new_steps
                poss_piles.update(penny_piles(new_start, new_steps, poss_piles))
        if steps == "":
            poss_piles[start] = "-"
            lst_of_keys = list(poss_piles)
            lst_of_keys.sort()
            for key in lst_of_keys:
                print(key, poss_piles[key])
            return None
        return poss_piles

if __name__ == "__main__":
    pass


Looking Back:
This function has been tested and verified for various inputs including corner cases such as (0, 0) and odd number only values.  The results are easily checked by hand or using another Python function.

This solution can be re-implemented in many other ways including ones that do not use recursion explicitly, although it does help to think recursively for this problem. I can say, however, that this function is efficient in that it does not explore duplicate branches.

Recursion can be used a wide variety of problems where the problem can be broken down into smaller problems of the same type (e.g. a linked-list or a family tree).  In this case we had an initial condition and all the sub-configurations of this initial condition are made up of all the other sub-configurations of each new-configuration.