Sunday, March 30, 2014

Sorting and Efficiency

Introduction:

Sorting is the process when we place a collection of elements into certain order. Generally speaking, this "order" can be defined personally and also according to that collection. For example, we can have a collection of students' final grades and sort them by decreasing order. Also, searching engine can sort results of your search by different "rules", like most popular, most relevant, most recent, etc.

However there are various means and kinds of sorting, this week we started with the basic one, which is just sorting a range of numbers into increasing order. Sorting other collections like strings are similar to sorting numbers, where same insights can be applied. Therefore, in following discussion we only focus on sorting numbers.

Basic steps about sorting numbers:

Sorting numbers depends on two simple steps: First we need to compare number A to number B, to find which one is smaller. Second, we need to change the relevant position of A and B.

 These two steps seem really easy, but each step can be applied by different methods. Therefore, there are many ways to sort numbers and this through us the question how can we select the "most optimal" method, which means the smallest run time and complexity.

Quick sort:

Quicksort works by partitioning the list into two halves: those items less than the pivot item, and those greater than or equal to the pivot item. For a list [5, 1, 7, 3, 9, 12] and choosing pivot 5, we can first sort it into [3, 1, 5, 7, 9, 12]. After that, we recursively quick sort the left half and right half until it cannot be separated. The run time depends on two steps. First, how many times can the list of length n be divided? The answer should be log2(n) and the runtime is in O(ln n). Second, in final state, how many items shall we compare? The answer if a linear function of n. Therefore the runtime is in O(n). then the combination of these two steps are O(n ln n).

Note that there exist a worst case of this algorithm. The problem occurs when careless choosing the pivot. If each time the pivot we choose happen to be smallest or largest of the array, all the rest numbers will be put into one half and leave the other half empty. When this case happens, the runtime grows to O(n^2) because we indeed divide the list n times and in the final state we compare n times.


Selection sort:


Selection sort works by repeatedly selecting the smallest remaining item and putting it where it belongs. This is rather simple if we want to compute the runtime. First it need to search the smallest number by looping. Then each minimum selection counts O(n) times. Secondly, the rule "putting it where it belongs" means we need to find a position such that A[n - 1] < number we selected < A[n]. This means we also need to search be looping, which creat another O(n). In all, that is O(n) * O(n) = O(n^2).

Although there are also worst case and best case for selection sort, the runtime is still of O(n^2), which means it is a rather stable algorithm.


Insertion sort:

Insertion sort works by repeatedly inserting the next item where it belongs in the sorted items at the front of the list. This algorithm is somewhere similar with Selection sort. When we calculate the runtime of it, we also need to consider two steps. First we each time need the "next item". This means we indeed need to run O(n). Second, it also need to find the position "where it belongs to". According to our discussion in selection sort, this step need O(n). Therefore the conclusion is O(n^2).

Note that this algorithm has a best case. When it comes to a already sorted list, it only need O(n) comparison but just O(1) swap! Then the total runtime shrinks to O(n).


Bubble sort: 

Bubblesort works by repeatedly scanning the entire list and swapping items that are out of order. Actually, this algorithm is not preferable compared to other sorting algorithm like quick sort and merge sort both with O(n ln n). Also, when compared to algorithm with same runtime O(n^2), it is relatively inefficient.

However, this algorithm has a significant benefit if we are not going to sort a list but detect whether is is already sorted. In other words,  it is well performing on sorted list, only O(n).

Merge sort:


Merge sort splits the list in half, sorts the two halves, and then merges the two sorted halves. When it comes to "split into halves", you can imagine it has similar performance with quick sort which also half the list. Actually, the only difference between these two algorithm is that Merge sort implement "sort" in merge steps and quick sort implement it in dividing step. The rest of them are just the same.
Therefore the runtime is also O(n ln n).

Conclusion:

These are almost the most common sorting algorithm. However there are always new sorting method coming out and perform well in different cases. Sorting is a big problem in computer science and still need hard work to accomplish better performance. The above discussion is not all about sorting and you may explore more if you want. 





Sunday, March 16, 2014

LinkedList

Recently we learned a new type of list called LinkedList. Different from normal list class, it is a array of nodes which has only one value and the only child is the next  LinkedList's address.

In ordinary NestedList object, it is construct with recursive layers and each layer contains a value and a NestedList. However, this LinkedList provides us a new way to store similar information but in a more flatter way. As each node only have one child, we don't need to consider the size of the next layer. The only thing we need to to is check the value of each node one by one. LinkedList is just like a train with multiple storages. We put our data, here it is 'goods', in this linear order.

Also there are different ways to build this LinkedList object. One way is just like how we built the Tree object but only contains one child. The second one is using a wrapper thought. In this method, we need first build a LinkedNode class which is just like each storage on the train. Secondly we build the LinkedList on the LinkedNode. Therefore the LinkedList to LikedNodes is just like train to several storages. This path provides us better way to implement functions like reverse().

Sunday, March 2, 2014

WHEN CODING MEETS RECURSION

Recently I encounter a new type of coding method, which is recursion. It is not only a new 'technique'  I learn to programming, but also a remarkable insight of specific problems.

What is recursion in programming? From my experience, I think it is a method of recalling function itself. For instance, when you build up a function f(x), inside it you call f(x). In this way, when computer execute to that line, it will automatically plug in new input and execute f(new_x) again.

Although it is resemble to looping to some extend, recursion provides us benefits more than that. Namely, it provides us thoughts to splits complex but recursive problems down. In this way, the programmer only need to code program in one iteration and let computer do the rest. For example, in A1, we need to program a solution to solve the Hanoi tower puzzle (of three stools). Although the number of cheese is unknown, we can still find out the general 'rule', which is just move the top (n - 1) cheese onto the 'helper' and base to the 'destination'. Each time moving n cheese is asked, you can recursively think of moving (n - 1) cheeses until the base case, which is just one left. Without recursion this task seems impossible because you even don't know how many cheeses you have! Generally, recursion is required when a task has uncertain sub_tasks resemble to itself.

However, recursion easily raise some problems. The most significant I met is that the execution is booming at a exponential speed thus easily cause problems. Sometimes you may not be able to think through and leave 'leaks' which allow it recurse out of range. Also compared to same times of loops, it will cost more time to solve it. Believe it or not, when I run 150 cheeses on my computer, it takes around 2 minutes to generate answer. After that, each cheese I add would cost times of that time and finally my computer could do nothing besides burn!

Sunday, February 16, 2014

Reading Week

I planned to use this reading week to review for m test. It seems that lots of things I need to catch up.

For last week Assignment 1, I want to comment about the function to calculate i. I think this would be the hardest part in the whole assignment. I knew that I need to use the M(n) on the handout, but it was pretty tricky to come up with an optimal solution. Then I googled a lot for some hint, and finally found an article suggest that the steps, M(n), should be tested 'i' one by one and then find the smallest M(n). That was very helpful and after several tries I can find the smallest M(n). However, it was not over. I still need to redirect to the 'i'. First time I tried use a loop to get the i, but found it is too slow because when I did the loop, I call the recursion thousands of times. After struggle for 2 days, I recalled that I can use a memory to take the result down each time, which I can recall from instead of doing the same recursion again and again. Then I choose use a dictionary. After that, all the problems were solved. My program can run the 150 cheeses with smallest step within 1 min.

Sunday, February 9, 2014

LEARN FROM HANOI

This week I spend much time figuring the step 5, namely the recursion question, in the Hanoi Puzzle. After the Monday lecture I understood the sight of solving recursion problem, and finished the three stools problem. I thought it wold be similar for the four stools, but found much harder.

By many tries, I finally finished the program but I still feel vague about the whole process. There are still many steps that I find hard to trace. However, through this assignment I have a basic framework about how to separate a recursive question into parts and divide them to base cases and recursive cases. It's just time that I need to understand it better.

In conclusion, recursive function requires a way that you focus your mind on solving only one cycle of the problem and leave the rest to the computer who can auto-run it by itself. Moreover, it's important to add condition operator to each cases, which avoids unexpected error.

Monday, February 3, 2014

Recursion

These days I have been busy about the Assignment 1. In the assignment, there is a puzzle called Tower of Hanoi. Actually the first 4 steps are easy, which require to follow the instruction and comment to fill the blank and make the program functional. However the fifth step was so hard that I struggled for 2 days but still stand in still!

That step require to think of an method to automatically run the game and find the least-step solution. I knew that recursion is required in that method but can' t find a clue. After the Monday lecture and some searching job on the Wiki, I can understand most of the three stools method. It will take me more time, let's say at least a week, to come to the method for four stools.

Therefore, recursion is easy to trace and understand but hard to generate your own recursion. It requires your better understanding of the puzzle and the ability to separate it into several steps.

Wednesday, January 22, 2014

View about OOP


This week I am going to talk about my understanding of OOP.

OOP, as Object-Oriented Programming, is the the main programming paradigm nowadays. Not only Python, but also C++, Java and other programming languages, is using this paradigm. 

In my personal understanding, OOP is a way that programmer follows when they want to create new codes. By using OOP, these codes are referred to certain object with certain attributes and features. For instance, the class "Stack" is in other words describing floors of stones that can be only added and removed  at the top. In this way, tons of complex code is packaged into many objects. Whenever user and other developer want to use them, they don't need to build again. By contrast, the only thing they need to do is to call the object and the method inside it.

This programming paradigm benefits the developing of computer programming in many ways. Firstly, the new programmers can build their codes on the previous codes, which saves time. Secondly, OOP provide user with great convenience. Users do not need to understand the complex code they buy. They can utilize them just by calling object and method names. Thirdly, OOP programs are closely related to the real world problems. For instance, the "turtle" class is built for turtle robots in reality. Therefore, the none value codes drive the robots accomplish movements. By selling robots, the society is benefited and then these codes product values. Generally speaking, OOP is a proper and meaningful paradigm in programming.