You do not need to memorize all of the following, but you should be familiar with them and spend some time writing up your own implementation of them.
Terms
Stability
Sorting algorithms are considered to be stable if two elements with equal keys remain in the same relative order as they appear in the input and the output.
Adaptive
An adaptive algorithm takes advantage of any existing values that are presorted in the input set and tend to sort faster. The amount of disorder in the input set has a negative effect on the ultimate time taken to complete the sorting of said input set.
In-Place
A sorting algorithm that occupies the same storage space as the input set. Typically, it requires one or two additional temp variables but does not require large chunks of additional memory to store subsets of the input set while sorting.
Online
An online sorting algorithm does not require all of the elements to be present at the beginning of the sorting operation. They can be provided as a stream, or one-by-one and the algorithm must keep additional items sorted as they are provided to the input set.
Insertion sort
More efficient for a relatively smaller number of elements. Similar to selection sort. If provided with a nearly sorted input, it is very efficient. Relatively speaking, each element is examined and checked to see if the element before it is greater in value that it is. If so, it is swapped and thus continues down the array until the correct position is found. It can be more memory r/w intensive than selection sort, so in cases where r/w are slow selection sort may be favored. Advantageous over selection sort and other methods as it is online
- Worst, Avg: O(n ^ 2)
- Best case, O(n)
- Online, adaptive, stable, in-place
Example: 13, 24, 9, 8, 6
We typically start at the beginning of the array and work towards the end of the array, each time, we sort down (towards the 0 index) the value until it is in it’s correct place, such that it is less than the element to it’s right.
Assume a for i = 0, i < lengh; i++ semantic.
- i = 13, there is nothing to the left, so i++
- i = 24, i > i-1, so nothing to do, i++
- i = 9, !(i > i-1), so we swap 9 and 24.
- Then we check to see whether the i-1 element from 9 is less than 9
- It is not so we swap those two
- The first three elements in the array are now, 9, 13, 24, the sorted sub-set.
- i = 8 . . . we continue checking the ith element to i-1 until we have sorted the entire array.
Quicksort
A divide and conquer algo. Divide the array into two sub arrays via a partioning algorithm (pick the middle element/value or a random value). Then sort all elements less than the pivot to the left of the pivot, all greater than or equal to the right. Then recursively run the same function on the values to the left of the pivot. Can be problematic in that it can result in resource starvation and stack-overflow problems as the result of it’s recursive nature.
- Worst case: O(n2)
- Avg case: O(n log n)
- Best case: O(n log n)
- Unstable
- Can be implemented with in-place
- https://www.algolist.net/Algorithms/Sorting/Quicksort
- http://me.dt.in.th/page/Quicksort/
- http://en.wikipedia.org/wiki/Quicksort
Mergesort
A divide and conquer algo. Recursively split the list until you have lists that are 1 element and merge the result.
- Worst case: O (n log n)
- Avg case: O (n log n)
- Best case: O (n log n)
- Stable (most implementations)
- http://en.wikipedia.org/wiki/Merge_sort
- http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Merge_sort.html
- http://www.ee.ryerson.ca/~courses/coe428/sorting/mergesort.html
Heapsort
Is an in-place algorithm with a worst, best, and average complexity of O(n log n).
Can be though of as an improved selection sort. It is an improvement because of the use of the semi-sorted heap data structure and that there is no longer a linear approach required to compare all items to each other (not O(n^2)).
It is essentially a two part algorithm. We will assume in this example that this is a max heap and not a min heap
- Build the heap
- Sort the heap
We capitalize on the fundamental nature of the heap that the largest item is always at the root. And, when we remove the largest item and then reform the remaining heap the next largest item bubbles to the top. We can utilize an array (arr
) to store our heap. We divide the array into an unsorted and sorted segments. Initially, the entire array is unsorted.
We begin by creating a pointer (sortedIdx
) that will keep track of the index in the array that divides the sorted and unsorted elements. Initially, it points to the last element in the array (array.length – 1). We also create a temporary variable (temp
) that we will use to store values that we are in the process of swapping in our array.
In each sort operation we
- Copy the value in
arr[sortedIdx]
totemp
, which is one of the smallest elements in the heap - Copy the value at
arr[0]
(removing the largest element in the array) toarr[sortedIdx]
, this begins creating the sorted segment of the array - Decrement the
sortedIdx
value moving the pointer towards the beginning of the array to indicate that all elements after it are already sorted. - Set the value of
arr[0],
the root of the heap, to the value oftemp
- Sort the new value downwards until the heap ordering property is satisfied.
- Repeat until
sortedIdx
== 0
After each sort cycle the next largest element is added to the sortedIdx
position in the array and we slowly grow the sorted section of the array.
Once we have removed all of the elements from the heap and added them back to the array we are left with a completely sorted array.
BubbleSort
Selection Sort
Similar to insertion sort, however, you must iterate through the entire array each time, to make sure that you have found the lowest next element, whereas in insertion sort, once you have met the criteria of a > b you can stop iterating for that element and move on to the next. An advantage is that it is simple to implement.
Worst, Avg, Best is O(n^2)
The basic algorithm:
- Start with two sets, a sorted set on the left and an unsorted set on the right
- Initially, the sorted set is 0 size, and the unsorted set is the rest of the array
- For each element in the array:
- Look for the smallest element in the unsorted set and and add it to the end of the sorted set
- Continue until unsorted set is 0 size (reach the end of the array)
public class SelectionSort {
public static void selectSort(int[] arr) {
int temp = 0;
int currentMin = Integer.MAX_VALUE;
int currentMinPosition = 0;
// Start at the beginning of the array and look for the smallest value
// from our current position to the end.
for (int i = 0; i < arr.length-1; i++) {
// Reset the minimum
currentMin = Integer.MAX_VALUE;
// Look for the smallest value from i to the end
for (int j = i; j < arr.length; j++) {
if (arr[j] < currentMin) {
currentMin = arr[j];
currentMinPosition = j;
}
}
// Once we have found the current minimum position in the unsorted
// segment of the array, swap it with the current ith position
temp = arr[i];
arr[i] = arr[currentMinPosition];
arr[currentMinPosition] = temp;
}
}
public static void printArr(int[] arr) {
for (int i : arr) {
System.out.printf("%d ", i);
}
System.out.println();
}
public static void main(String[] args) {
int[] arr1 = new int[]{6, 6, 2, 9, 7, 15, 1, 31};
int[][] testData = new int[1][];
testData[0] = arr1;
for (int[] t : testData) {
selectSort(t);
printArr(t);
}
}
}
Topological Sorting
Nodes in a acyclic graph are placed in an order consistent with directed the edges of the graph. Such that for every directed edge a -> b, a appears before b in the result. The graph must be a DAG to be sorted in this fashion.
Nodes are visited (added to the result) only after all of their connected nodes, those upstream from them in the DAG, have been visited. Thus, a node visited later, must come later when topologically sorted.
The algorithm:
- Start with a node that does not have any incoming edges, or an indegree of zero.
- Add it to the result and remove it from the graph
- Then search for another node with a zero indegree
- Add it to the result and remove it from the graph
- Continue looping until the graph has zero elements remaining OR we cannot find a node with an indegree of zero
There are one of two possible outcomes:
- If the graph has zero elements we completed our ordering and it was indeed a DAG
- If we cannot find an additional node with an indegree of zero we have a a cyclic graph and cannot generate a topological sorting of it.
The implementation of the algorithm does not require destroying the original graph. We can either utilize a hashmap to keep track of the indegee value of each node, or a node structure that includes that property along with it.
- https://en.wikipedia.org/wiki/Topological_sorting
- https://www.interviewcake.com/concept/java/topological-sort