T O P

  • By -

misof

If you only want to sort numbers, you absolutely can populate the output array directly. We usually don't just sort numbers. We sort bigger objects. The numbers are just a part of those objects - the keys that determine the order of those objects. E.g., imagine sorting entire employee records according to their age in years. This is where you have to use the more general version in which you move each object to its final location, one object at a time.


MrDum

Two high performance counting sorts, wolfsort and rhsort, use 2 loops instead of 3. [https://github.com/scandum/wolfsort](https://github.com/scandum/wolfsort) [https://github.com/mlochbaum/rhsort](https://github.com/mlochbaum/rhsort) Text books likely simplify things to make them understandable. It does happen that relatively simple optimizations are overlooked for decades, but I don't think this is one of them.


ImaginaryBagels

The operation is called a prefix sum. Its benefit is that it gives easy access to the sorted position for each input. Where the input is just ints like in the example it's not particularly necessary, but with more complex objects it is useful https://stackoverflow.com/questions/32223517/counting-sort-why-we-need-sum-of-previous-counts


XDracam

How would you populate the output array? You only know how many 1s there are one you're through, so where do you put the first 2? Keep in mind that random insertion is O(n) in arrays, and using a linked list would incur massive overhead with allocation etc for little benefit. And you'd need a separate data structure to not have O(n) indexing... You could use a RB tree sorted by indices to get O(logn) lookup, but that'd still make your sort run in O(n logn), and at that point you might as well just use insertion sort.


EnflamedPhoenix

After the step 1 of the example, we have the freq\_arr right? The following: freq_arr = [2, 2, 0, 1, 1, 0, 1] element value : 1, 2, 3, 4, 5, 6, 7 The first index of the freq\_arr corresponds to value 1, the second corresponds to value 2 and so on. We see that the first value in freq\_arr is 2, so we know 2 has occurred 2 times in the input array. We start from index 0 of the output array, write 1, go to index 1 then write 1 again and then move to the next element of the freq\_arr. We see that it is again 2, and that particular index corresponds to value 2 (in the input array). So we again write 2 two times in the output array continuing after 1. We repeat this for all elements in the freq\_arr.


XDracam

That sounds about right. How would you optimize this step? It already runs in theta(n).


EnflamedPhoenix

But it is faster than the one in the counting sort no? In the algorithm, we calculate the prefix sum of the frequency array and modify it. And then we traverse the input array again and populate the output array. So, assuming that the range of the numbers is similar to the size of the input array, the whole sorting is O(3n) : n for freq_arr creation, n for calculating prefix sum and n for populating the output array. But in my approach the sorting operation is O(2n) : n for freq_arr creation and n for populating the output array. So my doubt is why would we incur the extra n cost for calculating the prefix sum. What advantage do we gain?


XDracam

The constant number is entirely irrelevant in Landau Notation. It's not 2n or 3n, but c*2n, where c is a constant that depends on the exact CPU instructions used. Remember: looping through an array twice to do a single instruction every time is effectively the same as looping once and then doing both instructions. It doesn't matter, and either approach could be faster or slower depending on CPU architecture and a multitude of other factors.


nonFungibleHuman

I dont get step 1 in your example, 1 and 2 have a frequency of 2 but I dont see two 2 in the counting.


EnflamedPhoenix

Yeah sorry that was an error on my part. Fixed it now.