On 21-10-09 09:58:12, Mario Pereira wrote:
> Hi Christophe,
>
> Your implementation reminds me very much of TimSort (for an OCaml
> implementation, see for instance https://github.com/LesBoloss-es/sorting/blob/
> master/src/list/timsort.ml). This is also a stable algorithm.
Hi Mário,
Yes the idea is similar, but
- I use a reference to the list for the sorted/reverse-sorted block I build in
first phase (like Timsort on arrays, which actually is stable, in place and as
good complexity, but you have to get the balancing right, see below)
- in the second phase, I use a merge, returning sorted list at even depth and
rev-sorted at odd depth (like OCaml's List.sort). This avoid the Asc/Desc
constructor and the List.rev.
- the balancement is ensured by a simple arithmetic computation, not by a stack
and an invariant on sizes (probably building a balanced binary tree with the
initial block instead of a list could be a bit better, I am not sure).
- Last but not least: the code you point seems broken somewhere, it does not
ends on large lists (probably quadratic because the balancing with size is
probably wrong. cf the paragraph "bug" on wikipedia about Timsort. Anyway, the
code you point is probably optimised for arrays.
> Was TimSort an inspiration?
The inspiration was a paper I read long ago about the O(n ln n) not being the
best we can do for lists, as O(n ln k) can be achieved where k is the number of
change of direction.
My work aims at being a replacement for OCaml sort (for list currently). Here is
the timing I get:
Correctness test passed
Stability test passed
Random lists:
random: tf = 1.44, tg = 1.39, factor = 1.04x, gain = 3.70%
random small: tf = 1.10, tg = 1.19, factor = 0.92x, gain = -8.27%
Worst cases:
worst1: tf = 0.83, tg = 0.71, factor = 1.17x, gain = 14.21%
worst2: tf = 0.65, tg = 0.70, factor = 0.93x, gain = -7.17%
Sorted (partially) lists:
sorted: tf = 0.65, tg = 0.01, factor = 97.86x, gain = 98.98%
reversed: tf = 0.65, tg = 0.09, factor = 7.62x, gain = 86.88%
sorted@rev: tf = 0.65, tg = 0.21, factor = 3.14x, gain = 68.19%
rev@sorted: tf = 0.65, tg = 0.21, factor = 3.13x, gain = 68.03%
Shuffled lists (permute k times 2 elements in a sorted list):
shuffle 10: tf = 0.66, tg = 0.41, factor = 1.61x, gain = 38.01%
shuffle 100: tf = 0.64, tg = 0.51, factor = 1.25x, gain = 20.28%
shuffle 1000: tf = 0.63, tg = 0.56, factor = 1.14x, gain = 11.94%
shuffle 10000: tf = 0.64, tg = 0.61, factor = 1.04x, gain = 4.30%
factor > 1 means Block_sort is faster that List.sort. Remark that in the case of
sorted list, my algorithm returns the original list using constant memory.
Cheers,
Christophe
PS: I probably will try something much more like Timsort on arrays later ...
> Cheers
> --
> Mário