From: Christophe Raffalli <christophe@raffalli.eu> To: Mario Pereira <mjp.pereira@fct.unl.pt> Cc: caml-list@inria.fr Subject: Re: [Caml-list] O(n ln k) sorting for ocaml on github and a challenge Date: Sat, 9 Oct 2021 09:59:10 -1000 Message-ID: <20211009195910.mqivg4tsafiyb3tr@oulala> (raw) In-Reply-To: <CAJ98AgmGGR-JaciOfMB=9wOt5im4qPg--EBrt5AJOvBg7b9Niw@mail.gmail.com> [-- Attachment #1: Type: text/plain, Size: 2907 bytes --] On 21-10-09 09:58:12, Mario Pereira wrote: > Hi Christophe, > > Your implementation reminds me very much of TimSort (for an OCaml > implementation, see for instance https://github.com/LesBoloss-es/sorting/blob/ > master/src/list/timsort.ml). This is also a stable algorithm. Hi Mário, Yes the idea is similar, but - I use a reference to the list for the sorted/reverse-sorted block I build in first phase (like Timsort on arrays, which actually is stable, in place and as good complexity, but you have to get the balancing right, see below) - in the second phase, I use a merge, returning sorted list at even depth and rev-sorted at odd depth (like OCaml's List.sort). This avoid the Asc/Desc constructor and the List.rev. - the balancement is ensured by a simple arithmetic computation, not by a stack and an invariant on sizes (probably building a balanced binary tree with the initial block instead of a list could be a bit better, I am not sure). - Last but not least: the code you point seems broken somewhere, it does not ends on large lists (probably quadratic because the balancing with size is probably wrong. cf the paragraph "bug" on wikipedia about Timsort. Anyway, the code you point is probably optimised for arrays. > Was TimSort an inspiration? The inspiration was a paper I read long ago about the O(n ln n) not being the best we can do for lists, as O(n ln k) can be achieved where k is the number of change of direction. My work aims at being a replacement for OCaml sort (for list currently). Here is the timing I get: Correctness test passed Stability test passed Random lists: random: tf = 1.44, tg = 1.39, factor = 1.04x, gain = 3.70% random small: tf = 1.10, tg = 1.19, factor = 0.92x, gain = -8.27% Worst cases: worst1: tf = 0.83, tg = 0.71, factor = 1.17x, gain = 14.21% worst2: tf = 0.65, tg = 0.70, factor = 0.93x, gain = -7.17% Sorted (partially) lists: sorted: tf = 0.65, tg = 0.01, factor = 97.86x, gain = 98.98% reversed: tf = 0.65, tg = 0.09, factor = 7.62x, gain = 86.88% sorted@rev: tf = 0.65, tg = 0.21, factor = 3.14x, gain = 68.19% rev@sorted: tf = 0.65, tg = 0.21, factor = 3.13x, gain = 68.03% Shuffled lists (permute k times 2 elements in a sorted list): shuffle 10: tf = 0.66, tg = 0.41, factor = 1.61x, gain = 38.01% shuffle 100: tf = 0.64, tg = 0.51, factor = 1.25x, gain = 20.28% shuffle 1000: tf = 0.63, tg = 0.56, factor = 1.14x, gain = 11.94% shuffle 10000: tf = 0.64, tg = 0.61, factor = 1.04x, gain = 4.30% factor > 1 means Block_sort is faster that List.sort. Remark that in the case of sorted list, my algorithm returns the original list using constant memory. Cheers, Christophe PS: I probably will try something much more like Timsort on arrays later ... > Cheers > -- > Mário [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --]

next prev parent reply indexThread overview:6+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-10-09 4:05 Christophe Raffalli 2021-10-09 8:58 ` Mario Pereira2021-10-09 19:59 ` Christophe Raffalli [this message]2021-10-11 11:46 ` Niols 2021-10-11 21:27 ` Mario Pereira 2021-10-22 4:44 ` Kazuhiko Sakaguchi

Reply instructions:You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the--to,--cc, and--in-reply-toswitches of git-send-email(1): git send-email \ --in-reply-to=20211009195910.mqivg4tsafiyb3tr@oulala \ --to=christophe@raffalli.eu \ --cc=caml-list@inria.fr \ --cc=mjp.pereira@fct.unl.pt \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting theIn-Reply-Toheader via mailto: links, try the mailto: link

caml-list - the Caml user's mailing list Archives are clonable: git clone --mirror https://inbox.ocaml.org/caml-list AGPL code for this site: git clone https://public-inbox.org/ public-inbox