Bentley and MacIlroy, 1993 Engineering a Sort Function Engineering a Sort Function JON L. BENTLEY M. DOUGLAS McILROY AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974, U.S.A. Jim Royer SUMMARY CIS 351 We recount the history of a new qsort function for a C library. Our function is clearer, faster and more robust than existing sorts. It chooses partitioning elements by a new sampling scheme; it partitions by a February 4, 2019 novel solution to Dijkstra’s Dutch National Flag problem; and it swaps efficiently. Its behavior was assessed with timing and debugging testbeds, and with a program to certify performance. The design techniques apply in domains beyond sorting. From: Software Practice and Experience , Vol. 23 (1993) 1249–1265. http://www.skidmore.edu/~meckmann/2009Spring/cs206/papers/spe862jb.pdf Royer (CIS 351) Engineering a Sort Function February 4, 2019 1 / 16 Royer (CIS 351) Engineering a Sort Function February 4, 2019 2 / 16 Why rewrite Unix’s quicksort? Bentley and MacIlroy’s version of CLRS’s Quicksort void iqsort0(int *a, int n) { In ancient days of yore ( ≈ 1991): int i, j; The old quicksort ( qsort ) had be in use for ≈ 20 years and was if (n <= 1) return; stable and usually fast. for (i = 1, j = 0; i < n; i++) if (a[i] < a[0]) A colleague found that qsort ran in Θ ( n 2 ) time inputs with swap(++j, i, a); certain structures, e.g., on pipe-organ arrays of 2 n integers: swap(0, j, a); 1,2,3,4,...,n,n,...,4,3,2,1 . iqsort0(a, j); They found that all the then competitors of qsort could also be iqsort0(a+j+1, n-j-1); driven to Θ ( n 2 ) on certain reasonable inputs. } “Users complain when easy inputs don’t sort quickly.” Program 2. A toy Quicksort, unfit for general use So it was time for a new systems-level quicksort. more efficient (and more familiar) partitioning method uses On nearly sorted arrays, the above makes ≈ n 2 2 many comparisons! Royer (CIS 351) Engineering a Sort Function February 4, 2019 3 / 16 Royer (CIS 351) Engineering a Sort Function February 4, 2019 4 / 16
The qsort interface (for the moment) Insertion sort using the qsort interface isort void isort(char *a, int n, int es, int (*cmp)()) { char *pi, *pj; void qsort(char *a, int n, int es, int (*cmp)()); for (pi = a + es; pi < a + n*es; pi += es) for (pj = pi; pj > a && cmp(pj-es, pj) > 0; pj -= es) Parameters swap(pj, pj-es, es); } *a = the array’s starting location function swap(i,j,n) , defined in Program 1, interchanges n -byte fields pointed to n = the number of elements = the size (in bytes) of each element es A simple, straightforward, and troublesome swap = the comparison function cmp void swap(char *i, char *j, int n) { do { For “ char* ” think “byte addresses.” char c = *i; *i++ = *j; *j++ = c; } while (--n > 0); } Royer (CIS 351) Engineering a Sort Function February 4, 2019 5 / 16 Royer (CIS 351) Engineering a Sort Function February 4, 2019 6 / 16 Bentley and MacIlroy’s starting quicksort qsort1 ’s invariants void qsort1(char *a, int n, int es, int (*cmp)()) { int j; As the partition process is running char *pi, *pj, *pn; T ≤ T ? ≥ T if (n <= 1) return; pi = a + (rand() % n) * es; swap(a, pi, es); 0 i j n-1 pi = a; pj = pn = a + n * es; around the element a[0] , which we abbreviate as T . Increment for (;;) { do pi += es; while (pi < pn && cmp(pi, a) < 0); When the partitioning is done do pj -= es; while (cmp(pj, a) > 0); if (pj < pi) break; swap(pi, pj, es); ≤ T T ≥ T } swap(a, pj, es); j = (pj - a) / es; 0 j i n-1 qsort1(a, j, es, cmp); qsort1(a + (j+1)*es, n-j-1, es, cmp); recursively on the subarrays a[0..j-1] and a[j+1..n-1] } Program 4. A simple qsort the cost of about forty common sorting operations. Table I shows the cost of Royer (CIS 351) Engineering a Sort Function February 4, 2019 7 / 16 Royer (CIS 351) Engineering a Sort Function February 4, 2019 8 / 16
Cost of basic operations (on a VAX 8550) Getting rid of the randomness — Median of three (Systems folk do not care for randomized algorithms.) Table I On a modern CPU the actual CPU Microseconds a:b C n = expected number of times will be vastly smaller. Int Operations comparisons for a size- n input 0.20 i1 = i2 + i3 < > 0.20 My guess that the proportions i1 = i2 - i3 when pivoting around a are roughly the same. Pointer Operations b:c b:c random elm: 0.17 p1 -= es C n ≈ 1.386 n log 2 n 0.16 Note that under “Swap < > < > p1 += es when pivoting around the Functions” both lines are about Control Structures 0.32 abc cba a:c a:c median of 3 random elms: if (p1 == p2) i1++ swapping 4-byte int s. 0.26 while (p1 < p2) i1++ C n ≈ 1.188 n log 2 n < > < > Comparison Functions MIX (standard cost models): Program 5 makes 8/3 2.37 i1 = intcomp(&i2, &i3) acb cab bac bca 3.67 comparisons (on average). overhead ≈ comparisons < swaps i1 = floatcomp(&f2, &f3) 3.90 i1 = dblcomp(&d2, &d3) 8.74 i1 = strcmp(s2, s3) qsort (what is going on here): static char *med3(char *a, char *b, char *c, int (*cmp)()) Swap Functions { return cmp(a, b) < 0 ? overhead < swaps ∗ < comparisons 11.50 swap(p1, p2, 4) (cmp(b, c) < 0 ? b : cmp(a, c) < 0 ? c : a) 0.84 : (cmp(b, c) > 0 ? b : cmp(a, c) > 0 ? c : a); t = *i1, *i1 = *i2, *i2 = t ∗ done right } inline swaps for integer-sized objects and a function Program 5. Decision tree and program for median of three Royer (CIS 351) Engineering a Sort Function February 4, 2019 9 / 16 Royer (CIS 351) Engineering a Sort Function February 4, 2019 10 / 16 Getting rid of the randomness — Ninther Take advantage of repeated elements void iqsort2(int *x, int n) During partitioning { ninther = a median of three medians, at most 12 comparisons int a, b, c, d, l, h, s, v; = < ? > = pm = a + (n/2)*es; /* Small arrays, middle element */ if (n <= 1) return; v = x[rand() % n]; if (n > 7) { a b c d a = b = 0; pl = a; c = d = n-1; After partitioning for (;;) { pn = a + (n-1)*es; while (b <= c && x[b] <= v) { if (n > 40) { /* Big arrays, pseudomedian of 9 */ if (x[b] == v) iswap(a++, b, x); = < > = b++; s = (n/8)*es; } a c b d pl = med3(pl, pl+s, pl+2*s, cmp); while (c >= b && x[c] >= v) { After copying pm = med3(pm-s, pm, pm+s, cmp); if (x[c] == v) iswap(d--, c, x); c--; pn = med3(pn-2*s, pn-s, pn, cmp); < > = } } if (b > c) break; iswap(b++, c--, x); pm = med3(pl, pm, pn, cmp); /* Mid-size, med of 3 */ } } s = min(a, b-a); for(l = 0, h = b-s; s; s--) iswap(l++, h++, x); s = min(d-c, n-1-d); Results of experiments on a modified program 2: for(l = b, h = n-s; s; s--) iswap(l++, h++, x); iqsort2(x, b-a); Behaved well on non-random inputs iqsort2(x + n-(d-c), d-c); } On random input arrays C n ≈ 1.362 n log 2 n − 1.41 n Program 6. An integer qsort with split-end partitioning Quicksort with split-end partitioning (Program 7) is about twice as fast as the Royer (CIS 351) Engineering a Sort Function February 4, 2019 11 / 16 Royer (CIS 351) Engineering a Sort Function February 4, 2019 12 / 16
Recommend
More recommend