Harvard Mark I 1940s

Tuesday, March 2, 2021

Variations om mergesort. Part I. MIT Scheme

I'll be demonstrating some sample code for variations on the merge sort algorithm in various computer coding systems. The breakdown into parts will be the following.
Part I. Scheme using MIT Scheme.
Part II. Common Lisp using CLISP
Part III. Python 2.7.18
Part IV. C++ using clang, gnu g++ (or maybe even also cl on Windows)
Part V. Fortran using gnu gfortran.

The approach I shall take in the Scheme version is to break up an unsorted list into its already sequentially ordered sublists. These sublists are often referred to as 'runs'. Given an unsorted list
such as
(-9901 -588 4611 19329 -15665 -2951 -28140 -21067 -30182 26262 12038 18460 11128 17996 19257 -28634 31310 -32161 -17620 -12520)
one can partition such a list into its subsections that are already in ascending, or descending, order.
In this case such a partition into ascending order sublists would yield
(
(-9901 -588 4611 19329) 
(-15665 -2951) 
(-28140 -21067) 
(-30182 26262) 
(12038 18460) 
(11128 17996 19257)
(-28634 31310)
(-32161 -17620 -12520)
)

I do this with the Scheme function extract-runs below.

(define (extract-runs ls fn)
  (define (iter in run out)
    (cond ((null? (cdr in))
           (if (null? run)
             (cons (list (car in)) out)
             (cons (cons (car in) run) out)))
          ((fn (car in) (cadr in))
           (iter (cdr in) (cons (car in) run) out))
          (#t
           (iter (cdr in) '() (cons (cons (car in) run) out)))))
  (reverse! (map reverse! (iter ls '() '()))))

Scheme is very friendly to functions as first class citizens.

And since an operator such < or >= is treated as a function in scheme this makes it easy to write functions that can deal with ascending or descending order. For example, at the command line in GNU MIT scheme one can type 
> (extract-runs ls <=)
given a list ls to generate its partition into ascending runs similar to how the built in sort function works, i.e.
> (sort ls <=)
These 'runs' become the raw material for further merge operations eventually leading to the building up of a sorted list. This is done from the bottom up so to speak as the successive runs become longer, and their quantity less numerous.
Merging sorted runs 2 at a time, and leaving any dangling odd at the end to be dealt with in some successive iteration of merge operations on the raw material above yields.
(
(-15655 -9901 -2951 -588 4611 19329)
(-30182 -28140 -21067 26262)
(11128 12038 17996 18460 19257)
(-32161 -28634 -17620 -12520 31310)
)
And so on until we are left with only one 'run'.
(
(-32161 -30182 -28634 -28140 -21067 -17620 -15665 -12520 -9901 -2951 -588 4611 11128 12038 17996 18460 19257 19329 26262 31310)
)
The car (first list) of our list of runs has become the sorted list we wanted to build and so we return it from the merge-runs function.

(define (merge-runs x y fn)
  (define (iter x y z)
    (cond 
          ((null? x)
           (reverse! (append! (reverse! y) z)))
          ((null? y)
           (reverse! (append! (reverse! x) z)))
          ((fn (car x) (car y))
           (iter (cdr x) y (cons (car x) z)))
          (#t
           (iter x (cdr y) (cons (car y) z)))))
  (iter x y '()))

(define (merge-sort ls fn)
  (define (aux in out)
    (cond ((and (null? in) (equal? (length out) 1))
           (car out))
          ((and (equal? (length in) 1) (null? out))
           (car in))
          ((null? in)
           (aux out '()))
          ((null? (cdr in))
           (aux (cons (merge-runs (car in) (car out) fn) (cdr out)) '()))
          (#t
           (aux (cddr in) (cons (merge-runs (car in) (cadr in) fn) out)))))
  (aux (extract-runs ls fn) '()))

There are some who say that Scheme is an elegant programming language conducive to a functional
paradigm and tail call optimized recursive procedures. I have found it to be so. Lexical scope is another thing that is often clarified as one attempts to program using Scheme. One can use the following to generate random lists to use as the raw material for testing the above procedures.

(define (random-integer) (- (random 65536) 32768))
(define (random-real) (random 1.0))

(define (random-list n f)
  (define (iter k ls)
    (if (equal? k 0)
      ls
      (iter (- k 1) (cons (f) ls))))
  (iter n '()))

Topics touched upon, but not really covered in this essay include the following.
Scheme programming language.

Merge sort algorithm.

Tail call optimization.

Structure and Interpretation of Computer Programming. An all time great computer science text, 

and set of associated lectures.

Variations om mergesort. Part I. MIT Scheme

I'll be demonstrating some sample code for variations on the merge sort algorithm in various computer coding systems. The breakdown into...