Tag Archives: quicksort

Closures in C#: Funcs

Since I’m trying to learn Clojure, and the name of the language is a variant on the term ‘closure’ (with the ‘j’ inserted since it’s based on Java, I assume), it seems appropriate that I should know what, exactly, a closure is. I’ve poked around the web a fair bit, but I couldn’t find a nice, clear (to me, anyway) explanation of what a closure is. I’ve patched together a bit of an understanding from a few places that posted some examples, so I’ll try to produce something that makes sense to me at this point. Hopefully any closure experts who chance on this page will correct me if my understanding is faulty.

Basically, a closure is a function that is created within a program (rather than being hard-coded before the program is compiled), and that contains the functionality and the data that were current when the function was created. In other words, a closure is a function that captures, or closes over, its environment at the time it was created, and preserves that environment throughout its lifetime.

As I haven’t delved deeply enough into Clojure itself yet, I’ll try to illustrate a closure by using some examples from a language with which I am more familiar: C#. Apologies if this isn’t a language that you know, but I had to start somewhere.

C# contains a feature called a delegate. Most C# programmers will be familiar with delegates as things that are used to handle events, but in fact they have many more applications. I’ll give a quick recap on delegates here (I’ll try to put up a more complete post on them soon). A delegate, as its name implies, delegates work to another function or functions. To declare a delegate, we must specify the signature of the type of function that it can deal with.

A typical delegate declaration is something like this:

public delegate int ArithOperation(int num1, int num2);

This declaration is for a delegate with the name ArithOperation and states that this delegate can refer to functions with two int arguments and an int return type. Note that ArithOperation is not the name of a function; it is the name of the delegate, and is a declared data type.

This delegate can be attached to an ordinary function of the required signature. For example, if we had a function:

  public int Plus(int num1, int num2)
  {
    return num1 + num2;
  }
 

we can then initialize the delegate by writing:

ArithOperation arithOp = new ArithOperation(Plus);

Having made this connection, we can then use the delegate by writing something like this:

int sum = arithOp(3, 4);

The arithOp delegate will pass its arguments along to the Plus() function, which will calculate the sum and return it via the delegate, so that sum becomes 7.

Now all of this might look very clumsy, since we could obviously have just called Plus() directly without using the delegate. However, the power of delegates begins to become apparent when we realize that a delegate can be passed as an argument to a function, whereas a raw function name like Plus cannot. Thus the C# delegate provides a way of passing functions around a program just like any other data types.

To illustrate this, we need a program that’s a bit more involved. A common operation is that of sorting data, and a popular sorting algorithm is quicksort. Don’t worry if you don’t know the quicksort algorithm; I won’t be looking in detail at how it works here. All you need to realize is that in any sorting algorithm, we must have a way of ordering the data, so we need a meaning for the ‘less than’ operation. Typically, data come in various forms (not always as bare numbers), and a program cannot anticipate all the types of data it may be required to sort. Thus the ability to define a ‘lessThan’ function on the fly to cope with some new data type that we wish to sort would be quite useful.

First, we’ll give the code for the quicksort algorithm (again, you don’t need to work through the whole thing to follow what I’m going to say):

  public class Quicksort<T>
  {
    public static List<T> Sort(List<T> data, int left, int right, Func<T, T, bool> lessThan)
    {
      int i = left;
      int j = right;
      int pivotValue = (left + right) / 2;
      T x = data[pivotValue];
      T w;
      while (i <= j)
      {
        while (lessThan(data[i], x))
        {
          i++;
        }
        while (lessThan(x, data[j]))
        {
          j--;
        }
        if (i <= j)
        {
          w = data[i];
          data[i++] = data[j];
          data[j--] = w;
        }
      }
      if (left < j)
      {
        Sort(data, left, j, lessThan);
      }
      if (i < right)
      {
        Sort(data, i, right, lessThan);
      }
      return data;
    }
  }

This code uses C#’s generic data types to allow it to sort any type of data. On line 1, the Quicksort class takes a generic data type T, which is the type of data we want to sort. There is only one method in this class, which performs the quicksort algorithm. Line 3 begins the Sort function, which takes a List (declared as a generic type, also containing elements of type T), two ints (left and right) which indicate which portion of the list to sort (these would be set to the beginning and end of the list on the first pass), and a curious object called Func<T, T, bool> lessThan. As you might guess from its name, this is the lessThan function that will be used to determine the order of two data elements in the list.

Func<T, T, bool> is a pre-defined delegate available with later versions of C#. The signature of the functions that it can connect to must have two arguments, both of type T in this case, and a return type of bool. (There are actually 17 varieties of Func allowing for functions with from zero to 16 arguments, so if your program needs a different number of arguments, just use the corresponding version of Func. If you have more than 16 arguments, you need to refactor your code(!))

Having a delegate as a parameter to the Sort() method means we can choose the lessThan function somewhere else and pass it in as a parameter to the program. (If you’re interested in the innards of the quicksort algorithm, you’ll note that the lessThan function is used on lines 12 and 16. Quicksort is recursive, so the same function must be passed along to subsequent calls to Sort() on lines 29 and 33.)

So how do we create a custom lessThan function? Let’s look at the other class in this program.

  class FuncTest
  {
    static Func<string, string, bool> selectFunc(int choice)
    {
      switch (choice)
      {
        case 1:
          return delegate(string s1, string s2)
          {
            if (s1.Length < s2.Length)
              return true;
            else if (s1.Length > s2.Length)
              return false;
            else
            {
              if (s1.CompareTo(s2) < 0)
                return true;
              return false;
            }
          };
        case 2:
          return delegate(string s1, string s2)
          {
            return s1.CompareTo(s2) < 0;
          };
        default:
          return null;
      }
    }

    static void Main(string[] args)
    {
      List<string> wordList = new List<string>();
      string[] test = "the quick brown fox jumped over the lazy dog".Split(new char[] { ' ' });
      for (int word = 0; word < test.Length; word++)
      {
        wordList.Add(test[word]);
      }

      Console.Write("Sort by size (1) or alphabetical order (2): ");
      int choice = int.Parse(Console.ReadLine());

      Func<string, string, bool> lessThan = selectFunc(choice);
      if (lessThan == null)
      {
        Console.WriteLine("Quitting");
        Environment.Exit(0);
      }
      List<string> sortedList = Quicksort<string>.Sort(wordList, 0, wordList.Count - 1, lessThan);
      for (int word = 0; word < sortedList.Count; word++)
      {
        Console.WriteLine(sortedList[word]);
      }
    }
  }

Let’s start with the Main() function on line 31. To test the program, we’ll use a List of strings, so we create this and populate it with some words on lines 33 to 38. Then we offer the user two options for sorting. First, the words can be sorted by size (so shorter words come before longer ones), or they can be sorted alphabetically. When the choice has been made, we call selectFunc() on line 43. This function will create the correct lessThan function and return it as a delegate to the function code.

Switch your attention now to selectFunc() on line 3. If the user wants the words sorted by size, we build the correct lessThan function on lines 8 through 20. We are constructing an anonymous function (it has no name). On line 8, we use the C# keyword delegate to create a delegate attached to the function code which follows. After writing delegate, we write out the function as usual, beginning with its argument list on line 8, then adding the rest of the code on lines 9 through 20 (not forgetting to put a semi-colon afterwards). The return type of the function is defined implicitly by the return statements in its definition.

If the user wanted alphabetical sorting, we build another delegate in the same way, on lines 22 to 25.

In case you’re wondering what all this has to do with closures, functions created in this way are closures. They are created within the running program, and capture the environment in which they were created. They can also (via their delegates) be passed as parameters to other functions. This is done back in Main() on line 49, where a call to the Sort() method is made, passing along the lessThan delegate we built earlier.

This example shows that closures allow the separation of the logic for accomplishing a given task from the rest of the code. If we wanted to sort other types of data (for example, database records consisting of several data fields, each of a different data type) we can write another function that determines the lessThan condition for such data, and pass that via its delegate into the Sort() function without having to rewrite the sorting routine.

One thing this example hasn’t shown, however, is how a closure captures data that is current when the closure (the function) is defined. But this post has already reached a fair size, so I’ll leave that till the next post.

Code for this post available here.

Advertisements