LINQ: SelectMany

The LINQ Select command that we’ve used so far returns a single object for each input object that is passed to it. In some cases, we would like to return several objects (or sometimes no objects at all) for a given input.

As an example, remember the list of Canada’s prime ministers that we’ve been using. Suppose we construct another list, this time of the terms of office of each prime minister. We use the id number of the prime minister to label each term of office, and then give the start and end dates of each term of office. Since some prime ministers served more than one term, there are multiple entries for some ids.

The class defining the terms of office is

  class Terms
  {
    public int id;
    public DateTime start, end;

    public static ArrayList GetTermsArrayList()
    {
      ArrayList terms = new ArrayList();

      terms.Add(new Terms { id = 1, start = DateTime.Parse("1867/7/1"), end = DateTime.Parse("1873/11/5") });
      terms.Add(new Terms { id = 1, start = DateTime.Parse("1878/10/17"), end = DateTime.Parse("1891/6/6") });
      terms.Add(new Terms { id = 2, start = DateTime.Parse("1873/11/7"), end = DateTime.Parse("1878/10/8") });
      terms.Add(new Terms { id = 3, start = DateTime.Parse("1891/6/16"), end = DateTime.Parse("1892/11/24") });
      terms.Add(new Terms { id = 4, start = DateTime.Parse("1892/12/5"), end = DateTime.Parse("1894/12/12") });
      terms.Add(new Terms { id = 5, start = DateTime.Parse("1894/12/21"), end = DateTime.Parse("1896/4/27") });
      terms.Add(new Terms { id = 6, start = DateTime.Parse("1896/5/1"), end = DateTime.Parse("1896/7/8") });
      terms.Add(new Terms { id = 7, start = DateTime.Parse("1896/7/11"), end = DateTime.Parse("1911/10/6") });
      terms.Add(new Terms { id = 8, start = DateTime.Parse("1911/10/10"), end = DateTime.Parse("1920/7/10") });
      terms.Add(new Terms { id = 9, start = DateTime.Parse("1920/7/10"), end = DateTime.Parse("1921/12/29") });
      terms.Add(new Terms { id = 9, start = DateTime.Parse("1926/6/29"), end = DateTime.Parse("1926/9/25") });
      terms.Add(new Terms { id = 10, start = DateTime.Parse("1921/12/29"), end = DateTime.Parse("1926/6/28") });
      terms.Add(new Terms { id = 10, start = DateTime.Parse("1926/9/25"), end = DateTime.Parse("1930/8/7") });
      terms.Add(new Terms { id = 10, start = DateTime.Parse("1935/10/23"), end = DateTime.Parse("1948/11/15") });
      terms.Add(new Terms { id = 11, start = DateTime.Parse("1930/8/7"), end = DateTime.Parse("1935/10/23") });
      terms.Add(new Terms { id = 12, start = DateTime.Parse("1948/11/15"), end = DateTime.Parse("1957/6/21") });
      terms.Add(new Terms { id = 13, start = DateTime.Parse("1957/6/21"), end = DateTime.Parse("1963/4/22") });
      terms.Add(new Terms { id = 14, start = DateTime.Parse("1963/4/22"), end = DateTime.Parse("1968/4/20") });
      terms.Add(new Terms { id = 15, start = DateTime.Parse("1968/4/20"), end = DateTime.Parse("1979/6/3") });
      terms.Add(new Terms { id = 15, start = DateTime.Parse("1980/3/3"), end = DateTime.Parse("1984/6/29") });
      terms.Add(new Terms { id = 16, start = DateTime.Parse("1979/6/4"), end = DateTime.Parse("1980/3/2") });
      terms.Add(new Terms { id = 17, start = DateTime.Parse("1984/6/30"), end = DateTime.Parse("1984/9/16") });
      terms.Add(new Terms { id = 18, start = DateTime.Parse("1984/9/17"), end = DateTime.Parse("1993/6/24") });
      terms.Add(new Terms { id = 19, start = DateTime.Parse("1993/6/25"), end = DateTime.Parse("1993/11/3") });
      terms.Add(new Terms { id = 20, start = DateTime.Parse("1993/11/4"), end = DateTime.Parse("2003/12/11") });
      terms.Add(new Terms { id = 21, start = DateTime.Parse("2003/12/12"), end = DateTime.Parse("2006/2/5") });
      terms.Add(new Terms { id = 22, start = DateTime.Parse("2006/2/6"), end = DateTime.Now });

      return terms;
    }

    public override string ToString()
    {
      return id + ". " + start.ToString("ddd dd MMM yyyy") + " - " + end.ToString("ddd dd MMM yyyy");
    }

    public static Terms[] GetTermsArray()
    {
      return (Terms[])GetTermsArrayList().ToArray(typeof(Terms));
    }
  }

Now, how do we construct a query that will return the terms of office for each prime minister, and label each term with the PM’s name rather than just the id number?

If you’re familiar with using a join in SQL you’ll probably be thinking this is a ‘join’ problem and you’d be right. There is a join clause in LINQ which we’ll get to in due course, but for now, we’ll look at another way of doing it.

The point here is that, since some PMs have more than one term, in some cases we’ll need to return more than one result for a given id. Thus a simple ‘select’ won’t work. LINQ provides the SelectMany() operator for such a purpose.

SelectMany() works like Select() in that it takes a single object as input and returns a result. However, instead of a single object as its output, it returns an IEnumerable<T> sequence which can contain any number (even zero) of objects. It’s easiest to see how it works with an example. Here’s the query that gives the result we wanted above:

      PrimeMinisters[] primeMinisters = PrimeMinisters.GetPrimeMinistersArray();
      Terms[] terms = Terms.GetTermsArray();
      var pmList10 = primeMinisters
        .SelectMany(pm => terms
          .Where(term => term.id == pm.id)
          .Select(term => new
          {
            surname = pm.lastName,
            inOffice = term
          }));
      foreach (var pmTerm in pmList10)
      {
        Console.WriteLine(pmTerm.surname + ": {0:dd MMM yyyy} to {1:dd MMM yyyy}",
          pmTerm.inOffice.start, pmTerm.inOffice.end);
      }

We extract the two data structures from their corresponding classes as usual. Now look at the call to SelectMany(). Since it’s called from primeMinisters, each object fed to it as input is of data type PrimeMinisters. We want to return a list of terms for that PM as output.

To do this, we start with the ‘terms’ sequence and call a Where() clause on it. Since this Where() is inside the SelectMany() call, it has access to the pm object passed to it, as well as the current ‘term’ object from the terms sequence. The Where() tests for the condition that the id field in the term object is equal to the id field in the pm object. This is effectively a join between the two data sources, since we are selecting only those elements from the terms sequence whose id matches the current pm object.

Once we’ve applied this filter, we call Select() (the ordinary Select(), since we need return only a single item for any given pair of pm and term objects) to construct an anonymous object consisting of the current PM’s last name and the entire ‘term’ object, which contains the start and end dates for the term.

Thus for each pm object passed to SelectMany(), a sequence of results (the terms corresponding to that pm) is constructed, so that’s what SelectMany() returns. The results of all the calls to SelectMany() are concatenated together to form the final output which is stored in pmList10.

This time, rather than print out the raw anonymous object, we’ve selected the fields we want and formatted them a bit more nicely than before.

The results of this code are:

Macdonald: 01 Jul 1867 to 05 Nov 1873
Macdonald: 17 Oct 1878 to 06 Jun 1891
Mackenzie: 07 Nov 1873 to 08 Oct 1878
Abbott: 16 Jun 1891 to 24 Nov 1892
Thompson: 05 Dec 1892 to 12 Dec 1894
Bowell: 21 Dec 1894 to 27 Apr 1896
Tupper: 01 May 1896 to 08 Jul 1896
Laurier: 11 Jul 1896 to 06 Oct 1911
Borden: 10 Oct 1911 to 10 Jul 1920
Meighen: 10 Jul 1920 to 29 Dec 1921
Meighen: 29 Jun 1926 to 25 Sep 1926
Mackenzie King: 29 Dec 1921 to 28 Jun 1926
Mackenzie King: 25 Sep 1926 to 07 Aug 1930
Mackenzie King: 23 Oct 1935 to 15 Nov 1948
Bennett: 07 Aug 1930 to 23 Oct 1935
St. Laurent: 15 Nov 1948 to 21 Jun 1957
Diefenbaker: 21 Jun 1957 to 22 Apr 1963
Pearson: 22 Apr 1963 to 20 Apr 1968
Trudeau: 20 Apr 1968 to 03 Jun 1979
Trudeau: 03 Mar 1980 to 29 Jun 1984
Clark: 04 Jun 1979 to 02 Mar 1980
Turner: 30 Jun 1984 to 16 Sep 1984
Mulroney: 17 Sep 1984 to 24 Jun 1993
Campbell: 25 Jun 1993 to 03 Nov 1993
Chrétien: 04 Nov 1993 to 11 Dec 2003
Martin: 12 Dec 2003 to 05 Feb 2006
Harper: 06 Feb 2006 to 18 May 2012

SelectMany() is one of those commands that is available only in standard query operator form; there is no query expression version of it. However, it is still possible to write the same query using query expressions, so here it is:

      var pmList11 = from pm in primeMinisters
                     from term in terms
                     where term.id == pm.id
                     select new
                      {
                        surname = pm.lastName,
                        inOffice = term
                      };
      foreach (var pmTerm in pmList11)
      {
        Console.WriteLine(pmTerm.surname + ": {0:dd MMM yyyy} to {1:dd MMM yyyy}",
          pmTerm.inOffice.start, pmTerm.inOffice.end);
      }

Here, we create the two sequences primeMinisters and terms using two ‘from’ clauses, then apply the ‘where’ filter to them, followed by an ordinary ‘select’. The output is the same as before.

Advertisements
Post a comment or leave a trackback: Trackback URL.

Trackbacks

  • By LINQ Take and Skip « Programming tutorials on May 19, 2012 at 8:49 PM

    […] illustrate Take() with a few examples using our list of Canada’s prime ministers (see last post and links from there). First, a simple example showing how to return the first 10 prime […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: