LINQ: ToDictionary

Up till now, we’ve considered only the deferred standard query operators, which are not evaluated until their result is actually enumerated by, for example, running through the result in a foreach loop.

LINQ also has a number of non-deferred operators, which are evaluated at the point where they are called. The first of these we’ll look at is  ToDictionary.

C# has a built in Dictionary data type, which is an implementation of a hash table. A hash table is essentially a glorified array, with the main difference being that any data type can be used as the array index or key. For example, if we wanted to store our list of Canadian prime ministers in a dictionary, we could use the integer ID we’ve assigned each prime minister as the key, or we could use the person’s last name, or even define some other data type from the components of a PrimeMinisters object. The one essential property is that each key must be unique, so that only one prime minister is stored for each key.

LINQ allows a dictionary to be constructed from an IEnumerable<T> source, where T is the data type of the objects in the input sequence. The simplest version of ToDictionary allows only the key to be defined for each element in the input sequence. An example is

      PrimeMinisters[] primeMinisters = PrimeMinisters.GetPrimeMinistersArray();
      var pmDictionary01 = primeMinisters.ToDictionary(k => k.id);
      Console.WriteLine("----->pmDictionary01");
      foreach (int key in pmDictionary01.Keys)
      {
        Console.WriteLine("Prime minister with ID {0}: {1} {2}",
          key, pmDictionary01[key].firstName, pmDictionary01[key].lastName);
      }

ToDictionary() here takes a single argument, which is a lambda expression defining the key. The variable k is an element from the input sequence, and we’ve selected the ‘id’ field from that element to use as the key.

Once the dictionary is built, we use a foreach loop to run through the list by selecting each key from the Keys property of the dictionary. We use array-like notation (square brackets) to reference an element in the dictionary. Each element in the dictionary is an object of type PrimeMinsters.

The output is:

----->pmDictionary01
Prime minister with ID 1: John Macdonald
Prime minister with ID 2: Alexander Mackenzie
Prime minister with ID 3: John Abbott
Prime minister with ID 4: John Thompson
Prime minister with ID 5: Mackenzie Bowell
Prime minister with ID 6: Charles Tupper
Prime minister with ID 7: Wilfrid Laurier
Prime minister with ID 8: Robert Borden
Prime minister with ID 9: Arthur Meighen
Prime minister with ID 10: William Mackenzie King
Prime minister with ID 11: Richard Bennett
Prime minister with ID 12: Louis St. Laurent
Prime minister with ID 13: John Diefenbaker
Prime minister with ID 14: Lester Pearson
Prime minister with ID 15: Pierre Trudeau
Prime minister with ID 16: Joe Clark
Prime minister with ID 17: John Turner
Prime minister with ID 18: Brian Mulroney
Prime minister with ID 19: Kim Campbell
Prime minister with ID 20: Jean Chrétien
Prime minister with ID 21: Paul Martin
Prime minister with ID 22: Stephen Harper

There are three more variants of ToDictionary, each offering a bit more flexibility than the basic version.

A second type allows the specification of a comparer class which can be used for defining the equality of objects used as keys. In the previous example, the default definition of equality was used; since the keys were ints, two keys were equal if they had the same numerical value.

However, it is possible to define keys to be equal based on any criterion we like. For example, if we stored the ID of each prime minister as a string instead of an int, then we could define two keys to be equal if their strings parsed to the same numerical value. This would allow the strings 12 and 00012 to be equal as keys, since the leading zeroes don’t change the numerical value.

To use this feature, we must first define a comparer class, in much the same way as we did when comparing the terms of office. The comparer class here is

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace LinqObjects01
{
  class IdKeyEqualityComparer : IEqualityComparer<string>
  {
    public bool Equals(string x, string y)
    {
      return Int32.Parse(x) == Int32.Parse(y);
    }

    public int GetHashCode(string obj)
    {
      return (Int32.Parse(obj)).GetHashCode();
    }
  }
}

Remember that we need to implement IEqualityComparer<string> and provide an Equals() and GetHashCode() method. In Equals() we parse the two strings and define equality to be true if their numerical values are equal. GetHashCode() must return the same code for two objects that are considered equal, so we call GetHashCode() on the parsed int.

With this class in hand, we can use it in the second form of ToDictionary:

      PrimeMinisters[] primeMinisters = PrimeMinisters.GetPrimeMinistersArray();
      var pmDictionary02 = primeMinisters.ToDictionary(k => k.id.ToString(),
        new IdKeyEqualityComparer());
      Console.WriteLine("----->pmDictionary02");
      foreach (string key in pmDictionary02.Keys)
      {
        string zeroKey = "000" + key;
        Console.WriteLine("Prime minister with ID {0}: {1} {2}",
          key, pmDictionary02[zeroKey].firstName, pmDictionary02[zeroKey].lastName);
      }

This time, we store the key as a string and pass an IdKeyEqualityComparer as the second parameter to ToDictionary. When we print out the results, we create a different string by prepending three zeroes onto the key in the dictionary, then use that zeroKey as the key when looking up entries in the dictionary. The dictionary uses its comparer object to compare zeroKey to the valid keys in the dictionary, and if a match is found, the corresponding object is returned. The output from this code is the same as that above.

If no match is found an exception is thrown, as you might expect, so be careful to ensure that all keys used to access the dictionary are valid.

The third variant of ToDictionary allows us to create our own data type from the sequence element being processed and store that new data type in the dictionary. For example, suppose we wanted to store the string representation of each prime minister in the dictionary instead of the original PrimeMinisters object. We can do that using the following code.

      PrimeMinisters[] primeMinisters = PrimeMinisters.GetPrimeMinistersArray();
      var pmDictionary03 = primeMinisters.ToDictionary(k => k.id,
        k => k.ToString());
      Console.WriteLine("----->pmDictionary03");
      foreach (int key in pmDictionary03.Keys)
      {
        Console.WriteLine(pmDictionary03[key]);
      }

The first argument to ToDictionary specifies the key as usual (we’ve gone back to using the int version of the key). The second parameter calls the ToString() method to produce a string which is stored in the dictionary. When we list the elements in the dictionary, we print out the entry directly, since it’s a string and not a compound object.

This time the output is:

----->pmDictionary03
1. John Macdonald (Conservative)
2. Alexander Mackenzie (Liberal)
3. John Abbott (Conservative)
4. John Thompson (Conservative)
5. Mackenzie Bowell (Conservative)
6. Charles Tupper (Conservative)
7. Wilfrid Laurier (Liberal)
8. Robert Borden (Conservative)
9. Arthur Meighen (Conservative)
10. William Mackenzie King (Liberal)
11. Richard Bennett (Conservative)
12. Louis St. Laurent (Liberal)
13. John Diefenbaker (Conservative)
14. Lester Pearson (Liberal)
15. Pierre Trudeau (Liberal)
16. Joe Clark (Conservative)
17. John Turner (Liberal)
18. Brian Mulroney (Conservative)
19. Kim Campbell (Conservative)
20. Jean Chrétien (Liberal)
21. Paul Martin (Liberal)
22. Stephen Harper (Conservative)

A final version of ToDictionary combines the last two versions, so we can provide both a key comparer and a custom data type. For example, if we wanted to store keys as strings and store the string version of each PrimeMinisters object, we could write:

      PrimeMinisters[] primeMinisters = PrimeMinisters.GetPrimeMinistersArray();
      var pmDictionary04 = primeMinisters.ToDictionary(k => k.id.ToString(),
        k => k.ToString(), new IdKeyEqualityComparer());
      Console.WriteLine("----->pmDictionary04");
      foreach (string key in pmDictionary04.Keys)
      {
        string zeroKey = "000" + key;
        Console.WriteLine(pmDictionary04[zeroKey]);
      }

The output from this is the same as from pmDictionary03.

Advertisements
Post a comment or leave a trackback: Trackback URL.

Trackbacks

  • By LINQ: ToLookup « Programming tutorials on September 28, 2012 at 4:26 PM

    […] seen how to create a Dictionary using LINQ. A Dictionary is a hash table in which only one object may be stored for each key. It […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: