Mar 7 2009

Create Dictionaries From Two IEnumerables With Zip In .Net

I've been borrowing some functionality from our Haskell brethren (thanks to Jeff Cutsinger for enlightening me). There is a function in Haskell called Zip; you can see the Haskell documentation here. It essentially creates a Tuple, which is basically a generically typed, primitive data structure used in Haskell to return multiple values from a function, which, in the case of Zip, would be composed like a key-value pair.

I keep running into instances in .Net where I need to take two lists/collections/arrays of values and turn them into a dictionary. Now; I could write tedious code each and every time to do this, but I finally decided I should see what kind of quagmire I could get myself into trying to recreate Zip functionality. May I also plug TDD here? It made this task nearly trivial. Here was my first test:

   1: [TestMethod]
   2: public void TestZip()
   3: {
   4:     var names = new[]
   5:                     {
   6:                         "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"
   7:                     };
   8:  
   9:     var numbers = new int[]
  10:                       {
  11:                           1, 2, 3, 4, 5, 6, 7, 8, 9
  12:                       };
  13:  
  14:     var lookup = Zip(name, numbers);
  15:     int testValue = 1;
  16:     Assert.AreEqual(lookup["one"], testValue++);
  17:     Assert.AreEqual(lookup["two"], testValue++);
  18:     Assert.AreEqual(lookup["three"], testValue++);
  19:     Assert.AreEqual(lookup["four"], testValue++);
  20:     Assert.AreEqual(lookup["five"], testValue++);
  21:     Assert.AreEqual(lookup["six"], testValue++);
  22:     Assert.AreEqual(lookup["seven"], testValue++);
  23:     Assert.AreEqual(lookup["eight"], testValue++);
  24:     Assert.AreEqual(lookup["nine"], testValue);
  25: }


Simple, right? Sure, could've written a much more succinct lambda to assert, but hey, I'm trying to make the concept easier to grok : ) My first attempt at implementation looked like this:

   1: public Dictionary<K, V> Zip<K, V>(IEnumerable<K> keys, IEnumerable<V> values)
   2: {
   3:     var keyList = new List<K>();
   4:     return keys.ToDictionary(
   5:         key =>
   6:         {
   7:             keyList.Add(key);
   8:             return key;
   9:         },
  10:         key => values.ElementAt(keyList.Count - 1));
  11: }


Right. Probably not obvious. If you're thinking "uhhhhhhhhh", no worries, I can make this simple (i think). Everyone who <3's functional programming and lambdas and etc. will scold you for not minding your closures. In this case, trying to introduce any kind of counter index inside one of my lambdas would be very very bad. But I needed a way to get a value from the value collection which corresponded to the value from the key collection at the same ordinal, but how? That's when I realized, if I didn't mind wasting space for a List, I could simply add the key to a new list and use the list count (minus 1) to get the ordinal of the key element at any point in time during the iteration.

It wasn't long until I realized that Haskell lets us zip un-even collections (by limiting the size of the result to the smaller of the two collections) and I thought, "Gee, that's nicer than throwing exceptions at our user's heads" and since I'm borrowing Haskell concepts, why not just stay consistent. Fortunately, someone at Microsoft likes programmers because the answer was simple. Let's look at my additional unit tests though (since I did write them first) that will help us prove the code.

   1: [TestMethod]
   2: public void TestZipTooManyKeys()
   3: {
   4:     var names = new[]
   5:                     {
   6:                         "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"
   7:                     };
   8:  
   9:     var numbers = new int[]
  10:                       {
  11:                           1, 2, 3, 4, 5, 6, 7
  12:                       };
  13:  
  14:     var lookup = Zip(names,numbers);
  15:     int testValue = 1;
  16:     Assert.AreEqual(lookup.Count, 7);
  17:     Assert.AreEqual(lookup["one"], testValue++);
  18:     Assert.AreEqual(lookup["two"], testValue++);
  19:     Assert.AreEqual(lookup["three"], testValue++);
  20:     Assert.AreEqual(lookup["four"], testValue++);
  21:     Assert.AreEqual(lookup["five"], testValue++);
  22:     Assert.AreEqual(lookup["six"], testValue++);
  23:     Assert.AreEqual(lookup["seven"], testValue);
  24: }
  25:  
  26: [TestMethod]
  27: public void TestZipTooManyValues()
  28: {
  29:     var names = new[]
  30:                     {
  31:                         "one", "two", "three", "four", "five", "six", "seven"
  32:                     };
  33:  
  34:     var numbers = new int[]
  35:                       {
  36:                           1, 2, 3, 4, 5, 6, 7, 8, 9
  37:                       };
  38:  
  39:     var lookup = Zip(names, numbers);
  40:     int testValue = 1;
  41:     Assert.AreEqual(lookup.Count, 7);
  42:     Assert.AreEqual(lookup["one"], testValue++);
  43:     Assert.AreEqual(lookup["two"], testValue++);
  44:     Assert.AreEqual(lookup["three"], testValue++);
  45:     Assert.AreEqual(lookup["four"], testValue++);
  46:     Assert.AreEqual(lookup["five"], testValue++);
  47:     Assert.AreEqual(lookup["six"], testValue++);
  48:     Assert.AreEqual(lookup["seven"], testValue);
  49: }


Makes sense right? I have one test where I declare more keys than I have values and vice-versa. I assert the count is limited to the shorter of the two (in this case 7) and proceed with the assertions. So how'd I do this? It was simpler than I thought it would be when I made the tests and I think you'll see it immediately:

   1: public static Dictionary<K, V> Zip<K, V>(this IEnumerable<K> keys, IEnumerable<V> values)
   2: {
   3:     var keyList = new List<K>();
   4:     return keys.Take(values.Count()).ToDictionary(
   5:         key =>
   6:         {
   7:             keyList.Add(key);
   8:             return key;
   9:         },
  10:         key => values.ElementAt(keyList.Count - 1));
  11: }


So for all my Microsoft whining, they obviously do lots of things right. The Take extension method will take the number of values available from the source. This means I don't have to qualify or limit Take if keys doesn't have as many elements as the values collection. Just try to tell me that's not awesome.

You may be thinking, "But, Alex, you jerkface, you told us all about Tuples for nothing!?". Sorry, thought I could pull a fast one there for a minute. While I have also included Tuple support in Nvigorate, I don't think they should be used 'willy-nilly' in .Net. In this case, a dictionary feels like the right use since we're essentially dealing with a pair and what better than being able to quickly access the pair you want by one of the keys?

Right, so turning this into an extension method is trivial. It's important too because by making it an extension method, we can now write:

   1: var names = new[]
   2:                 {
   3:                     "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"
   4:                 };
   5:  
   6: var numbers = new int[]
   7:                   {
   8:                       1, 2, 3, 4, 5, 6, 7, 8, 9
   9:                   };
  10:  
  11: var lookup = names.Zip(numbers);


Awesome? C'mon, you know it is. This extension method is now becoming a part of the Nvigorate framework.

Tags:

Comments

1.
trackback DotNetKicks.com says:

Trackback from DotNetKicks.com

Create Dictionaries From Two IEnumerables With Zip In .Net

Add comment


(Will show your Gravatar icon)

  Country flag

biuquote
  • Comment
  • Preview
Loading