IEnumerable, for a More Elegant C#

I’ve recently had the chance to dig in deeper than ever before in C#. There are a lot of things I like about C#, one of them being LINQ (Language-Integrated Query). I’ve started to rely heavily on one aspect of LINQ, Enumerables.

What is LINQ?

At first glance, LINQ expressions look like SQL queries. Expressions include keywords like from, where, and select. To filter out all odd numbers in a list, you could construct a query like the following.

var numbers = new List <int> {1, 3, 5, 2, 0, 8};

var query = 
  from number in numbers
  where (number % 2 == 0)
  select number;

The above syntax is a step up from traditional foreach loops. The LINQ syntax is more declarative than the prior. The declarative nature makes the code’s purpose clear. Great, but can it be improved more?

Intro to IEnumerable

I discovered a little gem that makes data transformations very composable. As you will see below, the result of one query can often be passed directly into the next.

It all starts with the IEnumerable interface. IEnumerable adds a bunch of handy methods to classes that implement it including Select, Zip, and Where. Lucky for us, one class that implements IEnumerable is List. Let’s take a look at the above example written with IEnumerable extensions.

var numbers = new List{1, 3, 5, 2, 0, 8};
var evens = numbers.Where(num => num % 2 == 0);

A trivial example, but you can already start to see how this can be powerful.

Using IEnumerable

Now let’s take it one step farther. Let’s say we have a Table class. It looks like this:

public class Table
{
  public string Title {get; set}
  public List Rows {get; set;}
}

It uses a Row class that looks like this.

public class Row 
{
  public string Title {get; set;}
  public bool TitleRow {get; set;}
  public List Cells {get; set;}
}

And the Row uses a Cell class that looks like this.

public class Cell 
{
  public string Label {get; set;}
  public bool IsNumber {get; set;}
}

Our goal is to return a list of all cells that are part of a title row (where a table can have multiple title rows). First, the traditional method.

var titleCells = new List();
foreach(var row in Table.Rows) 
{
  if(row.TitleRow)
  {
    foreach(var cell in row.Cells)
    {
      titleCells.Add(cell);
    }
  }
}

return titleCells;

Now let’s look at the same example using Enumerables.

return Table.Rows
  .Where(row => row.TitleRow)
  .Select(row => row.Cells)
  .SelectMany(x => x);

Easy Extensions

In this example we introduce SelectMany. If we just returned the Select, we would have an IEnumerable<IEnumerable<Cell>>, SelectMany allows us to flatten that structure. That sounds like something we might want to do more than once. In order to give it a more meaningful name and allow for reuse, we can break it out into an extension method.

public static IEnumerable Flatten(this IEnumerable> source)
{
  return source.SelectMany(x => x);
}

Now, whenever we want to flatten a list, we can call Flatten().

Putting it All Together

Let’s go back to the previous example. Say the specification changed, and we now want cells that are part of a title row and are a number. This is how we’d deal with that using traditional C#.

var titleCells = new List();
foreach(var row in Table.Rows) 
{
  if(row.TitleRow)
  {
    foreach(var cell in row.Cells)
    {
      if (cell.IsNumber)
      {
        titleCells.Add(cell);
      }
    }
  }
}

return titleCells;

We had to add an additional if branch, bumping our code farther over and adding visual complexity. Compare that to the following IEnumerable implementation.

return Table.Rows
  .Where(row => row.TitleRow)
  .Select(row => row.Cells.Where(cell => cell.IsNumber))
  .Flatten();

We add a Select statement that will select all of the cells that pass the Where clause and then Flatten that list.

Conclusion

You start to get a good idea of how IEnumerables keep code elegant, easy to read, and easy to maintain. Extend IEnumerables farther and I think you’ll start to see why LINQ and IEnumerables are the way to go in C#!
 

Conversation
  • Romoku says:

    A couple corrections.

    It is the generic IEnumerable(T) that has linq extension methods not the non-generic IEnumerable.

    The Select -> SelectMany is redundant and will decrease performance.

    return Table.Rows
    .Where(row => row.TitleRow)
    .SelectMany(row => row.Cells);

    It’s also worth noting that linq is lazy, so the results will not be produced until you enumerate them. One fun fact is that every linq extension method that returns a sequence can be reimplemented using SelectMany.

    It’s nice reading C# articles from Atomic Object. Keep up the good work.

  • Dane says:

    IEnumerable is great, but you’re mucking up the code to make ‘foreach’ look worse than it actually is. You’re putting every “{” on its own line in the “foreach” example (which I’ve never seen anyone do in real code), but not putting every “(” on its own line in the IEnumerable example. You’ve even got an extra blank line in the “foreach” example for no apparent reason.

    The examples here are 4 vs 16 lines, but with common formatting practice it would be 4 vs 11. Half of the improvement you’re showing is not the power of IEnumerable, but that it happens to use syntactic constructs that you don’t feel like putting on their own line. IEnumerable helps this, but so would simply not using so much whitespace.

    Further, if you write helper methods on these classes, like Table.TitleRows() — which you will typically end up with, regardless of whether you’re using “foreach” or “Enumerable” — the difference is even smaller. You wrote your own Flatten() method for IEnumerable, but didn’t write any helper methods for the “foreach” case.

  • Isaac says:

    I just so happened to be looking at IEnumerables recently as well. One of the things I recently learned is that it’s possible to leverage IEnumerable’s lazy-evaluation to create something similar to a Haskell infinite list. As a (poorly contrived) example, let’s say you wanted an array of 20 ints, each with the value 1. You could create this method (note the “while true”):


    public static IEnumerable Repeat (T some)
    {
    while (true)
    {
    yield return some;
    }
    }

    Then when you want the array, simply call:


    Repeat(1).Take(20).ToArray();

    Like I said, it’s a poorly contrived example, but it’s easy to imagine more powerful uses.

  • Jason says:

    Dane,

    By putting each { on its own separate line, Ryan is following the recommended C# coding conventions to the letter. Microsoft’s coding conventions recommend that each { and } are on their own separate lines. There are a few exceptions, how he declares his initial list at the beginning, but in most other cases these brackets should be on their own separate line. This is similar to how C# also uses Pascal casing for function calls and property names. It might not be the Java conventions you’re used to, but then, C# isn’t Java.

    As for the blank line, I assume you are referring to the blank line right above return titleCells; Once again, this is a Microsoft standard to leave a blank line after closing brackets. It also enhances readability of code.

    The goal behind this format is to enhance readability of the code. You are not harming performance in any way by adding extra spaces or placing brackets on their own line; however, you do increase readability drastically. In fact, this formatting was used even back in the beginning days of C programming. People started using the if(statement) { convention due to saving space when writing code for books.

    Here’s a reference from MSDN to clarify:

    http://msdn.microsoft.com/en-us/library/vstudio/ff926074.aspx

    (BTW, most development teams at Microsoft follow this convention when writing C#… but maybe what you’re saying is Microsoft doesn’t write real code?)

  • Comments are closed.