Friday, October 9, 2009

Closures in C#

LINQ has alot of "gotchas". Take the following code, for example:

using System;

namespace ConsoleApplication50
{
class Program
{
static void Main(string[] args)
{
int value = 100;
Action<int> curriedAction =
x => Console.WriteLine(x * value);
curriedAction(5);
}
}
}

How does C# handle closures? Let's take a closer look. Here's the IL for the Main method:

.method private hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 3
.locals init (
[0] class ConsoleApplication50.Program/<>c__DisplayClass1 CS$<>8__locals2)
L_0000: newobj instance void ConsoleApplication50.Program/<>c__DisplayClass1::.ctor()
L_0005: stloc.0
L_0006: ldloc.0
L_0007: ldc.i4.s 100
L_0009: stfld int32 ConsoleApplication50.Program/<>c__DisplayClass1::value
L_000e: ldloc.0
L_000f: ldftn instance void ConsoleApplication50.Program/<>c__DisplayClass1::<Main>b__0(int32)
L_0015: newobj instance void [mscorlib]System.Action`1<int32>::.ctor(object, native int)
L_001a: pop
L_001b: ldloc.0
L_001c: ldc.i4 200
L_0021: stfld int32 ConsoleApplication50.Program/<>c__DisplayClass1::value
L_0026: ret
}

Look at the locals declaration:
 .locals init (
[0] class ConsoleApplication50.Program/<>c__DisplayClass1 CS$<>8__locals2)

Looks like there's a whole lot more happening here than I originally intended. Instead of having a local integer variable, I'm spinning up a whole class simply to hold my single integer variable. In fact, there is no local integer variable in this method. It's this local <>c_DisplayClass1 instance that actually hosts the integer within it.

Also contained within this DisplayClass1 class is my delegate, which isn't just a static method at this point, it's an instance method.

public void <Main>b__0(int x)
{
Console.WriteLine((int) (x * this.value));
}

Interestingly, you can even view the variable for the target by calling the Target property of the curriedAction variable. If I add the following two lines to this method:

var target = curriedAction.Target;
Console.WriteLine(curriedAction.Target);
Console.WriteLine(target.GetType().GetField("value").GetValue(target));

I get this as part of the output:
ConsoleApplication50.Program+<>__DisplayClass1
100

So it appears that in situations where there is no closure, the compiler favors a static method, and in situations where there is a closure, the compiler favors an instance method on a compiler-generated class.

I've heard people complain about LINQ's performance versus a for or a foreach loop, and I wouldn't be surprised if this has something to do with it. Remember, each instantiated class will have to be collected at some point. All in all, I'd say to avoid using closures wherever possible whenever performance is an issue.

2 comments:

Ryan Riley said...

Your last paragraph is an interesting perspective. Following that train of logic, you should never use LINQ whenever performance is an issue as it is 1) an extra layer of abstraction over a great many things and 2) like functional programming treats objects as immutable values for most of its operations.

If you are single-threaded and caring about performance, you are probably right. If, however, you are running in a multi-threaded environment, then these functional programming constructs may serve you even better than the alternative. Yes, you'll have lots of anonymous types and looping constructs, but you'll gain a level of thread safety in a much simpler programming model.

There are always trade-offs. :)

David Morton said...

Well, there's another tradeoff -- modified closures. In C#, since the "value" is mutable, the value could potentially be changed, causing the closure to not function properly, and possibly getting odd results.

The issue in C# so often is that functional programming is mixed with mutable variables, this can cause some issues. F# has the big advantage of encouraging you not to use mutables in the code.