New features in C# 2.0
There are dozens (hundreds, probably) of pages listing the new features of C# 2.0. However,
I never know where to find a good one quickly, and they don't always tell me what I need to
know at the time. I figured if I added my own set of pages, I could update them whenever
I wanted to, and point other people at them when answering questions. Without further ado then,
here are the new features of C# 2.0:
- Various "bits and bobs" described below
- Nullable types and the null coalescing operator
- Delegate changes
- Implementing iterators with
yield
statements - Generics
Partial types
Code generators have existed for a long time. In the past, they usually (depending on the
language) either "owned" a whole type/module (creating a whole file which shouldn't or couldn't
be edited) or reserved sections of files which shouldn't be edited manually. In some cases, where
code was generated by a separate tool from something like a database schema, it could be very hard to
make changes to the schema and regenerate the code without losing any additions made by hand.
C# 2.0 introduces the concept of a partial type declaration. This is quite simply a single type
which spans multiple files, where each file declares the same type using the partial
modifier.
The files may refer to members declared within one another without problem (just as forward references within
C# is already not a problem). Here's an example (which in itself is a complete program). This allows all the
auto-generated code which either mustn't be touched on pain of brokenness or shouldn't be touched because you'll
lose all your changes anyway to live in a completely separate file to the code you wish to add. It doesn't help
much if you want to tweak the generated code, of course, but that's a less common issue.
Test1.cs:
|
Test2.cs:
|
Compile with:
|
Results:
|
A few little things to be aware of:
using
directives are only applied to the source file they occur in.
Variable initializers (both instance and static) are executed in textual order, but there's no more guarantee made than that.
Given four fields (a1, a2, b1, b2) appearing in filesA.cs
andB.cs
(in the obvious files, with the
obvious order), all that is guaranteed is that the initializer fora1
will be executed before the initializer fora2
andb1
beforeb2
. A sequence ofa1 b1 b2 a2
is acceptable, although I'd
imagine that eithera1 a2 b1 b2
orb1 b2 a1 a2
would be more likely.
- If a type has a modifier (
abstract
,public
,static
etc)
applied to it in one place, it has effectively been applied everywhere. In particular, the modifiers on each
part of the type definition must not clash - a class cannot be declared asprotected
in one place
andpublic
in another, for example.
Aliases
In previous versions of C#, it was impossible to use two different types which had the same name (including namespace)
within the same assembly. (The types themselves would have to be defined in different assemblies anyway, of course, but
you might want to use them both from the same assembly.)
C# 2.0 introduces the concept of an "alias". This allows you to effectively name an assembly reference when you compile
the code, and use that name to disambiguate between names. As well as disambiguating between identical namespace-qualified
names, aliases allow you to disambiguate between names which have been declared within an already used namespace and
names which belong to the "root" namespace. This is achieved with the predefined alias of global
. Here's an
example of aliases at work:
Lib.cs
|
Baz.cs:
|
Test.cs:
|
Compile:
|
(The results are exactly what you'd expect.)
There are, I suspect, various subtleties to do with aliases. However, I don't know them and I don't want to
know them at the moment - because I think aliases should be avoided wherever possible. In a very few cases they'll
be absolutely invaluable, but you should really try to avoid the situation where they're needed from cropping up
in the first place.
Static classes
Prior to version 2.0, it was impossible to create a class with no instance constructors in C#. If you didn't declare
one, the compiler provided a default constructor for you (a public parameterless constructor which called the
parameterless constructor of the base type). For classes which were never meant to be instantiated (usually utility
classes such as System.Math
), this meant you needed to include a private constructor which you didn't
call yourself in order to prevent instantiation.
In C# 2.0, there are static classes. These are simply declared using the static
modifier. They
cannot be derived from or instantiated, and they have no constructors (it is a compile-time error to provide any
yourself, and the compiler won't add one for you). Their members must all be static. Here's an example:
|
Property access modifiers
This is a feature which is long, long overdue. Prior to 2.0, it was impossible to declare
a property with one access level for the "getter" and a different access level for the "setter".
This has meant that people have written separate SetXXX
methods if they wanted a
public getter but a more limited setter. Fortunately, this glaring omission has been fixed in
2.0. It's very straightforward - you just add the access modifier to the get
orset
as desired:
|
The basic rules are that you can't specify an access modifier for both the getter and the setter,
and you have to use the "extra" modifier to make access more restrictive than the rest
of the property.
Nullable types and the null coalescing operator
For as long as I can remember, people have been asking why they can't set an int
variable to null, or why they can't return null from a method declared to return DateTime
.
Many who understand why they couldn't do so still wished they could, particularly when working
with databases.
.NET 2.0 provides the generic struct System.Nullable<T>
with the constraint that T
must be a value type. (If you know absolutely nothing about generics, now might
be a good time to learn about the basics before reading further. You don't need to know a lot of the details
however, and the basic concept is a lot simpler than full-blown generics sometimes gets, which is why I've
put this page before the one on generics.) Nullable<T>
itself is still a value type, but
it represents the same set of values as T
plus the "null" value. It maintains a separate
member in memory, which is exposed through the HasValue
property. When this is true, theValue
property represents the overall value. When it's false, the overall value is null.
C# provides language support for nullable types using a question mark as a suffix. For example,int?
is the same type as Nullable<int>
(which is also the same
type as Nullable<System.Int32>
in the normal way). C# then allows you to compare
a nullable value with null, or set it to null, and these work in the obvious way. There's an implicit
conversion (no cast required) from a non-nullable type to its equivalent nullable type, and there's
an explicit conversion (cast requried) from a nullable type to its equivalent non-nullable type.
The cast is compiled into a call to the Value
property, and an InvalidOperationException
is thrown if the value is null at that point. A nullable type can also be used as the right hand side of
the as
operator, with the natural consequences.
The boxed type of a nullable value is the boxed type of the equivalent non-nullable value. If you box a value
which is already null (i.e. HasValue
is false), the result is null. This was a late change
to the behaviour, as it required CLR changes which Microsoft were hoping to avoid - you may therefore see
some beta documentation which disagrees with this.
Note that unlike in SQL, two null values of the same type are equal. In other words, the following:
|
prints "True"
.
As well as the System.Nullable<T>
struct, there's the non-generic
static class System.Nullable
. This merely provides
support for the System.Nullable<T>
struct, in terms of finding out the non-nullable
type of a nullable type and performing comparisons.
Nullable logic
bool?
has various binary logic operators, but not all of the ones available on bool
.
Importantly, the "shortcut" operators (&&
and ||
) aren't defined forbool?
. A null value represents a sort of "don't know" value - so for instance, null | true
is true
, but null | false
is null
; similarly null & false
is false
but null & true
is null
.
The null coalescing operator
This is a really simple little operator which I suspect will come in quite handy -
if it's widely known about. (It was a long time before I saw anything about it.)
Basically, a ?? b
is similar to a==null ? b : a
. The type ofa
has to be a nullable type or a reference type, and the type of b
has to be a suitable type, the details of which are best left to the spec.
The result is the value of a
if that's non-null, otherwise it takes the value ofb
. a
is only evaluated once (contrary to the version presented
above using the conditional operator), and b
is only evaluated
at all if a
evaluates to null.
The operator is right-associative, so a ?? b ?? c
is equivalent toa ?? (b ?? c)
- in other words, if you provide a string ofa1 ?? a2 ?? ... ?? an
then the result is the first non-null one (or null if
all of them are null).
One nice feature is that if the type of a
is a nullable type and the type ofb
is the equivalent "non-nullable" type, the result of the expression is
that non-nullable type. For example, you can do:
|
This is possible because the compiler knows that if the first expression evaluates to null it
will use the second expression. In this case, we're effectively using GetSomeValueMaybe()
with a kind of "default value" of 5.
The usual details about conversions and the precise rules which are applied can be found
in the spec.
Delegate changes
C# 2.0 has made delegates even easier to work with, using delegate inference,
covariance/contravariance and anonymous methods.
All of these changes are useful today, but many will become even more important in C# 3.0.
This article assumes a reasonable knowledge of delegates. If you are new to the subject, please read
my article on delegates and events first.
Delegate inference
This is actually a smaller topic than anonymous methods, but it'll make the topic of
anonymous methods easier to understand, by keeping the code smaller. Basically,
delegate inference lets you create a delegate instance without the new DelegateType
part wherever the compiler can work out which delegate you mean. Here's an example:
|
This can be used with events (e.g. button.Click += ClickHandler;
) and
anonymous methods. Occasionally you will need to tell the compiler
what kind of delegate you're trying to build, in which case the old syntax is the way to go.
Covariance and contra-variance
In earlier versions of C#, the signature of the method used to implement a delegate
had to match the delegate signature exactly, including return type. In C# 2.0,
return types can be covariant and parameter types can be contra-variant. Now, these
are big words which basically mean "you're allowed to use any method which is guaranteed
not to muck things up". To be specific:
Return type covariance means that if a delegate is declared to return a base type,
an instance can be created using a method which is declared to return a type derived from it.
Parameter type contra-variance means that if a delegate parameter is declared to be
a derived type, an instance can be created using a method which has a base type for that
parameter.
The reason this works is that any caller of the delegate must provide arguments matching
the delegate's signature (and those arguments will always be compatible with base types of the
parameters), and the implementation just has to guarantee that it will return something which
can be treated as a value of the declared return type of the delegate.
Here's an example of both covariance and contra-variance:
|
Any string
argument which could be passed into the delegate is fine to be treated
as an object
reference by SomeMethod
, and any UTF8Encoding
returned by SomeMethod
can be treated as an Encoding
reference by the
caller.
Anonymous methods
Delegates are a wonderful idea. They make eventing models simpler, and get away
from a lot of single method interfaces (or worse, multiple method interfaces with adapters
which do nothing until they're overridden) which afflict Java. Java came up with a way to
make it slightly less painful to override a single method than it would be otherwise:
anonymous inner classes. These allow you to derive from a class or implement an interface
"inline" in the middle of a method. They allow you to use local variables from the method
you're running in (with restrictions) and even private variables and methods from the class
you're running in. As you're deriving from one class and have implicit access to an instance
of another, it's almost as if you get multiple inheritance. Almost. Unfortunately, they
look pretty ugly. Here's an example:
|
This creates an implementation of the SomeActionable
interface inline - it just prints out the
first command line parameter specified. It's often better than having a whole class just for the sake
of doing something (especially if you have lots of different implementations, each of which is really
just a line of code), but it's ugly to look at and gets worse if you need to distinguish between
accessing the class you're actually deriving from and the class you're declaring the anonymous inner class in.
Now, back to C#. In versions prior to 2.0, you couldn't create a delegate instance unless you had a normal method
which had the code you wanted to run. When the delegate only needs to do one thing (e.g. pass on the call to
another component, with some different parameters) this can be a bit of a pain - it's often much more readable to
see the code you're going to execute at the point you use the delegate, and all those extra methods usually do
nothing outside their use for constructing delegates. In addition, if you need to use some context which isn't already
in your object within a delegate implementation (e.g. a local variable), you sometimes need to create a whole extra class to
encapsulate that context. Yuk. Fortunately C# 2.0 has a delegate equivalent to Java's anonymous inner classes - but without
as much mess.
Here's a rough equivalent of the Java code above, this time in C# 2.0:
|
Now, admittedly part of the reason this is so much shorter is the lack of comments and the way I've included the inline
code without putting line breaks before/after the braces. However, because there's so little to
definition, I could afford to do that without losing readability.
the delegate
Similar to Java and anonymous classes, when the C# compiler encounters an anonymous method, it either creates an extra (nested) type
or an extra method within the current type behind the scenes to contain the delegate implementation. It's easy to see
why an extra method would be required - but why would an extra type be needed? It's all to do with captured variables...
Captured variables
This is where things can become slightly difficult. Like many things, captured variables are fabulously useful,
but to start with they can be hard to understand, and they can change some of what you may think you already know about C#.
The purpose of them is very simple though - they're there to make local variables from the method declaring an anonymous method
available within the delegate itself.
Static and instance variables of the class don't need to be captured - normal instance and static methods can be created by the
compiler to access those. However, look at the example given above. How can the delegate "see" the value of args
to print out args[0]
? In the code above, it's fairly simple - the delegate is only invoked within the method - but
the delegate could be returned from the method, or passed to another method, or stored in a variable somewhere. Just to
make things concrete, here's an example which does just that:
|
Here, a local variable is available in MakeDelegate
, and the delegate uses that local variable
to print a new random number every time it's invoked. But what's the scope of that variable? Does the variable even
exist outside MakeDelegate
? Well, the answer is yes - because the compiler compiles it into an instance
variable in a new type behind the scenes. If you compile the above and look at it with ildasm
, you'll see
that there actually isn't a Random
local variable in MakeDelegate
at all, as far as
the runtime is concerned! Instead, there's a local variable of a nested type with some compiler-generated name (the time I
compiled it, the name was <>c__DisplayClass1
). That type has a Random
instance variable,
and when rng
is assigned or used within MakeDelegate
, it's that variable which is actually
used. The method used to implement the delegate signature is a member of the nested type, so it is able to get at
the Random
instance even after MakeDelegate
has finished executing.
Things become really tricky when there are multiple local variables being used in the delegate, in particular when
some of them are effectively created several times. This is best demonstrated with an example:
|
Don't worry if it takes you a while to understand the output - it certainly threw me! Look carefully at
where x
and y
are declared. There's only one x
variable for the whole method,
but there's a "new" y
variable each time we go through the loop. That means that all the
delegate instances share the same x
variable, but each one has a different y
variable. That's why the first number printed keeps going up, but the second number printed only goes up when the same
delegate instance is called again. (If another instance of the delegate was created inside the loop, then each pair
of delegate instances would share the same y
variable.) Any access to the variable within the main body
of the method uses the "current" set of variables - so any change to y
within the loop itself would
affect just the instance created in that iteration of the loop.
The reason this is contrary to intuition is that usually it doesn't really matter where a variable is declared
within a method (assuming you don't want to use the same name elsewhere in the method, and that you've made the scope
large enough to access it everywhere you want to). The location of the assignments matters, of course, but the actual
declarations have previously only affected the scope of the variable. They now affect the behaviour, as seen above -
the way the values of x
and y
behave in the delegate instances are very different.
In case you're wondering how some variables are shared and some aren't, in the above code two
are created - one with an x
variable, and one with a y
variable and another variable
referring to an instance of the first extra type. The delegate implementation is a member of the second type,
and each instance of the second type has a reference to the same instance of the first type. You can imagine
how complicated things must get when even more variables are involved!
extra types
As I said before, captured variables are very useful - but they come at the price of readability. Examples like
the code above, where different variables are shared or not shared, should be rare in real code. Anything where
the reader has to work out just where a variable came from and which variables will be associated with which
delegates is going to be a pain at maintenance time.
0 Comments:
Post a Comment
<< Home