28
©2010 Microsoft Corporation. All rights reserved. Building a LINQ Provider By Pedram Rezaei, Microsoft Corporation You can download the code for this article from here [ http://code.msdn.microsoft.com/SimpleLingToDatabase.aspx ] . Throughout this article, we will build a simple IQueryable provider similar to the LINQ to SQL provider demonstrating what is needed to build a custom provider that is capable of executing LINQ queries and performing create, update and delete operations. Language-Integrated Query (LINQ) provides a unified querying model for accessing data regardless of where that data is stored. This groundbreaking innovation bridges the gap between the world of objects and the world of data. If you are new to LINQ, Don Box and Anders Hejlsberg have a great overview here [ http://msdn.microsoft.com/en-us/library/bb308959.aspx ] . There are a few articles either on MSDN or on other websites describing the steps required to build a custom LINQ provider – most notably this one [ http://blogs.msdn.com/mattwar/archive/2007/07/30/linq-building-an-iqueryable-provider-part- i.aspx ] . In here, you will get an end-to-end and in-depth understanding of most of the elements involved in building a mini provider that is capable of performing simple queries and managing write operations (i.e. create, update and delete) as well as implementing a simple object tracker that helps with those operations. One of the more popular LINQ providers is the LINQ to SQL [ http://msdn.microsoft.com/en- us/library/bb386976.aspx ] provider that translates LINQ queries to T-SQL queries for Microsoft SQL Server. This provider also manages updates, deletes and inserts. In here, I will create a mini LINQ to SQL provider that with a few tweaks should also work with other relational databases such as Oracle and MySql. The techniques used by LINQ for querying data differ based on whether the data containers are in- memory objects or the data source is a separate system such as Microsoft SQL Server or Active Directory. In simple terms, say one is performing a query on a collection of in-memory Customer objects to find all those that live in London. This is usually achieved by enumerating all customer objects and selecting the ones that satisfy this predicate. This technique however is not efficient when dealing with data that is stored in a durable data store such as SQL Server. In such a case, we need a way of converting the original query to T-SQL (the domain-specific language for SQL Server) and benefit from SQL Server’s powerful querying engine. After all, it does not make sense to load all customers as objects into the managed heap and then loop through them to find the ones that live in London – what if we had a few million customers in the database? Based on the above challenges, LINQ takes two different approaches to executing queries so that it is possible to target almost any source of data: LINQ to Objects If the source of data is a collection of in-memory objects, as long as the source collection implements the IEnumerable<T> [ http://msdn.microsoft.com/en-us/library/9eekhta0.aspx ] interface, LINQ uses a set of extension methods [ http://blogs.msdn.com/pedram/archive/2007/06/02/extension-methods- in-c-3-0.aspx ] (known as query operators) defined in the static System.Linq.Enumerable [ http://msdn.microsoft.com/en-us/system.linq.enumerable.aspx ] class to perform the query. This technique is known as “LINQ to Objects”. With LINQ to Objects, the execution of queries is performed in managed code (IL) which makes this technique inefficient when querying data that is not stored as objects in the managed memory. For instance when querying data stored in a relational database, it might not be reasonable to load all the entities into the managed memory just to perform some simple operations that could have been better performed by the database engine. Obviously a better approach is to take the Query Expression [ http://msdn.microsoft.com/en-us/library/bb384065.aspx ] and translate it to a SQL statement with a proper WHERE clause and let the relational database engine to perform the query. That is where the IQueryable [ http://msdn.microsoft.com/en-us/library/system.linq.iqueryable.aspx ] interface plays a big part: Page 1 of 28 Building a LINQ Provider 5/14/2010 http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Custom LINQ Provider

Embed Size (px)

DESCRIPTION

Custom LINQ Provider

Citation preview

Page 1: Custom LINQ Provider

©2010 Microsoft Corporation. All rights reserved.

Building a LINQ Provider

By Pedram Rezaei, Microsoft Corporation

You can download the code for this article from here

[ http://code.msdn.microsoft.com/SimpleLingToDatabase.aspx ] .

Throughout this article, we will build a simple IQueryable provider similar to the LINQ to SQL provider

demonstrating what is needed to build a custom provider that is capable of executing LINQ queries and

performing create, update and delete operations.

Language-Integrated Query (LINQ) provides a unified querying model for accessing data regardless of

where that data is stored. This groundbreaking innovation bridges the gap between the world of objects

and the world of data. If you are new to LINQ, Don Box and Anders Hejlsberg have a great overview

here [ http://msdn.microsoft.com/en-us/library/bb308959.aspx ] .

There are a few articles either on MSDN or on other websites describing the steps required to build a

custom LINQ provider – most notably this one

[ http://blogs.msdn.com/mattwar/archive/2007/07/30/linq-building-an-iqueryable-provider-part-

i.aspx ] . In here, you will get an end-to-end and in-depth understanding of most of the elements

involved in building a mini provider that is capable of performing simple queries and managing write

operations (i.e. create, update and delete) as well as implementing a simple object tracker that helps

with those operations.

One of the more popular LINQ providers is the LINQ to SQL [ http://msdn.microsoft.com/en-

us/library/bb386976.aspx ] provider that translates LINQ queries to T-SQL queries for Microsoft SQL

Server. This provider also manages updates, deletes and inserts. In here, I will create a mini LINQ to

SQL provider that with a few tweaks should also work with other relational databases such as Oracle

and MySql.

The techniques used by LINQ for querying data differ based on whether the data containers are in-

memory objects or the data source is a separate system such as Microsoft SQL Server or Active

Directory.

In simple terms, say one is performing a query on a collection of in-memory Customer objects to find

all those that live in London. This is usually achieved by enumerating all customer objects and selecting

the ones that satisfy this predicate. This technique however is not efficient when dealing with data that

is stored in a durable data store such as SQL Server. In such a case, we need a way of converting the

original query to T-SQL (the domain-specific language for SQL Server) and benefit from SQL Server’s

powerful querying engine. After all, it does not make sense to load all customers as objects into the

managed heap and then loop through them to find the ones that live in London – what if we had a few

million customers in the database?

Based on the above challenges, LINQ takes two different approaches to executing queries so that it is

possible to target almost any source of data:

LINQ to Objects

If the source of data is a collection of in-memory objects, as long as the source collection implements

the IEnumerable<T> [ http://msdn.microsoft.com/en-us/library/9eekhta0.aspx ] interface, LINQ uses

a set of extension methods [ http://blogs.msdn.com/pedram/archive/2007/06/02/extension-methods-

in-c-3-0.aspx ] (known as query operators) defined in the static System.Linq.Enumerable

[ http://msdn.microsoft.com/en-us/system.linq.enumerable.aspx ] class to perform the query. This

technique is known as “LINQ to Objects”.

With LINQ to Objects, the execution of queries is performed in managed code (IL) which makes this

technique inefficient when querying data that is not stored as objects in the managed memory. For

instance when querying data stored in a relational database, it might not be reasonable to load all the

entities into the managed memory just to perform some simple operations that could have been better

performed by the database engine. Obviously a better approach is to take the Query Expression

[ http://msdn.microsoft.com/en-us/library/bb384065.aspx ] and translate it to a SQL statement with a

proper WHERE clause and let the relational database engine to perform the query. That is where the

IQueryable [ http://msdn.microsoft.com/en-us/library/system.linq.iqueryable.aspx ] interface plays a

big part:

Page 1 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 2: Custom LINQ Provider

C# Copy Code

LINQ with IQueryable

If the source of a query expression is an IQueryable<T> [ http://msdn.microsoft.com/en-

us/library/bb351562.aspx ] instead of the raw IEnumerable<T> [ http://msdn.microsoft.com/en-

us/library/9eekhta0.aspx ] then the query is represented by a set of objects that preserves the

structure of the actual query. This tree of objects is called an Expression Tree.

An expression tree can be given to a LINQ provider where it is usually translated to a domain-specific

query language – or anything else for that matter. For example in the case of SQL Server, it can be

translated to T-SQL, whereas in the case of Active Directory the same expression tree can be converted

to an LDAP filter. With this approach, it is possible to develop all sorts of providers such as the ones

shown here [ http://blogs.msdn.com/charlie/archive/2008/02/28/link-to-everything-a-list-of-linq-

providers.aspx ] .

This article demonstrates how to create a mini IQueryable LINQ provider.

Before moving on, it is imperative to have a deep understanding of expression trees. If you are puzzled

by what they are, let me show you a simple example. Imagine the following simple expression (where

‘i’ is an integer):

i <= 10

The expression tree representing this expression is a tree of objects that is shown below:

1 : Expression Tree for “ i <= 10”

At its root, this expression is captured by an instance of the BinaryExpression class. This object has a

Left and a Right property of type Expression. The Left expression is a ParameterExpression called ‘i’ of

type ‘Int32’ and the Right expression is a ConstantExpression representing the constant integer ‘10’. Now given this tree of objects, different providers can translate them into different domain-specific

languages.

If the source of the LINQ Query is of type IQueryable<T> then the C# compiler automatically – at

compile time – converts the query to an expression tree. Reflecting on the generated IL will show the

expression tree representing the original expression. For example the query expression below is

automatically converted to its expression tree representation at compile time:

var q = from c in dc.Customers

where c.City == "London"

select c;

Page 2 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 3: Custom LINQ Provider

C# Copy Code

C# Copy Code

ParameterExpression pc = Expression.Parameter(typeof(Customer), "c");

IQueryable<Customer> q3 =

dc.Customers.Where<Customer>(

Expression.Lambda<Func<Customer, bool>>(

Expression.Equal(

Expression.Property(pc, typeof(Customer).GetProperty("City")),

Expression.Constant("London", typeof(string))

), new ParameterExpression[] { pc }));

Nothing stops you of building an expression tree manually, however it is simpler to let the compiler to

manage this for you. The following shows how to manually build an expression tree that represents the

‘ i <= 10 ’ expression:

BinaryExpression e = Expression.LessThanOrEqual(

Expression.Parameter(typeof(int), "i"),

Expression.Constant(10, typeof(int)));

Before moving on, there is a simple Visual Studio debug-time Visualizer (shown in Figure 1 ) that helps

to visualize expression trees. I found this to be extremely useful when developing this custom LINQ provider. Please see here [ http://msdn.microsoft.com/en-us/library/bb397975.aspx ] for how to

install and enable this for Visual Studio 2008.

As Figure 2 shows, our custom provider will translate LINQ queries (expression trees) to T-SQL and the

result-sets returned from the database are mapped to objects using an object-relational mapper (ORM)

that is discussed next. Write operations on the other hand, are tracked in memory and only executed

when the application explicitly requests to submit the changes.

Page 3 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 4: Custom LINQ Provider

C# Copy Code

C# Copy Code

2 : Custom provider in action

As this provider is also an Object-Relational Mapper (ORM), it must provide the set of functionality

described in this section.

The mapping information between objects and database entities (tables, views, rows and associations)

must be captured. The technique used here is the same as the one used by LINQ to SQL in which

TableAttribute [ http://msdn.microsoft.com/en-

us/library/system.data.linq.mapping.tableattribute.aspx ] and ColumnAttribute

[ http://msdn.microsoft.com/en-us/library/system.data.linq.mapping.columnattribute.aspx ] custom

attributes are used to decorate classes and their members that correspond to database tables and

columns respectively. Here is an example of such a mapper:

[Table(Name = "dbo.Products")] public partial class Product { // ... [Column(Storage = "_ProductID", DbType = "Int NOT NULL IDENTITY", IsPrimaryKey = true, IsDbGenerated = true)] public string ProductID { get; set; }

This class is mapped to a database table called ‘Products’ in the ‘dbo’ schema. The ‘ProductID’ property

is associated to the ‘ProductID’ column of the ‘Products’ table. Throughout the rest of this article I use

the terms “Entity Classes” to refer to mapping classes and “Entities” to refer to their instances.

LINQ to SQL uses SqlMetal.exe [ http://msdn.microsoft.com/en-us/library/bb386987.aspx ] that comes

as part of the Windows SDK [ http://msdn.microsoft.com/en-us/windowsvista/bb980924.aspx ] or the

“LINQ to SQL Classes” item in Visual Studio to automatically generate the entity classes. As our

provider uses a very similar mapping technique to LINQ to SQL, we can still use these tools to auto-

generate the entity classes.

An ORM also requires a mechanism for converting database result-sets to collections of objects. A basic

implementation of such a component is also discussed later in this article.

Other components necessary for this custom provider are:

An implementation of IQueryable<T> that can be used as the source of LINQ queries

An implementation of IQueryProvider to execute expression trees. This class should

eventually achieve the following:

Convert expression trees to T-SQL (domain-specific language)

Execute T-SQL

Map the results of the query to a single object or a sequence of objects

One last element to add is an object tracker. This is required to track entities instantiated by the

mapper. Without it, how would the provider know that an entity has been changed by the application so

that it can generate a SQL UPDATE statement?

The first step of building a custom provider is to define a source, and the source has to implement

IQueryable. The generic IQueryable<T> is identical to the non-generic IQueryable with the only

difference that it also implements IEnumerable<T>. The non-generic version of IQueryable<T> interface is defined as:

public interface IQueryable : IEnumerable { Type ElementType { get; } Expression Expression { get; } IQueryProvider Provider { get; } }

The Expression property gives the expression tree that represents the current user query.

System.Linq.Queryable [ http://msdn.microsoft.com/en-us/library/system.linq.queryable.aspx ] is the

equivalent class to System.Linq.Enumerable [ http://msdn.microsoft.com/en-

Page 4 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 5: Custom LINQ Provider

C# Copy Code

C# Copy Code

C# Copy Code

us/library/system.linq.enumerable.aspx ] used for LINQ to Objects and provides a set of LINQ query

operators such as Where and Select for IQueryable LINQ providers. Each query operator defined in this

class represents the requested operation as a node of within the expression tree. For example the

Where query operator looks similar to the following:

public static IQueryable<T> Where<T>(

this IQueryable<T> source,

Expression<Func<T, bool>> predicate)

{

// Use reflection and find the current method that we are in

MethodInfo currentMethod = (MethodInfo)MethodBase.GetCurrentMethod();

// Build the expression tree node representing a method call

// to this Where method

MethodCallExpression e =

Expression.Call(

null,

currentMethod,

new Expression[] {

source.Expression,

Expression.Quote(predicate) });

// ...

When this Where query operator is used as a filter to an IQueryable<T>, it simply builds a new

IQueryable<T> adding the above MethodCallExpression as its root node. This is how expression trees

are built. For example the following simple query builds a pipeline of IQueryables each adding new

nodes to the expression tree:

var q =

dc.Customers

.Where(c => c.City == "London")

.Select(c => c.ContactName);

So far, we established that we need an IQueryable source for our provider. As database queries usually

start from a database table, it is only logical to define a source representing a database table.

The only query source provided by this mini provider is the DatabaseTable<TEntity> class that directly

represents a database table. An instance of this class can be used as the source for LINQ queries:

Figure 3 : DatabaseTable<TEntity> and its interfaces

The DatabaseTable<TEntity> sits at the heart of our expression trees and is represented by a simple

constant expression:

Page 5 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 6: Custom LINQ Provider

C# Copy Code

public sealed class DatabaseTable<TEntity> : IQueryable<TEntity>, ...

{

...

public Expression Expression

{

get { return Expression.Constant(this); }

}

DatabaseTable<TEntity> is an internal class meaning that an instance of this class cannot be created

by others. The only way to use it is through another class called DatabaseDataContext which is the only

entry point to this custom provider. When the DatabaseDataContext is instantiated, the user should

pass in a connection string to the database. They can then call the GetTable<T>() method to access an

instance of the DatabaseTable<T> which is the IQueryable<T> source for any subsequent LINQ queries:

SqlConnection cnn = new SqlConnection(Settings.Default.NorthwindCnnString);

DatabaseDataContext dc = new DatabaseDataContext(cnn); IQueryable<Customer>

customers = dc.GetTable<Customer>();

The main job of the DatabaseDataContext is to be the only facade that users would need in order to

consume this LINQ provider.

What happens when GetTable<TEntity>() is called?

If this is the first time that an IQueryable of type TEntity is asked for, reflection is used on TEntity to

create a tree of objects that represent the mapping metadata declared through the use of

TableAttribute [ http://msdn.microsoft.com/en-

us/library/system.data.linq.mapping.tableattribute.aspx ] and ColumnAttribute[ http://msdn.microsoft.com/en-us/library/system.data.linq.mapping.columnattribute.aspx ] custom

attributes. This metadata is represented by the following three classes:

Figure 4 : Classes representing mapping metadata

A DatabaseMetaTable represents a mapper entity class such as the Product class that was shown

earlier. It also includes a collection of type DatabaseMetaDataMember that are objects representing all

Page 6 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 7: Custom LINQ Provider

C# Copy Code

fields and properties defined in the TEntity mapper class (e.g. the ProductID property in the Product

class).

The PersistentDataMembers property of the DatabaseMetaTable is a subset of DataMembers

representing a collection of properties or fields that are directly mapped to a database column

(remember, not all data members necessarily have the ColumnAttribute

[ http://msdn.microsoft.com/en-us/library/system.data.linq.mapping.columnattribute.aspx ] meaning

that it is possible to have data members that are not associated to a database column). The

IdentityMembers property contains metadata for columns that are used as primary keys by the entity.

The DatabaseMetaModel holds an in-memory cache of DatabaseMetaTables to avoid the expensive cost

of reflection every time that GetTable<TEntity>() is called (trading memory consumption for speed).

DatabaseMetaDataMember has some important properties such as IsDbGenerated that tells the

provider whether the value of the column is automatically generated by the database engine and

therefore should be ignored when performing write operations. These properties are usually mapped directly to properties of the ColumnAttribute.

Here is a description of some of the other interesting properties of the DatabaseMetaDataMember:

Property Description

CanBeNull Is Null an acceptable value for this item? For instance, if the data member is mapped to a database column then is this a null-able column?

DbType If a database column then the DbType represents the real SQL data type for this column.

DeclaringType The actual entity type (class) that contains this field or property

IsPersistent Is this member associated to a database column?

MappedName The actual name of the column in the database. For example the column

name could be “ContactName” whereas the property representing this column might have been called “CustomerName”.

Member The MemberInfo [ http://msdn.microsoft.com/en-us/library/system.reflection.memberinfo.aspx ] object referencing the actual property or field of the entity class. This is used to gain access to the value of this member.

StorageMember Similar to the Member property, with the difference that it represents the underlying member that should be used by the mapper to set the value of a cell read from the database – should the setter of a property be called when the value of the column is read from the database or should the mapper find the underlying field and set its value directly?

Based on the mapping metadata for TEntity, GetTable<TEntity>() creates and returns an instance of

DatabaseTable<TEntity>:

public DatabaseTable<TEntity> GetTable<TEntity>() where TEntity : class

{

CheckDispose();

DatabaseMetaTable metaTable = _metaModel.GetTable(typeof(TEntity));

if (metaTable == null)

throw new Exception(

string.Format("{0} is not decorated with the TableAttribute.",

typeof(TEntity).Name));

IDatabaseTable table = GetTable(metaTable);

if (table.ElementType != typeof(TEntity))

throw new Exception(

string.Format("It was not possible to find a table for type {0}",

Page 7 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 8: Custom LINQ Provider

C# Copy Code

C# Copy Code

typeof(TEntity).Name));

return (DatabaseTable<TEntity>)table;

}

private IDatabaseTable GetTable(DatabaseMetaTable metaTable)

{

IDatabaseTable table;

if (!_tables.TryGetValue(metaTable, out table))

{

table = (IDatabaseTable)Activator.CreateInstance(

typeof(DatabaseTable<>).MakeGenericType(

new Type[] { metaTable.EntityType }),

BindingFlags.NonPublic | BindingFlags.Public |

BindingFlags.Instance, null,

new object[] { this, metaTable }, null);

_tables.Add(metaTable, table);

} return table;

}

Phew! We now have an instance of IQueryable<T> that gives access to the expression tree

representing the query. The next step is to define an IQueryProvider to execute the query.

The IQueryable interface has a property called Provider that gives access to an object implementing the

IQueryProvider interface. The IQueryProvider has 2 methods (with a couple of overloads) one that

takes an expression tree and creates an IQueryable for it and the other which executes an expression

tree:

public interface IQueryProvider

{

IQueryable<TElement> CreateQuery<TElement>(Expression expression);

TResult Execute<TResult>(Expression expression); ...

}

For cleanness of design, our IQueryable<T> implementer (DatabaseTable<T>) is also an

IQueryProvider, so the implementation of the properties required by IQueryable<T> should look similar

to the following:

public sealed class DatabaseTable<TEntity> : IQueryable<TEntity>, ...

{

... public Type ElementType

{

get { return typeof(TEntity); }

}

public Expression Expression

{ get { return Expression.Constant(this); }

}

public IQueryProvider Provider

{

get { return this; }

}

Page 8 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 9: Custom LINQ Provider

C# Copy Code

C# Copy Code

Query Execution

There are 2 ways that a LINQ query can start executing. You can explicitly call the Execute method of

the provider, or rely on the call to IEnumerable.GetEnumerator() when enumerating over the

IQueryable – remember that an IQueryable is also an IEnumerable.

Not all queries result in a sequence of entities. For those query operators such as Sum, Max, Min, Count

and First, a call is explicitly made to the Execute method of the provider. However for other queries,

the enumerator implicitly executes the query.

Because both Execute() and GetEnumerator() attempt to execute the query, it is only logical to factor

the execution logic into one central place. For this purpose, we will create a class called

DatabaseProvider which DatabaseTable<T> will delegate the calls to:

public sealed class DatabaseTable<TEntity> : IQueryProvider ...

{

public IEnumerator<T> GetEnumerator()

{

return ((IEnumerable<T>)provider.

Execute(Expression.Constant(this))).GetEnumerator();

}

object IQueryProvider.Execute(Expression expression)

{

return provider.Execute(expression);

}

As you can see, most of the work is delegated to the DatabaseProvider object accessible through the

provider member. The provider model now looks like Figure 5 .

5 : Providers

The task of the Execute method is well defined. It receives an expression tree representing the user

query and returns a sequence of entities or a single entity whichever the query has asked for:

public object Execute(Expression query)

{

Page 9 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 10: Custom LINQ Provider

C# Copy Code

CheckDispose(); // translate QueryTranslator translator = new QueryTranslator(_dataContext.Model); QueryInfo info = translator.Translate(query); // execute return Execute(info);

}

As you can see, it first has to translate the expression tree to an executable T-SQL statement that is

returned as plain text (accessible through info.QueryText ). The call to the second Execute method will

finally execute the plain T-SQL statement. Here is a simplified version of this Execute method:

private object Execute(QueryInfo info) { // get a database connection DbConnection connection = _cnnManager.GetOpenConnection(); try

{ // execute the query DbCommand command = connection.CreateCommand(); command.CommandText = info.QueryText;

DbDataReader reader = command.ExecuteReader(CommandBehavior.SingleResult);

// map the database rows into objects (entities) // ...

// return the mapped entities // ... } finally

{ connection.Close(); } }

Query Translation

Before looking at how the mapping is done, let’s focus on the query translation process. This is where

expression trees are converted to T-SQL queries. There are a number of classes that are involved in

this process:

Page 10 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 11: Custom LINQ Provider

6 : Query Translation Classes

The QueryTranslator is responsible for walking the expression tree and find any method calls to Where

[ http://msdn.microsoft.com/en-us/library/system.linq.queryable.where.aspx ] , Select

[ http://msdn.microsoft.com/en-us/library/system.linq.queryable.select.aspx ] , OrderBy

[ http://msdn.microsoft.com/en-us/library/system.linq.queryable.orderby.aspx ] , ThenBy

[ http://msdn.microsoft.com/en-us/library/system.linq.queryable.thenby.aspx ] , OrderByDescending

[ http://msdn.microsoft.com/en-us/library/system.linq.queryable.orderbydescending.aspx ] ,

ThenByDescending [ http://msdn.microsoft.com/en-

us/library/system.linq.queryable.thenbydescending.aspx ] , Take [ http://msdn.microsoft.com/en-

us/library/bb300906.aspx ] , First [ http://msdn.microsoft.com/en-

us/library/system.linq.queryable.first.aspx ] and FirstOrDefault [ http://msdn.microsoft.com/en-

us/library/system.linq.queryable.firstordefault.aspx ] extension methods (LINQ query operators)

defined in the System.Linq.Queryable [ http://msdn.microsoft.com/en-

us/library/system.linq.queryable.aspx ] class. You may have noticed that this mini LINQ provider does

not support other query operators such as Skip [ http://msdn.microsoft.com/en-us/library/bb357513.aspx ] and Distinct [ http://msdn.microsoft.com/en-

us/library/system.linq.queryable.distinct.aspx ] . However you should be able to add many of those

relatively simply.

At this point, it is worth mentioning some of the limitations of this mini custom LINQ Provider:

It does not support sub-queries or queries that require references to more than one

database table

It does not support queries that may return more than one result-set

For a full list of query operators offered by System.Linq.Queryable see here

[ http://msdn.microsoft.com/en-us/library/system.linq.queryable_members.aspx ] . Any

operator that is not mentioned above is not supported.

No more than one Where, Select, Take, First or FirstOrDefault operator can exist in an

expression tree.

Page 11 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 12: Custom LINQ Provider

C# Copy Code

Take, First or FirstOrDefault operators can only appear at the bottom of an expression (the

root of the expression tree)

Not all SQL Server and CLR data types are supported. For instance conversion from some

SQL data types to CLR types may fail.

Partial evaluation of expression trees is not carried out meaning that, for example, access

to local variables can cause exceptions. See here

[ http://blogs.msdn.com/mattwar/archive/2007/08/01/linq-building-an-iqueryable-provider

-part-iii.aspx ] for a generic implementation of a partial evaluator.

Earlier in the Expression Trees section, it became apparent that expression trees can represent any

type of expressions. In fact, they can even include method calls (MethodCallExpression) and lambda

expressions (LambdaExpression). Although this level of flexibility makes expression trees an incredibly

powerful tool to describe expressions, it adds to the complexity of LINQ providers as they need to

analyse the expression tree and usually convert it to a domain-specific language.

In order to reduce this complexity, many LINQ providers only support a subset of the query operators

and expression tree combinations. For example the first version of the LINQ provider for ADO.NET Data

Services released with .NET Framework 3.5 SP1 does not support the Sum query operator:

dm.Books.Sum(b => b.Price);

To make this process even easier, it is possible to codify a common pattern for visiting all nodes in an

expression tree. Visiting a tree is a well-known topic and there are many algorithms that make it trivial

to visit all the nodes and receive a notification when a node of a given type is found.

A very interesting approach used by the .NET Framework 3.5 implementation of LINQ to SQL is to

define a generic expression visitor that can be inherited by all interested parties (the abstract

ExpressionVisitor class shown in Figure 6 ). It consists of many virtual methods such as VisitBinary and

VisitConstant. If you are looking for a ConstantExpression then the VisitConstant is called when the

expression tree is being visited by the ExpressionVisitor, so all you need to do is to override the

VisitConstant method of the base class (ExpressionVisitor).

Here is the code listing of one of main entry method of this class: the Visit method that starts the whole

process:

protected virtual Expression Visit(Expression exp)

{

if (exp == null)

{

return exp;

}

switch (exp.NodeType)

{

case ExpressionType.Add:

case ExpressionType.AddChecked:

case ExpressionType.And:

case ExpressionType.AndAlso:

case ExpressionType.ArrayIndex:

case ExpressionType.Coalesce:

case ExpressionType.Divide:

case ExpressionType.Equal:

case ExpressionType.ExclusiveOr:

case ExpressionType.GreaterThan:

case ExpressionType.GreaterThanOrEqual:

case ExpressionType.LeftShift:

case ExpressionType.LessThan:

case ExpressionType.LessThanOrEqual:

case ExpressionType.Modulo:

case ExpressionType.Multiply:

case ExpressionType.MultiplyChecked:

case ExpressionType.NotEqual:

case ExpressionType.Or:

case ExpressionType.OrElse:

case ExpressionType.Power:

case ExpressionType.RightShift:

case ExpressionType.Subtract:

case ExpressionType.SubtractChecked:

Page 12 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 13: Custom LINQ Provider

C# Copy Code

return this.VisitBinary((BinaryExpression)exp);

case ExpressionType.ArrayLength:

case ExpressionType.Convert:

case ExpressionType.ConvertChecked:

case ExpressionType.Negate:

case ExpressionType.NegateChecked:

case ExpressionType.Not:

case ExpressionType.Quote:

case ExpressionType.TypeAs:

return this.VisitUnary((UnaryExpression)exp);

case ExpressionType.Call:

return this.VisitMethodCall((MethodCallExpression)exp);

case ExpressionType.Conditional:

return this.VisitConditional((ConditionalExpression)exp);

case ExpressionType.Constant:

return this.VisitConstant((ConstantExpression)exp);

case ExpressionType.Invoke:

return this.VisitInvocation((InvocationExpression)exp);

case ExpressionType.Lambda:

return this.VisitLambda((LambdaExpression)exp);

case ExpressionType.ListInit:

return this.VisitListInit((ListInitExpression)exp);

case ExpressionType.MemberAccess:

return this.VisitMemberAccess((MemberExpression)exp);

case ExpressionType.MemberInit:

return this.VisitMemberInit((MemberInitExpression)exp);

case ExpressionType.New:

return this.VisitNew((NewExpression)exp);

case ExpressionType.NewArrayInit:

case ExpressionType.NewArrayBounds:

return this.VisitNewArray((NewArrayExpression)exp);

case ExpressionType.Parameter:

return this.VisitParameter((ParameterExpression)exp);

case ExpressionType.TypeIs:

return this.VisitTypeIs((TypeBinaryExpression)exp);

default:

throw new Exception(

string.Format("Unhandled expression type: '{0}'", exp.NodeType));

}

}

All of the translators shown in Figure 6 , including both the QueryTranslator and the WhereTranslator

classes, inherit from the abstract ExpressionVisitor. The QueryTranslator class shown in the same figure

is only interested in finding calls to Where, Select and other similar query operators. So the only

overridden method from its base class (ExpressionVisitor) is the VisitMethodCall method:

protected override Expression VisitMethodCall(MethodCallExpression mc)

{

Type declaringType = mc.Method.DeclaringType;

if (declaringType != typeof(Queryable))

throw new NotSupportedException(

"The type for the query operator is not Queryable!");

switch (mc.Method.Name)

{

case "Where":

// is this really a proper Where?

var whereLambda = GetLambdaWithParamCheck(mc);

if (whereLambda == null)

break;

VisitWhere(mc.Arguments[0], whereLambda);

break;

case "OrderBy":

case "ThenBy":

// is this really a proper Order By?

var orderLambda = GetLambdaWithParamCheck(mc);

if (orderLambda == null)

break;

Page 13 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 14: Custom LINQ Provider

C# Copy Code

VisitOrderBy(mc.Arguments[0], orderLambda, OrderDirection.Ascending);

break; case "OrderByDescending": case "ThenByDescending":

// is this really a proper Order By Descending? var orderDescLambda = GetLambdaWithParamCheck(mc);

if (orderDescLambda == null) break;

VisitOrderBy(mc.Arguments[0], orderDescLambda, OrderDirection.Descending);

break; case "Select":

// is this really a proper Select? var selectLambda = GetLambdaWithParamCheck(mc);

if (selectLambda == null) break;

VisitSelect(mc.Arguments[0], selectLambda); break;

case "Take": if (mc.Arguments.Count != 2) break;

VisitTake(mc.Arguments[0], mc.Arguments[1]); break;

case "First": // This custom provider does not support the use of a First operator // that takes a predicate. Therefore we check to ensure that no more // than one argument is provided.

if (mc.Arguments.Count != 1) break;

VisitFirst(mc.Arguments[0], false); break;

case "FirstOrDefault": // This custom provider does not support the use of a FirstOrDefault // operator that takes a predicate. Therefore we check to ensure that // no more than one argument is provided.

if (mc.Arguments.Count != 1) break;

VisitFirst(mc.Arguments[0], true); break;

default: return base.VisitMethodCall(mc);

} Visit(mc.Arguments[0]); return mc;

}

This article will not expand on the implementation of the supported query operators except for the

more interesting WhereTranslator. When a call to Queryable Where operator is discovered, an instance

of the WhereTranslator class is created and its Translate method is called passing in the lambda

expression representing the where condition (predicate):

private void VisitWhere(Expression queryable, LambdaExpression predicate) { // this custom provider cannot support more // than one Where query operator in a LINQ query if (_whereTranslator != null)

throw new NotSupportedException( "You cannot have more than one Where operator in this expression");

_whereTranslator = new WhereTranslator(_model); _whereTranslator.Translate(predicate); }

Page 14 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 15: Custom LINQ Provider

C# Copy Code

C# Copy Code

C# Copy Code

The WhereTranslator also inherits from the ExpressionVisitor and is only interested in Binary, Constant,

MemberAccess and Unary expressions. When its Translate method is called, it creates a StringBuilder

that is then used to build the WHERE clause of the T-SQL query. Here is an example of how binary

operations can be converted to T-SQL:

StringBuilder _sb; protected override Expression VisitBinary(BinaryExpression b) { _sb.Append("("); Visit(b.Left); switch (b.NodeType)

{ case ExpressionType.And: case ExpressionType.AndAlso:

_sb.Append(" AND "); break;

case ExpressionType.Or: case ExpressionType.OrElse:

_sb.Append(" OR "); break;

case ExpressionType.Equal: if (IsComparingWithNull(b))

_sb.Append(" IS "); else

_sb.Append(" = "); break;

case ExpressionType.GreaterThan: _sb.Append(" > ");

break; ... } Visit(b.Right); _sb.Append(")"); return b;

}

In this method, the Left expression is first visited to identify any constant or member access

expressions that could fill the left part of a binary operator. The binary operator comes next followed by

a visit to the Right expression. Note that in SQL Server, due to its three-valued

[ http://msdn.microsoft.com/en-us/library/ms191504.aspx ] logic, “IS” or “IS NOT” operators should

be used when testing for null values (“ City = NULL ” has a different meaning to “ City IS NULL ”,

namely will always be false).

Once translated, you can access the WhereClause property of the WhereTranslator to get the string

that represents the actual WHERE clause:

internal string WhereClause { get { return _sb.ToString(); }

}

The same technique is also used to construct ORDER BY, projection and other parts of a SELECT

statement. Once all clauses are built, they need to be assembled into a SELECT statement representing

the original query (the following method is defined in QueryTranslator):

private QueryInfo ConvertToExecutableQuery(Expression query) { // Find the query source DatabaseMetaTable source;

Page 15 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 16: Custom LINQ Provider

if (!GetSourceTable(query, out source)) throw new NotSupportedException( "This query expression is not supported!"); StringBuilder sb = new StringBuilder(); bool useDefault = false;

// SELECT sb.Append("SELECT "); // TOP if (_takeTranslator != null && _takeTranslator.Count.HasValue)

{ useDefault = _takeTranslator.UseDefault; sb.Append("TOP "); sb.Append(_takeTranslator.Count); sb.Append(" "); } // PROJECTION if (_selectTranslator == null || !_selectTranslator.DataMembers.Any())

{ // project on all the mapped columns _selectTranslator = new ProjectionTranslator( _model, source.PersistentDataMembers); } if (!_selectTranslator.DataMembers.Any())

throw new Exception( "There are no items for projection in this query!"); sb.Append(_selectTranslator.ProjectionClause); // FROM sb.Append(" FROM "); sb.AppendLine(source.TableName); // WHERE if (_whereTranslator != null)

{ string where = _whereTranslator.WhereClause; if (!string.IsNullOrEmpty(where))

{ sb.Append("WHERE "); sb.AppendLine(where); } } // ORDER BY if (_orderByTranslator != null)

{ string orderby = _orderByTranslator.OrderByClause; if (!string.IsNullOrEmpty(orderby))

{ sb.Append("ORDER BY "); sb.AppendLine(orderby); } } return new QueryInfo

{ QueryText = sb.ToString(), // The actual SQL query SourceMetadata = source, LambdaExpression = _selectTranslator.ProjectionLambda, ResultShape = GetResultShape(query), UseDefault = useDefault }; }

The query is now ready to be executed. See the Query Execution section for more information on how

this query is executed.

Once executed, the result-set returned from the database needs to be converted or mapped to a single

instance of an entity or a sequence of entities. This is the job of the DatabaseResultMapper<T> class.

DatabaseResultMapper<T> is an IEnumerable<T> that takes a DbDataReader

[ http://msdn.microsoft.com/en-us/library/system.data.common.dbdatareader.aspx ] instance and

maps database rows to objects. This conversion can be performed one row at a time when MoveNext is

called on the enumerator created by the source of the query. It could perhaps use the deferred

Page 16 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 17: Custom LINQ Provider

C# Copy Code

C# Copy Code

execution [ http://blogs.msdn.com/pedram/archive/2007/06/02/lazy-evaluation-in-c.aspx ] model that

is commonly used by LINQ to Objects, for mapping rows to entities (objects).

Figure 7 : The result-set mapper class

Before demoing the code of the mapper, it is important to explain how projection is implemented in this

custom provider. Imagine the following simple query:

IEnumerable<string> phoneNumbers =

from c in dc.GetTable<Customer>()

select c.Phone;

This query is actually equivalent to the following:

IEnumerable<string> phoneNumbers = dc.GetTable<Customer>() .Select(c =>

c.Phone);

As you can see, the call to the Select operator is converted to a lambda expression

[ http://blogs.msdn.com/pedram/archive/2007/06/02/lambda-expressions.aspx ] that takes an

instance of a Customer and returns a string which happens to be the phone number of the customer.

This is called projection because we are returning only a few of the cells for a given customer row

(projecting on customer). Here are some more examples:

1 Selecting the name of all customers IEnumerable <string> names = dc.GetTable<Customer>() .Select(c => c.CustomerName);

2 Projecting all customers to instances of the already existing Contact type

IEnumerable <Contact> contacts =

dc.GetTable<Customer>().Select(

c => newContact

{

Name = c.CustomerName,

Page 17 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 18: Custom LINQ Provider

C# Copy Code

Phone = c.Phone

});

3 Projecting all customers to instances of a newly created C# anonymous type [ http://msdn.microsoft.com/en-us/library/bb397696.aspx ] (a type that is created by the compiler at the compile type)

var customerInfos =

dc.GetTable<Customer>().Select( c => new

{

ID = c.CustomerID,

Name = c.CustomerName

});

4 Projecting and changing the return value at the same time by performing a string concatenation

IEnumerable <string> phoneNumbers =

dc.GetTable<Customer>()

.Select(c => "+44 " +

c.Phone);

One thing that all of the lambda expressions that are used by the above Select operators have in

common is that they only take a single parameter of type Customer. Therefore in order to perform the

operations defined by these lambda expressions, we need to execute the lambdas passing in an

instance of Customer and remembering the result of the call. Here is a list of all the 4 Select lambda

expressions shown above:

c => c.CustomerName

c => newContact {Name = c.CustomerName,Phone = c.Phone }

c => new { ID = c.CustomerID, Name = c.CustomerName }

c => "+44 " + c.Phone

In order to be able to perform the specifics defined in these lambda expressions, we first need to

compile their expression tree representations into executables (IL). Fortunately, the expression tree API

provides an efficient mechanism for compiling a LambdaExpression into IL using Reflection.Emit

[ http://msdn.microsoft.com/en-us/library/3y322t50.aspx ] . You can simply call the Compile method

on the expression tree:

Delegate f = lambdaExpression.Compile();

Now that the lambda expression is compiled, we can invoke it dynamically passing in an instance of a

Customer:

f.DynamicInvoke(customerEntity);

The lambda expressions above take an instance of Customer. Therefore our result mapper should

always create an instance of Customer and pass that to the compiled code to perform the projection.

The final step is to read each result from the returned result-set and convert it to an instance of

Customer (entity) and if needed perform a projection. The code below illustrates how this is done by

the DatabaseResultMapper<TEntity> class. Please note the use of reflection APIs to dynamically create

instances of TEntity:

public IEnumerator<TEntity> GetEnumerator()

{

// compile the lambda expression into a function

Delegate projectionFunction = null;

Page 18 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 19: Custom LINQ Provider

if (_info.LambdaExpression != null) { projectionFunction = _info.LambdaExpression.Compile(); } bool isFirst = true;

MemberInfo[] members = null; BindingFlags bindingFlags = BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance; while (_reader.Read())

{ if (isFirst)

{ // find the order of the columns returned by the database members = new MemberInfo[_reader.FieldCount]; var persistentDataMembers = _info.SourceMetadata.PersistentDataMembers;

for (int i = 0; i < _reader.FieldCount; i++) {

string colName = _reader.GetName(i); DatabaseMetaDataMember mem = persistentDataMembers.FirstOrDefault( p => string.Compare(p.MappedName, colName, true, CultureInfo.InvariantCulture) == 0);

if (mem == null) throw new Exception(string.Format(

"It was not possible to find a mapping column for {0}", colName)); members[i] = mem.StorageMember ?? mem.Member; } isFirst = false; } // create a single instanace of the Entity (the mapper object) object entity = Activator.CreateInstance( _info.SourceMetadata.EntityType, bindingFlags, null, null, null); // populate its members with values from the result-set

for (int i = 0; i < members.Length; i++) { // Do magic conversion from SQL type to CLR type! // NOTE: I am using a very simplied conversion technique here. // You may want to use a more complex one... Type memberType = TypeHelper.GetMemberType(members[i]); // is this a Nullable type? if yes, then get // its generic type argument for conversion

if (TypeHelper.IsNullableType(memberType)) { memberType = memberType.GetGenericArguments()[0]; } object value = Convert.ChangeType(_reader.GetValue(i), memberType); // set the value of the member on the entity instance to 'value' TypeHelper.SetMemberValue(entity, members[i], value); } // project on the entity by calling the // compiled projection funcation

if (projectionFunction == null) { // this entity needs to be tracked // some tracking code to go here ... yield return (TEntity)trackedEntity; }

else { yield return (TEntity)projectionFunction.DynamicInvoke(entity); } } }

The steps taken by this mapper are:

Page 19 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 20: Custom LINQ Provider

C# Copy Code

C# Copy Code

C# Copy Code

C# Copy Code

If the query expression has projection, compile the projection’s lambda expression to IL 1.

Find the order of the columns in the result-set returned from the database 2.

For each row of the result-set: 3.

Create an instance of the entity

Populate its members with values from the current row

If the query expression has projection, invoke the compiled projection delegate

Some queries may return a sequence of objects and other may only be able to return a single object.

For instance, as shown below, the first expression returns a sequence of customers

(IEnumerable<Customer>) whereas the second one returns a single Customer object:

(1)

IEnumerable<Customer> customers =

from c in dc.GetTable<Customer>()

where c.City == "London"

select c;

(2)

Customer firstCustomer = dc.GetTable<Customer>().First();

Therefore it is necessary to find a way of distinguishing between the shapes of the results. For that

reason, you will find an enum that declares the shape of the result of a query. This is determined when the query expression is examined by the QueryTranslator:

internal enum ResultShape

{

None, // The query is not expected to have a return value

Singleton, // It returns a single entity

Sequence // It returns a sequence of entities

}

Here is a simplistic implementation of a function that identifies the shape of the result of an expression

tree:

private ResultShape GetResultShape(Expression query)

{

LambdaExpression lambda = query as LambdaExpression;

if (lambda != null)

query = lambda.Body;

if (query.Type == typeof(void))

return ResultShape.None;

if (query.Type == typeof(IMultipleResults))

throw new NotSupportedException( "Multiple result shape is not

supported");

MethodCallExpression methodExp = query as MethodCallExpression;

if (methodExp != null && ((methodExp.Method.DeclaringType == typeof

(Queryable)) ||

(methodExp.Method.DeclaringType == typeof(Enumerable))))

{

string str = methodExp.Method.Name;

Page 20 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 21: Custom LINQ Provider

C# Copy Code

if (str != null && (str == "First" || str == "FirstOrDefault" || str == "Single" || str == "SingleOrDefault"))

return ResultShape.Singleton; } return ResultShape.Sequence;

}

As you can see, if the expression tree includes a First, FirstOrDefault, Single or SingleOrDefault query

operator as its root element, then it is assumed that only a single object should be returned. Based on

this, the implementation of the second Execute method of the DatabaseProvider class must have

different logic for each possible ResultShape:

private object Execute(QueryInfo info) { // Get an open connection to the database DbConnection connection = _cnnManager.GetOpenConnection(); try

{ // Build a SQL Command DbCommand command = connection.CreateCommand(); command.CommandText = info.QueryText; // Attempt to excute the query if no result was expected from the query

if (info.ResultShape == ResultShape.None) command.ExecuteNonQuery();

else { DbDataReader reader = command.ExecuteReader(CommandBehavior.SingleResult); ... // What is the CLR type of the returned result-set rows? Type resultEntityType = (info.LambdaExpression == null) ? info.SourceMetadata.EntityType : info.LambdaExpression.Body.Type; // Build a pipeline so that we can // read and map the results returned from the DB. // Do this by dynamically creating an instance // of DatabaseResultMapper<> class. // The followig use of reflection is equivalent to: // IEnumerable mappedResults = // new DatabaseResultMapper<T>( // info, reader, _dataContext.ChangeTracker); IEnumerable mappedResults = (IEnumerable)Activator.CreateInstance(typeof(DatabaseResultMapper<>) .MakeGenericType(new Type[] { resultEntityType }), BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Instance, null, new object[] { info, reader, _dataContext.ChangeTracker }, null); // Are we expecting a single entity or a sequence of entities?

if (info.ResultShape == ResultShape.Sequence) { // Read the results by enumerating through all mappedResults // This is not perfect but as I am sharing the database // connection between all queries and even write operations, // I need to load all entities into memory by fully reading // them using the DbDataReader (mappedResults) – that is exactly // what the constructor of List<> does here. // The followig use of reflection is equivalent to: // return new List<T>(mappedResults);

return Activator.CreateInstance(typeof(List<>).MakeGenericType( new Type[] { resultEntityType }), new object[]

{ mappedResults }); }

else if (info.ResultShape == ResultShape.Singleton) { IEnumerator enumerator = mappedResults.GetEnumerator();

if (enumerator.MoveNext()) {

Page 21 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 22: Custom LINQ Provider

C# Copy Code

return enumerator.Current;

}

// If First or Single query operators are used // throw an

excpetion because no rows where returned.

// You should always expect at least one element

// when First or Single are used

else if (!info.UseDefault)

{

throw new Exception( "No entity was found meeting the

specified criteria.");

}

}

}

return null;

}

finally

{

connection.Close();

}

}

As you can see, the implementation of this method is extremely simple. It basically decides how many

times the result mapper should be enumerated (none, once, or multiple times) based on the shape of

the result that was discovered earlier.

This provider is also capable of executing dynamic SQL or stored procedures that return a sequence of

entities. As long as a mapper class exists for the returned entity then the provider automatically uses

the result mapper to map the returned result-set to a sequence of objects. Here is an example of

executing a dynamic query:

SqlConnection cnn = new SqlConnection(Settings.Default.NorthwindCnnString);

DatabaseDataContext dc = new DatabaseDataContext(cnn);

var ps = dc.ExecuteQuery<Product>(

@"SELECT p.ProductID, p.ProductName, p.UnitPrice FROM dbo.Products p

JOIN dbo.Categories c on c.CategoryID = p.CategoryID

WHERE c.CategoryName = 'Beverages'");

foreach (var pr in ps)

Console.Write(pr.ProductName + ",");

The query part of the provider is now complete. Let’s see a few examples:

Purpose Implementation Generated SQL

Display a list of customer phone numbers

DatabaseDataContext dc = newDatabaseDataContext(connection);

var customers = dc.GetTable<Customer>().Select(c => c.Phone);

foreach (var c in customers)

Console.Write(c + ",");

SELECT [Phone] FROM dbo.Customers

Display a list of customers living in London with no phone number

var q =

from c in dc.GetTable<Customer>()

SELECT [ContactName] FROM dbo.Customers

WHERE (( [City] ='London')AND([Phone] ISNOTNULL))

Page 22 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 23: Custom LINQ Provider

where c.City == "London" && c.Phone != null

orderby c.CustomerName, c.Country

select c.CustomerName;

ORDER BY [ContactName] ASC, [Country] ASC

Take the first 5

customers

var q = dc.GetTable<Customer>().Take(5);

SELECT TOP 5

[CustomerID], [CompanyName], [ContactName],

[ContactTitle], [Address], [City], [Region],

[PostalCode], [Country], [Phone], [Fax]

FROM dbo.Customers

So far this custom provider has only been able to build and execute T-SQL queries. The next step is to

introduce a mechanism to handle write operations.

Entity tracking (also known as object tracking) is a technique used to track all entity instances created

or known to a provider. This is heavily used to organise and manage write operations. It is necessary to

keep track of all entity instances known to this custom provider because:

Most changes to an entity object representing a row in the database should be tracked so

that they can be applied to the database.

Write requests (update, insert and delete) are all kept in memory until the user explicitly

asks the provider to submit the changes.

Entity objects representing the same row in the database should also be the same object in

the managed heap. With this approach, any changes to one entity is reflected by all other

references so that when changes are being submitted to the database only a single

UPDATE operation incorporating all changes is executed.

One may argue that when a change occurs, it should be possible to generate the DML statement for the

change at that point in time. Unfortunately that is not a workable solution because of the performance

overhead it would impose on applications (every write would require execution of a DML statement).

The entity tracking technology used by our provider applies a batch of all write operations in a

transaction when the SubmitChanges method of the provider’s entry point (DatabaseDataContext) is

manually called.

There are many states that an entity can participate in. LINQ to SQL has a comprehensive list of states

[ http://msdn.microsoft.com/en-us/library/bb386982.aspx ] that its objects can be in. The only

tracking states used by this simple custom provider are ToBeInserted, ToBeDeleted and ToBeUpdated.

As was mentioned earlier, every single object created by the result mapper

(DatabaseResultMapper<T>) must be tracked by the change tracking system. This means keeping a

reference to those objects. Over time this list of objects will grow and as our change tracker is keeping

a strong reference to all of those objects, they are never garbage collected even if they are no longer

referenced anywhere else.

In order to avoid this possible memory issue, the implementation of the tracker uses a collection of

WeakReference [ http://msdn.microsoft.com/en-us/library/system.weakreference.aspx ] objects. In

one sentence, a WeakReference references an object while still allowing that object to be reclaimed by

the garbage collector:

List <WeakReference> _trackedItems;

If an object has been reclaimed by the garbage collector, the WeakReference’s IsAlive property would

be set to false. However if the original object is still in memory, you can use the Target property to

Page 23 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 24: Custom LINQ Provider

C# Copy Code

C# Copy Code

access that original object. In the sample below, assuming that obj1 is still alive in memory, both obj1

and obj2 variables should be referencing the same object:

WeakReference wr = new WeakReference(obj1);

...

if(wr.IsAlive)

object obj2 = wr.Target;

Change Tracking

The entity tracking and change processing is managed by two classes, the ChangeTracker and the ChangeProcessor classes:

Figure 8 : The main classes used to manage write requests

When an entity instance is created by the result mapper, or is newly inserted by the user, it is attached

to the ChangeTracker by calling the Attached method.

In order to receive change notifications from entity instances, the entity classes must implement the

INotifyPropertyChanging [ http://msdn.microsoft.com/en-

us/library/system.componentmodel.inotifypropertychanging.aspx ] interface, and also any changes to

their properties should raise the PropertyChanging event defined by this interface:

[Table(Name = "dbo.Products")]

public partial class Product : INotifyPropertyChanging { public event

PropertyChangingEventHandler PropertyChanging; protected virtual void SendPropertyChanging()

{

if ((this.PropertyChanging != null))

{

this.PropertyChanging(this, emptyChangingEventArgs);

}

}

[Column]

public string ProductName

{

get { return this._ProductName; }

set {

if ((this._ProductName != value))

{

this.SendPropertyChanging();

this._ProductName = value;

Page 24 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 25: Custom LINQ Provider

C# Copy Code

} } }

The change tracking is performed by the ChangeTracker class that subscribes to the PropertyChanging

event of all mapped entity instances. When the provider is notified of the first change to an object, it

creates an instance of the TrackedObject and adds it to a collection of objects that require processing.

This collection includes a list of newly created, updated and deleted objects.

Figure 9 : Tracked objects and their states

The code listing below demonstrates the logic used by the ChangeTracker’s event handler for all

PropertyChanging events:

private Dictionary<object, TrackedObject> _items; ... private voidOnPropertyChanging( object sender, PropertyChangingEventArgs args) { // Is this object already being tracked? TrackedObject trackedObj; if (!_items.TryGetValue(sender, out trackedObj))

{ DatabaseMetaTable metaTable = CheckAndGetMetaTable(sender.GetType()); // This has not been tracked in the past so create a new tracked object trackedObj = ...; _items.Add(sender, trackedObj); } }

This custom provider is unable to keep track of changes to objects that do not implement

INotifyPropertyChanging.

It is also worth mentioning that the ChangeTracker should often compress its in-memory collection of

WeakReferences and remove the ones with garbage collected targets.

At this point, I would like to define the scope in which changes are tracked by the ChangeTracker. All

changes and in fact all write operations such as delete and insert, are cached by the ChangeTracker

until the AcceptChanges method of this class is called. That method in turn can only be called by the

SubmitChanges method of the DatabaseDataContext.

When SubmitChanges is called, all changes, newly created entities and delete requests are converted to

database commands and are executed inside a transaction. The transaction is only committed when all

commands succeed.

The conversion of TrackedObjects to database commands is done by the ChangeProcessor class:

Page 25 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 26: Custom LINQ Provider

C# Copy Code

C# Copy Code

internal class ChangeProcessor { internal void SubmitChanges() {

foreach (var obj in _tracker.GetInterestingObjects()) { DbCommand command = null;

switch (obj.TrackingState) {

case TrackingState.ToBeInserted: command = BuildInsertCommand(obj);

break; case TrackingState.ToBeUpdated:

command = BuildUpdateCommand(obj); break;

case TrackingState.ToBeDeleted: command = BuildDeleteCommand(obj);

break; }

if(command != null) ExecuteCommand(command); } } ...

And here is how the ChangeProcessor converts an insert request to a SQL INSERT command:

private DbCommand BuildInsertCommand(TrackedObject obj) { if (obj == null)

throw new ArgumentNullException("obj"); DbCommand command = GetNewDbCommand(); StringBuilder sb = new StringBuilder("INSERT INTO "); sb.Append(obj.MetaTable.TableName); sb.Append(" ("); // Find all the insertable columns (excluding // all the auto generated data members) var columns = obj.MetaTable.PersistentDataMembers .Where(c => !c.IsDbGenerated); StringBuilder columnValuesSb = new StringBuilder(); bool isFirst = true;

// Include all the columns in the INSERT statement foreach (var col in columns)

{ if (!isFirst)

{ sb.Append(", "); columnValuesSb.Append(", "); }

else isFirst = false; // What if the name of the column has a space character in it? sb.Append(FormatHelper.WrapInBrackets(col.MappedName)); // Get the value from the entity for this column MemberInfo memberInfo = col.StorageMember ?? col.Member; // Format the value of the column to // its acceptable SQL representation object value = FormatHelper.FormatDbValue( TypeHelper.GetMemberValue(obj.Entity, memberInfo)); columnValuesSb.Append(value); } sb.Append(") VALUES ("); sb.Append(columnValuesSb.ToString()); sb.Append(")"); command.CommandText = sb.ToString();

Page 26 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 27: Custom LINQ Provider

C# Copy Code

return command; }

DatabaseDataContext.SubmitChanges() now looks similar to the following – it effectively delegates the

actual translation and execution of the database commands to the ChangeProcessor and has the

responsibility of creating a new transaction or reusing an existing one:

public void SubmitChanges() { this.CheckDispose();

// Get an open database connection DbConnection connection = _provider.Connection.GetOpenConnection(); try

{ // Create or use a transaction

using (TransactionScope ts = new TransactionScope()) { // Enlist in this transaction connection.EnlistTransaction(Transaction.Current); // Process all changes and apply them to the database ChangeProcessor processor =

new ChangeProcessor(_changeTracker, this); processor.SubmitChanges(); // Vote to commit the transaction ts.Complete(); // If all changes were committed successfuly // to the DB then accept all the changes _changeTracker.AcceptChanges(); } } finally

{ connection.Close(); } }

ChangeTracker.AcceptChanges() is called only when all SQL commands are executed successfully.

When called, deleted objects are no longer tracked and the ChangeTracker subscribes to the

PropertyChanging event of all the newly created objects.

Both delete and update operations rely heavily on the primary keys of the entity. In fact, if you look at

the database commands generated for these operations, you will notice that the generated WHERE

clause only specifies the primary key columns. It is worth pointing out that no delete or update

operations can be performed by this mini provider if the entity does not have a primary key defined (an

exception is thrown).

Also, it is not possible to modify a primary key using this provider as it does not keep a copy of the original primary key values so that they could be used by the UPDATE statement to locate the original

record.

Here are some examples of write operations against the Northwind database:

Purpose Implementation Generated SQL

Inserting a new

customer

DatabaseDataContext dc = newDatabaseDataContext(dbConnection);

p = newCustomer

{

INSERT INTO dbo.Customers

( [CustomerID], [CompanyName], [ContactName],

Page 27 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx

Page 28: Custom LINQ Provider

CustomerID = "12345",

CompanyName = "Microsoft",

City = "London",

CustomerName = "Pedram Rezaei"

};

dc.GetTable<Customer>().InsertOnSubmit(p);

dc.SubmitChanges();

[ContactTitle], [Address], [City], [Region], [PostalCode], [Country], [Phone], [Fax])

VALUES ('12345','Microsoft', 'Pedram Rezaei',NULL,

NULL,'London',NULL, NULL,NULL,NULL,NULL)

Updating a

customer entity

DatabaseDataContext dc = newDatabaseDataContext(dbConnection);

// get a customer from the database

Customer p = ...

p.City = "Southampton";

dc.SubmitChanges();

UPDATE dbo.Customers

SET

[CustomerID] ='12345', [CompanyName] ='Microsoft',

[ContactName] ='Pedram Rezaei', [ContactTitle] ='',

[Address] ='', [City] ='Southampton',

[Region] ='', [PostalCode] ='',

[Country] ='',

[Phone] ='',

[Fax] =''

WHERE ([CustomerID] ='12345')

LINQ is an amazing technology and its design makes it an incredibly flexible and extensible platform. As

we saw, the LINQ’s provider model supports different types of providers. This article showed how it is

possible to write a simple IQueryable LINQ provider that also supported write operations.

You can download the code for this article from here

[ http://code.msdn.microsoft.com/SimpleLingToDatabase.aspx ] .

I would like to thank Stuart Leaks, Carl Nolan and Charlie Calvert for their help in reviewing this article.

Disclaimer : Any code or technique shown here is for illustrative purposes only and the referenced

solution should only be seen as a throw-away prototype. The code has not been fully tested therefore it should not be assumed as production ready. You may find issues and bugs so please understand that

the purpose of this exercise was not to write a fully working and bulletproof provider. The aim is to

demonstrate the steps required to write a custom IQueryable LINQ provider.

Page 28 of 28Building a LINQ Provider

5/14/2010http://msdn.microsoft.com/en-us/vcsharp/ee672195(printer).aspx