Dapper performance

Performance Considerations –Dapper .Net

Agenda

• Query Caching

• Buffering

• Query Vs QueryMultiple

• Dirty Tracking

• Q & A

Query Caching

• Dapper caches information about every query it runs, this allow it to materialize objects quickly and process parameters quickly.

• The current implementation caches this information in a ConcurrentDictionary object.

static readonly ConcurrentDictionary<Identity, CacheInfo> _queryCache =

new ConcurrentDictionary<Identity, CacheInfo>();

• The objects it stores are never flushed.

• If you are generating SQL strings on the fly without using parameters it is possible you will hit memory issues.

• Each query you issue will create an Identity, depending on the SQL query, its command type and its parameters.

• The CacheInfo object contains IDataReader and IDBCommandfunctions and some counters which limit the cached amount.

Cont…

• The Identity class which is used for caching will look like as mentioned in the below slide,

private Identity(string sql, CommandType? commandType, string connectionString, Type type, TypeparametersType, Type[] otherTypes, int gridIndex)

{

this.sql = sql;

this.commandType = commandType;

this.connectionString = connectionString;

this.type = type;

this.parametersType = parametersType;

this.gridIndex = gridIndex;

unchecked

{

hashCode = 17; // we *know* we are using this in a dictionary, so pre-compute this

hashCode = hashCode * 23 + commandType.GetHashCode();

hashCode = hashCode * 23 + gridIndex.GetHashCode();

hashCode = hashCode * 23 + (sql == null ? 0 : sql.GetHashCode());

hashCode = hashCode * 23 + (type == null ? 0 : type.GetHashCode());

if (otherTypes != null)

{

foreach (var t in otherTypes)

{ hashCode = hashCode * 23 + (t == null ? 0 : t.GetHashCode());

}

}

hashCode = hashCode * 23 + (connectionString == null ? 0 : connectionString.GetHashCode());

hashCode = hashCode * 23 + (parametersType == null ? 0 : parametersType.GetHashCode());

}

}

CacheInfo Class

class CacheInfo

{

public Func<IDataReader, object> Deserializer { get; set; }

public Func<IDataReader, object>[] OtherDeserializers { get; set; }

public Action<IDbCommand, object> ParamReader { get; set; }

private int hitCount;

public int GetHitCount() { return Interlocked.CompareExchange(ref hitCount, 0, 0); }

public void RecordHit() { Interlocked.Increment(ref hitCount); }

}

Note on Caching

• Use thisstring s = "SELECT email, passwd, login_id, full_name " + "FROM members WHERE " + "email = @email";

SqlCommand cmd = new SqlCommand(s); cmd.Parameters.Add("@email", email);

• Instead ofcmd.CommandText = "SELECT email, passwd, login_id, full_name " + "FROM members " + "WHERE email = '" + email + "'";

• The first one is parameterized. It will be cached once. The second one is not parameterized. It will be cached every time you write a query like it with a different value for email. This will explode your memory. (+)

• The first one is vastly superior. It avoids injection attacks. dapper can cache it once. SQL Server will compile the execution plan once and cache it. (+)

Buffering

• The buffer is unrelated to cache.

• Dapper does not include any kind of data-cache (although it does have a cache related to the way how it processes commands, i.e. "this command string, with this type of parameter, and this type of entity - has these associated dynamically generated methods to configure the command and populate the objects").

• It’s a bool value, supplied against each command object. By default buffering set to true.

Buffer = true

• In a buffered API all the rows are read before anything is yielded.

• When dealing with the limited no of rows for example 100 or 200. So that it consumes less memory.

• Once you get the data, the command is complete - so there is no conflict between that and subsequent operations. (+)

• It doesn’t not hold the active connection for a long time. (+)

• As soon as you get the data, the command has already released any resources (locks etc), so you're having minimal impact on the server. (+)

• If the query is immense, loading them all into memory (in a list) could be expensive / impossible. (-)

• High latency time. (-)

Buffer = false

• If at all dealing with large amount of data (thousands to millions of rows). It consumes lot of memory for storing the buffered data.

• you can iterate over immense queries (many millions of rows), without needing them all in-memory at once - since you're only ever really looking at the current row being yielded. (+)

• In a streaming API each element is yielded individually. This is very memory efficient, but if you do lots of subsequent processing per item, mean that your connection / command could be "active" for an extended time. (+)

• You don't need to wait for the end of the data to start iterating - as soon as it has at least one row. (+)

Cont…

• The connection is in-use while you're iterating, which can lead to "there is already an open reader on the connection" (or whatever the exact wording is) errors if you try to invoke other commands on a per-row basis (this can be mitigated by MARS). (-)

• It holds the active connection for long time, when deals with large amount of data. (-)

Query Vs QueryMultiple

• We need to choose the right one whether to use Query or QueryMultiple.

• Its completely based upon the no of resultsets expected from the command. If expected is more than one resultset we must use QueryMultiple. If not we must use Query.

• QueryMultiple has some more additional logic. It applies some amount of complication.

QueryMultiple Example

var sql =

@"

select * from Customers where CustomerId = @id

select * from Orders where CustomerId = @id

select * from Returns where CustomerId = @id";

using (var multi = connection.QueryMultiple(sql, new { id = value }))

{

var customer = multi.Read<Customer>().Single();

var orders = multi.Read<Order>().ToList();

var returns = multi.Read<Return>().ToList();

}

Dirty Tracking

• Dapper .net provides a nice feature to determine if the update statement is really required. If the value does not change, it won’t generate the SQL statement, which is very handy performance optimization.

• The only requirement is we need to declare a interface for the object.

Update –Without Tracking

using (var sqlConnection = newSqlConnection(Constant.DatabaseConnection))

{

sqlConnection.Open();

var entity = sqlConnection.Get(9);

entity.ContactName = "John Smith";

sqlConnection.Update(entity);

var result = sqlConnection.Get(9);

}

Update –With Tracking

public interface ISupplier

{

int Id { get; set; }

string CompanyName { get; set; }

string ContactName { get; set; }

string ContactTitle { get; set; }

}

public class Supplier : ISupplier

{

public int Id { get; set; }

public string CompanyName { get; set; }

public string ContactName { get; set; }

public string ContactTitle { get; set; }

}

Cont…

using (var sqlConnection = new SqlConnection(conString))

{

sqlConnection.Open();

var supplier = sqlConnection.Get(9);

Console.WriteLine(string.Format("IsUpdated {0}", sqlConnection.Update(supplier)));

supplier.CompanyName = “NewManning";

Console.WriteLine(string.Format("IsUpdated {0}", sqlConnection.Update(supplier)));

}

Q & A

Technology

Dapper performance