C# and LINQ
Yuan YuMicrosoft Research Silicon Valley
2
Collections and Iterators IEnumerable<T>
Elements of type TIterator(current element)
• Very easy to use: foreach (string name in persons) { Console.WriteLine(name); }
More on IEnumerable<T>
• IEnumerable<T> is a generic collection– C# generics is very similar to C++ template– T is a type parameter representing its element type
• IEnumerable<double>• IEnumerable<MyClass>
• .NET collections implement IEnumerable<T>– T[], List<T>, HashSet<T>, Stack<T>, Queue<T>,
Dictionary<K, T>, …
IEnumerable<T> Examples
• Example 1
• Example 2int[] numbers = { 1, 5, 2, 12, 4, 5 };int sum = 0;foreach (int x in numbers){ sum += x;}
string[] persons = { “Frank”, “Bob”, “Chandu”, “Mike”, “Dennis” }foreach (string name in persons){ Console.WriteLine(name);}
5
LINQ: Operators on Collection
Collection<T> collection;bool IsLegal(Key);string Hash(Key);
var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};
6
LINQ Operators
Where (filter)Select (map)GroupByOrderBy (sort)Aggregate (fold)Join
Input
Lambda Expression
• A nice way to represent anonymous functions• Examples:
– Func<int, int> inc = x => x + 1;– Func<int, double> sqrt = x => Math.Sqrt(x);– Func<int, int, int> mul = (x, y) => x * y;
• Func<T, R> represents any method that takes a argument of type T and returns a value of type R– Similar to C++ function pointer
Where
• Filters the elements in a collection based on a predicate
• Example:
IEnumerable<T> Where<T>(IEnumerable<T> source, Func<T, bool> pred)
int[] a = { 1, 5, 2, 12, 4, 5 };IEnumerable<int> result = a.Where(x => x > 4);
Select
• Transforms each element of a collection into a new form
• Example:
IEnumerable<R> Select<T, R>(IEnumerable<T> source, Func<T, R> selector)
int[] a = { 1, 5, 2, 12, 4, 5 };IEnumerable<int> result = a.Select(x => x * x);
Composing Operators
• Composing computations
• Or simply
• You can use “var” to represent the type
int[] a = { 1, 5, 2, 12, 4, 5 };IEnumerable<int> r1 = a.Where(x => x > 4);IEnumerable<int> r2 = r1.Select(x => x *x);
int[] a = { 1, 5, 2, 12, 4, 5 };IEnumerable<int> r2 = a.Where(x => x > 4).Select(x => x * x);
int[] a = { 1, 5, 2, 12, 4, 5 };var r2 = a.Where(x => x > 4).Select(x => x * x);
Query Comprehension
• If you really hate lambda expression, you can also use the alternate SQL-like syntax
int[] a = { 1, 5, 2, 12, 4, 5 };var r2 = from x in a where x > 4 select x * x;
Invoking User-Defined Functions
int[] a = { 1, 5, 2, 12, 4, 5 };var r2 = from x in a where MyPredicate(x) select Math.Sqrt(x);
public static bool MyPredicate(int x){ // User code here}
SelectMany• Tranforms each element of a sequence to an
IEnumerable<R> and flattens the resulting sequences into one sequence
• Example:
IEnumerable<R> SelectMany<T, R>( IEnumerable<T> source, Func<T, IEnumerable<R>> selector)
string[] lines= { “A line of words of wisdom”, “Dryad and DryadLINQ are great”};var result = lines.SelectMany(x => x.Split(' '));
OrderBy
• Sorts the elements of a sequence in ascending order according to a key
• Example:
IEnumerable<T> OrderBy<T, K>(IEnumerable<T> source, Func<T, K> keySelector)
IEnumerable<Employee> employees;var result = employees.OrderBy(x => x.Name);
GroupBy
• Groups the elements of a sequence according to a specified key selector function
• IGrouping<K, T> represents a group of elements of type T with key K– g.Key returns the key of the group– g is IEnumerable<T>
IEnumerable<IGrouping<K, T>> GroupBy<T, K>(IEnumerable<T> source, Func<T, K> keySelector)
GroupBy Examples
• Example 1:
• Example 2:
string[] items= { "carrots", "cabbage", "broccoli", "beans", "barley" }; IEnumerable<IGrouping<char, string>> foodGroups = items.GroupBy(x => x[0]);
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };var groups = numbers.GroupBy(x => x % 5);
Distinct
• Returns distinct elements from a sequence
• Example
IEnumerable<T> Distinct<T>(IEnumerable<T> source)
int[] numbers = { 1, 5, 2, 11, 5, 30, 2, 2, 7 }; var result = numbers.Distinct();
Aggregate
• Applies an accumulator function over a sequence
• Example:
R Aggregate<T>(IEnumerable<T> source, R seed, Func<R, T, R> func)
double[] doubles = { 1.7, 2.3, 1.9, 4.1, 2.9 };double result = doubles.Aggregate(1.0, (r, n) => r * n);
Pre-defined Aggregators
• Useful pre-defineds– Count, LongCount, Sum, Max, Min, Average, All,
Any, Contains, …
• Examples:
IEnumerable<int> numbers;long result = numbers.LongCount();
20
Exercises (1)• Keep all numbers divisible by 5
var div = x.Where(v => v % 5 == 0);•The average value
var avg = x.Sum() / x.Count();•Normalize the numbers to have a mean value of 0
var avg = x.Sum() / x.Count();var norm = x.Select(v => v - avg);
• Keep each number only oncevar uniq = x.GroupBy(v => v) .Select(g => g.Key); var uniq = x.Distinct();
21
Exercises (2)• Flatten lists var flatten = x.SelectMany(v => v);• The average of all even #s and of all odd #s var avgs = x.GroupBy(v => v % 2) .Select(g => g.Sum() / g.Count());• The most frequent value var freq = x.GroupBy(v => v) .OrderBy(g => g.Count()) .Select(g => g.Key) .First()• The number of distinct positive values var pos = x.Where(v => v >= 0) .Distinct() .Count();
22
Putting them together: Histogrampublic static IEnumerable<Pair> Histogram( IEnumerable<LineRecord> input, int k){ var words = input.SelectMany(x => x.line.Split(' ')); var groups = words.GroupBy(x => x); var counts = groups.Select(x => new Pair(x.Key, x.Count())); var ordered = counts.OrderByDescending(x => x.count); var top = ordered.Take(k); return top;}
“A line of words of wisdom”
[“A”, “line”, “of”, “words”, “of”, “wisdom”]
[[“A”], [“line”], [“of”, “of”], [“words”], [“wisdom”]]
[ {“A”, 1}, {“line”, 1}, {“of”, 2}, {“words”, 1}, {“wisdom”, 1}]
[{“of”, 2}, {“A”, 1}, {“line”, 1}, {“words”, 1}, {“wisdom”, 1}]
[{“of”, 2}, {“A”, 1}, {“line”, 1}]
23
MapReduce in LINQ
public static IEnumerable<R> MapReduce<T,M,K,S>( IEnumerable<T> input,
Func<T, IEnumerable<M>> mapper,Func<M, K> keySelector,Func<IGrouping<K, M>, IEnumerable<R>> reducer)
{ var map = input.SelectMany(mapper); var group = map.GroupBy(keySelector); var result = group.SelectMany(reducer); return result;}