Linq 用来实现集合(List, DataTable等) 的二次操作十分简便,这里介绍下用 Linq 对集合进行 Distinct 操作的几种方法。
0. 准备数据:
1. 使用GroupBy:对需要Distinct的字段进行分组,取组内的第一条记录这样结果就是Distinct的数据了。
- Console.WriteLine("Distinct1 By: A");
- var query1 = from e in User.GetData()
- group e by new { e.A } into g
- select g.FirstOrDefault();
- foreach (var u in query1)
- Console.WriteLine(u.ToString());
2. 使用Distinct()扩展方法:需要实现IEqualityComparer接口。
- class UserCompare : IEqualityComparer<User>
- {
- public bool Equals(User x, User y)
- {
- return (x.A == y.A && x.B == y.B);
- }
- public int GetHashCode(User obj)
- {
- // return obj.GetHashCode();
- return obj.ToString().ToLower().GetHashCode();
- }
- }
- Console.WriteLine("Distinct2 By: A,B");
- var compare = new UserCompare();
- var query2 = User.GetData().Distinct(compare);
- foreach (var u in query2)
- Console.WriteLine(u.ToString());
上面的实现中要注意GetHashCode()方法直接用obj.GetHashCode()的话,Distinct不能正常运行。
3. 自定义扩展方法DistinctBy(this IEnumerable source, Func keySelector)
- public static class MyEnumerableExtensions
- {
- public static IEnumerable<TSource> DistinctBy<TSource, TKey>
- (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
- {
- HashSet<TKey> seenKeys = new HashSet<TKey>();
- foreach (TSource element in source)
- {
- if (seenKeys.Add(keySelector(element))) { yield return element; }
- }
- }
- }
- Console.WriteLine("Distinct3 By: A,B,C");
- var query3 = User.GetData().DistinctBy(x => new { x.A, x.B, x.C });
- foreach (var u in query3)
- Console.WriteLine(u.ToString());
运行结果:
A B C D
a2,b1,c1,d1
a1,b2,c1,d1
a1,b1,c1,d1
a1,b1,c2,d1
a1,b1,c1,d2
----------------
Distinct1 By: A
a1,b1,c1,d1
a2,b1,c1,d1
----------------
Distinct2 By: A,B
a1,b1,c1,d1
a2,b1,c1,d1
a1,b2,c1,d1
a1,b1,c2,d1
a1,b1,c1,d2
----------------
Distinct3 By: A,B,C
a1,b1,c1,d1
a2,b1,c1,d1
a1,b2,c1,d1
a1,b1,c2,d1
----------------