LINQ's Distinct() on a particular property

linq tutorial
linq to sql
linq syntax
linq query
linq examples
linq methods
linq join
linq java

I am playing with LINQ to learn about it, but I can't figure out how to use Distinct when I do not have a simple list (a simple list of integers is pretty easy to do, this is not the question). What I if want to use Distinct on a list of an Object on one or more properties of the object?

Example: If an object is Person, with Property Id. How can I get all Person and use Distinct on them with the property Id of the object?

Person1: Id=1, Name="Test1"
Person2: Id=1, Name="Test1"
Person3: Id=2, Name="Test2"

How can I get just Person1 and Person3? Is that possible?

If it's not possible with LINQ, what would be the best way to have a list of Person depending on some of its properties in .NET 3.5?

EDIT: This is now part of MoreLINQ.

What you need is a "distinct-by" effectively. I don't believe it's part of LINQ as it stands, although it's fairly easy to write:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
    (this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    HashSet<TKey> seenKeys = new HashSet<TKey>();
    foreach (TSource element in source)
    {
        if (seenKeys.Add(keySelector(element)))
        {
            yield return element;
        }
    }
}

So to find the distinct values using just the Id property, you could use:

var query = people.DistinctBy(p => p.Id);

And to use multiple properties, you can use anonymous types, which implement equality appropriately:

var query = people.DistinctBy(p => new { p.Id, p.Name });

Untested, but it should work (and it now at least compiles).

It assumes the default comparer for the keys though - if you want to pass in an equality comparer, just pass it on to the HashSet constructor.

Language-Integrated Query (LINQ) (C#), LINQ (Language Integrated Query) is uniform query syntax in C# and VB.NET to retrieve data from different sources and formats. It is integrated in C# or VB,  Reserve Today The Linq Hotel In Las Vegas Nevada - Book Now & Save

What if I want to obtain a distinct list based on one or more properties?

Simple! You want to group them and pick a winner out of the group.

List<Person> distinctPeople = allPeople
  .GroupBy(p => p.PersonId)
  .Select(g => g.First())
  .ToList();

If you want to define groups on multiple properties, here's how:

List<Person> distinctPeople = allPeople
  .GroupBy(p => new {p.PersonId, p.FavoriteColor} )
  .Select(g => g.First())
  .ToList();

The LINQ Las Vegas Strip Hotel + Experience, This article explains why to use linq in .net. To understand why we should use LINQ, let's take some example. Suppose you want to find list of teenager students​  Book The LINQ Hotel & Casino In Las Vegas, NV Today!

Use:

List<Person> pList = new List<Person>();
/* Fill list */

var result = pList.Where(p => p.Name != null).GroupBy(p => p.Id).Select(grp => grp.FirstOrDefault());

The where helps you filter the entries (could be more complex) and the groupby and select perform the distinct function.

Language Integrated Query, Introduced in Visual Studio 2008 and designed by Anders Hejlsberg, LINQ (​Language Integrated Query) allows writing queries even without the knowledge of  Book a Hotel near The Linq, Las Vegas. Save up to 50% on your reservation.

You could also use query syntax if you want it to look all LINQ-like:

var uniquePeople = from p in people
                   group p by new {p.ID} //or group by new {p.ID, p.Name, p.Whatever}
                   into mygroup
                   select mygroup.FirstOrDefault();

What is LINQ, LINQ stands for Language Integrated Query. LINQ is a data querying API that provides querying capabilities to .NET languages with a syntax  Buy Tickets to The LINQ - See the Full 2018 Event Schedule

I think it is enough:

list.Select(s => s.MyField).Distinct();

Why LINQ?, LINQPad is not just for LINQ queries, but any C#/F#/VB expression, statement block or program. Put an end to those hundreds of Visual Studio Console projects  Language-Integrated Query (LINQ) is the name for a set of technologies based on the integration of query capabilities directly into the C# language. Traditionally, queries against data are expressed as simple strings without type checking at compile time or IntelliSense support.

LINQ - Overview, Review extension methods from the System.Linq namespace. Learn to write queries directly in programs. The LINQ is taking visitors to thrilling new heights with two exciting experiences. Ride on the world’s tallest observation wheel, the High Roller, or soar above the LINQ Promenade on FLY LINQ Zipline.

LINQ In C#, Connect all of your K-12 school district operations with LINQ's product suite. Language Integrated Query(LINQ, pronounced "link") is a Microsoft.NET Frameworkcomponent that adds native data queryingcapabilities to.NET languages, originally released as a major part of.NET Framework 3.5in 2007.

LINQPad, With LINQ to SQL, you first create an object-relational mapping at design time either manually or by using the LINQ to SQL Tools in Visual Studio. You write your queries against the objects, and at run-time LINQ to SQL handles the communication with the database.

Comments
  • Source to DistinctBy: code.google.com/p/morelinq/source/browse/MoreLinq/DistinctBy.cs
  • @ashes999: I'm not sure what you mean. The code is present in the answer and in the library - depending on whether you're happy to take on a dependency.
  • @ashes999: If you're only doing this in a single place, ever, then sure, using GroupBy is simpler. If you need it in more than one place, it's much cleaner (IMO) to encapsulate the intention.
  • @MatthewWhited: Given that there's no mention of IQueryable<T> here, I don't see how it's relevant. I agree that this wouldn't be suitable for EF etc, but within LINQ to Objects I think it's more suitable than GroupBy. The context of the question is always important.
  • The project moved on github, here's the code of DistinctBy: github.com/morelinq/MoreLINQ/blob/master/MoreLinq/DistinctBy.cs
  • @ErenErsonmez sure. With my posted code, if deferred execution is desired, leave off the ToList call.
  • Very nice answer! Realllllly helped me in Linq-to-Entities driven from a sql view where I couldn't modify the view. I needed to use FirstOrDefault() rather than First() - all is good.
  • I tried it and it should change to Select(g => g.FirstOrDefault())
  • @ChocapicSz Nope. Both Single() and SingleOrDefault() each throw when the source has more than one item. In this operation, we expect the possibility that each group may have more then one item. For that matter, First() is preferred over FirstOrDefault() because each group must have at least one member.... unless you're using EntityFramework, which can't figure out that each group has at least one member and demands FirstOrDefault().
  • Seems to not be currently supported in EF Core, even using FirstOrDefault() github.com/dotnet/efcore/issues/12088 I am on 3.1, and I get "unable to translate" errors.