Substring comparer on Intersect

intersection of two strings python
python intersection
python list intersection
intersection in lists
python set intersection
itertools intersection
python built in intersection
set1 intersection set2

I need to do an intersect between strings but comparing substrings:

public class MiNumeroEqualityComparer : IEqualityComparer<string> {
    public bool Equals(string x, string y) => x.Contains(y);
    public int GetHashCode(string obj) => obj.GetHashCode();
}

List<string> lst = new List<string> { "abcXdef", "abcXdef", "abcede", "aYcde" };
List<string> num = new List<string> { "X", "Y", "Z" };

var fin = lst.Intersect(num, new MiNumeroEqualityComparer());

I expect in fin: "abcXdef", "abcXdef", "aYcde"

But it's empty, why?

First I've tried substring with case insensitive with: (without success too)

public bool Equals(string x, string y) => x.IndexOf(y, StringComparison.InvariantCultureIgnoreCase) >= 0;

But empty too.

You're doing an intersection between two lists, which will give you the common items between them. Since neither list contains an identical item, you are getting no results.

If you want to get all the items from lst that contain an item from num, then you can do something like the code below, which uses the string.Contains method to filter the items from lst:

var fin = lst.Where(item => num.Any(item.Contains));

Result:

{ "abcXdef", "abcXdef", "aYcde" }

Alternatively, if you do want to do a case-insensitive query, you can use the IndexOf method instead:

var fin = lst.Where(item => num.Any(n => 
    item.IndexOf(n, StringComparison.OrdinalIgnoreCase) >= 0));

If that's hard to understand (sometimes Linq is), the first code snippet above is a shorthand way of writing the following:

var fin = new List<string>();

foreach (var item in lst)
{
    foreach (var n in num)
    {
        if (item.Contains(n))
        {
            fin.Add(item);
            break;
        }
    }
}

c# - Substring comparer on Intersect, You're doing an intersection between two lists, which will give you the common items between them. Since neither list contains an identical  The easier case of just where B is a string of A, with std::set_intersection would be pretty simple with a complexity of (A.size + B.size) * comp_substr, with would be even better if one had to sort it before (n * log(n)), but I don't know how to write the compare function for it, or rather the sort of both.

Sure Rufus has solved your issue in the answer provided. But let me explain why your approach was not working.

The reason it is producing an empty result is because Equals(string x, string y) will never be called. It can infer the inequality from the GetHashCode method. If the hashes are the same, then it will call Equals. In other words, your logic in Equals will never be executed.

Here is some code so you can see what is going on.

class Program
{
    static void Main(string[] args)
    {
        // See I added an item at the end here to show when Equals is called
        List<string> lst = new List<string> { "abcXdef", "abcXdef", "abcede", "aYcde", "X" };
        List<string> num = new List<string> { "X", "Y", "Z" };

        var fin = lst.Intersect(num, new MiNumeroEqualityComparer()).ToList();
        Console.ReadLine();
    }
}

public class MiNumeroEqualityComparer : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        Console.WriteLine("Equals called for {0} and {1}.", x, y);
        return x.Contains(y);
    }

    public int GetHashCode(string obj)
    {
        Console.WriteLine("GetHashCode alled for {0}.", obj);
        return obj.GetHashCode();
    }
}

If you run the above code, it will only call Equals for items which produce the same hash; so for "X" only.

See the output in this fiddle.

Python, It contains well written, well thought and well explained computer science and One of the string operations can be computing the intersection of two strings i.e,  The first substring precedes the second substring in the sort order. Zero: The substrings occur in the same position in the sort order, or length is zero. Greater than zero: The first substring follows the second substring in the sort order.

Intersect gets common elements from 2 collections. The Intersect method here is elegant. It can be used on many types of elements.

your result is empty because it is not a common value in the lists.

  List<string> lst = new List<string> { "abcXdef", "abcXdef", "abcede", "aYcde" };
            List<string> num = new List<string> { "X", "Y", "abcXdef", "Z", "aYcde" };

            var fin = lst.Intersect(num);

fin >> abcXdef,aYcde

Software Systems Safety, Each edge corresponds to some substring of the output, and is annotated with the set of Procedure Intersect: Given a trace set for each input-output example, the expressions (if there is a long substring match between input and output, it is  The substr_compare() function compares two strings from a specified start position. Tip: This function is binary-safe and optionally case-sensitive.

Handling Strings with R, In other words, nchar() provides the length of a string: functions such as set union, intersection, difference, equality and membership, on "character" vectors. The StringComparer class is declared abstract (MustInherit in Visual Basic), which means its members can be invoked only on an object of a class derived from the StringComparer class. The contradiction is that each property of the StringComparer class is declared static ( Shared in Visual Basic), which means the property can be invoked without first creating a derived class.

Similarity Search and Applications: 8th International Conference, , Then all strings intersecting with s are contained in an enveloping string of the For example, when comparing ordinary text, a substring containing spaces and  SUBSTRING (input_string, start, length); In this syntax: input_string can be a character, binary, text, ntext, or image expression. start is an integer that specifies the location where the returned substring starts. Note that the first character in the input_string is 1, not zero.

LATIN 2010: Theoretical Informatics: 9th Latin American Symposium, , computing expressions on given sets involving unions and intersections. In addition, we presentasolution for the two-dimensional substring indexing problem,  For substr, a character vector of the same length and with the same attributes as x (after possible coercion). For substring , a character vector of length the longest of the arguments. This will have names taken from x (if it has any after coercion, repeated as needed), and other attributes copied from x if it is the longest of the arguments).

Comments
  • X, Y, Z are strings not char, could be: XXX, Y, ZZZZZ
  • Your equality comparer is broken for a start. You seem to be trying to use it to make "abcXdef" equal "X" but you have many problems. For a start your equals is not reverable for example. So Equals(x,y) is not going to give the same result as Equals(y,x). Also your getHashCode is wrong which is likely why nothing is happening. "abcXdef" has a different hashcode to "X" so there is no way they can be the same so it probably isn't even running your Equals methods...
  • Thanks for the comment, I've tried GetHashCode return 1 for all, some strings are on the list but not all should be, I suppose that Linq internally has several optimizations and not all elements check the Equals function.
  • The point of my comment was not just to point out flaws in your equality implementation but also to question whether its the right way to do things. "x" and "abcXdef" are not equal in any practical sense so making an equality comparer that says they are is silly. given that for an equality comparer if a=b and b=c then a =c should be true with your logic that two strings are equal if one is a substring of another you would eventually end up having to prove all strings are equal to each other more or less.
  • You're complete right @Chris, thanks! Now, I understand quite well.
  • Thanks a lot, I'm newbie in Linq questions, with foreach's it's quite easy to do for me, but I'd want to learn/understand Linq.
  • Thank you very much for the explanation, I understand now. I don't know dotnetfiddle.net very useful, thanks!