Finding longest common matching substring from given List<string> in a string

longest common substring recursive
longest common substring in an array of strings
longest common substring python
longest common substring - leetcode
longest common substring javascript
longest common substring - youtube
longest common subsequence
longest palindromic substring

I have a list of strings, which I want to find the start and end indices of their occurrence in a given string.

I want to find the longest common substring that is present in the original string and only print it.

Here is my code:

public static void Main(string[] args)
{
    //This is my original string where I want to find the occurance of the longest common substring
    string str = "will you consider the lic premium of my in-laws for tax exemption";

    //Here are the substrings which I want to compare   
    List<string> subStringsToCompare = new List<string>
    {
        "Life Insurance Premium",
        "lic",
        "life insurance",
        "life insurance policy",
        "lic premium",
        "insurance premium",
        "insurance premium",
        "premium"
    };

    foreach(var item in subStringsToCompare)
    {
        int start = str.IndexOf(item);

        if(start != -1)
        {
            Console.WriteLine("Match found: '{0}' at {1} till {2} character position", item, start, start + item.Length);
        }
    }
}

The problem is I am getting 3 occurrences instead of one. I can't seem to figure out the condition where it gets the longest common matched substring from all substring to compare.

Output I am getting:

  • Match found: 'lic' at 22 till 25 character position
  • Match found: 'lic premium' at 22 till 33 character position
  • Match found: 'premium' at 26 till 33 character position

Output expected:

  • Match found: 'lic premium' at 22 till 33 character position

.NET Fiddle

Here's what I suggested in my comment

public static void Main(string[] args)
{
    //This is my original string where I want to find the occurance of the longest common substring
    string str = "will you consider the lic premium of my in-laws for tax exemption";

    // Here are the substrings which I want to compare
    // (Sorted by length descending) 
    List<string> subStringsToCompare = new List<string>
    {
        "Life Insurance Premium",
        "life insurance policy",
        "insurance premium",
        "life insurance",
        "lic premium",
        "premium",
        "lic"
    };

    foreach(var item in subStringsToCompare)
    {
        int start = str.IndexOf(item);

        if(start != -1)
        {
            Console.WriteLine("Match found: '{0}' at {1} till {2} character position", item, start, start + item.Length);

            break; // Stop at the first match found
        }
    }
}

Longest Common Substring in an Array of Strings, C++ program to find the stem of given list of. // words. #include <bits/stdc++.h>. using namespace std;. // function to find the stem (longest common. // substring)  Given two strings ‘X’ and ‘Y’, find the length of the longest common substring. Examples : The longest common substring is “Geeks” and is of length 5. The longest common substring is “abcd” and is of length 4. The longest common substring is “abcdez” and is of length 6.

if you only needs exact match from the list of strings (not substring of the strings within the list) then you are pretty close

string longest = null;
int longestStart = 0;
foreach(var item in subStringsToCompare)
{
    int start = str.IndexOf(item);

    if(start != -1 && (longest == null || item.Length > longest.Length))
    {
        longest = item;
        longestStart = start
    }
}

if (longest != null)
{
    Console.WriteLine("Match found: '{0}' at {1} till {2} character position", longest, longestStart, longestStart + longest.Length);
}

Dynamic Programming, Given two string sequences write an algorithm to find, find the length of longest substring present in both of them. Recommended Posts: Longest Common Prefix Matching | Set-6. Longest Common Prefix using Character by Character Matching. Find shortest unique prefix for every word in a given list | Set 2 (Using Sorting) C program to Replace a word in a text by another given word. Print longest palindrome word in a sentence.

I haven't tried but try the following:

List<string> matches = new List<string>();
for( int i = 0; i < str.Length; i++ )
{
    foreach ( string toCompare in subStringsToCompare )
    {
        if ( str.SubString( i, toCompare.Length ) == toCompare )
            matches.Add( toCompare );
    }
}

string longest = "";
foreach ( string match in matches )
{
    if ( match.Length > longest.Length )
        longest = match;
}

Finding longest common matching substring from given List<string , Here's what I suggested in my comment public static void Main(string[] args) { //​This is my original string where I want to find the occurance of  Given two strings text1 and text2, return the length of their longest common subsequence. A subsequence of a string is a new string generated from the original string with some characters(can be none) deleted without changing the relative order of the remaining characters. (eg, "ace" is a subsequence of "abcde" while "aec" is not).

Longest Common Substring problem, The longest common substring problem is the problem of finding the longest string(s) Linked List · DP · Graph · Backtracking · Matrix · Heap · D&C · String · Sorting of finding the longest string (or strings) that is a substring (or are substrings) of decreasing lengths and return as soon any substring matches the first string. Remove ‘b’ and ‘ac’ from a given string; wildcard character matching; Longest Palindromic Substring; Given string is interleaving of two other strings or not; Print all permutations with repetition; Run length encoding; List items containing all characters of a given word; Write a program to print all permutations of a given string

Longest Common Substring, In computer science, the longest common substring problem is to find the longest string (or strings) that is a substring (or are substrings) of two or more strings. Given a list of k (k≤100) strings of length at most 1000, find a longest common substring of all strings in the given list. This is based on a problem from Rosalind. I tried writing a straight forward program to do this.

Longest common substring problem, In computer science, the longest common substring problem is to find the longest string that is a substring of two or more strings. Analysis Given. What is Longest Common Substring: A longest substring is a sequence that appears in the same order and necessarily contiguous in both the strings. Example : String A = "tutorialhorizon"; String B = "dynamictutorialProgramming"; Output: Length of Longest Common Substring: 8 ("tutorial").

Comments
  • Can you not sort subStringsToCompare by length (descending) and bail out at the first match found? That way "lic premium" will be found first and displayed before "lic" and "premium" get in the foreach loop.
  • @vc that in my opinion is not fool-proof
  • what do you mean?
  • @vc74 I mean it stills looks for lic and premium later in the loop. How to bail out if I get the match? Here's the updated fiddle
  • I've added an answer to show what I mean
  • This would further increase the complexity resulting in unnecessary computations.
  • Could you perhaps explain why you think your changes are better? As it is we are left guessing why you think this is an improvement. For example I can't see what advantage there is in going character by character through the string, finding a substring and comparing them rather than just using .indexof as the OP did originally.
  • "Changes"? My answer was no change of some code there is not existing...