Longest common prefix (LCP) of a list of strings

longest common prefix python
longest common prefix c#
longest common prefix ruby
longest common prefix dynamic programming
prefix strings
longest common prefix dp
common prefix length hackerrank solution
how to find longest common prefix c#
lcs([ H|L1],[ H|L2],[H|Lcs]) :-
    !,
    lcs(L1,L2,Lcs).
lcs([H1|L1],[H2|L2],Lcs):-
    lcs(    L1 ,[H2|L2],Lcs1),
    lcs([H1|L1],    L2 ,Lcs2),
    longest(Lcs1,Lcs2,Lcs),
    !.
lcs(_,_,[]).

longest(L1,L2,Longest) :-
    length(L1,Length1),
    length(L2,Length2),
    (  Length1 > Length2
    -> Longest = L1
    ;  Longest = L2
    ).

This is my code so far. How could I optimize it so that it prints the prefix, e.g.:

["interview", "interrupt", "integrate", "intermediate"]

should return "inte"

A bit rusty with Prolog, haven't done it in a while :)

First, let's start with something related, but much simpler.

:- set_prolog_flag(double_quotes, chars).  % "abc" = [a,b,c]

prefix_of(Prefix, List) :-
   append(Prefix, _, List).

commonprefix(Prefix, Lists) :-
   maplist(prefix_of(Prefix), Lists).

?- commonprefix(Prefix, ["interview", "integrate", "intermediate"]).
   Prefix = []
;  Prefix = "i"
;  Prefix = "in"
;  Prefix = "int"
;  Prefix = "inte"
;  false.

(See this answer, how printing character lists with double quotes is done.)

This is the part that is fairly easy in Prolog. The only drawback is that it doesn't give us the maximum, but rather all possible solutions including the maximum. Note that all strings do not need to be known, like:

?- commonprefix(Prefix, ["interview", "integrate", Xs]).
   Prefix = []
;  Prefix = "i", Xs = [i|_A]
;  Prefix = "in", Xs = [i, n|_A]
;  Prefix = "int", Xs = [i, n, t|_A]
;  Prefix = "inte", Xs = [i, n, t, e|_A]
;  false.

So we get as response a partial description of the last unknown word. Now imagine, later on we realize that Xs = "induce". No problem for Prolog:

?- commonprefix(Prefix, ["interview", "integrate", Xs]), Xs = "induce".
   Prefix = [], Xs = "induce"
;  Prefix = "i", Xs = "induce"
;  Prefix = "in", Xs = "induce"
;  false.

In fact, it does not make a difference whether we state this in hindsight or just before the actual query:

?- Xs = "induce", commonprefix(Prefix, ["interview", "integrate", Xs]).
   Xs = "induce", Prefix = []
;  Xs = "induce", Prefix = "i"
;  Xs = "induce", Prefix = "in"
;  false.

Can we now based on this formulate the maximum? Note that this effectively necessitates some form of extra quantor for which we do not have any direct provisions in Prolog. For this reason we have to limit us to certain cases we know will be safe. The easiest way out would be to insist that the list of words does not contain any variables. I will use iwhen/2 for this purpose.

maxprefix(Prefix, Lists) :-
   iwhen(ground(Lists), maxprefix_g(Prefix, Lists)).

maxprefix_g(Prefix, Lists_g) :-
   setof(N-IPrefix, ( commonprefix(IPrefix, Lists_g), length(IPrefix, N ) ), Ns),
   append(_,[N-Prefix], Ns).   % the longest one

The downside of this approach is that we get instantiation errors should the list of words not be known.

Note that we made quite some assumptions (which I hope really hold). In particular we assumed that there is exactly one maximum. In this case this holds, but in general it could be that there are several independent values for Prefix. Also, we assumed that IPrefix will always be ground. We could check that too, just to be sure. Alternatively:

maxprefix_g(Prefix, Lists_g) :-
   setof(N, IPrefix^ ( commonprefix(IPrefix, Lists_g), length(IPrefix, N ) ), Ns),
   append(_,[N], Ns),
   length(Prefix, N),
   commonprefix(Prefix, Lists_g).

Here, the prefix does not have to be one single prefix (which it is in our situation).

The best, however, would be a purer version that does not need to resort to instantiation errors at all.

Longest common prefix (LCP) of a list of strings, First, let's start with something related, but much simpler. :- set_prolog_flag(​double_quotes, chars). % "abc" = [a,b,c] prefix_of(Prefix, List)  Output : The longest common prefix is - gee. Time Complexity : The recurrence relation is. T(M) = T(M/2) + O(MN) where. N = Number of strings M = Length of the largest string

Here is how I would implement this:

:- set_prolog_flag(double_quotes, chars).

longest_common_prefix([], []).
longest_common_prefix([H], H).
longest_common_prefix([H1,H2|T], P) :-
    maplist(append(P), L, [H1,H2|T]),
    (   one_empty_head(L)
    ;   maplist(head, L, Hs),
        not_all_equal(Hs)
    ).

one_empty_head([[]|_]).
one_empty_head([[_|_]|T]) :-
    one_empty_head(T).

head([H|_], H).

not_all_equal([E|Es]) :-
    some_dif(Es, E).

some_dif([X|Xs], E) :-
    if_(diffirst(X,E), true, some_dif(Xs,E)).

diffirst(X, Y, T) :-
    (   X == Y -> T = false
    ;   X \= Y -> T = true
    ;   T = true,  dif(X, Y)
    ;   T = false, X = Y
    ).

The implementation of not_all_equal/1 is from this answer by @repeat (you can find my implementation in the edit history).

We use append and maplist to split the strings in the list into a prefix and a suffix, and where the prefix is the same for all strings. For this prefix to be the longest, we need to state that the first character of at least two of the suffixes are different.

This is why we use head/2, one_empty_head/1 and not_all_equal/1. head/2 is used to retrieve the first char of a string; one_empty_head/1 is used to state that if one of the suffixes is empty, then automatically this is the longest prefix. not_all_equal/1 is used to then check or state that at least two characters are different.

Examples
?- longest_common_prefix(["interview", "integrate", "intermediate"], Z).
Z = [i, n, t, e] ;
false.

?- longest_common_prefix(["interview", X, "intermediate"], "inte").
X = [i, n, t, e] ;
X = [i, n, t, e, _156|_158],
dif(_156, r) ;
false.

?- longest_common_prefix(["interview", "integrate", X], Z).
X = Z, Z = [] ;
X = [_246|_248],
Z = [],
dif(_246, i) ;
X = Z, Z = [i] ;
X = [i, _260|_262],
Z = [i],
dif(_260, n) ;
X = Z, Z = [i, n] ;
X = [i, n, _272|_274],
Z = [i, n],
dif(_272, t) ;
X = Z, Z = [i, n, t] ;
X = [i, n, t, _284|_286],
Z = [i, n, t],
dif(_284, e) ;
X = Z, Z = [i, n, t, e] ;
X = [i, n, t, e, _216|_224],
Z = [i, n, t, e] ;
false.

?- longest_common_prefix([X,Y], "abc").
X = [a, b, c],
Y = [a, b, c|_60] ;
X = [a, b, c, _84|_86],
Y = [a, b, c] ;
X = [a, b, c, _218|_220],
Y = [a, b, c, _242|_244],
dif(_218, _242) ;
false.

?- longest_common_prefix(L, "abc").
L = [[a, b, c]] ;
L = [[a, b, c], [a, b, c|_88]] ;
L = [[a, b, c, _112|_114], [a, b, c]] ;
L = [[a, b, c, _248|_250], [a, b, c, _278|_280]],
dif(_248, _278) ;
L = [[a, b, c], [a, b, c|_76], [a, b, c|_100]] ;
L = [[a, b, c, _130|_132], [a, b, c], [a, b, c|_100]];
…

Longest Common Prefix (LCP) Problem, Function to find the longest common prefix (LCP) between given set of strings. string findLCP(vector<string> const &words). {. string prefix = words[0];. for (string​  It stores the lengths of the longest common prefixes (LCPs) between all pairs of consecutive suffixes in a sorted suffix array. For example, if A := [aab, ab, abaab, b, baab] is a suffix array, the longest common prefix between A[1] = aab and A[2] = ab is a which has length 1, so H[2] = 1 in the LCP array H.

Here's the purified variant of the code proposed (and subsequently retracted) by @CapelliC:

:- set_prolog_flag(double_quotes, chars).

:- use_module(library(reif)).

lists_lcp([], []).
lists_lcp([Es|Ess], Ls) :-
   if_((maplist_t(list_first_rest_t, [Es|Ess], [X|Xs], Ess0),
        maplist_t(=(X), Xs))
       , (Ls = [X|Ls0], lists_lcp(Ess0, Ls0))
       , Ls = []).

list_first_rest_t([], _, _, false).
list_first_rest_t([X|Xs], X, Xs, true).

Above meta-predicate maplist_t/3 is a variant of maplist/2 which works with term equality/inequality reification—maplist_t/5 is just the same with higher arity:

maplist_t(P_2, Xs, T) :-
   i_maplist_t(Xs, P_2, T).

i_maplist_t([], _P_2, true).
i_maplist_t([X|Xs], P_2, T) :-
   if_(call(P_2, X), i_maplist_t(Xs, P_2, T), T = false).

maplist_t(P_4, Xs, Ys, Zs, T) :-
   i_maplist_t(Xs, Ys, Zs, P_4, T).

i_maplist_t([], [], [], _P_4, true).
i_maplist_t([X|Xs], [Y|Ys], [Z|Zs], P_4, T) :-
   if_(call(P_4, X, Y, Z), i_maplist_t(Xs, Ys, Zs, P_4, T), T = false).

First here's a ground query:

?- lists_lcp(["a","ab"], []).
false.                                % fails (as expected)

Here are the queries presented in @Fatalize's fine answer.

?- lists_lcp(["interview",X,"intermediate"], "inte").
   X = [i,n,t,e]
;  X = [i,n,t,e,_A|_B], dif(_A,r)
;  false.

?- lists_lcp(["interview","integrate",X], Z).
   X = Z, Z = []
;  X = Z, Z = [i]
;  X = Z, Z = [i,n]
;  X = Z, Z = [i,n,t]
;  X = Z, Z = [i,n,t,e]
;  X = [i,n,t,e,_A|_B], Z = [i,n,t,e]
;  X = [i,n,t,_A|_B]  , Z = [i,n,t]  , dif(_A,e)
;  X = [i,n,_A|_B]    , Z = [i,n]    , dif(_A,t)
;  X = [i,_A|_B]      , Z = [i]      , dif(_A,n)
;  X = [_A|_B]        , Z = []       , dif(_A,i).

?- lists_lcp([X,Y], "abc").
   X = [a,b,c]      , Y = [a,b,c|_A]
;  X = [a,b,c,_A|_B], Y = [a,b,c]
;  X = [a,b,c,_A|_B], Y = [a,b,c,_C|_D], dif(_A,_C)
;  false.

?- lists_lcp(L, "abc").
   L = [[a,b,c]]
;  L = [[a,b,c],[a,b,c|_A]]
;  L = [[a,b,c,_A|_B],[a,b,c]]
;  L = [[a,b,c,_A|_B],[a,b,c,_C|_D]], dif(_A,_C)
;  L = [[a,b,c],[a,b,c|_A],[a,b,c|_B]]
;  L = [[a,b,c,_A|_B],[a,b,c],[a,b,c|_C]]
;  L = [[a,b,c,_A|_B],[a,b,c,_C|_D],[a,b,c]]
;  L = [[a,b,c,_A|_B],[a,b,c,_C|_D],[a,b,c,_E|_F]], dif(_A,_E) 
…

Last, here's the query showing improved determinism:

?- lists_lcp(["interview","integrate","intermediate"], Z).
Z = [i,n,t,e].                              % succeeds deterministically

Longest Common Prefix, Write the function to find the longest common prefix string among an array of To note here set() to convert the list to set has time complexity of O(N) Wn)) Hence we can recursively divide the problem of finding LCP of an  It can be observed that the word car is common amongst all of the strings in the list, and this is the longest prefix.

This previous answer presented an implementation based on if_/3.

:- use_module(library(reif)).

Here comes a somewhat different take on it:

lists_lcp([], []).
lists_lcp([Es|Ess], Xs) :-
   foldl(list_list_lcp, Ess, Es, Xs).                % foldl/4

list_list_lcp([], _, []).
list_list_lcp([X|Xs], Ys0, Zs0) :-
   if_(list_first_rest_t(Ys0, Y, Ys)                 % if_/3
      , ( Zs0 = [X|Zs], list_list_lcp(Xs, Ys, Zs) )
      ,   Zs0 = []
      ).

list_first_rest_t([], _, _, false).
list_first_rest_t([X|Xs], Y, Xs, T) :-
   =(X, Y, T).                                       % =/3

Almost all queries in my previous answer give the same answers, so I do not show them here.

lists_lcp([X,Y], "abc"), however, does not terminate universally anymore with the new code.

LCP array, In computer science, the longest common prefix array (LCP array) is an auxiliary data structure Suffix array: Represents the lexicographic rank of each suffix of an array. LCP array: Contains the maximum In order to find the number of occurrences of a given string P (length m) in a text T (length N),. We use binary search  S n ], find the longest common prefix among a string q and S. This LCP query will be called frequently. We could optimize LCP queries by storing the set of keys S in a Trie. For more information about Trie, please see this article Implement a trie (Prefix trie). In a Trie, each node descending from the root represents a common prefix of some keys.

A simple version:

:- set_prolog_flag(double_quotes, chars).
pref([],_,[]).
pref(_,[],[]).
pref([H|T1],[H|T2],[H|Tr]):-
    pref(T1,T2,Tr).
pref([H|_],[H|_],[]).
pref([H1|_],[H2|_],[]):-
    dif(H1,H2).
lcf([],[]).
lcf([W],R):-
    pref(W,W,R).
lcf([W1,W2|L],R):-
    pref(W1,W2,R),
    lcf([W2|L],R).

Examples:

pref("interview","integrate",R).
R = [i, n, t, e] ;
R = [i, n, t] ;
R = [i, n] ;
R = [i] ;
R = [] ;
False.

lcf(["interview", "interrupt", "integrate", "intermediate"],R).
R = [i, n, t, e]

lcf(["interview", "interrupt", X, "intermediate"],R).
R = X, X = [i, n, t, e, r]

Longest common prefix, For a function, lcp, accepting a list of strings, the following should hold true (the find the longest common prefix of an array of STRINGs # Longest Common Prefix. Easy. 23761799Add to ListShare. Write a function to find the longest common prefix string amongst an array of strings. If there is no common prefix, return an empty string "". Example 1: Input: ["flower","flow","flight"]Output:"fl". Example 2: Input: ["dog","racecar","car"]Output:""Explanation:There is no common prefix among the input strings.

Longest Common Prefix, Write a function to find the longest common prefix string amongst an array of strings. If there is no common prefix, return an empty string "" . Example 1: It is often useful to find the common prefix of a set of strings, that is, the longest initial portion of all strings that are identical. Given a set of strings, R, for a prefix S, it should hold that: ∀ x ∈ R : S ≤. {\displaystyle \forall x\ \in \ R:\ S\leq } pref. x.

Longest Common Prefix, Read LeetCode's official solution for Longest Common Prefix. Write a function to find the longest common prefix string amongst an array of strings. The idea of the algorithm comes from the associative property of LCP  you need to find the longest string S which is the prefix of ALL the strings in the array. Longest common prefix for a pair of strings S1 and S2 is the longest string S which is the prefix of both S1 and S2. For Example, longest common prefix of "abcdefgh" and "abcefgh" is "abc".

Longest Common Prefix, The longest common prefix for a pair of strings S1 and S2 is the longest Horizontal Scanning — Find the LCP of strs[0] with strs[1] with strs[2] and so on. word in the given list; Find Longest common prefix using linked list  The longest common prefix is - gee Time Complexity : Since we are iterating through all the strings and for each string we are iterating though each characters, so we can say that the time complexity is O (N M) where, N = Number of strings M = Length of the largest string string

Comments
  • Is there anything wrong with the code? Does it provide a solution? Or an incorrect solution? Having it "print" the prefix when it doesn't currently is a feature addition, not an "optimization".
  • longest_common_prefix([[A],[B],[b]], []), A=a,B=b. gives two identical solutions?
  • @false My implementation introduces redundant dif constraints that I don't see how to avoid.
  • Up to not_all_equal_/1 this is a highly Prologish approach!
  • @false I am in the process of writing a question on the implementation of not_all_equal, because it seems like a useful predicate bu a difficult one to implement properly…
  • Please note that with the Prolog flag set as above, [i,n,t,e] = "inte"! So they are the same. Seem my answer how to get "inte" written as shown above!
  • Nice, but it seems we can take it a step further!
  • @false. Further? More (&better) queries showing that answers do not "overlap"?
  • Yes, indeed. Different approach! Imagine you have lcp/3. That is, the lcp for two lists. And now ...
  • My reasoning was flawed. It;s just not only for [X,Y] but also larger lists!