Find longest unique substring in string python

print longest substring without repeating characters python
maximum substring hackerrank solution
longest substring without repeating characters - youtube
longest substring with k distinct characters - leetcode
longest substring with repeating characters java
longest repeating substring in a string
find the length of the longest substring without repeating characters c#
longest even length substring leetcode

I am trying that age old question (there are multitudes of versions around) of finding the longest substring of a string which doesn't contain repeated characters. I can't work out why my attempt doesn't work properly:

def findLongest(inputStr):
    resultSet = []
    substr = []

    for c in inputStr:
        print ("c: ", c)
        if substr == []:
            substr.append([c])
            continue

        print(substr)
        for str in substr:
            print ("c: ",c," - str: ",str,"\n")
            if c in str:
                resultSet.append(str)
                substr.remove(str)
            else:
                str.append(c)
        substr.append([c])



    print("Result set:")
    print(resultSet)
    return max(resultSet, key=len)

print (findLongest("pwwkewambb"))

When my output gets to the second 'w', it doesn't iterate over all the substr elements. I think I've done something silly, but I can't see what it is so some guidance would be appreciated! I feel like I'm going to kick myself at the answer...

The beginning of my output:

c:  p
c:  w
[['p']]
c:  w  - str:  ['p']

c:  w
[['p', 'w'], ['w']]
c:  w  - str:  ['p', 'w'] # I expect the next line to say c: w - str: ['w']

c:  k
[['w'], ['w']] # it is like the w was ignored as it is here
c:  k  - str:  ['w']

c:  k  - str:  ['w']
...

EDIT:

I replaced the for loop with

for idx, str in enumerate(substr):
    print ("c: ",c," - str: ",str,"\n")
    if c in str:
        resultSet.append(str)
        substr[idx] = []
    else:
        str.append(c)

and it produces the correct result. The only thing is that the empty element arrays get set with the next character. It seems a bit pointless; there must be a better way.

My expected output is kewamb.

e.g.

c:  p
c:  w
[['p']]
c:  w  - str:  ['p']

c:  w
[['p', 'w'], ['w']]
c:  w  - str:  ['p', 'w']

c:  w  - str:  ['w']

c:  k
[[], [], ['w']]
c:  k  - str:  []

c:  k  - str:  []

c:  k  - str:  ['w']

c:  e
[['k'], ['k'], ['w', 'k'], ['k']]
c:  e  - str:  ['k']

c:  e  - str:  ['k']

c:  e  - str:  ['w', 'k']

c:  e  - str:  ['k']
...

Edit, per comment by @seymour on incorrect responses:

def find_longest(s):
    _longest = set()
    def longest(x):
         if x in _longest:
             _longest.clear()
             return False
         _longest.add(x)
         return True
    return ''.join(max((list(g) for _, g in groupby(s, key=longest)), key=len))

And test:

In [101]: assert find_longest('pwwkewambb') == 'kewamb'

In [102]: assert find_longest('abcabcbb') == 'abc'

In [103]: assert find_longest('abczxyabczxya') == 'abczxy'

Old answer:

from itertools import groupby

s = set() ## for mutable access

''.join(max((list(g) for _, g in groupby('pwwkewambb', key=lambda x: not ((s and x == s.pop()) or s.add(x)))), key=len))
'kewamb'

groupby returns an iterator grouped based on the function provided in the key argument, which by default is lambda x: x. Instead of the default we are utilizing some state by using a mutable structure (which could have been done a more intuitive way if using a normal function)

lambda x: not ((s and x == s.pop()) or s.add(x))

What is happening here is since I can't reassign a global assignment in a lambda (again I can do this, using a proper function), I just created a global mutable structure that I can add/remove. The key (no pun) is that I only keep elements that I need by using a short circuit to add/remove items as needed.

max and len are fairly self explanatory, to get the longest list produced by groupby

Another version without the mutable global structure business:

def longest(x):
     if hasattr(longest, 'last'):
         result = not (longest.last == x)
         longest.last = x
         return result
     longest.last = x
     return True


''.join(max((list(g) for _, g in groupby('pwwkewambb', key=longest)), key=len))
'kewamb'

How To Find Longest Substring Without Repeating Characters In , How do you find the longest substring in a string? Find the longest substring with k unique characters in a given string; Check if sum of Fibonacci elements in an Array is a Fibonacci number or not; Word Ladder - Set 2 ( Bi-directional BFS ) Remove odd frequency characters from the string; Longest permutation subsequence in a given array; Count the elements having frequency equals to its value

Print Longest substring without repeating characters, How do I find all the substrings in a string? In order to traverse the string in linear time, we can use a common technique “sliding window” - use a start pointer and an end pointer to represent a substring in a sliding window, if the current substring doesn't have duplicate characters, it means the string can potentially be longer and you just need to move forward the end pointer by one character.

Depends on your definition of repeated characters: if you mean consecutive, then the approved solution is slick, but not of characters appearing more than once (e.g.: pwwkewabmb -> 'kewabmb' ).

Here's what I came up with (Python 2):

def longest(word):
    begin = 0
    end = 0
    longest = (0,0)
    for i in xrange(len(word)):
        try:
            j = word.index(word[i],begin,end)
            # longest?
            if end-begin >= longest[1]-longest[0]:
                longest = (begin,end)
            begin = j+1
            if begin==end:
                end += 1
        except:
            end = i+1
    end=i+1
    if end-begin >= longest[1]-longest[0]:
        longest = (begin,end)
    return word[slice(*longest)]

Thus

>>> print longest('pwwkewabmb')
kewabm
>>> print longest('pwwkewambb')
kewamb
>>> print longest('bbbb')
b

Find longest unique substring in string python, How do you find the length of the longest substring without repeating characters? Given a list of k (k≤100) strings of length at most 1000, find a longest common substring of all strings in the given list. This is based on a problem from Rosalind. I tried writing a straight forward program to do this.

Longest Substring Without Repeating Characters, Find the longest substring with k unique characters in a given string · Print the longest common substring · Strings formed from given characters without any  Longest repeating and non-overlapping substring; Longest substring such that no three consecutive characters are same; Longest substring of only 4's from the first N characters of the infinite string; Find the longest substring with k unique characters in a given string; Longest substring with K unique characters using Binary Search

Given a string, find the length of the longest substring without , Edit, per comment by @seymour on incorrect responses: def find_longest(s): _longest = set() def longest(x): if x in _longest: _longest.clear()  Meaning that the substring was not found in the parent string. If the substring is found, the find method will return the starting index of the first character of the substring. In this case the starting index is 6 because the first character of the substring is found at character 6 of the parent string.

Longest substring of given string containing distinct characters , Given a string, find the length of the longest substring without repeating characters. Example Longest Substring with At Most Two Distinct Characters. Medium. Python string method find () determines if string str occurs in string, or in a substring of string if starting index beg and ending index end are given. str − This specifies the string to be searched. beg − This is the starting index, by default its 0. end − This is the ending index, by default its equal to the length of the string.

Comments
  • substr.remove(str): doing that while iterating is bad
  • ah really? didn't know that. I tried using str = [] before and that didn't work so starting using remove
  • am I thinking about this the wrong way - is there a more intuitive solution?
  • you should mention the expected output. Is that "kewamb" ?
  • Ah, yes. I'll edit it. I tried out something which got the expected output, but isn't a perfect solution.
  • I wouldn't call that intuitive but that's very good. Care to explain to the world how it works?
  • This is some kind of genius. An explanation would make my life easier, too!
  • It's using groupby to create groups with characters not similar, using side effect in the key to add or remove from set, and use max to compute max string. Same principle as my solution but one-liner & comprehensions. Really nice.
  • I admit I had to google the question to see if it wasn't plagiarism because 1) it's very good 2) I don't know that user whereas I'm here like a lot and 3) no explanation. Didn't find that anywhere else. Please explain more, you'll get more upvotes.
  • @Jean-FrançoisFabre explanation coming, I was bored at work. I would admit plagiarism but I can claim this one as my own, borrowing heavily from things I seen of course.
  • Good answer, but the case "bbbb" doesn't produce the expected "b"
  • yes, fixed (edge cases / index values to tune): now it works.
  • I just realised this has a "for else" which I don't remember ever coming across. Learnt something else new.
  • that's a nice feature to avoid the use of a cumbersome flag.