Basically, I want to leave the double letters like 'ss' in guess and remove the 'iiiiiiiiiiiiiiii'

I'm cleaning data for text analytics.

s_input = "guess who just got shoes boiiiiiiiiiiiiii"

print(''.join(i for i, _ in itertools.groupby(s_input ))) #this also takes out the 'ss' in guess

>"guess who just got shoes boi"

The intent is to get the following "guess who just got shoes boi"

notice, 'guess' keeps the 'ss'

You could use re.sub to get this done

Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. If the pattern isn’t found, string is returned unchanged. repl can be a string or a function; if it is a string, any backslash escapes in it are processed.

import re

s_input = "guess who just got shoes boiiiiiiiiiiiiii"


guess who just got shoes boi

You could do:

print(''.join(i if len(g) > 2 else ''.join(g) 
              for i, g in itertools.groupby(s_input)
              for g in [list(g)]))

But that would be rather bad.

For complicated generators, I like to write a generator function:

import itertools

def kill_long_dups(s):
    for key, group in itertools.groupby(s):
        group = list(group)
        if len(group) > 2:
            yield key
            yield from group

s_input = "guess who just got shoes boiiiiiiiiiiiiii"

How to find duplicates in word, Notice the yellow highlights, it refers to the duplicated words. you can compare duplicates and decide which version to keep and which to delete. and select highlight all. duplicated() in Python; Check if all elements in a list are even This example teaches you how to find and highlight duplicates (or triplicates) in Excel.