Python: strip function definition using regex

python strip multiple characters
python strip regex
python regex
python regex match word in string
python regex cheat sheet
regular expression in python for beginners
python regex extract
how to strip a list in python

I am a very beginner of programming and reading the book "Automate the boring stuff with Python'. In Chapter 7, there is a project practice: the regex version of strip(). My code below does not work (I use Python 3.6.1). Could anyone help?

import re

string = input("Enter a string to strip: ")
strip_chars = input("Enter the characters you want to be stripped: ")

def strip_fn(string, strip_chars):
    if strip_chars == '':
        blank_start_end_regex = re.compile(r'^(\s)+|(\s)+$')
        stripped_string = blank_start_end_regex.sub('', string)
        print(stripped_string)
    else:
        strip_chars_start_end_regex = re.compile(r'^(strip_chars)*|(strip_chars)*$')
        stripped_string = strip_chars_start_end_regex.sub('', string)
        print(stripped_string)

When using r'^(strip_chars)*|(strip_chars)*$' string literal, the strip_chars is not interpolated, i.e. it is treated as a part of the string. You need to pass it as a variable to the regex. However, just passing it in the current form would result in a "corrupt" regex because (...) in a regex is a grouping construct, while you want to match a single char from the define set of chars stored in the strip_chars variable.

You could just wrap the string with a pair of [ and ] to create a character class, but if the variable contains, say z-a, it would make the resulting pattern invalid. You also need to escape each char to play it safe.

Replace

r'^(strip_chars)*|(strip_chars)*$'

with

r'^[{0}]+|[{0}]+$'.format("".join([re.escape(x) for x in strip_chars]))

I advise to replace * (zero or more occurrences) with + (one or more occurrences) quantifier because in most cases, when we want to remove something, we need to match at least 1 occurrence of the unnecessary string(s).

Also, you may replace r'^(\s)+|(\s)+$' with r'^\s+|\s+$' since the repeated capturing groups will keep on re-writing group values upon each iteration slightly hampering the regex execution.

Strip() Function Using Regex, I slightly changed your script like this, def strip(char, string): if char == "": # not "​stripChar" regsp = re.compile(r'^\s+|\s+$') stripContext  When using r'^(strip_chars)*|(strip_chars)*$' string literal, the strip_chars is not interpolated, i.e. it is treated as a part of the string. You need to pass it as a variable to the regex. You need to pass it as a variable to the regex.

You can also use re.sub to substitute the characters in the start or end. Let us say if the char is 'x'

re.sub(r'^x+', "", string)
re.sub(r'x+$', "", string)

The first line as lstrip and the second as rstrip This just looks simpler.

Python String, strip() in-built function of Python is used to remove all the leading and trailing spaces from a string. Syntax : The function returns another string with both leading and trailing characters being stripped off. When the def Count(string):. Description Python string method strip () returns a copy of the string in which all chars have been stripped from the beginning and the end of the string (default whitespace characters).

#! python
# Regex Version of Strip()
import re
def RegexStrip(mainString,charsToBeRemoved=None):
    if(charsToBeRemoved!=None):
        regex=re.compile(r'[%s]'%charsToBeRemoved)#Interesting TO NOTE
        return regex.sub('',mainString)
    else:
        regex=re.compile(r'^\s+')
        regex1=re.compile(r'$\s+')
        newString=regex1.sub('',mainString)
        newString=regex.sub('',newString)
        return newString

Str='   hello3123my43name is antony    '
print(RegexStrip(Str))

Maybe this could help, it can be further simplified of course.

String Manipulation and Regular Expressions, Strings in Python can be defined using either single or double quotations (they are The basic method of removing characters is the strip() method, which strips​  A RegEx, or Regular Expression, is a sequence of characters that forms a search pattern. RegEx can be used to check if a string contains the specified search pattern. RegEx Module

Regular Expression HOWTO, This document is an introductory tutorial to using regular expressions in Python For example, the regular expression test will match the string test exactly. Matches any non-whitespace character; this is equivalent to the class [^ \t\n\r\f\v] . \w. strip() in-built function of Python is used to remove all the leading and trailing spaces from a string. Syntax : string.strip([remove]) Parameters : remove (optional): Character or a set of characters, that needs to be removed from the string.

Python string, programming language that returns a copy of the string with both leading and trailing characters removed (based on the string argument passed). Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default).

Python RegEx, () method removes characters from both left and right based on the argument (a string specifying the set of characters to be removed). Regular Expression Syntax¶. A regular expression (or RE) specifies a set of strings that matches it; the functions in this module let you check if a particular string matches a given regular expression (or if a given regular expression matches a particular string, which comes down to the same thing).

Comments
  • Does not work how, and for what input?
  • r'^(strip_chars)*|(strip_chars)*$' -> r'^[{0}]+|[{0}]+$'.format("".join([re.escape(x) for x in strip_chars])). Also, remove the unnecessary ( and ) in r'^(\s)+|(\s)+$'.