Replace a word in a string by a tag

string replace python
python string replace regex
python replace regex
string replace php
string replace javascript
python 3 string replace
python string replace multiple
remove all html tags from string php

Lets consider the following HTML snippet:

html = '''
 <p>
  The chairman of European Union leaders, Donald Tusk, will meet May in London on Thursday, a day after the bloc’s Brexit negotiator weakened sterling by issuing another warning to Britain, which is due to leave the bloc in March 2019.
 </p>
'''

Let's turn it into a BeautifulSoup object:

from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')

I would like to transform that soup object so that its HTML output is:

'''
    <p>
      The chairman of European Union leaders, <span style="color : red"> Donald Tusk </span>, will meet May in London on Thursday, a day after the bloc’s Brexit negotiator weakened sterling by issuing another warning to Britain, which is due to leave the bloc in March 2019.
     </p>
'''

I found on the doc page of BeautifulSoup couple of examples of how to replace a string, create a new tag, or even insert a new tag at a specific location in the tree, but not how to add a new tag in the middle of a string like in my use case.

Any help very welcome.

First let me say, thanks for posting this question, because it was a very interesting coding problem.

I spent sometime looking at this problem and finally decided to throw an answer into the ring.

I attempted to use insert_before() and insert_after() from BeautifulSoup to modified the <p> tag in your example HTML. I also looked at using extend() and append() from BeautifulSoup. After dozens of attempts, I just could not get the results you requested.

The code below seems to accomplish the requested HTML modification based on a keyword (e.g., Donald Tusk). I used replace_with() from BeautifulSoup to replace the original tag in the HTML with new_tag() from BeautifulSoup.

The code works, but I'm sure that it can be refined.

from bs4 import BeautifulSoup

raw_html = """
<p> This is a test. </p>
<p>The chairman of European Union leaders, Donald Tusk, will meet May in London on Thursday, a day after the bloc’s Brexit negotiator weakened sterling by issuing another warning to Britain, which is due to leave the bloc in March 2019.</p>
<p> This is also a test. </p>
"""

soup = BeautifulSoup(raw_html, 'lxml')

# find the tag that contains the keyword Donald Tusk
original_tag = soup.find('p',text=re.compile(r'Donald Tusk'))

if original_tag:
  # modify text in the tag that was found in the HTML
  tag_to_modify = str(original_tag.get_text()).replace('Donald Tusk,', '<span style="color:red">Donald Tusk</span>,')

  print (tag_to_modify)
  # outputs
  The chairman of European Union leaders, <span style="color:red">Donald Tusk</span>, will meet May in London on Thursday, a day after the bloc’s Brexit negotiator weakened sterling by issuing another warning to Britain, which is due to leave the bloc in March 2019.

  # create a new <p> tag in the soup
  new_tag = soup.new_tag('p')

  # add the modified text to the new tag
  # setting a tag’s .string attribute replaces the contents with the new string
  new_tag.string = tag_to_modify

  # replace the original tag with the new tag
  old_tag = original_tag.replace_with(new_tag)

  # formatter=None, BeautifulSoup will not modify strings on output
  # without this the angle brackets will get turned into "&lt;", and "&gt;"
  print (soup.prettify(formatter=None))
  # outputs 
  <html>
    <body>
      <p>
        This is a test.
      </p>
      <p>
        The chairman of European Union leaders, <span style="color:red">Donald Tusk</span>, will meet May in London on Thursday, a day after the bloc’s Brexit negotiator weakened sterling by issuing another warning to Britain, which is due to leave the bloc in March 2019.
      </p>
      <p>
        This is also a test.
      </p>
    </body>
  </html>

Python String replace() Method, Replace all occurrence of the word "one": txt = "one one was a race horse, two two was one too." x = txt� Hi all, Is there a way to find/replace text in particular tags? For instance, if in the “title” or “artist” tag I wanted to replace any periods (.) with commas (,)?

Try using a loop, go through each word in the string, once you find the string your looking for (using whatever method works, regular expressions would be useful) then use the Tag.insert(position, "found_word")

PHP str_replace() Function, Example. Replace the characters "world" in the string "Hello world!" with "Peter": <?php The replace() method searches a string for a specified value, or a regular expression, and returns a new string where the specified values are replaced. Note: If you are replacing a value (and not a regular expression), only the first instance of the value will be replaced. To replace all occurrences of a specified value, use the global (g

You need to use Regular Expressions. Hope this snippet helps.

import re

def highlight_matches(query, text):
    def span_matches(match):
        html = '<span style="color : red">{0}</span>'
        return html.format(match.group(0))
    return re.sub(query, span_matches, text, flags=re.I)

strip_tags - Manual, This function tries to return a string with all NULL bytes, HTML and PHP tags not actually validate the HTML, partial or broken tags can result in the removal of more A word of caution. strip_tags() can actually be used for input validation as � Word: 3: 03-04-2014 03:50 PM: wildcards in find & replace to reverse word order: jeffk: Word: 3: 11-11-2012 01:47 PM: MS Word Find and Replace not working: allenglishboy: Word: 10: 07-25-2012 08:05 AM: Word Find and Replace Query: bthart: Word: 1: 12-29-2011 12:45 AM: Bad view when using Find and Find & Replace - Word places found string on top

Replace text using regular expression, When expression is a cell array or a string array, regexprep applies the first expression Replace words that begin with M , end with y , and have at least one character '<img src="\w+\.gif">' matches an <img> HTML tag when the file name� Regular expressions are a very useful tool for developers. They allow to find, identify or replace a word, character or any kind of string. This tutorial will teach you how to master PHP regexp and show you extremely useful, ready-to-use PHP regular expressions that any web developer should have in his toolkit.

How to Replace All Occurrences of a String in JavaScript, You can use the JavaScript replace() method in combination with the regular expression to find and replace all occurrences of a word or substring inside any� This opens Word’s Find and Replace window. In the “Find What” box, type the word or phrase you want to locate. If you only want to find text in your document, you can go ahead and click the “Find Next” button to have Word jump to the next occurrence of that word.

How to replace a word inside a string in PHP ?, How to dynamically load JS inside JS ? Index inside map() Function � Which Characters Should Be Escaped Inside A "pre" tag? What happens� The JavaScript replace() function takes two arguments: The string or regular expression to search for. The string to replace the matches found with. Replacing First Match Only: If we specify the first argument as string, the replace function only replaces the first occurrence of the string. Consider the example below:

Comments
  • I'm curious, how will the modified string be used?
  • @Life is complex i'm trying to mark certain news articles that i downloaded for an nlp project
  • Thanks for the info. I was asking, because I was able to this with regex and string replace with your example, but I don't know how it works with full HTML from BS.
  • Why not first modify the html code and then turn it into a BeautifulSoup object?
  • thx for your reply - my understanding is that the insert position refers to other tags.Example from the doc here: crummy.com/software/BeautifulSoup/bs4/doc/#insert. Replace_with (crummy.com/software/BeautifulSoup/bs4/doc/#replace-with) seems to be the only way to edit string in place, but its puzzling how to edit and add a tag in the middle of the string.
  • it seems to me that the html graph (soup) needs to be modified with bs4 objects rather than html as string. But i 'll try that.