How do I replace matched groups in a regex?

regex capture group
regex replace
regex backreference
regex optional capture group
replace group regex
replace group regex python
regex replace group javascript
regex replace group c#

Given some input data:

<somexml>
    <User Name="MrFlibble">
        <Option Name="Pass">SomeSaltedPassword</Option>
        <Option Name="Salt">Salt</Option>
        <tag1></tag1>
        <Permissions>
            <Permission Dir="E:"></Permission>
        </Permissions>
    </User>
    <User Name="MrFlobble">
        <Option Name="Pass">SomeOtherSaltedPassword</Option>
        <Option Name="Salt">Salt</Option>
        <tag1></tag1>
        <Permissions>
            <Permission Dir="C:"></Permission>
        </Permissions>
    </User>
</somexml>

I'd like to replace the first user that doesn't have a C: permission in the user area (in this case MrFlibble) with Jon and SomeSaltedPassword with MyNewSaltedPassword using a .net framework regex to give the following result:

<somexml>
    <User Name="Jon">
        <Option Name="Pass">MyNewSaltedPassword</Option>
        <Option Name="Salt">Salt</Option>
        <tag1></tag1>
        <Permissions>
            <Permission Dir="E:"></Permission>
        </Permissions>
    </User>
    <User Name="MrFlobble">
        <Option Name="Pass">SomeOtherSaltedPassword</Option>
        <Option Name="Salt">Salt</Option>
        <tag1></tag1>
        <Permissions>
            <Permission Dir="C:"></Permission>
        </Permissions>
    </User>
</somexml>

I think something like this regex would capture the users and group the sections I Want to replace:

<User Name="(.*)">.*<Option Name="Pass">(.*)<\/Option>.*<Option Name="Salt">(.*)<\/Option>.*<\/User>

...but I'm struggling to see how I would substitute the three groups while maintaining the other text. The docs all seem to suggest replacing modifications of the original text rather than multiple specifically named groups with specific new text.

Is there a standard way to do this or am I barking up the wrong tree?

Do not under any circumstances try to parse XML with a regex unless you wish to invoke rite 666 Ph'nglui mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn.

Use an XML parsing library see this page for some ways to do it.

Reinserting Text Matched By Capturing Groups in The Replacement , This makes it possible to rearrange the text matched by a regular expression in many different ways. As a simple example, the regex \*(\w+)\* matches a single  It participates when the regex matches abc, but not when the regex matches ac. In most applications, there is no difference between a backreference in the replacement string to a group that matched the empty string or a group that did not participate. Both are replaced with an empty string. Two exceptions are Python and PCRE2. They do allow

This is quite difficult to do with regular expressions because you need a replacement by condition.

In the comment you wrote that it is well-formed xml. Therefore, I dare to offer a solution using xml parser.

Add reference to System.Xml.Linq library to the project. Open the following namespaces

using namespace System;
using namespace System::IO;
using namespace System::Xml::Linq;

The code is very simple and concise

//auto xml = XElement::Parse(input); // input - string containing your xml
auto xml = XElement::Load(L"test.xml");

for each (auto user in xml->Elements(L"User"))
{
    if (user->Element(L"Permissions")->Element(L"Permission")->Attribute(L"Dir")->Value != L"C:")
    {
        user->Attribute(L"Name")->Value = L"Jon";

        for each(auto option in user->Elements(L"Option"))
        {
            if (option->Attribute(L"Name")->Value == L"Pass")
            {
                option->Value = L"MyNewSaltedPassword";
            }
        }
    }
}

Console::WriteLine(xml);
//xml->Save(L"result.xml");

Substitutions in Regular Expressions, replace(regexp, replacement) that replaces all matches with regexp in str allows to use parentheses contents in the replacement string. That's  Substituting a Numbered Group The $number language element includes the last substring matched by the number capturing group in the replacement string, where number is the index of the capturing group. For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.

Option with regular expressions. The expression itself looks obscure, as a result it is difficult to maintain. Therefore, it is better to use the method with the xml parser.

using namespace System;
using namespace System::IO;
using namespace System::Text::RegularExpressions;

MatchEvaluator method:

String^ Evaluate(Match^ m)
{
    if (m->Groups[L"dir"]->Value != L"C:")
        return L"Jon" + m->Groups[L"mid1"] + L"MyNewSaltedPassword" + m->Groups[L"mid2"] + m->Groups[L"dir"];
    else
        return m->Groups[L"name"]->Value + m->Groups[L"mid1"] + m->Groups[L"pass"] + m->Groups[L"mid2"] + m->Groups[L"dir"];
}

Code:

auto input = File::ReadAllText(L"test.xml");

auto pattern = gcnew String(R"(
(?<= <User \s Name = " )
(?'name' .+? )
(?= "> )

(?'mid1' .+? )

(?<= <Option \s Name = "Pass"> )
(?'pass' .+? )
(?= </Option> )

(?'mid2' .+? )

(?<= <Permission \s Dir = " )
(?'dir' .+? )
(?= "> )
)");

auto options = RegexOptions::IgnorePatternWhitespace | RegexOptions::Singleline;

auto evaluator = gcnew MatchEvaluator(Evaluate);
auto result = Regex::Replace(input, pattern, evaluator, options);

Console::WriteLine(result);

Capturing groups, Learn how to use regular expressions (regex) with find and replace actions. They can help you in pattern matching, parsing, filtering of results, and so on. Once you learn the Note that the group 0 refers to the entire regular expression​. Use regex capturing groups and backreferences You can put the regular expressions inside brackets in order to group them. Each group has a number starting with 1, so you can refer to (backreference) them in your replace pattern. Note that the group 0 refers to the entire regular expression.

Find and replace text using regular expressions, In the example below, we want to replace all HTML span elements, with their inner content. We create a C# regular expression to match the span element, and​  Substituted with the text matched by the capturing group that can be found by counting as many opening parentheses of named or numbered capturing groups as specified by the number from right to left starting at the backreference. (a)(b)(c)(d)\k<-3>matches abcdb. V2.

C# - Regex - How to replace a matched group in 1 line, You can group a part of a regular expression by encapsulating the Furthermore​, you can extract the matched value by  When attempting to build a logical “or” operation using regular expressions, we have a few approaches to follow. Fortunately the grouping and alternation facilities provided by the regex engine are very capable, but when all else fails we can just perform a second match using a separate regular expression – supported by the tool or native language of your choice.

New JavaScript Features That Will Change How You Write Regex , For instance, the regex \b(\w+)\b\s+\1\b matches repeated words, such as regex regex To insert the capture in the replacement string, you must use the group's​  Normally, within a pattern, you create a back-reference to the content a capture group previously matched by using a backslash followed by the group number—for instance \1 for Group 1. (The syntax for replacements can vary.)

Comments
  • Parsing HTML with regex is evil, and is equivalent to killing a kitten.
  • Thanks. Very helpful. How would you solve it instead then?
  • What prevents you from doing a simple substitution? i.e. finding Name="MrFlibble" and replacing with Name="Jon"
  • It's not guaranteed to be MrFlibble originally and I don't want to match the entry with C: in it. I should have made that clear in the question. Will do so now.
  • Second answer down in that first link more closely describes my use case I think. This is some well-formed XML in a very closed environment, not scraping some possibly horribly formed XHTML from a random website.
  • Took your advice and narrowly closed the gate before it had fully formed. Took -4 sanity in the process and woke up in Arkham Asylum for my trouble though... ...Dr Dobbs got me through it!
  • Unfortunately, my platform is limited to .net 2.0 so I was unable to use Linq