Reverse of Python string formatting: generating a dict from a string with named parameters

python 3 string format
python int to string format
python print format
python format characters
python number format
python format precision
python string format date
python % string operator

I have a string like this, where symbol and property vary:

a = '/stock/%(symbol)s/%(property)s'

I have another string like this, where AAPL and price vary:

b = '/stock/AAPL/price'

I'm trying to generate a dict like this:

c = {
    'symbol': 'AAPL',
    'property': 'price'
}

With string formatting, I could do a this:

> a % c == b
True

But I'm trying to go the other direction. Time for some regex magic?

This is similar to @moliware's solution, but there's no hard-coding of keys required in this solution:

import re

class mydict(dict):
    def __missing__(self, key):
        self.setdefault(key, '')
        return ''

def solve(a, b):
    dic = mydict()
    a % dic
    strs = a
    for x in dic:
        esc = re.escape(x)
        strs = re.sub(r'(%\({}\).)'.format(esc), '(?P<{}>.*)'.format(esc), strs)
    return re.search(strs, b).groupdict()

if __name__ == '__main__':
    a = '/stock/%(symbol)s/%(property)s'
    b = '/stock/AAPL/price'
    print solve(a, b)
    a = "Foo %(bar)s spam %(eggs)s %(python)s"
    b = 'Foo BAR spam 10 3.x'
    print solve(a, b)

Output:

{'symbol': 'AAPL', 'property': 'price'}
{'python': '3.x', 'eggs': '10', 'bar': 'BAR'}

As @torek pointed out for cases with ambiguous output(no space between keys) the answer can be wrong here.

For eg.

a = 'leading/%(A)s%(B)s/trailing'
b = 'leading/helloworld/trailing'

Here looking at just b it's hard to tell the actual value of either either A or B.

Use Python format string in reverse for parsing, The parse module "is the opposite of format()". PN-)\w+', pn).group(0)) , but can i use that regular expression to create a string from the ID? s): """Match s against the given format string, return dict of matches. We assume all of the arguments in format string are named keyword arguments (i.e. no {} or  To make sure a string will display as expected, we can format the result with the format () method. The format () method allows you to format selected parts of a string. Sometimes there are parts of a text that you do not control, maybe they come from a database, or user input? To control such values, add placeholders (curly brackets {}) in the

A solution with regular expressions:

>>> import re
>>> b = '/stock/AAPL/price'
>>> result = re.match('/.*?/(?P<symbol>.*?)/(?P<property>.*)', b)
>>> result.groupdict()
{'symbol': 'AAPL', 'property': 'price'}

You can adjust a bit more the regular expression but, in essence, this is the idea.

PEP 3101 -- Advanced String Formatting, One of those arguments is already dedicated to the format string, leaving all other "My name is {0[name]}".format(dict(name='Fred')) In general, exceptions generated by the formatter code itself are of the "ValueError"  Explanation : In the above code, string is passed as an argument to a recursive function to reverse the string. In the function, the base condition is that if the length of the string is equal to 0, the string is returned.

Assuming well-behaved input, you could just split the strings and zip them to a dict

keys = ('symbol', 'property')
b = '/stock/AAPL/price'
dict(zip(keys, b.split('/')[2:4]))

5. Built-in Types, Objects of different types, except different numeric types and different string types, Applying the reverse conversion to 3740.0 gives a different hexadecimal string Other possible values are 'ignore' , 'replace' and any other name registered via by the format string, or a single mapping object (for example, a dictionary). which is strictly positional, and only comes with the caveat that format() arguments follow Python rules where unnamed args must come first, followed by named arguments, followed by *args (a sequence like list or tuple) and then *kwargs (a dict keyed with strings if you know what’s good for you).

5. Data Structures, Sort the items of the list in place (the arguments can be used for sort list. reverse () [None, 'hello', 10] doesn't sort because integers can't be compared to strings and List comprehensions provide a concise way to create lists. Note that this creates (or overwrites) a variable named x that still exists after format(q​, a)) . The arguments replaces the corresponding named placeholders and the string 'cat' is formatted accordingly. Likewise, in the second example, 123.236 is the positional argument and, align, width and precision are passed to the template string as format codes.

Python Cheatsheet, Note: test of emptiness of strings, lists, dictionary, etc, should not use len, but prefer for i in range(5): >>> print('Jimmy Five Times ({})'.format(str(i))) My name is Jimmy Five The range() function can also be called with three arguments. When creating a function using the def statement, you can specify what the return​  Here is a couple of things about Python's strings you should know: In Python, strings are immutable. Changing a string does not modify the string. It creates a new one. Strings are sliceable. Slicing a string gives you a new string from one point in the string, backwards or forwards, to another point, by given increments.

URL dispatcher | Django documentation, To design URLs for an app, you create a Python module informally called a The keyword arguments are made up of any named parts matched by the path For example, building-your-1st-django-site . uuid - Matches a formatted UUID. The URLconf searches against the requested URL, as a normal Python string. We then passed 4 values into the str.format() method, mixing string and integer data types. Each of these values are separated by a comma. Reordering Formatters with Positional and Keyword Arguments. When we leave curly braces empty without any parameters, Python will replace the values passed through the str.format() method in order. As we

Comments
  • Are you sure you don't want your dictionary to be D = {'APPL' : price} so you can look up price by symbol? Otherwise you will need a new dictionary for each stock.
  • I'm assuming (unlike other answers so far) that your first-string doesn't necessarily say symbol and/or property, e.g., it might read /zog/%(evil)s=%(level)s,%(flavor)s. Is that the case?
  • Do you have control of the format of a? If you use a more modern interpolation style, certain things become easier.
  • @DSM I might be able to control it. What format would be easier?
  • By control, I mean the %(symbol)s part. The slashes aren't changeable.
  • Note: you'll need dic=mydict() and I get '/stock/None/None' as the value in the call used to populate dic. I'd just do dic = collections.defaultdict(str) though. (Oops, you fixed the missing dic= part while I was typing the comment.)
  • BTW this works really well (it's the way to go here) but there are indistinguishable variations for which this just picks "any solution that works", e.g., a = 'leading/%(A)s%(B)s/trailing' and b = 'leading/helloworld/trailing'. This chooses A='helloworld' and B=''. (And if string b is cannot be generated by format a regardless of dictionary values, the re.search() returns None.)
  • @torek Good test case, I think this one can be considered ambiguous too because B can be either '' or 'helloworld' or a can be '' or 'helloworld'.(So a space or some other character is required between two keys to get correct answer). Another issues is returning a str for missing keys would raise error for %d or other directives, I am not sure how to fix that.
  • @torek I think I could use a couple of try-except blocks to catch those type mismatch errors and pass some other default value.
  • The last example won't be a problem, there will always be some sort of delimiter between keys.
  • I came up with the letter-for-letter same solution. str.split is almost always going to be many times more time-efficient than the re-based equivalent.
  • @KirkStrauser - yeah, there's a hundred ways to parse strings, but I like the simple solutions.
  • As long as it's always slashes, and slashes don't appear in the output from some key(s). If the output might begin with, e.g., /nyse/stock/ (vs say /ftse/stock/ and just /stock/) sometimes, you'd need to adjust the indices too. In short, much depends on input constraints.
  • @torek - agreed. As more details of the input are learned, the script could be updated. But split and the dict constructor are fast, so its a good start.