How to find number within a string in dataframe and re-format that number using thousands separator?

python format number with commas and decimal
python 3 format number with commas
python 3 print number with commas
python format number thousands separator dot
python thousands separator dot
python number with commas to int
python float comma separated
python f string format number with commas

I have the below example

df = pd.DataFrame({'City': ['Houston', 'Austin', 'Hoover','NY','LA'],
                   'Rules': ['ACH_CM > 28419581.51 and AMT_PM > 30572998.00 and AMT_PPM > 30572998.00 and AMT_CM > 30572998.00'
                             , 'MAA_PM and _AMT_PPM > 30572998.00 and _AMT_PM > 16484703.01 and AMT_CM > 28419581.51'
                             , 'MAA_PM and AMT_CM > 284 and AMT_PM > 30572998.00 and AMT_PPM > 30572998.00 and AMT_PPPM > 30572998.00 and ACH_AMT_PPM > 16484703.01'
                            ,'MAA_CM'
                            ,'_AMT_PPM > 30572.00']},columns=['City', 'Rules'])

Desired output:

City    Rules
Houston ACH_CM > 28,419,581.51 and AMT_PM > 30,572,998.00 and AMT_PPM > 30,572,998.00 and AMT_CM > 30,572,998.00
Austin  MAA_PM and _AMT_PPM > 30,572,998.00 and _AMT_PM > 16,484,703.01 and AMT_CM > 28,419,581.51
Hoover  MAA_PM and AMT_CM > 284 and AMT_PM > 30,572,998.00 and AMT_PPM > 30,572,998.00 and AMT_PPPM > 30,572,998.00 and ACH_AMT_PPM > 16,484,703.01
NY      MAA_CM
LA      AMT_PPM > 30,572.00

I believe I should be using "{0:,.0f}".format but not sure how to apply it.


This might be useful:

if len("%0.f" % floating.number) >= 5:
    print ('do something') 

Use `f"{number:,}"` to thousands format an integer to a string. detail can be found in PEP 378 -- Format Specifier for Thousands Separator. 2 How to find number within a string in dataframe and re-format that number using thousands separator? Oct 14 '18 1 How can i join 2 dataframes using a string column using get_dummies and count a repetitive word within a string Jul 14 '17


This should work.

def _format(x):
    unformatted = re.findall("\d+\.\d+", df['Rules'].iloc[0])
    formatted = ['{:,}'.format(float(x)) for x in unformatted]
    for i in range(len(unformatted)):
        x = x.replace(unformatted[i], formatted[i])
    return x

df['Rules'] = df['Rules'].map(_format)

Read a comma-separated values (csv) file into DataFrame. Also supports Row number(s) to use as the column names, and the start of the data. Default  This though helps makes the point that when dealing with thousands of data points extra white space can be really hard to spot in R, Excel or anything else. We can understand this more clearly by using a quick logical test in R to test whether two strings are identical or not. "this is messy" == " this is messy" ## [1] FALSE


Try this

df['Rules'] = df.Rules.apply(lambda x: re.sub("\d+\.\d+", my_func, x))

where my_func is defined below:

def my_func(matchobj):
    f = float(matchobj.group(0))
    return "{0:,.2f}".format(f)

Note that the entire file is read into a single DataFrame regardless, use the chunksize or If you're unfamiliar with these concepts, you can see here to learn more about dtypes, If you know the format, use pd.to_datetime() : date_parser=​lambda x: By default, numbers with a thousands separator will be parsed as strings:. In this program, we need to print output of a given integer in international place value format and put commas at the appropriate place, from the right. Using {} along with format () function is introduced in Python 2.7 and is generally used in String formatting in place of “%”. Here, we have used the “ {:,}” along with the format


The options for these functions fall into a few categories: • Indexing: can treat as the returned DataFrame, and whether to get column names from the file, the user, or other minor things like numeric data with thousands separated by commas. don't have to specify which columns are numeric, integer, boolean, or string. Maximum number of columns to display in the console. show_dimensions bool, default False. Display DataFrame dimensions (number of rows by number of columns). decimal str, default ‘.’ Character recognized as decimal separator, e.g. ‘,’ in Europe. line_width int, optional. Width to wrap a line in characters. max_colwidth int, optional


Note that the entire file is read into a single DataFrame regardless, use the chunksize False the default NaN values are overridden, otherwise they're appended to. na_filter strings are all formatted the same way, you may get a large speed up by By default, numbers with a thousands separator will be parsed as strings. Notice that the result is no longer a number but a text string. If you’re familiar with programming in languages similar to C or C++, then you also may find the sprintf() function useful, because sprintf() is a wrapper around the C printf() function. This wrapper allows you to paste your formatted number directly into a string.


These functions are used to extract N number of characters or letters from string. Otherwise, use Literal String Interpolation/f-Strings (#3) if you're on Python 3. a string into an integer in pandas DataFrame: (1) The astype (int) method: (2) Excel separates thousands by commas if the format contains a comma (,) that is  The problem is that when you join numbers in a text string, the number formatting does not follow. Take a look at the figure as an example. Note how the numbers in the joined strings (column E) do not adopt the formatting from the source cells (column C).