Reformatting numbers such as data0000172 to 172 in python
I have a list of strings. Each string has the form of
data0*(\d*) if we use a regular expression form.
The following is an example the strings:
data000000, data000003, data0172, data2312, data008212312
I would like to take only the meaningful number portion. All numbers are integers. For example, in the above case, I would like to get another list containing:
0, 3, 172, 2312, 8212312
What would be the best way in the above case?
The following is the solution that I thought:
import re string_list = ["data0000172", ..... ] number_list =  for string in string_list: match = re.search("data0*(\d+)", string) if match: number_list.append(match.group(1)) else: raise Exception("Wrong format.")
However, the above might be inefficient. Could you suggest a better way for doing this?
If you are sure that the strings start with "data", you can just slice the string and convert to integer. Leading zeroes aren't an issue there. Building an integer from a zero-padded digit strings works.
lst = ["data000000", "data000003", "data0172", "data2312", "data008212312"] result = [int(x[4:]) for x in lst]
[0, 3, 172, 2312, 8212312]
or good old replace just in case the prefix can be omitted (but it will be slightly slower):
result = [int(x.replace("data","")) for x in lst]
Newest 'regex' Questions - Page 5, how to filter API by URL on Python python regex: either or or both with separator Reformatting numbers such as data0000172 to 172 in python. The numbers module defines a hierarchy of numeric abstract base classes which progressively define more operations. None of the types defined in this module can be instantiated. class numbers.Number¶ The root of the numeric hierarchy. If you just want to check if an argument x is a number, without caring what kind, use isinstance(x, Number).
import re st = 'data0000172' a = float(re.search('data(\d+)',st).group(1)) print(a)
This extract the numbers i.e useful part.Apply this to your list.
PEP 515 -- Underscores in Numeric Literals, grouping decimal numbers by thousands amount = 10_000_000.0 The new- style number-to-string formatting language will be extended to� Python allows you to use a lowercase L with long, but it is recommended that you use only an uppercase L to avoid confusion with the number 1. Python displays long integers with an uppercase L. A complex number consists of an ordered pair of real floating point numbers denoted by a + bj, where a is the real part and b is the imaginary part of
In the case where the strings are might not be of the form
data<num> and you want the solution to still be valid or if some of the entries are broken for some reason, you can do the following:
import re ll = ['data000000', 'data000003', 'data0172', 'data2312', 'data008212312'] ss = ''.join(ll) res = [int(s) for s in re.findall(r'\d+', ss)] print(res)
re.findall is applied to the entire list of strings but due to the fact it returns a list of tuples you will get the desired result.
[0, 3, 172, 2312, 8212312]
Note: applying the
re.findall to the list without the join will raise an error.
PEP 3101 -- Advanced String Formatting, If numbers, they must be valid base-10 integers; if names, they must be valid In such cases, the str.format() method merely passes all of the� Program the numerical methods to create simple and efficient Python codes that output the numerical solutions at the required degree of accuracy. Create and manipulate arrays (vectors and matrices) by using NumPy. Use the plotting functions of matplotlib to present your results graphically. Apply
7. Input and Output — Python v3.1.5 documentation, Often you'll want more control over the formatting of your output than Many values, such as numbers or structures like lists and dictionaries,� repeated concatenation of the same string is simply repeating the same string a certain number of times. For example, in python, multiplying "ahh" by 4 yields "ahhahhahhahh". Note that these operators also work for numbers and are defined differently for numbers. In a
PEP 378 -- Format Specifier for Thousands Separator, Second, the name of a relevant locale (such as "de_DE") can vary from Format Class  that uses picture patterns (one for positive numbers� Some basic operations in Python for scientific computing. Lecture 1B: To speed up Python's performance, usually for array operations, most of the code provided here use NumPy, a Python's scientific computing package. However, for comparison, code without NumPy are also presented.
To me this kind of seemingly simple problem is what Python is all about. Especially if you're coming from a language like C++, where simple text parsing can be a pain in the butt, you'll really appreciate the functionally unit-wise solution that python can give you.
- Why not use int()?
- you can use it. That's not a problem.Based on our need.