Convert string with NaNs to int in pandas

pandas convert string column to int
cannot convert non finite values (na or inf) to integer
pandas convert string to int
cannot convert float nan to integer
pandas convert object to int64
pandas to int
pandas float to int with nan
convert string to nan pandas

I have a pandas dataframe, all the values are strings. Some are 'None's, and the rest are integers but in string format, such as '123456'. How can I convert all 'None's to np.nan, and others to integers, like, 123456.

df = {'col1': ['1', 'None'], 'col2': ['None', '123']}

Convert df to:

df = {'col1': [1, NaN], 'col2': [NaN, 123]}

Use the below code:

print(df.replace('None', np.nan).astype(float))

Output:

   col1   col2
0   1.0    NaN
1   NaN  123.0

You have to use replace.

P.S. if df is a dictionary, convert it first:

df = pd.DataFrame(df)

How to Convert String to Integer in Pandas DataFrame, Step 2: Convert the Strings to Integers in Pandas DataFrame. Now how do you By setting errors='coerce', you'll transform the non-numeric values into NaN. In this guide, I’ll show you two methods to convert a string into an integer in pandas DataFrame: (1) The astype(int) method: df['DataFrame Column'] = df['DataFrame Column'].astype(int) (2) The to_numeric method: df['DataFrame Column'] = pd.to_numeric(df['DataFrame Column']) Let’s now review few examples with the steps to convert a string

You can convert your columns to Nullable Integer type (new in 0.24+):

d = {'col1': ['1', 'None'], 'col2': ['None', '123']}
res = pd.DataFrame({
    k: pd.to_numeric(v, errors='coerce') for k, v in d.items()}, dtype='Int32')
res

   col1  col2
0     1   NaN
1   NaN   123

With this solution, numeric data is converted to integers (but missing data remains as NaN):

res.to_dict()
# {'col1': [1, nan], 'col2': [nan, 123]}

On older versions, convert to object when initialising the DataFrame:

res = pd.DataFrame({
    k: pd.to_numeric(v, errors='coerce') for k, v in d.items()}, dtype=object)
res

  col1 col2
0    1  NaN
1  NaN  123

It is different from the nullable types solution above—only the representation changes, not the actual data.

res.to_dict()
#  {'col1': [1.0, nan], 'col2': [nan, 123.0]}

pandas.to_numeric — pandas 1.1.0 documentation, If 'coerce', then invalid parsing will be set as NaN. If not None, and if the data has been successfully cast to a numerical dtype (or if the data was numeric to begin with), downcast that resulting 'int' or 'signed': smallest signed int dtype ( min. The df.astype(int) converts Pandas float to int by negelecting all the floating point digits. df.round(0).astype(int) rounds the Pandas float number closer to zero. to_numeric() Method to Convert float to int in Pandas. This method provides functionality to safely convert non-numeric types (e.g. strings) to a suitable numeric type.

You can also use:

import pandas as pd
d = {'col1': ['1', 'None'], 'col2': ['None', '123']}
df = pd.DataFrame.from_dict(d).replace("None", value=pd.np.nan).astype(float)

   col1   col2
0   1.0    NaN
1   NaN  123.0

col1    1 non-null float64
col2    1 non-null float64
dtypes: float64(2)

pandas.DataFrame.to_string — pandas 1.1.0 documentation, String representation of NAN to use. formatterslist Display DataFrame dimensions (number of rows by number of columns). Convert DataFrame to HTML. Notes. By default, convert_dtypes will attempt to convert a Series (or each Series in a DataFrame) to dtypes that support pd.NA.By using the options convert_string, convert_integer, and convert_boolean, it is possible to turn off individual conversions to StringDtype, the integer extension types or BooleanDtype, respectively.

Working with missing data — pandas 1.1.0 documentation, Because NaN is a float, a column of integers with even one missing values is cast to floating-point dtype (see Support for integer NA for more). Pandas provides a Alternatively, the string alias dtype='Int64' (note the capital "I" ) can be used. Values considered “missing”¶ As data comes in many shapes and forms, pandas aims to be flexible with regard to handling missing data. While NaN is the default missing value marker for reasons of computational speed and convenience, we need to be able to easily detect this value with data of different types: floating point, integer, boolean, and general object.

Python Pandas String To Integer And Integer To String DataFrame, In this tutorial I will show you how to convert String to Integer format and vice versa. Convert Integer To Str Using astype() method of Python Pandas Dataframe non numeric values to NaN and at the same time convert the data type to int. Converting a column within pandas dataframe from int to string. 0 votes . 1 To do that I have to convert an int column to str. Replace None with NaN in pandas

Data Types and Formats – Data Analysis and Visualization in , Define, manipulate, and interconvert integers and floats in Python. Analyze datasets having missing/null values (NaN values). Write manipulated data to Text data type is known as Strings in Python, or Objects in Pandas. Strings can contain� pandas.to_numeric¶ pandas.to_numeric (arg, errors = 'raise', downcast = None) [source] ¶ Convert argument to a numeric type. The default return dtype is float64 or int64 depending on the data supplied. Use the downcast parameter to obtain other dtypes.

Comments
  • Is df a dataframe or a dictionary?
  • Thank you, but how about the integer string? Will they be automatically converted to float numbers?
  • @TingWang Edited mine, now they will :-)
  • Just a minor note, since this is now the accepted answer: it converts the numeric data to floats, not integers (as requested in the OP).
  • The data type is still object, with the numbers being strings.
  • They are still strings after your edit. Run result.values.tolist() to take a look...