Python: Pandas filter string data based on its string length

pandas length of string in cell
pandas length of column value
pandas length of list
pandas max length in column
python length of string pandas
python length dataframe
dataframe count string length
attributeerror: 'series' object has no attribute 'len'

I like to filter out data whose string length is not equal to 10.

If I try to filter out any row whose column A's or B's string length is not equal to 10, I tried this.

df=pd.read_csv('filex.csv')
df.A=df.A.apply(lambda x: x if len(x)== 10 else np.nan)
df.B=df.B.apply(lambda x: x if len(x)== 10 else np.nan)
df=df.dropna(subset=['A','B'], how='any')

This works slow, but is working.

However, it sometimes produce error when the data in A is not a string but a number (interpreted as a number when read_csv read the input file).

  File "<stdin>", line 1, in <lambda>
TypeError: object of type 'float' has no len()

I believe there should be more efficient and elegant code instead of this.


Based on the answers and comments below, the simplest solution I found are:

df=df[df.A.apply(lambda x: len(str(x))==10]
df=df[df.B.apply(lambda x: len(str(x))==10]

or

df=df[(df.A.apply(lambda x: len(str(x))==10) & (df.B.apply(lambda x: len(str(x))==10)]

or

df=df[(df.A.astype(str).str.len()==10) & (df.B.astype(str).str.len()==10)]
import pandas as pd

df = pd.read_csv('filex.csv')
df['A'] = df['A'].astype('str')
df['B'] = df['B'].astype('str')
mask = (df['A'].str.len() == 10) & (df['B'].str.len() == 10)
df = df.loc[mask]
print(df)

Applied to filex.csv:

A,B
123,abc
1234,abcd
1234567890,abcdefghij

the code above prints

            A           B
2  1234567890  abcdefghij

Pandas filter string data based on its string length using DataFrame , Replacing 3 with "3" works. I'm using pandas 0.23.1 . df.query('A.str.len() != "3"'). Output: A B 0 hi 1 1 hello 2 3 NaN 4. Alternatively, if you want  The question is very similar to this question Python: Pandas filter string data based on its string length, but I want to use pandas.DataFrame.query. Let's say we have a pandas.DataFrame. I like to filter out the rows where the string length of the column A is not equal to 3 using pandas.DataFrame.query

A more Pythonic way of filtering out rows based on given conditions of other columns and their values:

Assuming a df of:

data={"names":["Alice","Zac","Anna","O"],"cars":["Civic","BMW","Mitsubishi","Benz"],
     "age":["1","4","2","0"]}

df=pd.DataFrame(data)
df:
  age        cars  names
0   1       Civic  Alice
1   4         BMW    Zac
2   2  Mitsubishi   Anna
3   0        Benz      O

Then:

df[
df['names'].apply(lambda x: len(x)>1) &
df['cars'].apply(lambda x: "i" in x) &
df['age'].apply(lambda x: int(x)<2)
  ]

We will have :

  age   cars  names
0   1  Civic  Alice

In the conditions above we are looking first at the length of strings, then we check whether a letter ("i") exists in the strings or not, finally, we check for the value of integers in the first column.

Python, Pandas is one of those packages and makes importing and analyzing data much NULL values might be present too depending upon caller series. In this example, the string length of Name column is calculated using str.len() method. Python Pandas allows us to slice and dice the data in multiple ways. Often, you may want to subset a pandas dataframe based on one or more values of a specific column. Necessarily, we would like to select rows based on one value or multiple values present in a column. To filter data in Pandas, we have the following options.

If You have numbers in rows, then they will convert as floats.

Convert all the rows to strings after importing from cvs. For better performance split that lambdas into multiple threads.

pandas.Series.str.len, Compute the length of each element in the Series/Index. The element may be a sequence (such as a string, tuple or list) or a collection (such as a dictionary). We will be using apply function to find the length of the string in the columns of the dataframe so the resultant dataframe will be . Example 2 – Get the length of the integer of column in a dataframe in python: # get the length of the integer of column in a dataframe df[' Revenue_length'] = df['Revenue'].map(str).apply(len) print df

you can use df.apply(len) . it will give you the result

Working with text data, You can accidentally store a mixture of strings and non-strings in an object dtype rather than either int or float dtype, depending on the presence of NA values. the number of unique elements in the Series is a lot smaller than the length of  In this article, we will cover various methods to filter pandas dataframe in Python. Data Filtering is one of the most frequent data manipulation operation. It is similar to WHERE clause in SQL or you must have used filter in MS Excel for selecting specific rows based on some conditions. In terms of speed, python has an efficient way to perform

pandas.Series.str.endswith, A Series of booleans indicating whether the given pattern matches the end of each string element. See also. str.endswith. Python standard library string method. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.filter() function is used to Subset rows or columns of dataframe according to labels in the specified index. Note that this routine does not filter a dataframe on its contents. The filter is applied to the labels of the index.

Get the string length of the column - python pandas, len() function in python is used to get the length of string. How to find the string length of the column in a dataframe in python pandas..example in pandas. Often, you may want to subset a pandas dataframe based on one or more values of a specific column. Essentially, we would like to select rows based on one value or multiple values present in a column. Here are SIX examples of using Pandas dataframe to filter rows or select rows based values of a column(s).

14 Strings, This chapter introduces you to string manipulation in R. You'll learn the Base R contains many functions to work with strings but we'll avoid them and it automatically recycles shorter vectors to the same length as the longest Typically, however, your strings will be one column of a data frame, and you'll want to use filter  Filter a list of strings in Python using filter() Suppose we have a list of strings i.e. Now let’s filter the contents of list and keep the strings with length 2 only using filter() i.e. Output: So, filter() iterated over all the strings in given list and the called isOfLengthFour() for each string element.