How to splitting column value in dataframe into multiple columns

pandas split dataframe by column name
pandas split column of lists into multiple rows
pandas split column into multiple rows
pandas split column into multiple columns by delimiter
pandas dataframe split rows
pandas split list into columns
dataframe split column into multiple
pandas split dataframe by condition

I need to split a dataframe column into multiple columns to make sure only two value is contained within each cell. The current dataframe looks like:

          Name     |  Number |  Code |
         ..............................
         Tom      | 78797071|       0
         Nick     |         | 89797071
         Juli     |         | 57797074
         June     | 39797571|       0
         Junw     |         | 23000000|

if code contain 8 digit number then split every two digit number in each column and if 00 comes in any of the DIV it should be marked as 'incomplete'

The new dataframe should look like:

     Name     |  Number |  Code |  DIV|DIV2|DIV3|DIV4|Incomplete  |
     ........................................................................
     Tom      | 78797071|       0 | 0 |   0|  0 |   0 |incomplete |
     Nick     |         | 89797071| 89| 79 | 70 | 71  |complete   |
     Juli     |         | 57797074| 57| 79 | 70 | 74  |complete   |
     June     | 39797571|       0 |  0|   0|  0 |   0 |complete   |
     Junw     |         | 23000000| 23|  00| 00 | 00  |incomplete |

You can use str.findall("..") to split the values, then join the list on the original df. Use apply to get the complete/incomplete status.

import pandas as pd

df = pd.DataFrame({"Name":["Tom","Nick","Juli","June","Junw"],
                   "Number":[78797071, 0, 0, 39797571, 0],
                   "Code":[0, 89797071, 57797074, 0, 23000000]})

df = df.join(pd.DataFrame(df["Code"].astype(str).str.findall("..").values.tolist()).add_prefix('DIV')).fillna("00")
df["Incomplete"] = df.iloc[:,3:7].apply(lambda row: "incomplete" if row.str.contains('00').any() else "complete", axis=1)

print (df)

#
   Name    Number      Code DIV0 DIV1 DIV2 DIV3  Incomplete
0   Tom  78797071         0   00   00   00   00  incomplete
1  Nick         0  89797071   89   79   70   71    complete
2  Juli         0  57797074   57   79   70   74    complete
3  June  39797571         0   00   00   00   00  incomplete
4  Junw         0  23000000   23   00   00   00  incomplete

How to Split a Column into Two Columns in Pandas?, Often you may have a column in your pandas data frame and you may want to split the column and make it into two columns in the data frame. df['A'], df['B'] = df['AB'].str.split(' ', 1).str. Or you can create create a DataFrame with one column for each entry of the split automatically with: df['AB'].str.split(' ', 1, expand=True) You must use expand=True if your strings have a non-uniform number of splits and you want None to replace the missing values.


Try this quick fix.

import pandas as pd
import re

#data-preprocessing
data = {'Name': ['Tom','Nick','Juli','June','Junw'],'Code': ['0', '89797071', '57797074', '0', '23000000']}

#I omitted Number key in data

df = pd.DataFrame(data)

print(df)

#find patterns

pattern = r'(\d{2})(\d{2})(\d{2})(\d{2})'
zero_pattern = r'0{1,}'

split_data = []

for _ in df['Code'].items():

  to_find = _[1]

  splitted = re.findall(pattern, to_find)
  if splitted:
    temp = list(splitted[0])
    if '00' in temp:
      temp.append('incomplete')
    else:
      temp.append('complete')
    split_data.append(temp)

  zeromatch = re.match(zero_pattern, to_find)
  if zeromatch:
    split_data.append(['0','0','0','0','incomplete'])

#make right dataframe

col_name = ['DIV1','DIV2','DIV3','DIV4','Incomplete']

df2 = pd.DataFrame(split_data, columns=col_name)  

df[col_name]= df2

print(df)

Output

   Name      Code
0   Tom         0
1  Nick  89797071
2  Juli  57797074
3  June         0
4  Junw  23000000
   Name      Code DIV1 DIV2 DIV3 DIV4  Incomplete
0   Tom         0    0    0    0    0  incomplete
1  Nick  89797071   89   79   70   71    complete
2  Juli  57797074   57   79   70   74    complete
3  June         0    0    0    0    0  incomplete
4  Junw  23000000   23   00   00   00  incomplete

Pandas Dataframe: split column into multiple columns, right-align , I'd do something like the following: foo = lambda x: pd.Series([i for i in reversed(x.​split(','))]) rev = df['City, State, Country'].apply(foo) print rev 0 1  We can use Pandas’ str.split function to split the column of interest. Here we want to split the column “Name” and we can select the column using chain operation and split the column with expand=True option. str.split() with expand=True option results in a data frame and without that we will get Pandas Series object as output.


you can do it using string functions zfill and findall like below

df.Code = df.Code.astype(np.str)

## zfill will pad string with 0 to make its lenght 8, findall will find each pair of digit
## explode will split list into rows (explode works with pandas 0.25 and above)
## reshape to make it 4 columns
arr = df.Code.str.zfill(8).str.findall(r"(\d\d)").explode().values.reshape(-1, 4)

## create new dataframe from arr with given column names
df2 = pd.DataFrame(arr, columns=[f"Div{i+1}" for i in range(arr.shape[1])])

## set "Incomplete" colum to incomplete if any column of row contains "00"
df2["Incomplete"] = np.where(np.any(arr == "00", axis=1), "incomplete", "complete")

pd.concat([df,df2], axis=1)


Result

        Name    Number  Code    Div1    Div2    Div3    Div4    Incomplete
0   Tom 78797071    0   00  00  00  00  incomplete
1   Nick        89797071    89  79  70  71  complete
2   Juli        57797074    57  79  70  74  complete
3   June    39797571    0   00  00  00  00  incomplete
4   Junw        23000000    23  00  00  00  incomplete

Pandas Dataframe: split column into multiple columns , You can perform the following code: foo = lambda x: pd.Series([i for i in reversed(​x.split(','))]). rev = df['City, State, Country'].apply(foo). print rev. for index, row in df.iterrows(): i = 0 for item in row['string'].split(): df.set_values(index, 'string_{0}'.format(i), item) i = i + 1 But how could one achieve the same result more elegantly?a python pandas


Split Column into Unknown Number of Columns by Delimiter , I am trying to split a column into multiple columns based off comma/space seperation. my dataframe currently looks like. Item Colors. 0 ID-1 Red  Splitting the column into three columns is trivial enough: location_df = df['City, State, Country'].apply(lambda x: pd.Series(x.split(','))) However, this creates left-aligned data:


Splitting Columns With Pandas, Split columns containing multiple values in your Pandas DataFrame For an example of multiple attributes in the same column, let's look at  In this article, I'll use two IF() functions to split a simple "database" sheet into multiple columns, based on a value in another column. Disclosure: TechRepublic may earn a commission from some


pandas.Series.str.split, Expand the splitted strings into separate columns. If True , return DataFrame/​MultiIndex expanding dimensionality. If False , return Series/Index, containing lists  Split a column in Pandas dataframe and get part of it; Split a String into columns using regex in pandas DataFrame; Split a text column into two columns in Pandas DataFrame; Python | Remove empty strings from list of strings; Python | pandas.to_markdown() in Pandas; Python | Tokenizing strings in list of strings; Python String | split()