How to split character and numerical separately in R

r split column into multiple columns by separator
separate function in r
r split string by delimiter
r split column by number of characters
r separate numbers and characters
r split column into multiple rows
r split string by number of characters
tidyr separate multiple columns

I have a dataframe which looks like this:

df= data.frame(name= c("1Alex100.00","12Rina Faso92.31","113john00.00"))

And I want to split this into a data frame with 3 columns so that the output looks like:

name1 name2      name3
1     Alex       100.00
12    Rina Faso  92.31
113   john       00.00

I have tried stringr() and grep() and have got limited success. Lack of a delimiter makes it lot more difficult.


You could try

library(tidyr)
res <- extract(df, name, into=c('name1', 'name2', 'name3'),
                  '(\\d+)([^0-9]+)([0-9.]+)', convert=TRUE)
res
#    name1     name2  name3
#1     1      Alex 100.00
#2     2 Rina Faso  92.31
#3     3      john  50.00

str(res)
# 'data.frame': 3 obs. of  3 variables:
#$ name1: int  1 2 3
#$ name2: Factor w/ 3 levels "Alex","john",..: 1 3 2
# $ name3: num  100 92.3 50
Update

Based on 'df' from @DavidArenburg's post

 res <- extract(df, name, into=c('name1', 'name2', 'name3'),
                   '(\\d+)([^0-9]+)([0-9.]+)', convert=TRUE)
 res
 #    name1         name2 name3
 #1   121       Réunion 13.76
 #2     2 Côte d'Ivoire 22.40
 #3     3          john 50.00

How to split a data frame?, How do I split a column into multiple columns in pandas? In R, you use the paste()function to concatenate and the strsplit()function to split. In this section, we show you how to use both functions. First, create a character vector called pangram, and assign it the value “The quick brown fox jumps over the lazy dog”, as follows:


Try with str_match from stringr:

str_match(df$name, "^([0-9]*)([A-Za-z ]*)([0-9\\.]*)")
#      [,1]              [,2] [,3]        [,4]    
# [1,] "1Alex100.00"     "1"  "Alex"      "100.00"
# [2,] "2Rina Faso92.31" "2"  "Rina Faso" "92.31" 
# [3,] "3john50.00"      "3"  "john"      "50.00" 

So as.data.frame(str_match(df$name, "^([0-9]*)([A-Za-z ]*)([0-9\\.]*)")[,-1]) should give you the desired result.

How to Split a Column into Two Columns in Pandas?, You can use 'separate' command from 'tidyr' package like below. tidyr’s separate function is the best option to separate a column or split a column of text the way you want. Let us see some simple examples of using tidyr’s separate function. Let us first load the R packages needed to see the examples with separate function.


You could do like this also.

> df <- data.frame(name= c("1Alex100.00","12Rina Faso92.31","113john00.00"))
> x <- do.call(rbind.data.frame, strsplit(as.character(df$name), "(?<=[A-Za-z])(?=\\d)|(?<=\\d)(?=[A-Za-z])", perl=T))
> colnames(x) <- c("name1", "name2", "name3")
> print(x, row.names=FALSE)
 name1     name2  name3
     1      Alex 100.00
    12 Rina Faso  92.31
   113      john  00.00

Quick Tip: Split a sequence of numbers or letters into multiple columns, Source: R/separate.R. separate. Names of new variables to create as character vector. Use NA If numeric, sep is interpreted as character positions to split at. Convert All Characters of a Data Frame to Numeric. As you have seen, to convert a vector or variable with the character class to numeric is no problem. However, sometimes it makes sense to change all character columns of a data frame or matrix to numeric. Consider the following R data.frame:


With base R it could be done abit uglier though it works with special characters too

with(df, cbind(sub("\\D.*", "", name), 
               gsub("[0-9.]", "", name), 
               gsub(".*[A-Za-z]", "", name)))

#     [,1]  [,2]        [,3]    
# [1,] "1"  "Alex"      "100.00"
# [2,] "2"  "Rina Faso" "92.31" 
# [3,] "3"  "john"      "50.00" 

An example on special characters

df = data.frame(name= c("121Réunion13.76","2Côte d'Ivoire22.40","3john50.00"))
with(df, cbind(sub("\\D.*", "", name), 
         gsub("[0-9.]", "", name), 
         gsub(".*[A-Za-z]", "", name)))

#     [,1]  [,2]            [,3]   
# [1,] "121" "Réunion"       "13.76"
# [2,] "2"   "Côte d'Ivoire" "22.40"
# [3,] "3"   "john"          "50.00"

Separate a character column into multiple columns with a regular , Given either regular expression or a vector of character positions, separate() turns a Description Usage Arguments See Also Examples. View source: R/​separate.R This is useful if the component columns are integer, numeric or logical. Names of new variables to create as character vector. Use NA to omit the variable in the output. sep. Separator between columns. If character, is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values. If numeric, interpreted as positions to split at.


Base R not ugly solutions:

 proto=data.frame(name1=numeric(),name2=character(),name3=numeric())
 strcapture("(\\d+)(\\D+)(.*)",as.character(df$name),proto)
  name1     name2  name3
1     1      Alex 100.00
2    12 Rina Faso  92.31
3   113      john   0.00
 read.table(text=gsub("(\\d+)(\\D+)(.*)","\\1|\\2|\\3",df$name),sep="|")
   V1        V2     V3
1   1      Alex 100.00
2  12 Rina Faso  92.31
3 113      john   0.00

separate: Separate a character column into multiple columns using , [R] separate numbers from chars in a string. Marc Schwartz marc_schwartz at me.​com. Thu Jul 31 14:40:56 CEST 2014. Previous message: [R] separate  Names of new variables to create as character vector. Use NA to omit the variable in the output. sep: Separator between columns. If character, sep is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values. If numeric, sep is interpreted as character positions to split at. Positive values start at 1 at the far-left of the string; negative value start at -1 at the far-right of the string.


[R] separate numbers from chars in a string, a list of vectors or data frames compatible with a splitting of x . Recycling applies if the lengths do not match. sep. character string, passed to interaction in the  Argument split will be coerced to character, so you will see uses with split = NULL to mean split = character(0), including in the examples below. Note that splitting into single characters can be done via split = character(0) or split = ""; the two are equivalent. The definition of ‘character’ here depends on the locale: in a single-byte locale it is a byte, and in a multi-byte locale it is the unit represented by a ‘wide character’ (almost always a Unicode code point).


split function, and you'd like to separate the numerical values from the units in columns If there are multiple numbers in the character string, it only extracts the first one: mode only (see next section about the import of Excel files in R). Extract maximum numeric value from a given string | Set 1 (General approach) Program to count vowels, consonant, digits and special characters in string. Minimum circular rotations to obtain a given numeric string by avoiding a set of given strings; Split the given string into Primes : Digit DP; Split the given string into Odds: Digit DP


Extraction of numbers from a character string, The result is a single string (i.e., one-element character vector) with the numbers separated by spaces (which is the default). We can also separate by other  Character String Manipulation. Unlike other statistical packages, R has a robust and simple to use set of string manipulation functions. These functions become useful in a number of situations, including: dynamically creating variables, generating tabular and graphical output, reading and writing from text files and the web, and managing character data (e.g., recoding free response or other