One long character to multiple columns

r split string into multiple columns
r split column into multiple columns by separator
error: `var` must evaluate to a single number or a column name, not a character vector
r split column into multiple rows
r split column by number of characters
separate function in r
r split column by comma
r split vector into columns

I have this dataframe

df1 <- data.frame(Note = c("Profit before tax 240 tSEK",
                           "Earnings per share 0.240 " ,
                           "Ali de Margin 37 %"),
                  Line = c(6, 2, 2))

I want something like below

Note                 Val    Unit    Line
Profit before tax    240    tSEK    6
Earnings per share   0.240          2
Ali de Margin        37      %      2

How can I do it?

You can use data.table function tstrsplit, splitting your variable Note on the space either prior to digits or after a number with digits (with or without dot), using regex and lookarounds:

library(data.table)
setDT(df1)[, c("Note", "Val", "Unit"):=tstrsplit(Note, "( (?=[0-9.]+))|((?<=\\d) )", perl=TRUE)]
df1
#                 Note Line   Val Unit
#1:  Profit before tax    6   240 tSEK
#2: Earnings per share    2 0.240   NA
#3:      Ali de Margin    2    37    %

One long character to multiple columns, You can use data.table function tstrsplit , splitting your variable Note on the space either prior to digits or after a number with digits (with or  The number of columns your data will be split on depends on the delimiters you select. For example, if you select Comma as your delimiter and one of the cells to be split contains three words separated by comma, the data will be split into three columns.

You could also play with regexpr & regmatches functions:

pattern <- regexpr("[[:digit:]]+\\.*[[:digit:]]+", df$note)
note <- substr(df$note, 1, pattern-2)
value <- regmatches(df$note, pattern)
unit <- substr(df$note, 
              pattern+attr(pattern, "match.length")+1,
              nchar(as.character(df$note)))

result <- data.frame(note=note, value=value, unit=unit, line=df$Lines)

#                note value unit line
#1  Profit before tax   240 tSEK    6
#2 Earnings per share 0.240         2
#3      Ali de Margin    37    %    2

separate: Separate a character column into multiple columns using , Given either regular expression or a vector of character positions, separate() turns a single character column into multiple columns. Usage. 1 2 3 4 5 6 7 8 9 10 11. You can take the text in one or more cells, and split it into multiple cells using the Convert Text to Columns Wizard. Select the cell or column that contains the text you want to split. Select Data > Text to Columns. In the Convert Text to Columns Wizard, select Delimited > Next. Select the Delimiters for your data.

One solution is to use tidyr::extract. The extract function provides flexibility to define regex to capture groups and separate column in multiple columns.

library(tidyr)

extract(df1, Note, into = c("Note", "Val", "Unit"),
                regex = "^([[:alpha:][:blank:]]+)\\s([[:digit:].]+)(.*)")

#                 Note   Val  Unit Line
# 1  Profit before tax   240  tSEK    6
# 2 Earnings per share 0.240          2
# 3      Ali de Margin    37     %    2
**Regex explanation:**

^([[:alpha:][:blank:]]+)  -- Group 1 => Any number of character/spaces 
\\s                       -- Leave a space between Group 1 and Group 2
([[:digit:].]+)           -- Group 2 => Any number of digits/.
(.*)                      -- Gropu 3 => Any thing after 2nd group till end.

Reshaping Your Data with tidyr · UC Business Analytics R , tidyr is a one such package which was built for the sole purpose of simplifying the makes “long” data wider; separate() splits a single column into multiple columns function which turns a single character column into multiple columns. Re: Help splitting one long column into multiple smaller columns That would give you 467 columns going across. It might be better to arrange the data to go across up to a certain number of columns, eg A1 to A20 in B1:T1, then A21 to A40 in B2:T2, and so on.

Quick Tip: Split a sequence of numbers or letters into multiple columns, The challenge here is though, there is no clear separating character in the data in 'test 1' column, so 'separate' command, which requires to set at least 1  Select the column that the names are in, click on the Data tab on the ribbon, and click the Text To Columns button. The Convert Text to Columns Wizard will open. The first step asks if your data is delimited or fixed width. Delimited means there is a specific character, such as a comma, tab, or space that separates each piece of information.

Split text into different columns with the Convert Text to Columns , Learn how to take the text in one or more cells, and split it out across multiple cells by using Excel functions. This is called parsing, and is the opposite of  Supposing you have a table as showing below, and you need to change one column to a range. Here we will show you some tricky things about how to change a single column into multiple columns. Transpose a single column to multiple columns with formulas. Transpose a single column to multiple columns with Kutools for Excel

Tidyr: Crucial Step Reshaping Data with R for Easier Analyses , The function gather() collapses multiple columns into key-value pairs. It produces a “long” data format from a “wide” one. You should use the function gather_() which takes character vectors, containing column names, instead of unquoted  You can take the text in one or more cells, and split it into multiple cells using the Convert Text to Columns Wizard. Select the cell or column that contains the text you want to split. Select Data > Text to Columns. In the Convert Text to Columns Wizard, select Delimited > Next. Select the Delimiters for your data. For example, Comma and

Comments
  • Write an appropriate regex. See help("regmatches") and help("regex") as well as numerous regex tutorials.
  • So, Val is the last(/only?) numeric field, and Unit is the optional string after it?