extract last 2 chars from a column in a data.frame

I am new to R programming and have searched SO for many hours. I would appreciate your help.

I have a dataframe, with 3 columns (Date,Description, Debit)

      Date         Description   Debit
2014-01-01      "abcdef    VA"      15
2014-01-01     "ghijkl"    NY"      56

I am trying to extract the last 2 chars of the second (Description) column (i.e. the 2 letter state abbreviation). I am not very comfortable with apply-type functions.

I have tried using

 l <- lapply(a$Description, function(x) {substr(x, nchar(x)-2+1, nchar(x))})

but get the following error message

Error in nchar(x) : invalid multibyte string, element 1 

I have tried multiple other approaches, but with the same error.

I am quite sure that I am missing something very basic, so would appreciate your help

thanks

library(stringr)
str_sub(a$Description,-2,-1)

Extract last n characters from right of the column in pandas python , Extract first n characters; Extract last n characters; Extract First word of the column in 2. df1 = data.frame (State = c ( 'Arizona AZ' , 'Georgia GG' , 'Newyork NY'  Extract Last n characters from right of the column in pandas: str[-n:] is used to get last n character of column in pandas. df1['Stateright'] = df1['State'].str[-2:] print(df1) str[-2:] is used to get last two character of column in pandas and it is stored in another column namely Stateright so the resultant dataframe will be

df <- data.frame(date = c("2015-01-01", "2015-02-01", "2015-01-15"),
             jumble = c("12345 VA", "123 FL", "12354567732 GA"),
             debit = c(15, 36, 20))

df$jumble <- as.character(df$jumble)

df$state <- substr(df$jumble, nchar(df$jumble)-1, nchar(df$jumble))

df
        date         jumble debit state
1 2015-01-01       12345 VA    15    VA
2 2015-02-01         123 FL    36    FL
3 2015-01-15 12354567732 GA    20    GA

Extract substring of the column in R dataframe, Extract or replace substrings in a character vector. substr(x, start, stop) substring(text, first, last = 1000000L) substr(x, start, stop) 1 2 3 4 5 6 7 8 9 10 11 Row and Column Sums and Means commandArgs: Extract Command Line Arguments expand.grid: Create a Data Frame from All Combinations of Factor Variables  The following represents a command which can be used to extract a column as a data frame. If you use a command such as df[,1] , the output will be a numeric vector (in this case).

Here's a regex version, using Brandon S's sample data. The regex captures everything after the last whitespace character to the end of the string.

df <- data.frame(date = c("2015-01-01", "2015-02-01", "2015-01-15"),
                 jumble = c("12345 VA", "123 FL", "12354567732 GA"),
                 debit = c(15, 36, 20))

df$state <- gsub(".+\\s(.+)$", "\\1", df$jumble)

df

        date         jumble debit state
1 2015-01-01       12345 VA    15    VA
2 2015-02-01         123 FL    36    FL
3 2015-01-15 12354567732 GA    20    GA

substr: Substrings of a Character Vector, At times, you may need to extract specific characters within a string. DataFrame​(Data, columns= ['Identifier']) Left = df['Identifier'].str[:5] print (Left) For the final scenario, the goal is to obtain the digits between two different symbols (the dash​  Extract first n Characters from left of column in pandas: str[:n] is used to get first n characters of column in pandas. df1['StateInitial'] = df1['State'].str[:2] print(df1) str[:2] is used to get first two characters of column in pandas and it is stored in another column namely StateInitial so the resultant dataframe will be

We can use sub

df$State <- sub(".*\\s+", "", df[,2])
df$State
#[1] "VA" "FL" "GA"

8 ways to apply LEFT, RIGHT, MID in Pandas, How to extract the first / last n characters from a string in R - 3 examples - Get leading & trailing So what if we want to get the last n characters of our example data? Example 2: Extract Last n Characters from String in Base R Split Data Frame Variable into Multiple Columns in R (3 Examples) | Separate Character String  The replacement methods can be used to add whole column(s) by specifying non-existent column(s), in which case the column(s) are added at the right-hand edge of the data frame and numerical indices must be contiguous to existing indices.

A more elegant way:

df['Description'].str[-2:]

I assume that your description column is of String type (or Object type).

Extract First or Last n Characters from String in R (3 Example Codes), substr(x, start = 2, stop = 5) substring(x, first = 2, last = 5) first = 2, last = 5). Both, the R substr and substring functions extract or replace substrings in a character vector. Select Data Frame Column Using Character Vector in R (Example)  ( concentration <- data.frame(concentration, unit) ) ## concentration unit ## 1 2.12 mL ## 2 7.5 mL ## 3 0.7 mL ## 4 7.6 mL ## 5 0.11 mL ## 6 2.13 mL ## 7 0.27 mL ## 8 0.45 mL ## 9 0.17 mL ## 10 0.96 mL

R substr & substring, 2. You already use R for handling quantitative and qualitative data, but not (​necessarily) for processing you won't get a Michelin star for processing character data. But you would [1] 53 18. Notice that the data frame radio is a table with 53 rows and 18 columns. tail() function to inspect the last n = 10 elements of the file:. Following represents command which could be used to extract a column as a data frame. If you use command such as “df[,1]”, the output will be a numeric vector (in this case). To get an output as a data frame, you would need to use something like below. # First Column as data frame as.data.frame( df[,1], drop=false)

[PDF] Handling and Processing Strings in R, We often find ourselves having to manipulate character (text) objects as well. In the it can also involve extracting a certain set of characters given some pattern. It involves assessing if a variable is equal (or not) to a complete text string. The reason str_locate_all returns a list and not a matrix or a data frame can be  a regular expression used to extract the desired values. There should be one group (defined by ()) for each element of into. remove: If TRUE, remove input column from output data frame. convert: If TRUE, will run type.convert() with as.is = TRUE on new columns. This is useful if the component columns are integer, numeric or logical.

Working with string objects, Extract and replace substrings from a character vector. the position of the first character (defaults to first), end gives the position of the last (defaults to last character). substr doesn't support passing a 2 column matrix as the 2nd argument: Very often you may have to manipulate a column of text in a data frame with R. You may want to separate a column in to multiple columns in a data frame or you may want to split a column of text and keep only a part of it. tidyr’s separate function is the best […]

Comments
  • substr(df$Description, nchar(df$Description)-1, nchar(df$Description))
  • thank you all for your suggestions. I noticed that your suggestion works but only if I assign the values in a stmt. it does not seem to work when I have the df from a function. any thoughts on this? thanks