tidyverse: matching specific dates to event periods

r subset by date range
lubridate
lubridate cheat sheet
separate date and time in r lubridate
selectbydate in r
as.date r
lubridate date difference
dplyr : : cheat sheet

I've got dates that I want to match with events for which I only have the start date. As a simplified reprex, say I'd like to figure out who was president during certain events, but I only have inauguration dates.

pres <- data.frame(pres = c("Ronald Reagan", "George H. W. Bush", 
                            "Bill Clinton", "George W. Bush", "Barack 
                             Obama", "Donald Trump"), 
                     inaugdate = structure(c(4037, 6959, 8420, 11342, 14264, 
                                             17186), class = "Date"))

events <- data.frame(event = c("Challenger explosion", "Chernobyl 
                                explosion", "Hurricane Katrina", "9-11"), 
                      date = structure(c(5871, 5959, 13024, 11576), class = "Date"))

Obviously, a simple left_join won't work because the events didn't happen on inauguration days.

events %>%
      left_join(pres, by = c("date" = "inaugdate"))

In Excel, vlookup used to give you an option of true (match closest previous) or false (match exact). Is there something similar in the tidyverse?

Here is one way to achieve the desired result, although it could probably be prettied up a bit. You can create intervals, which are a class provided by lubridate to specify timespans with a particular start and end time. This comes with the %within% operator to see if a date is in that interval. So we can first create this interval and make the pres column a character type so we can index it properly. Then, we iterate over the event dates with map_chr, using a function that says "check if this date is in each interval, get the index of the one that it is actually in (with which), and return the president corresponding to that". Obviously this requires that each date is found in only one interval, else this will fail.

library(tidyverse)
library(lubridate)

pres <- data.frame(pres = c("Ronald Reagan", "George H. W. Bush", 
                            "Bill Clinton", "George W. Bush",
                            "Barack Obama", "Donald Trump"), 
                   inaugdate = structure(c(4037, 6959, 8420, 11342, 14264, 
                                           17186), class = "Date"))

events <- data.frame(event = c("Challenger explosion", "Chernobyl explosion",
                               "Hurricane Katrina", "9-11"), 
                     date = structure(c(5871, 5959, 13024, 11576), class = "Date"))

pres2 <- pres %>%
  mutate(
    presidency = interval(inaugdate, lead(inaugdate, default = today())),
    pres = as.character(pres)
  )
events %>%
  mutate(pres = map_chr(date, ~ pres2$pres[which(. %within% pres2$presidency)]))
#>                  event       date           pres
#> 1 Challenger explosion 1986-01-28  Ronald Reagan
#> 2  Chernobyl explosion 1986-04-26  Ronald Reagan
#> 3    Hurricane Katrina 2005-08-29 George W. Bush
#> 4                 9-11 2001-09-11 George W. Bush

Created on 2019-02-04 by the reprex package (v0.2.1)

16 Dates and times, 14.3 Matching patterns with regular expressions This chapter will show you how to work with dates and times in R. At first glance, dates and times seem simple. part of core tidyverse because you only need it when you're working with dates/times. durations, which represent an exact number of seconds. periods, which  A date-time is a date plus a time: it uniquely identifies an instant in time (typically to the nearest second). Tibbles print this as <dttm>. Elsewhere in R these are called POSIXct, but I don’t think that’s a very useful name. In this chapter we are only going to focus on dates and date-times as R doesn’t have a native class for storing

Probably not the most efficient, but we can use an inequality join with sqldf:

library(sqldf)

sqldf('select a.event, a.date, b.pres
      from events a 
      left join pres b
      on a.date >= b.inaugdate
      group by a.event 
      having min(a.date - b.inaugdate)
      order by date, event')

Output:

                 event       date           pres
1 Challenger explosion 1986-01-28  Ronald Reagan
2  Chernobyl explosion 1986-04-26  Ronald Reagan
3                 9-11 2001-09-11 George W. Bush
4    Hurricane Katrina 2005-08-29 George W. Bush

Use Tidyverse Pipes to Subset Time Series Data in R, Learn how to extract and plot data by a range of dates using pipes in R. Use Tidyverse Pipes to Subset Time Series Data in R Let's create a subset of data for the time period around the flood between 15 August to 15 In that filter step, you filter out only the rows within the date range that you specified. Match a fixed string (i.e. by comparing only bytes), using fixed(). This is fast, but approximate. This is fast, but approximate. Generally, for matching human text, you'll want coll() which respects character matching rules for the specified locale.

Maybe not efficient (depending on number of rows and columns) but another way to solve the problem.

library(dplyr) 

pres <- data.frame(pres = c("Ronald Reagan", "George H. W. Bush", 
                            "Bill Clinton", "George W. Bush", "Barack Obama", "Donald Trump"), 
                   inaugdate = structure(c(4037, 6959, 8420, 11342, 14264, 
                                           17186), class = "Date")) %>% 
                  #lead date to get interval
                  mutate(enddt = lead(inaugdate, default = Sys.Date())-1)

events <- data.frame(event = c("Challenger explosion", "Chernobyl explosion", "Hurricane Katrina", "9-11"), 
                     date = structure(c(5871, 5959, 13024, 11576), class = "Date"))          
#get every combination of rows
newdf <- merge(pres,events,all = TRUE) %>% 
  filter(date >= inaugdate, date < enddt)

Filter with Date data, What if you want to keep the data that is within a certain date and time and time data is to filter the last 'n' number of date and time periods. I want to select all rows based on the date for example if Date is greater than 2015-09-04 and less than 2015-09-18. The result should be: Patch Date Prod_DL P1 2015-09-04 3.43 P11 2015-09-11 3.49 I tried the following but it returns empty empty vector.

Tidy way to range join tables, on an interval of dates, Hi all, I am wondering if there is a "tidy" way to join two data frames, where the joining variable will not necessarily be an exact match (I will give  Match a fixed string (i.e. by comparing only bytes), using fixed(). This is fast, but approximate. This is fast, but approximate. Generally, for matching human text, you'll want coll() which respects character matching rules for the specified locale.

Working with dates and time in R using the lubridate package , An Interval measures elapsed seconds between two specific points in time. A Period records a time span in units larger than seconds, such as  An Interval is elapsed time in seconds between two specific dates. (If no time is provided, the time for each date is assumed to be 00:00:00, or midnight.) A Duration is elapsed time in seconds independent of a start date. A Period is elapsed time in “calendar” or “clock” time (4 weeks, 2 months, etc) independent of a start date.

Psychosocial Development in Adolescence: Insights from the Dynamic , Using the Monday questionnaire, the participants could reflect on matches during be influenced by particular events related to tennis and the participant's social life. of the measured period, and (2) by showing the (raw) trends of the measured We made use of the Tidyverse packages (Wickham, 2017), in particular the  The left hand side (LHS) determines which values match this case. The right hand side (RHS) provides the replacement value. The LHS must evaluate to a logical vector. The RHS does not need to be logical, but all RHSs must evaluate to the same type of vector. Both LHS and RHS may have the same length of either 1 or n.