How to apply several functions on every possible row combinations within a dataframe in R?

Related searches

I've got a dataframe with coordinates (lon, lat)

    lon <- list(505997.627175236, 505997.627175236, 505997.627175236, 505997.627175236)   
    lon <- do.call(rbind.data.frame, lon)

    lat <- list(7941821.025438220, 7941821.025438220, 7941821.025438220, 7941821.025438220)
    lat <- do.call(rbind.data.frame, lat)

    coord <- cbind(lon, lat)
    colnames(coord) <- c("lon", "lat")

I'm trying to calculate the euclidian distance and the angle between all the possible row combinations within the dataframe.

     lon   lat       apply function on every possible combinations such as v1-v2, v1-v3, v1-v4,
v1   x1    y1        v2-v3 and so on...
v2   x2    y2         
v3   x3    y3        here are the two functions applied beetween v1 and v2 :
v4   x4    y4        **euclidian distance**    sqrt((x1-x2)^2 + (y1-y2)^2)
                     **angle**                 atan2((y1-y2),(x1-x2))*(180/pi)

How to apply several functions on every possible row combinations and get the results in respective lists ? My goal is to use these calculations at every iteration whatever the number of row in input.

Thanks in advance for your answers and sorry if the question seems silly. I've looked at so many posts but couldn't find a solution that I could understand and replicate.


Base R function combn generates the combinations of a vector's elements taken m at a time and it can, optionally, apply a function FUN to those combinations. Since the input data is a "data.frame", I will combine the rownames 2 by 2.

euclidean <- function(k){
  f <- function(x, y) sqrt((x[1] - y[1])^2 + (x[2] - y[2])^2)
  x <- unlist(coord[k[1], 1:2])
  y <- unlist(coord[k[2], 1:2])
  f(x, y)
}

angle <- function(k){ 
  f <- function(x, y) atan2(x[2] - y[2], x[1] - y[1])*(180/pi)
  x <- unlist(coord[k[1], 1:2])
  y <- unlist(coord[k[2], 1:2])
  f(x, y)
}

combn(rownames(coord), 2, euclidean)
#[1]   4019.95 800062.50  20012.25 804067.26  24001.87 780073.39

combn(rownames(coord), 2, angle)
#[1] -84.28941  90.71616  87.99547  90.74110  89.28384 -89.21407

Data.

This is the data in the OP's answer but without the id column.

lon <- c(505997.627175236, 505597.627175236,
         515997.627175236, 505297.627175236)   
lat <- c(7941821.025438220, 7945821.025438220,
         7141821.025438220, 7921821.025438220)
coord <- data.frame(lon, lat)

Create all possible combinations of a data frame, I've a data frame with n rows, and I want to apply a function on all possible combinations of k rows of this data frame. e E f F # two ways via base R # I need to reorder the columns output_1 <- as.data.frame(x = t(x = apply(X� In R, you can use the apply () function to apply a function over every row or column of a matrix or data frame. This presents some very handy opportunities. Count in R using the apply function Imagine you counted the birds in your backyard on three different days and stored the counts in a matrix like this:


# two vectors (I changed them a little bit)
lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)   
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)

# a function for the euclidean distance
eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)

# now we create a dataframe...
df <- data.frame(lon, lat) %>%
    mutate(joinIndex = 1:nrow(.)) # and we add an index column

# ...that looks like this
#        lon     lat joinIndex
# 1 505997.6 7941821         1
# 2 505597.6 7945821         2
# 3 515997.6 7141821         3
# 4 505297.6 7921821         4

# create all combinations of the join indeces
df_combinations <- expand.grid(1:nrow(df), 1:nrow(df))

#    Var1 Var2
# 1     1    1
# 2     2    1
# 3     3    1
# 4     4    1
# 5     1    2
# 6     2    2
# 7     3    2
# 8     4    2
# 9     1    3
# 10    2    3
# 11    3    3
# 12    4    3
# 13    1    4
# 14    2    4
# 15    3    4
# 16    4    4

# and join our dataframe first on one index then on the other
df_final <- df_combinations %>%
    left_join(df, by = c("Var1" = "joinIndex")) %>%
    left_join(df, by = c("Var2" = "joinIndex"))

# and then finally calculate the euclidean distance
df_final %>%
    mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y))

   Var1 Var2    lon.x   lat.x    lon.y   lat.y  distance
1     1    1 505997.6 7941821 505997.6 7941821      0.00
2     2    1 505597.6 7945821 505997.6 7941821   4019.95
3     3    1 515997.6 7141821 505997.6 7941821 800062.50
4     4    1 505297.6 7921821 505997.6 7941821  20012.25
5     1    2 505997.6 7941821 505597.6 7945821   4019.95
6     2    2 505597.6 7945821 505597.6 7945821      0.00
7     3    2 515997.6 7141821 505597.6 7945821 804067.26
8     4    2 505297.6 7921821 505597.6 7945821  24001.87
9     1    3 505997.6 7941821 515997.6 7141821 800062.50
10    2    3 505597.6 7945821 515997.6 7141821 804067.26
11    3    3 515997.6 7141821 515997.6 7141821      0.00
12    4    3 505297.6 7921821 515997.6 7141821 780073.39
13    1    4 505997.6 7941821 505297.6 7921821  20012.25
14    2    4 505597.6 7945821 505297.6 7921821  24001.87
15    3    4 515997.6 7141821 505297.6 7921821 780073.39
16    4    4 505297.6 7921821 505297.6 7921821      0.00

Expand data frame to include all possible combinations of values , Source: R/expand.R expand() generates all combination of variables found in a dataset. a row for each present school-student combination for all possible dates. a function: apply custom name repair (e.g., .name_repair = make.names for See there for more details on these terms and the strategies used to enforce � Apply functions in R Iterative control structures (loops like for, while, repeat, etc.) allow repetition of instructions for several numbers of times. However, at large scale data processing usage of these loops can consume more time and space.


For fast Euclidean calculations, you can look at this

For the other function, you can do something like

atan2(outer(coord$lat, coord$lat, `-`), outer(coord$lon, coord$lon, `-`))*180/pi

R tutorial on the Apply family of functions, Of course, not all the variants can be discussed, but when possible, you will The apply() functions form the basis of more complex combinations and translates into: “apply the function 'sum' to the matrix X along margin 2 You end up with a line vector containing the sums of the values of each column. The Apply Functions As Alternatives To Loops. This post will show you how you can use the R apply() function, its variants such as mapply() and a few of apply()'s relatives, applied to different data structures. Of course, not all the variants can be discussed, but when possible, you will be introduced to the use of these functions in


In the end, I've adapted the code that Georgery provided but I used "combn" instead of "expand.grid" in order to avoid repetition among the row conbinations when applying the functions to the final dataframe. I also had to use the function "convert" from the package "hablar" in order to properly convert the factors of my dataframe "coord_combn" to numeric values.

Here's the code :

lon <- c(505997.627175236, 505597.627175236, 515997.627175236, 505297.627175236)   
lat <- c(7941821.025438220, 7945821.025438220, 7141821.025438220, 7921821.025438220)

# dataframe creation + adding of an id column
coord <- data.frame(lon, lat) %>% 
                 mutate(id = 1:nrow(.))

coord_combn <- combn(rownames(coord), 2) # all the possible row combinations
coord_combn <- as.data.frame(t(coord_combn)) # transpose columns into rows
coord_combn <- coord_combn %>% 
                 convert(num(V1, V2)) # factor to numeric

#join our dataframe first on one index then on the other
coord_final <- coord_combn %>%
  left_join(coord, by = c("V1" = "id")) %>%
  left_join(coord, by = c("V2" = "id"))

eDistance <- function(x1, x2, y1, y2) sqrt((x1-x2)^2 + (y1-y2)^2)
eAngle <- function(x1, x2, y1, y2) atan2((y1-y2),(x1-x2))*(180/3.14159265359)

# euclidean distance calculation
coord_final <- coord_final %>% 
                 mutate(distance = eDistance(lon.x, lon.y, lat.x, lat.y)) 
# angle calculation
coord_final <- coord_final %>% 
                 mutate(angle = eAngle(lon.x, lon.y, lat.x, lat.y)) 

Thank you everyone, you've been a great help.

expand.grid: Create a Data Frame from All Combinations of Factor , A data frame containing one row for each combination of the supplied factors. The first factors vary fastest. The columns are labelled by the factors if these are� Caution: The number of combinations and permutations increases rapidly with n and r!. To use values of n above about 45, you will need to increase R's recursion limit. See the expression argument to the options command for details on how to do this.


cross: Produce all combinations of list elements in purrr: Functional , Alternatively, a data frame. cross_df() requires all elements to be named. This makes it more amenable to mapping operations. cross_df() returns the output in long over. cross_df() returns a data frame where each row is one combination. for functional programming # because applying a function to the combinations � Apply a lambda function to each row: Now, to apply this lambda function to each row in dataframe, pass the lambda function as first argument and also pass axis=1 as second argument in Dataframe.apply () with above created dataframe object i.e. # Apply a lambda function to each row by adding 5 to each value in each column


apply (), lapply (), sapply (), tapply () Function in R with Examples This tutorial aims at introducing the apply () function collection. The apply () function is the most basic of all collection. We will also learn sapply (), lapply () and tapply ().


This is an introductory post about using apply, sapply and lapply, best suited for people relatively new to R or unfamiliar with these functions. There is a part 2 coming that will look at density plots with ggplot , but first I thought I would go on a tangent to give some examples of the apply family, as they come up a lot working with R.