## Hot questions for Using Ggplot2 in ggpairs

Question:

I am using `ggpairs`

to make a pairs plot, but I only want to display the lower triangle. I can make the diagonal and upper triangle blank, but cannot make them go, which leaves an empty row and an empty column which I don't want.

Any suggestions?

library("GGally") ggpairs(iris[, 1:4], lower = list(continuous = "points"), upper = list(continuous = "blank"), diag = list(continuous = "blankDiag") )

Answer:

The `ggpairs`

object can be edited. The bulk of the object is `list`

of plots. The unwanted plots can be removed from this list and the other elements of the `ggpairs`

object modified to match.

Here is a function that will do this

gpairs_lower <- function(g){ g$plots <- g$plots[-(1:g$nrow)] g$yAxisLabels <- g$yAxisLabels[-1] g$nrow <- g$nrow -1 g$plots <- g$plots[-(seq(g$ncol, length(g$plots), by = g$ncol))] g$xAxisLabels <- g$xAxisLabels[-g$ncol] g$ncol <- g$ncol - 1 g } library("GGally") g <- ggpairs(iris[, 1:4], lower = list(continuous = "points"), upper = list(continuous = "blank"), diag = list(continuous = "blankDiag") ) gpairs_lower(g)

Question:

How can I save the ggpairs as the current ggsave does not work?

Script:

library(GGally) library(ggplot2) data(diamonds, package="ggplot2") diamonds.samp <- diamonds[sample(1:dim(diamonds)[1],200),] pf<-ggpairs( diamonds.samp[,1:3],mapping = ggplot2::aes(color = cut)) ggsave("C:/Users/top/Desktop/ggpairs.jpg", pf, dpi=500)

Answer:

If you try to use `ggsave`

you get an error

ggsave("ggpairs.jpg", pf, dpi=500)

Saving 7 x 7 in image Error in UseMethod("grid.draw") : no applicable method for 'grid.draw' applied to an object of class "c('gg', 'ggmatrix')"

So you can write you own `grid.draw`

method for the `ggpairs`

object class

grid.draw.gg <- function(x){ print(x) } ggsave("ggpairs.jpg", pf, dpi=500)

Question:

Can I use `pcor`

(from `ppcor`

) or actually put any correlation matrix I make in advance into the code of `ggpairs`

(in the `upper =`

) instead of `cor`

?

I want to integrate within ggpairs a partial correlation matrix or the pcor.

library(GGally) a <- as.numeric(1:10) b <- as.numeric(a*a) c <- as.numeric(a/b) D <- as.factor(c("A", "B", "C", "A", "B", "C","A", "B", "C","A")) abcd <- data.frame(a,b,c, D) p <- ggpairs(abcd, columns = c("a", "b", "c"), title = "All Bivariate analysis", upper = list(continuous = wrap("cor", size = 6)), lower = list(continuous = wrap("smooth", alpha = 0.6, size = 0.1)), mapping = aes(color = D)) for (i in 1:p$nrow) { for (j in 1:p$ncol) { p[i,j] <- p[i,j] + scale_fill_manual(values=c("grey25", "slategrey", "grey85")) + scale_color_manual(values=c("grey37", "slategrey", "grey75")) } } d <- p + theme(axis.text.x = element_text(face = "bold", size = 10 ), axis.text.y = element_text(face = "bold", size = 10), strip.text = element_text(size = 20)) d

I would like to use the fantastic `ggpairs`

but whit partial correlation matrix. Is it possible?
I guess I should do this in this part:

upper = list(continuous = wrap("cor", size = 6))

Answer:

Looking at the code of `GGally::ggpairs`

you can see that you can provide a function to `upper`

which needs to produce a `ggplot`

. When providing a function stub like this:

upper = list(continuous = function(data, mapping) { print(list(data, mapping)) })

You will see that for each panel you get the whole `data.frame`

and an `aes`

mapping describing what should be on the x- and y-axis and which other aesthetics you may have set, for instance:

[[1]] a b c D 1 1 1 1.0000000 A 2 2 4 0.5000000 B 3 3 9 0.3333333 C 4 4 16 0.2500000 A 5 5 25 0.2000000 B 6 6 36 0.1666667 C 7 7 49 0.1428571 A 8 8 64 0.1250000 B 9 9 81 0.1111111 C 10 10 100 0.1000000 A [[2]] Aesthetic mapping: * `x` -> `b` * `y` -> `a` * `colour` -> `D`

Out of this information we need to

- Calculate the
`pcor`

- Extract the relevant coefficients

This is a bit tricky, because we need to calculate a grouped `pcor`

(one coefficient for each level of `colour -> D`

+ potentially other groupings which you may want to include later) and we would need to get the grouping structure from the mapping, which is also not that straight forward.

To make a long story short, the following stub shows you the direction and you can take it from there to further fine-tune the appearance of the upper plot:

library(tidyverse) pcor_panel <- function(data, mapping, ...) { ## remove x, y mapping grp_aes <- mapping[setdiff(names(mapping), c("x", "y"))] ## extract the columns to which x and y is mapped xy <- sapply(mapping[c("x", "y")], rlang::as_name) ## calculate pcor per group stats <- data %>% group_by(!!!unname(unclass(grp_aes))) %>% group_modify(function(dat, grp) { res <- pcor(dat)$estimate %>% as_tibble() %>% setNames(names(dat)) ## needed b/c in pcor names are sometimes messed up res <- res %>% mutate(x = names(res)) %>% gather(y, pcor, -x) res %>% filter(x == xy[1], y == xy[2]) ## look only at the pcors of this panel }) %>% ungroup() %>% mutate(x = 1, y = seq_along(y)) ggplot(stats, aes(x, y, label = round(pcor, 3))) + geom_text(grp_aes) + ylim(range(stats$y) + c(-2, 2)) } ggpairs(abcd, columns = c("a", "b", "c"), title = "All Bivariate analysis", upper = list(continuous = pcor_panel), lower = list(continuous = wrap("smooth", alpha = 0.6, size = 0.1)), mapping = aes(color = D))

Question:

I have three variables `a`

, `b`

, `c`

. I want to make a `ggpairs`

plot of `a`

and `b`

with each variable (in all of the panels) colored by `c`

. How can I do this?

##### Code example

library(ggplot2) library(GGally) N <- 100 a <- rnorm(N, 0, 1) b <- rnorm(N, 0, 1) point.colors <- runif(N, 0, 1) ggpairs(data=data.frame(a, b)) # How to add point.colors here?

I can do this using base R pretty easily:

plot(a, b, col=colorRampPalette(c('red', 'blue'))(N)[1+floor(N*point.colors)])

How to do it with `ggpairs`

?

(edit: off-by-one)

Answer:

Why not change the plot within the ggpairs object?

p = ggpairs(data = data.frame(a,b)) p21 = qplot(a,b,colour = point.colors) #next line didn't work for user #p[2,1] = p21 p$plots[[3]] = p21

Question:

I am using the ggpairs from ggplot2.

I need to get an histogram in the diagonal for the ggpairs, but want to superimpose the normal density curve using the mean and sd of the data.

I read the help (https://www.rdocumentation.org/packages/GGally/versions/1.4.0/topics/ggpairs) but can't find an option to do it. I guess I must built my own function (myfunct) and then

`ggpairs(sample.dat, diag=list(continuous = myfunct))`

Has anyone have tried this?

I have tried the following:

head(data) x1 x2 x3 x4 x5 x6 F1 F2 1 -0.749 -1.57 0.408 0.961 0.777 0.171 -0.143 0.345 myhist = function(data){ ggplot(data, aes(x)) + geom_histogram(aes(y = ..density..),colour = "black") + stat_function(fun = dnorm, args = list(mean = mean(x), sd = sd(x))) } ggpairs(sample.data, diag=list(continuous = myhist))

The result is:

Error in (function (data) : unused argument (mapping = list(~x1))

Answer:

This question provides an example of the code to add a normal curve to a histogram in `ggplot2`

. You can use this to write your own function to pass to the `diag`

argument of `ggpairs`

. To calculate the `mean`

and `sd`

of the data, you can grab the relevant data using, for example, `eval_data_col(data, mapping$x)`

. Example below (perhaps a little more complicated than needed but it allows you to pass parameters to change colours etc using the `wrap`

functionality.

library(GGally) diag_fun <- function(data, mapping, hist=list(), ...){ X = eval_data_col(data, mapping$x) mn = mean(X) s = sd(X) ggplot(data, mapping) + do.call(function(...) geom_histogram(aes(y =..density..), ...), hist) + stat_function(fun = dnorm, args = list(mean = mn, sd = s), ...) } ggpairs(iris[1:100, 1:4], diag=list(continuous=wrap(diag_fun, hist=list(fill="red", colour="blue"), colour="green", lwd=2)))

Question:

I am using the Auto dataset from the ISLR library and the function ggpairs() from gpairs library to create a scatterplot of all possible combinations of variables. My code is the following:

data(Auto) setDT(Auto) ggpairs(Auto[, -c("name"), with = FALSE] , lower = list(continuous = wrap("points", color = "red", alpha = 0.5), combo = wrap("box", color = "orange", alpha = 0.3), discrete = wrap("facetbar", color = "yellow", alpha = 0.3) ), diag = list(continuous = wrap("densityDiag", color = "blue", alpha = 0.5) ))+ theme(axis.text.x = element_text(angle = 90, hjust = 1))

The plot is the one below:

There are some issues with this plot:

The axes tick labels are not readable. How could I remove the numbers and possibly rotate the tick lables to be vertical to the axes?

How could I enforce different colors for the combo pairs (categorical - continuous)

Your advice will be appreciated.

Answer:

Maybe the proposed solution is not a perfect match with your wishes, but I hope it helps.

- You need to invoke more libraries to get the code to work.
- You will need to have factors to "force" the categorical variables to be known as such.

The following code may do the trick:

library(ISLR) library(data.table) library(GGally) library(ggplot2) data(Auto, package = "ISLR") # remove unwanted column and make categorical variables Auto2 <- Auto[, -9] Auto2$cylinders <- factor(Auto2$cylinders) Auto2$origin <- factor(Auto2$origin) ggpairs(Auto2 , lower = list(continuous = wrap("points", color = "red", alpha = 0.5), combo = wrap("box", color = "orange", alpha = 0.3), discrete = wrap("facetbar", color = "yellow", alpha = 0.3) ), diag = list(continuous = wrap("densityDiag", color = "blue", alpha = 0.5) ))

This yields the following picture:

Please let me know whether this is what you want.

Question:

I am using ggpairs and while plotting the matrix, I receive a matrix as follows

As you can see, some of the text length is large and hence the text is not seen completely. Is there anyway that I can wrap the text so that it is visible completely.

Code

ggpairs(df)

I want the text to wrap so that it can be seen something like this

Answer:

You can use the `labeller`

argument of `ggpairs`

to pass a function to be applied to the facet strip text.

`ggplot`

does have a nice ready function `label_wrap_gen()`

that wrap the long labels.

By default `ggpairs`

use the column names as labels, and those can't contain spaces. `label_wrap_gen()`

need spaces to split the labels on multiple rows.

This is a solution:

library(ggplot2) library(GGally) df <- iris colnames(df) <- make.names(c('Long colname', 'Quite long colname', 'Longer tha usual colname', 'I\'m not even sure this should be a colname', 'The ever longest colname that one should be allowed to use')) ggpairs(df, columnLabels = gsub('.', ' ', colnames(df), fixed = T), labeller = label_wrap_gen(10))

Question:

I am getting the below error when trying to plot the dat data frame

library(GGally) library(ggplot2) dat = data.frame(a=rnorm(5) , b= rnorm(5) ,c =rnorm(5) , d=rnorm(5) , e= c(1,2,3,4,5)) dat a b c d e 1 0.21444531 1.9972134 2.1988103 -0.47624689 1 2 -0.32468591 0.6007088 1.3124130 -0.78860284 2 3 0.09458353 -1.2512714 -0.2651451 -0.59461727 3 4 -0.89536336 -0.6111659 0.5431941 1.65090747 4 5 -1.31080153 -1.1854801 -0.4143399 -0.05402813 5 ggpairs(dat ,mapping=aes(color =e),upper=list(continuous=wrap("cor",size=2)), columns = c("a","b","c","d"))

Error:

Error in $<-.data.frame(

tmp, "label", value = ": ") : replacement has 1 row, data has 0

I would like to color the data points using column "e"

Any ideas?

Answer:

If you factorize `e`

then it runs:

dat$e <- factor(dat$e) ggpairs(dat,mapping=aes(color=e),upper=list(continuous=wrap("cor",size=2)), columns = c("a","b","c","d"))

But that is a pretty ugly figure not to mention a useless comparison.

If you eliminate the mapping then the code also runs fine:

ggpairs(dat,upper=list(continuous=wrap("cor",size=2)), columns = c("a","b","c","d"))

Question:

It is possible to change the column label of factor levels without having to change the values in the data.frame

for example in the following graph can I change the label of Female and Male to F and M respectively without having to change the df?

library(GGally) data(tips, package = "reshape") pm <- ggpairs(tips, 1:3, columnLabels = c("Total Bill", "Tip", "Sex")) pm

Answer:

After

pm <- ggpairs(tips, 1:3, columnLabels = c("Total Bill", "Tip", "Sex"))

do this

levels(pm$data$sex)[levels(pm$data$sex) == "Male"] = "M" levels(pm$data$sex)[levels(pm$data$sex) == "Female"] = "F"

You'll get this plot:

It won't change anything in `tips`

dataset:

head(tips) total_bill tip sex smoker day time size 1 16.99 1.01 Female No Sun Dinner 2 2 10.34 1.66 Male No Sun Dinner 3 3 21.01 3.50 Male No Sun Dinner 3 4 23.68 3.31 Male No Sun Dinner 2 5 24.59 3.61 Female No Sun Dinner 4 6 25.29 4.71 Male No Sun Dinner 4

Question:

I would like to generate a correlation plot with my "True" variable pairs with all of the rest (People variables). I am pretty sure this has been brought up somewhere but solutions I have found do not work for me.

library(ggplot2) set.seed(0) dt = data.frame(matrix(rnorm(120, 100, 5), ncol = 6) ) colnames(dt) = c('Salary', paste0('People', 1:5)) ggplot(dt, aes(x=Salary, y=value)) + geom_point() + facet_grid(.~Salary)

Where I got error: Error: Column `y`

must be a 1d atomic vector or a list.

I know one of the solutions is writing out all of the variables in y - which I am trying to avoid because my true data has 15 columns.

Also I am not entirely sure what do the "value", "variables" refer to in the ggplot. I saw them a lot in demonstrating codes.

Any suggestion is appreciated!

Answer:

You want to convert your data from `wide`

to `long`

format using `tidyr::gather()`

for example. Here is a solution using packages in the `tidyverse`

framework

library(tidyr) library(ggplot2) theme_set(theme_bw(base_size = 14)) set.seed(0) dt = data.frame(matrix(rnorm(120, 100, 5), ncol = 6) ) colnames(dt) = c('Salary', paste0('People', 1:5)) ### convert data frame from wide to long format dt_long <- gather(dt, key, value, -Salary) head(dt_long) #> Salary key value #> 1 106.31477 People1 98.87866 #> 2 98.36883 People1 101.88698 #> 3 106.64900 People1 100.66668 #> 4 106.36215 People1 104.02095 #> 5 102.07321 People1 99.71447 #> 6 92.30025 People1 102.51804 ### plot ggplot(dt_long, aes(x = Salary, y = value)) + geom_point() + facet_grid(. ~ key)

### if you want to add regression lines library(ggpmisc) # define regression formula formula1 <- y ~ x ggplot(dt_long, aes(x = Salary, y = value)) + geom_point() + facet_grid(. ~ key) + geom_smooth(method = 'lm', se = TRUE) + stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~")), label.x.npc = "left", label.y.npc = "top", formula = formula1, parse = TRUE, size = 3) + coord_equal()

### if you also want ggpairs() from the GGally package library(GGally) ggpairs(dt)

Created on 2019-02-28 by the reprex package (v0.2.1.9000)