Wednesday, February 5, 2014

ggPlot2: Histogram with jittered stripchart

Here is an example of a Histogram plot, with a stripchart (vertically jittered) along the x side of the plot.

library(ggplot2)
movies <- movies[1:1000,]
m <- ggplot(movies, aes(x=rating))
m + geom_histogram(binwidth=0.2, colour = "darkgreen", fill = "white", size=1) + geom_point(aes(y = -2), position = position_jitter(height = 0.8), size=1)
view raw Hist1 hosted with ❤ by GitHub




Alternatively, using the geom_rug function:

library(ggplot2)
movies <- movies[1:1000,]
m <- ggplot(movies, aes(x=rating))
m + geom_histogram(binwidth=0.2, colour = "darkgreen", fill = "white", size=1) + geom_rug(aes(y=-2), position="jitter", sides="b")
view raw hist2 hosted with ❤ by GitHub



Of course this simplicistic method need to be adjusted in vertical position of the stripchart or rugchart (y=-2, here), and the relative proportion of points jittering.

Sunday, February 2, 2014

Boxplot with mean and standard deviation in ggPlot2 (plus Jitter)

When you create a boxplot in R, it automatically computes median, first and third quartile ("hinges") and 95% confidence interval of median ("notches").

But we would like to change the default values of boxplot graphics with the mean, the mean + standard deviation, the mean - S.D., the min and the max values.
Here is an example solved using ggplot2 package. Plus here are represented points (the single values) jittered horizontally.
library(ggplot2)
# create fictitious data
a <- runif(10)
b <- runif(12)
c <- runif(7)
d <- runif(15)
# data groups
group <- factor(rep(1:4, c(10, 12, 7, 15)))
# dataframe
mydata <- data.frame(c(a,b,c,d), group)
names(mydata) <- c("value", "group")
# function for computing mean, DS, max and min values
min.mean.sd.max <- function(x) {
r <- c(min(x), mean(x) - sd(x), mean(x), mean(x) + sd(x), max(x))
names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
r
}
# ggplot code
p1 <- ggplot(aes(y = value, x = factor(group)), data = mydata)
p1 <- p1 + stat_summary(fun.data = min.mean.sd.max, geom = "boxplot") + geom_jitter(position=position_jitter(width=.2), size=3) + ggtitle("Boxplot con media, 95%CI, valore min. e max.") + xlab("Gruppi") + ylab("Valori")
view raw ggplot1 hosted with ❤ by GitHub