I have a data set that more or less looks like this (note that this is a minimal example and the limitedness of the data may affect the visual appearance of the obtained graph):
y x z g
1 0 0 1
2 1 0 1
2 0 0.5 1
3 1 0.5 1
1.5 0 1 1
2 1 1 1
2 0 0 2
2 1 0 2
3 0 0.5 2
3 1 0.5 2
0.5 0 1 2
2 1 1 2
2 0 0 3
2 1 0 3
1 0 0.5 3
1 1 0.5 3
0.5 0 1 3
0.5 1 1 3
I would like to graph the mean of y for each possible combination of x & z. Representing y on the vertical axis and g on the horizontal axis.
So far I have used the following code:
means <- tapply(y,g,mean)
plot(means, col="red",pch=18, ylim=c(0,3), type = 'l', ylab='y', xlab="g")
Then, for each data set (for each possible combination of x and z that I do manually with subset
), I draw a new line on the graph, with a different color. I use this code:
lines(means, col="black",pch=18)
I would like to be able to plot in a less cumbersome way, using ggplot. I would also like to implement the 95% confidence intervals.
Thank you very much.
I assume you want a point for each mean of y conditional on
g
,x
yz
. With your test data it would be nine points, for each point an error bar with the CI at 95% and a line that joins the points alongg
when they have the same combination ofx
andz
. In that case you could do it like this:Which produces this graph:
It can be improved, the error bars overlap and it is not very readable. It will depend on the actual data you are working with.