Preliminary:
First of all, I apologize for not uploading my full code as it's a bit long and I don't know how to upload a file here. If anyone knows please guide me so I can update my question.
I downloaded from this website an excel file:
datosabiertos.segob.gob.mx/DatosAbiertos/sesnsp_Incidencia_delictiva_Fuero_Federal/IDEFF
I saved it as csv and after multiple manipulations to the database I have this:
> head(d1217full)
año mes comercio contra la salud narcomenudeo otros otros_lfcd otros_lgs transporte posesion produccion trafico total
1 2012 enero 0.16 0 5.87 0.13 0.02 0.00 0.02 4.34 0.07 0 10.61
2 2012 febrero 0.04 0 5.30 0.13 0.02 0.00 0.00 4.68 0.02 0 10.19
3 2012 marzo 0.07 0 4.81 0.07 0.05 0.00 0.00 6.12 0.07 0 11.21
4 2012 abril 0.40 0 4.48 0.13 0.04 0.00 0.00 4.85 0.00 0 9.90
5 2012 mayo 0.36 0 5.91 0.36 0.07 0.00 0.02 1.95 0.05 0 8.73
6 2012 junio 0.26 0 5.30 0.15 0.22 1.48 0.02 2.17 0.00 0 9.59
When applying str
we have:
> str(d1217full)
'data.frame': 72 obs. of 13 variables:
$ año : Factor w/ 6 levels "2012","2013",..: 1 1 1 1 1 1 1 1 1 1 ...
$ mes : Factor w/ 12 levels "enero","febrero",..: 1 2 3 4 5 6 7 8 9 10 ...
$ comercio : num 0.16 0.04 0.07 0.4 0.36 0.26 0.09 0.04 0.04 0.05 ...
$ contra la salud: num 0 0 0 0 0 0 0 0 0 0.02 ...
$ narcomenudeo : num 5.87 5.3 4.81 4.48 5.91 5.3 4.27 6.6 6.29 9.84 ...
$ otros : num 0.13 0.13 0.07 0.13 0.36 0.15 0.13 0.09 0.02 0.09 ...
$ otros_lfcd : num 0.02 0.02 0.05 0.04 0.07 0.22 0.11 0.11 0.13 0.11 ...
$ otros_lgs : num 0 0 0 0 0 1.48 4.12 0.2 0.46 0.51 ...
$ transporte : num 0.02 0 0 0 0.02 0.02 0 0.02 0.02 0.02 ...
$ posesion : num 4.34 4.68 6.12 4.85 1.95 2.17 2.11 1.91 1.71 1.73 ...
$ produccion : num 0.07 0.02 0.07 0 0.05 0 0.02 0 0 0 ...
$ trafico : num 0 0 0 0 0 0 0 0 0 0 ...
$ total : num 10.61 10.19 11.21 9.9 8.73 ...
What I want is to be able to make a graph that has the following characteristics:
- On the x-axis the year and month from 2012/01 to 2017/12 are recorded.
- On the y-axis, with different lines, the data of the trade variables are shown up to total, that is, a total of 11 lines, emphasizing "total"
Since my time variables are factors, ggplot
it doesn't allow me to graph properly. For example,
ggplot(d1217full, aes(año, total)) +
geom_line()
It results in this plot, which of course is wrong.
The foregoing, without considering that it would still be necessary to include the rest of the variables in the plot.
I thought about creating a time vector from scratch, bringing it into my database, and then making a graph following something like:
añosgto <- c(seq(as.Date("2012/1/1"), by = "month", length.out = 72))
añosgto <- as.character(añosgto)
temp <- strsplit(añosgto, "-")
temp <- matrix(unlist(temp), ncol=3, byrow=T)
temp <- as.data.frame(temp)
temp <- mutate(temp, año_def=V1, mes_def=V2)
temp <- temp[,-(1:3)]
head(temp)
D1217 <- cbind(d1217full, temp)
D1217 <- select(D1217, -(1:2))
ggplot(D1217, aes(str_c(año_def, "/", str_pad(mes_def, 2, pad = 0)),
total, group=total)) +
geom_line() +
scale_x_discrete(breaks = c("2012/12", "2013/12", "2014/12", "2015/12", "2016/12", "2017/12"))
Which results in this:
As you can see, I've tried graphing without success so any guidance would be greatly appreciated.
UPDATE:
Create a repository on github where the database and code are located.
Since you're using it , let's
dplyr
make the most of it:The result:
Detail:
mutate(order = paste0(año, "-",sprintf("%02d",which(meses %in% mes))))
to create a new column corresponding to axisx
, it is easier this way sinceaño
andmes
are factors and it would complicate a bit if we use it directly.gather(variable, valor, -
year, -mes, -order)
with this we reorganize the structure to have observations for each of the columns that will thenggplot
be treated as variables and independent linesThe rest of the code is simply the configuration of the plot, we configure the values of
x
,y
, which will be the groups and who defines the color, the last thing is to configure the vertical orientation of the axis labelsx
With
scale_x_discrete(breaks=unique(paste0(as.character(d1217full$año), "-12")))
we define the cuts in the axisx
so that these are always December of each year as you mention in your comments.