I want to join in a single column several columns, for example
col1 <- c("Uno", NA, "tres", NA)
col2 <- c(NA, "dos", NA, NA)
col3 <- c(NA,NA,NA, "cuatro")
df <- data.frame(col1,col2, col3)
library(sqldf)
df2<-sqldf("select *, coalesce(col1,col2,col3 ) UNIDOS from df")
col1 col2 col3 UNIDOS
1 Uno <NA> <NA> Uno
2 <NA> dos <NA> dos
3 tres <NA> <NA> tres
4 <NA> <NA> cuatro cuatro
But if instead of 3 columns there are 300, as I indicate in this part
coalesce(col1,col2,col3......col300 )
to not write the 300?
I have tried with
unir<-paste0("col", (seq(1,3)))
sqldf("select *, coalesce(unir) UNIDOS from df")
But stays
Using basic R
It could be as follows
Inside the apply function some things happen that have to be mentioned,
df2[,unir]
is filtering the columns of the data.frame that you want to join.MARGIN=1
tells apply to iterate over the rows of the previous data.frame, if it were equal to 2 it would iterate over the columns.function(x)
is the action we want to perform on each row of data.frame.You can always check the help
?apply
to learn more.Using tidyr
One advantage I see in using tidyr is that we can select the variables we want to join with the tidy-select , a very useful tool that for example allows us to select a range of continuous variables with the operator
:
.