if I have the following dataframe:
a b d
SI HOY 1
SI AYER 2
SI <NA> 1
NO AYER 1
SI HOY 2
<NA> HOY 2
NO HOY 2
NO AYER NA
If I want to create a new variable called "cond" that takes the value of as 5
long as the column a
takes the value of SI
and the column b
takes the value of AYER
, additionally, that takes the value of as 10
long as the column a
is equal to NO
and the column b
is equal to HOY
, in other cases that is equal 20
and if there is a NA
that takes the value of NA
.
The code I have made is the following:
df$cond<-ifelse(df$a=="SI" & df$b=="AYER", 5,
ifelse(df$a=="NO" & df$b=="HOY", 10,
ifelse(df$a==""|df$b=="", NA, 20)))
df
Which results in the following:
a b d cond
SI HOY 1 20
SI AYER 2 5
SI <NA> 1 NA
NO AYER 1 20
SI HOY 2 20
<NA> HOY 2 NA
NO HOY 2 10
NO AYER NA 20
My question is, how can I do the same thing but with some other function that allows me to shorten code?
Since in my real database I have to create a variable that is created with 15 conditionals like the ones I expressed above.
Thank you very much in advance.
An interesting way is to use the function
dplyr::case_when()
, you will write a little less (not too much) but above all you will gain code clarity:Comments:
mutate()
we create a new column calledcond
case_when()
set the conditions in the form of<condición> ~ <valor deseado>
, the default value is set asTRUE ~ 20
%>%
, if not let me know.You could try creating a table with the conditions, eg
And then using a function that assigns to each row of the dataframe the value of that double-entry table, for example
Then it's just a matter of expanding the double entry table with the additional conditions and it should work.
Greetings.