I have some numerical data that I have to rename with three arguments (low, medium and high) according to the interval in which they are found.
To know which intervals to take, use the quantiles:
quantile(df$cons_conf_id)
0% 25% 50% 75% 100%
-50.8 -42.7 -41.8 -36.4 -26.9
And I create a function to rename the variables depending on the range they are in:
conf_idx <- function(indice_confianza){
if(indice_confianza > -50.8 & indice_confianza < (-42.7-41.8)/2){
print("low")
}else if(indice_confianza > (-42.7-41.8)/2 & indice_confianza < (-41.8-36.4)/2){
print("medium")
}else{
print("high")
}
}
But when executing it I get this error, which must be the most basic, but I have tried to apply a while loop and I have not been able to apply it to all the observations of the variable:
conf_idx(df$cons_conf_id)
[1] "high"
Warning messages:
1: In if (indice_confianza > -50.8 & indice_confianza < (-42.7 - 41.8)/2) { :
la condición tiene longitud > 1 y sólo el primer elemento será usado
2: In if (indice_confianza > (-42.7 - 41.8)/2 & indice_confianza < :
la condición tiene longitud > 1 y sólo el primer elemento será usado
How can I apply the function to transform the numerical observations in the whole column?
I think a simpler but effective way is to use the following:
We create the level variable and then we use the case_when which is the vectorized form of the if, be careful with the strict majors and minors, I kept them as you put them in your code.
Another way that is also very effective is using the cut function
In this case we also create the variable level which will take values low, medium and hight based on the 3 intervals that we define in the breaks statement, be careful again with the statement right = false , which indicates if you want the intervals open or closed towards the right.
I hope that one of the two methods proposed will be useful to you.