What is a promise in Javascript?

Question

Asked: 2020-03-16 04:41:01 +0800 CST 2020-03-16 04:41:01 +0800 CST 2020-03-16 04:41:01 +0800 CST

How to convert given numeric values to NA values efficiently in R?

772

I have 2 dataframes about 2 surveys (National Household and Adult Health Survey), in which many of the variables give "Don't know/No answer" as an answer option, assigning a number or a pair of numbers to said answer.

This response (NS/NC):

in most occasions/variables/question they are assigned the numbers 8 and 9 to indicate if "does not know" or "does not answer", respectively.
In a variable it corresponds only to 8, since it only gives the option to indicate "does not know". Similarly for 9.
In another variable they are 98 and 99
in others the 998 and 999
There are other variables that use these values (8,9,998...) to indicate other options that do not correspond to "don't know" or "no answer".

So what I want is to convert all these NS/NC into NA values to later treat them in a more comfortable way with the rest of the missing values. I am doing it as follows:

# Variables de la Encuesta Adultos con respuesta de NS/NC
vars_na_89_A <- c("enf_cron","SM_estres","tipo_act_fis","frec_act_fis","fuma")

vars_na_8_A <- c("sit_lab")
vars_na_9_A <- c("clase_pr")
vars_na_9899_A <- c("niv_est")
vars_na_998999_A <- c("altura","peso")

# Variables de la Encuesta Hogares con respuesta de NS/NC
vars_na_89_H <- c("ruido","malos","agua","limpieza", "cont_indus","cont_otras",
                  "escasez_verde","molest_animal" ,"delincuencia")
vars_na_9899_H <- c("n_dormitorios", "ingreso")
vars_na_998999_H <-c("m2")

ifelse(datos_adultos[vars_na_89_A]!=8 | datos_adultos[vars_na_89_A!=9],datos_adultos,NA)
ifelse(datos_adultos[vars_na_8_A]!=8,datos_adultos,NA)
.
.
.

In this way I have to create an ifelse for each type of NS/NC of each of the two surveys. My question is: How can I get the same result but without so many lines of code?

2 Answers

Voted

RUBEN lopez · Answer 1 · 2020-03-16T08:54:58+08:00

It is difficult to give you a general answer, since I do not know what all the total columns that your df contains are, so I cannot give you a reproducible example with your data, I recommend the following code which only works if you know exactly which ones are the columns that contain NA based on the patterns you mention assuming that the columns: vars_na_89_A,vars_na_8_A,vars_na_9_A,vars_na_9899_A,vars_na_998999_A,vars_na_89_H,vars_na_9899_H

keys 89, 8 , 9 ,9899,998999,,89,9899 do not have a different meaning, for example, for vars_na_8_A 9 it has a different meaning than "NA"

library(map) #De esta libraría utilizaremos la función purrr que es muy similar apply
library(dplyr) 
library(stringr) # esta librería utilizaremos str_detect para identificar si la 
                 # columna tiene los patrones que deseamos

   df_modificar<-df %>% 
          select(vars_na_89_A,vars_na_8_A,vars_na_9_A,vars_na_9899_A,vars_na_998999_A,
              vars_na_89_H,vars_na_9899_H)
#No agrego la columna vars_na_m2_H por que no se si m2 quiere decir NA
#Para cambiarlo sobre todas las columnas y no realizarlo todo en una sola linea
#utilizo purrr

#Patrones que queremos modificar
 patrones<-paste(c(8,9,98,99,998,999),collapse = '|')
#Lo que vamos realizar es buscar los patrones sobre todas las columnas
 df_limpio<-map_df(df_modificar,function(x) ifelse(str_detect(x,pattern= 
                   patrones),"NA",x))

Patricio Moracho · Answer 2 · 2020-03-16T19:46:04+08:00

This is a proof of concept of a possible way to reduce some code, although I don't think you gain much. The idea is to define a list of replacements, where you indicate the columns and values that you want to replace with NA. Assuming one data.framelike this:

set.seed(2020)
df <- as.data.frame(matrix(sample(c(1:9,98,99,998,999), 100, replace = TRUE), ncol=10))
df

    V1  V2  V3  V4  V5  V6 V7  V8  V9 V10
1  998   1   2   4 999 999  9 999  98  98
2  998 999 999   5 998   9  3   2   1   1
3    7   8   8   4   8 999  8   5   6 998
4    6   8   8 998 999  99  2   6   9   1
5    8  98   4  99 999   9  1   4 998   1
6    1   2   2   6   2   5  6   3   6   8
7    1   6 998   2 999  99  7   7   8 998
8    4 999   7  98   1  98  8   4   7  98
9   98   2   4   6   6   6  5   4  98   2
10   6   3   2  99   3   2  8   2  98 999

And a replacement list, where we define different criteria:

lista_reemplazo <- list(
  list(cols=c("V1", "V2"), na_vals=c(9,8)),
  list(cols=c("V3"), na_vals=c(998)),
  list(cols=c("V4", "V5", "V6"), na_vals=c(998, 999)),
  list(cols=c("V7", "V8"), na_vals=c(9)),
  list(cols=c("V9", "V10"), na_vals=c(8, 9, 998, 999))
)

For example, the first criteria is that in V1and V2the values 8 and 9 are replaced by NA, the following is to iterate to process each criteria:

for (reemplazo in lista_reemplazo) {
  for (col in reemplazo$cols) {
    df[df[, col] %in% reemplazo$na_vals, col] <- NA
  }
}

df

    V1  V2  V3 V4 V5 V6 V7  V8 V9 V10
1  998   1   2  4 NA NA NA 999 98  98
2  998 999 999  5 NA  9  3   2  1   1
3    7  NA   8  4  8 NA  8   5  6  NA
4    6  NA   8 NA NA 99  2   6 NA   1
5   NA  98   4 99 NA  9  1   4 NA   1
6    1   2   2  6  2  5  6   3  6  NA
7    1   6  NA  2 NA 99  7   7 NA  NA
8    4 999   7 98  1 98  8   4  7  98
9   98   2   4  6  6  6  5   4 98   2
10   6   3   2 99  3  2  8   2 98  NA

How to convert given numeric values to NA values efficiently in R?

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?

How to convert given numeric values ​​to NA values ​​efficiently in R?

2 Answers

How to convert given numeric values to NA values efficiently in R?