What is a promise in Javascript?

Question

Alejandro Carrera

Asked: 2020-12-20 14:31:46 +0800 CST 2020-12-20 14:31:46 +0800 CST 2020-12-20 14:31:46 +0800 CST

How to make this replace function in R work for dataframes?

772

On many occasions, when we work with databases, we find various encodings for unregistered or unreported data. For example, in can be zeros, -999, -99, among others, which, for data processing purposes, we can convert to NA.

Thinking about it, I made a small function that looks for unregistered values that can be changed to NA:

'is.na.m<-' <- function(x, value, ...) {
    x[c(value)] = NA
    x
}

We create a test vector:

x <- c(1:3,2:5,1:10, -99, -999, -98)
is.na.m(x) = x%in%c(-99, -999, -98)
x

Salida:
[1]  1  2  3  2  3  4  5  1  2  3  4  5  6  7  8  9 10 NA NA NA

Now, I've tried to get this to work on a dataframe, but to no avail:

b <-data.frame(animal=c("perro", "gato", -999), num=c(1,-98,3))
is.na.m(b) = b[b%in%c(-98, -999)]
b

As can be seen, there are no changes in the dataframe:

  animal num
1  perro   1
2   gato -98
3   -999   3

** It should be noted that is.na.m(b) = b%in%c(-98, -999)it did not work either

I tried to use the indexing function, but it didn't work either:

is.na.m(b[,1:ncol(b)]) = b[,1:ncol(b)%in%c(-98, -999)]
b

Now, when I try to use lapply, it gives me an error:

b <- unlist(lapply(b, is.na.m(b))
Error in is.na.m(b) : no se pudo encontrar la función "is.na.m"

The question is: What are the adjustments that I must make in the function so that it operates correctly in all the columns of a dataframe?

I thank you in advance for any guidance.

1 Answers

Voted

Patricio Moracho · Answer 1 · 2020-12-21T05:56:33+08:00

Alejandro, the main problem I see is that you have a confusion with the operator %in%. This is a binary operator that points to the function match(), if we see its documentation for the first input parameter it says:

x vector or NULL: the values to be matched. Long vectors are supported.

That is, the expected input is a vector, even a two-dimensional vector (array), but not a data.frame, i.e. this x %in% c(-99, -999, -98)works as you expect, it returns a logical vector the same size as the input vector, but this b %in% c(-98, -999)no longer, why bis it a data.frame. The interesting and confusing thing is that it does not give us an error, it returns data, but not the expected ones, the return is a vector of FALSEthe size of the columns of thedata.frame

> b
  animal num
1  perro   1
2   gato -98
3   -999   3
> b %in% c(-98, -999)
[1] FALSE FALSE

I owe you the explanation of this behavior, matchit's an internal function, written in Cand I'm missing a lot of base from the R <-> C API. Anyway the bottom line is that you can't use it matchthe way you do.

The other problem, I don't know if you have noticed two situations that should be paid attention to:

a. This: c("perro", "gato", -999)by automatic coercion, it will be transformed into a vector of strings, the number -999is promoted to the most general data type, in this case a string

b. The other issue is that by default it data.frame()treats strings as data factor, this adds a bit more complexity. If we don't want this behavior we should usestringsAsFactors = FALSE

I tell you this because you are looking to do a match()with numeric values, so: What should be the behavior when we compare with strings? In fact, this same question, since it is a question, should be asked data.framefor each type of possible data.

Now, suppose the following scenario:

b <- data.frame(animal=c("perro", "gato", "-999"), 
                num=c(1,-98,3), 
                num2=c(1,-98,3), 
                stringsAsFactors = FALSE)
b

  animal num num2
1  perro   1    1
2   gato -98  -98
3   -999   3    3

Let's solve the first problem, how to replace the values -98and "-999"number and string respectively. We have already seen that %in%it does not work for us, so what we can do is compare by ==for each searched value:

lapply(c(-99, -999, -98), `==`, b)

This will generate a list, where each element is an array of the size of the data.framelogical for each value sought, and additionally, this operator does an automatic coercion in such a way that we can successfully compare (if that is what you are looking for) the string "-999"with the number -999. The idea then is to combine each array into one, where each TRUEis the place where we want to replace byNA

Reduce("|",lapply(c(-99, -999, -98), `==`, b))

     animal   num  num2
[1,]  FALSE FALSE FALSE
[2,]  FALSE  TRUE  TRUE
[3,]   TRUE FALSE FALSE

Now yes with Reduceand combining the arrays with a orlogical we obtain the places that we will have to replace, finally:

b[Reduce("|",lapply(c(-99, -999, -98), `==`, b))] <- NA
b
  animal num num2
1  perro   1    1
2   gato  NA   NA
3   <NA>   3    3

Now your example should work:

'is.na.m<-' <- function(x, value, ...) {
    x[value] <- NA
    x
}

b <- data.frame(animal=c("perro", "gato", "-999"), 
                num=c(1,-98,3), 
                num2=c(1,-98,3), 
                stringsAsFactors = FALSE)

is.na.m(b) <- Reduce("|",lapply(c(-99, -999, -98), `==`, b))
b
  animal num num2
1  perro   1    1
2   gato  NA   NA
3   <NA>   3    3

Comments:

It is a bit strange to use an assignment function, where the value you pass to it is not exactly the value to assign, it is more logical to see something of this style is.na.m(b) <- NA, but your example is totally valid.
You don't need to do x[c(value)]directly x[value]reach.
Note that when you use is.na.m(b)without the assignment, you're calling another function, not a is.na.m<-()but is.na.m()so if it's not defined, that's where you get the error when you try to use it with lapply.

How to make this replace function in R work for dataframes?

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?