What is a promise in Javascript?

Question

Alejandro Carrera

Asked: 2020-04-17 15:52:53 +0800 CST 2020-04-17 15:52:53 +0800 CST 2020-04-17 15:52:53 +0800 CST

Error applying lapply in a for loop in R. How to correct it?

772

I try to apply a simple normalization function to the numeric variables in the R database iristhrough an forand using lapplyin order to obtain a new database containing only the normalized variables:

data(iris)

normal <- function (x) {
num <- x - min(x)
den <- max(x) - min(x)
return (num/den)
}

iris_n <- data.frame()

for (i in 1:length(iris)){
if (is.numeric(iris[,i])) {
}
iris_n[,i] <- as.data.frame(lapply(iris[,i], normal))
}

Error in Summary.factor(1L, na.rm = FALSE) : 
  ‘min’ not meaningful for factors
Además: There were 50 or more warnings (use warnings() to see the first 50)

iris_n

[1] NaN.   NaN..1 NaN..2 NaN..3
<0 rows> (or 0-length row.names)

Now, I try to lapply directly with:

iris_n <- as.data.frame(lapply(iris, function(x) {if (is.numeric(x))  normal(x)}))

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 150, 0

No matter how hard I turn it, I can't find the error. Any guidance will be greatly appreciated (Note: I am interested in solving this specific problem, I know there are other types of solutions to achieve what I need)

2 Answers

Voted

mpaladino · Answer 1 · 2020-04-17T17:33:55+08:00

The options that Patricio gives you are very good and they explain well how the *apply().

Two alternatives:

If you don't know in advance which variables are not numeric enough to discard them directly in the function call, you can apply the function only to numeric ones like this:

    lapply(iris[sapply(iris, is.numeric)], normal)

sapply()is a relative of lapply(), only it returns a vector instead of a list. In this case we use it to ask iris which columns are numeric (those return TRUE, those that are not return FALSE). As it is inside the square brackets, it is used to "trim" the data, leaving only the numeric ones. It then lapply()takes care of applying to those normaland you get an easy list to coerce into data.frame. Mind you, you lose the factor Species.

If you have no problem using a separate package there is a very neat syntax option using the purrr. The function that replaces lapply()or apply()for this case is modify_if(). As its name implies, it modifies an element of a list (iris is a list because all data.frames are lists, although not all lists are data.frames) if a condition is met. In this case, let the column be numeric. The interesting thing is that it keeps intact the columns in which the condition is not met.

Another particular characteristic of modify_if()and of the whole family. modify_*is that it tries to return the data with the same structure that enters. That is, if the function receives a data.frame -in this case, iris- it will try to return another data.frame. It is not necessary as.date.frame()after. So elegant:

library(purrr)
modify_if(iris, is.numeric, normal)

Normalized and with all the original columns.

However I think the best option, if you are willing to carry an extra package, is to use dplyr::mutate_if().

library(dplyr)
iris %>% 
  mutate_if(is.numeric, normal)
            #condición  #función

This guarantees you that the result is a data.frame or an error. That may be better than it sounds...

Patricio Moracho · Answer 2 · 2020-04-17T17:00:44+08:00

First of all, you have this problem:

 Error in Summary.factor(1L, na.rm = FALSE) : 
  ‘min’ not meaningful for factors

That is because in your code lapply()it is also being applied to the values of the column Speciesthat is a factor, the call to lapply()should be made within the block of theif (is.numeric(iris[,i]))

The other problem is the values Nanthat are generated. This is due to a somewhat debatable behavior of R when trimming objects via indices []. When you trim an array or similar object by taking a single row or column, R by default "coerces" the return value to a more primitive type, in this case a vector. Which produces that lapplyit is applied on each element of the vector, which ends up generating a division by 0 and consequently theNan

To avoid this, just as you have the code written, you could simply do:

lapply(iris[,i,drop = FALSE],normal)

either

lapply(iris[i],FUN=normal)

Lastly, the problem of doing this:

as.data.frame(lapply(iris, function(x) {if (is.numeric(x))  normal(x)}))

It is that one of the columns, the one of Speciesyou are not returning, so you have 4 lists with 150 elements and one with none, for which you get the error:

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 150, 0

You solve it by returning normal(x)for the numeric columns and directly xfor the non-numeric ones. In short, your code can be summarized in this:

iris_n <- as.data.frame(lapply(iris, function(x) {if (is.numeric(x)){normal(x)}else{x}}))

Error applying lapply in a for loop in R. How to correct it?

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?