What is a promise in Javascript?

Question

Alejandro Carrera

Asked: 2020-08-09 23:06:31 +0800 CST 2020-08-09 23:06:31 +0800 CST 2020-08-09 23:06:31 +0800 CST

How to achieve a vectorized version of this function?

772

I was reviewing some things about matrices and the Mahalanobis distance and it occurred to me to make a small function that ranks the observations of each column of a matrix. Below the code:

test <- matrix(c(78.17,70.25,75.33, 86.08,54.97, 43.63,18.04,
0.3,1.4,0.5,1.5,0.7,0.2,0.1,3,5,5,8,9,10,2), ncol=3)
test

rank_columns <- function (x) {
    y <- matrix(ncol=ncol(x), nrow=nrow(x))    
    for (j in 1:ncol(x)) {
    y[,j] <- rank(x[,j])
        }
        return(y)
    }

rank_columns (test)

The function returns an array with the original dimensions of the input array and the ranked observations:

rank_columns(test)
     [,1] [,2] [,3]
[1,]    6    3  2.0
[2,]    4    6  3.5
[3,]    5    4  3.5
[4,]    7    7  5.0
[5,]    3    5  6.0
[6,]    2    2  7.0
[7,]    1    1  1.0

As many of you know, I'm not very good at using the family apply, so I was wondering if there was a way to vectorize the function to optimize its performance when dealing with larger matrices. Beforehand thank you very much.

2 Answers

Voted

Patricio Moracho · Answer 1 · 2020-08-10T08:07:37+08:00

Alejandro, first of all, the answer that Javier Ascunce has given you is undoubtedly an adequate way to solve it, but I want to extend the explanation a little more.

In R it is repeated over and over again about not using explicit cycles ( for, while, repeat) but using functions *apply, that is, implicit cycles. This because:

The code is usually much more compact, which generally makes it less confusing.
In some cases, depending on the code, there are often performance improvements.

Let me clarify that in reality we are not exactly talking about "vectorization", your function would already be "vectorized", it is very optimal, since it would only be using one cycle per column.

In your example, where you're looking to "apply" the function rankto each column of an array, and I'm assuming you're looking to get an array similar to the original, the easiest way to apply an implicit loop is:

apply(test,2,rank)

or in its most explicit version:

apply(X = test, MARGIN = 2, FUN = rank)

That is, enter the matrix in this case, and using MARGIN = 2that is to say that we take the columns (it MARGIN = 1would be per row), to each column then, we will apply ( FUN = rank) the function rank().

Another way is to use sapply(), which is more like what you're doing:

sapply(1:ncol(test),function(col){rank(test[,col])})

In this case we iterate over each column of the matrix and apply the rankon a slice of the matrix corresponding to the column.

What happens to the performance? Let's see, let's do a test with an array of 10,000 rows and try each function 1000 times:

library(microbenchmark)
library(ggplot2)

set.seed(100)
ncols = 3
nrows = 10000
test <- matrix(runif(nrows*ncols), ncol=ncols)

mi <- microbenchmark(
     m <- rank_columns(test),
     m <- apply(test,2,rank),
     m <- sapply(1:ncol(test),function(col){rank(test[,col])}),
     times = 1000L)

autoplot(mi)

Interesting, the three ways of doing the same have a very similar performance, apart from apply()having a greater dispersion of values, it could be said that there is no significant winner, in fact, you rank_columns()could even be a "tip" faster.

Beyond performance, without a doubt, solving several lines of code in a single one is a significant improvement that is worth taking advantage of whenever possible.

Javier Ascunce · Answer 2 · 2020-08-10T01:31:19+08:00

Best Answer

Javier Ascunce

2020-08-10T01:31:19+08:002020-08-10T01:31:19+08:00

test_ranked <- apply(test, 2, rank)

The 2 in the second argument makes the function apply by columns.

1

How to achieve a vectorized version of this function?

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?