We make a regression model for the mtcars data.
mtcars
I want to estimate mpg from the rest of the variables:
m1<-lm(formula = mpg ~ . , data = mtcars )
We see what could be the best variables to estimate mpg:
library(leaps)
mejores1 <- regsubsets(mpg~., data = mtcars,
nvmax = 10,
nbest=1,
method = "forward")
summary(mejores1)
For a model with a single variable we select wt.
For a model with two variables we select wt and cyl. etc.
I would like to pass this table
summary(mejores1)
to something more visible, as there are several methods to select variables (method = "seqrep", "backward" ,"exhaustive") it is a mess to compare them, especially when there are many variables. I would like to change it to something like this
forward backward
1 variable wt wt
2 variables wt + cyl wt + qsec
3 variables wt + cyl + hp wt + qsec + am
or any other way to see the results more clearly.
I have tried with this:
as.data.frame(summary(mejores1)$which) %>%
gather(key = variable,
value = variable_datos, -`(Intercept)`) -> datos
datos$`(Intercept)`<-NULL
datos %>% group_by(variable) -> datos
datos$posicion <- rep(seq(1,10),10)
datos[datos$variable_datos!="FALSE",] -> datos
datos$variable_datos<-NULL
datos$posicion<-ordered(datos$posicion, levels = c(1,2,3,4,5,6,7,8,9,10))
datos <- datos %>% group_by(variable) %>%
summarise(posicion=min(posicion))
datos[order(datos$posicion),]
It tells you which is the first variable to choose, which is the second etc, but surely there is a better way to do it.
I don't know if it's much better, but here it goes:
First I create a list with the selection of variables with each method. Usage
lapply()
and new style of creating lambda functions inR
(>4.10). It could be usedmap
.with
tidyverse
. The idea is to create a dataframe as soon as possible and handle it like that. I think it is the simplest option.With base R perhaps something similar to the previous approach could be used, here I do the same with pure manipulation of lists and lambda functions. It's a bit shorter, but unnecessarily complicated.