I have the following database:
I need to estimate regressions of the following form:
ir_total = b0 + b1 9dejulio + b2 Azul + b3 Bolivar
ir_total = b0 + b1 9dejulio + b2 Azul + b3 CoronelSuarez
.
.
.
.
ir_total = b0 + b1 pehuajo + b2 Dolores + b3 Junin
That is, I need to estimate regressions with the variable y=ir_total, 3 regressors that will go through all the different departments. Finally print in a DataFrame or table the coefficients and the R^2 of each regression
I am trying something like the following
n=ncol(data)
for (i in 3:n) {
model = lm (data = data, formula = data[,2] ~ data[,i] + data [,i+1] + data[,i+2])
}
First, I synthetically generate a data set similar to the one you show:
The first problem is to generate all the combinations of 3 independent variables, for this we select the column names that are going to participate in the calculation, in your case, all except the first two and then we simply
combn()
generate the combinations:We end up with partial strings, but we still need to complete the formula and actually turn them into a formula:
We now have a list of formulas, we simply
lapply()
applylm
them to each element:Finally, having a list with all the models, it is relatively easy to "extract" the data you mention
Perhaps this is an option to address this question. It may not be the most efficient, but it could help you. Considering that there is no sample of the data, here is an example with fictitious data:
DATAFRAME
OUTPUT