What is a promise in Javascript?

Question

Asked: 2020-02-22 04:20:06 +0800 CST 2020-02-22 04:20:06 +0800 CST 2020-02-22 04:20:06 +0800 CST

Bad line when fitting quadratic polynomial to nonlinear data

772

I am trying to fit a quadratic polynomial to my data using the following code:

#Polynimial regression for y ~ x
model <- lm(y ~ x + I(x^2))
summary(model)

#Box and whisker plot + polynomial
boxplot(y ~ x,
        col=c("white","lightgray"), ylab= "y", xlab= "x", dat)
means <- tapply(y,x,mean)
points(means,col="red",pch=18)
predicted.intervals <- predict(model,data.frame(x=x),interval='confidence',
                               level=0.99)
lines(x,predicted.intervals[,1],col='green',lwd=3)
lines(x,predicted.intervals[,2],col='black',lwd=1)
lines(x,predicted.intervals[,3],col='black',lwd=1)

link to data

Data Link B

The thing is that when I run the program the box graph appears, the red points representing the means and the green line of the polynomial fitted to the data. However, there is also a strange straight line joining the averages of levels 1 and 11 that I have no idea where it comes from. Here goes the graph:

I have fitted polynomials to my data more times in nonlinear regressions, but this has never happened to me.

Any solution?

Edit 1:

Graph obtained with data B.

Finally, I have managed to fit the polynomial to the data. The code used is the following:

#Factorizo la variable x
x <- as.factor(x)
#Vuelvo a transformar la variable a numérica
x <- as.numeric(x)

#Regresión cuadrática
model <- lm(y ~ x + I(x^2))
summary(model)

#Ajuste del polinomio
boxplot(y ~ x,
        col=c("white","lightgray"), ylab= "y", xlab= "x", dat)
means <- tapply(y,x,mean)
points(means,col="red",pch=18)
predicted.intervals <- predict(model,data.frame(x=x),interval='confidence',
                               level=0.99)
lines(x,predicted.intervals[,1],col='green',lwd=3)

The result for data B is this:

However, I have some questions:

My variable x comprises values from 0 to 1 (11 levels in steps of 0.1)

Why did I have to factor my original variable x, then transform the variable back to numeric (adopting discrete values between 0 and 11? Only then can I fit the polynomial to the data, but the regression is run on numeric values of 1 to 11.
Why in the case of "data set B" should I use

lines(x,predicted.intervals[,1],col='green',lwd=3)

...while in the case of "dataset A" I should use

lines(predicted.intervals[,1],col='green',lwd=3)

?

1 Answers

Voted

Patricio Moracho · Answer 1 · 2020-02-24T07:00:17+08:00

Let's go to the first problem, the data set A . I'm going to work only with the regression, let's see:

datA <- read.table(file = "datA.csv", header = T, sep=",", stringsAsFactors = F, dec = ".")

model <- lm(datA$y ~ datA$x + I(datA$x^2))
datA$fitted <- predict(model,data.frame(x=datA$x))

Now let's plot just the points and the curve for the regression:

plot(datA$x, datA$fitted)
lines(datA$x, datA$fitted)

The result is surely familiar to you:

This is explainable because it datAis not ordered by x, when drawing the lines from the points xand fittedeventually we could have a point (1, ?)and then a (0, ?), so we would return to the origin making the graph "circular". To solve this, we simply order by x:

datA.ordered <- datA[order(datA$x),]

plot(datA.ordered$x, datA.ordered$fitted)
lines(datA.ordered$x, datA.ordered$fitted)

Now the result is more in line with what was sought:

Is this the solution and explanation of the problem? Yes and No. Let's see, if we add this graph to theboxplot

datA <- read.table(file = "C:/Users/pmoracho/Downloads/datA.csv", header = T, sep=",", stringsAsFactors = F, dec = ".")

model <- lm(datA$y ~ datA$x + I(datA$x^2))
boxplot(datA$y ~ datA$x,
        col=c("white","lightgray"), ylab= "y", xlab= "x", datA)

means <- tapply(datA$y,datA$x,mean)
points(means,col="red",pch=18)
datA$fitted <- predict(model,data.frame(x=datA$x))
datA.ordered <- datA[order(datA$x),]
lines(datA.ordered$x, datA.ordered$fitted,col='green',lwd=3)

we can see this result:

What can we notice? the original curve was "compressed" between the values of 0 and 1, the explanation is that the coordinate systems are not compatible, this is because he boxplotconsiders the values of xas discrete variables, however our values of datA$xare not, and here is where the use of comes in factor(), like so:

lines(factor(datA.ordered$x), datA.ordered$fitted,col='green',lwd=3)

Now the data of xare consistent with those xof boxplot:

This explains and it has worked for me for both sets of data, I do not publish the results so as not to make the answer longer. In any case, you may wonder why the behavior of the two data sets was originally different, the explanation is simple, set A is disordered and set B is ordered (always speaking of the value of x).

I also recommend that you use spline()to draw these curves, it would avoid having to order previously and the other most important advantage is that it does not pass line()the complete set of data but the minimum points to interpolate the line in the graph:

lines(spline(factor(datA$x), datA$fitted),col='green',lwd=3)

Bad line when fitting quadratic polynomial to nonlinear data

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?