Julia First Steps : Linear regressions
This short notebook aims to demonstrate how easy it is to do linear algebra with Julia Let’s start by generating a vector of random normally distributed values $X$ :
using LinearAlgebra
using Plots
X = randn(50)
50-element Array{Float64,1}:
1.2315276650849976
-1.4524774035807928
-1.331487964705029
1.7910530069325952
1.4344147755194752
0.8381566948997999
0.37732934639480875
-0.08465783247472314
-1.2007366388946283
0.2477648455025647
0.36153611350540193
1.0671530445403783
0.48110313559012585
⋮
-0.9292021823632189
0.6925673689749254
-0.4111717889244624
-1.066442631729392
-1.0254831803316289
1.030013173212353
-0.36342805460780975
0.6463700426665658
-1.4416779178988512
-0.9530989447609137
-0.5719567028504781
-1.494616935078167
Then we use $X$ to generate the target variable $y$. To do so : $y_i = 5X_i + 2 +\epsilon$ :
ϵ = randn(50)/10;
y = (5 * X) + ϵ .+ 2
50-element Array{Float64,1}:
8.1407692513959
-5.3081880651629465
-4.580108604523786
10.979354279905353
9.217772217481926
6.200257553598876
3.974797320462655
1.4380463508634955
-4.010151843847709
3.2260249975738176
3.5870402952610814
7.3820007915715875
4.604098918967308
⋮
-2.743409826160721
5.564858955179007
-0.19705563404837934
-3.3428052322697033
-3.089865165111327
7.234024641459847
0.36568334066534147
5.240195075581495
-5.197421511055493
-2.743474127301406
-0.9295139178487957
-5.497662035395874
This is how simple it is to make a linear regression with Julia :
X\y
4.831442452366241
As $y$ has been generated with an intercept this approach yields inaccurate results. Let’s add a column of ones to $X$. $X$ is now a $50 \times 2$ Matrix.Xconst is our Design Matrix :
Xconst = hcat(ones(50), X)
50×2 Array{Float64,2}:
1.0 1.23153
1.0 -1.45248
1.0 -1.33149
1.0 1.79105
1.0 1.43441
1.0 0.838157
1.0 0.377329
1.0 -0.0846578
1.0 -1.20074
1.0 0.247765
1.0 0.361536
1.0 1.06715
1.0 0.481103
⋮
1.0 -0.929202
1.0 0.692567
1.0 -0.411172
1.0 -1.06644
1.0 -1.02548
1.0 1.03001
1.0 -0.363428
1.0 0.64637
1.0 -1.44168
1.0 -0.953099
1.0 -0.571957
1.0 -1.49462
Using the back slash operator, we are now able to get back the parameters used to generate $y$ :
para = Xconst\y
2-element Array{Float64,1}:
1.9969334464481343
5.012372712566652
This evaluates parameters more accurately. Note that, in Julia, x = A\y solves $A.x = y$ :
If we use the normal equation, we know parameters can be estimated by $(X^tX)^{-1} X^t.y$ :
(transpose(Xconst) * Xconst)^(-1) * transpose(Xconst) * y
2-element Array{Float64,1}:
1.9969334464481343
5.012372712566649
scatter(Xconst[:,2], y)
Plots.abline!(para[2], para[1], line=:dash, legend=false)
Fortunately, Julia has some packages to help make quick diagnostics. The GLM (General Linear Models) package can be used here to get p-values and confidence intervals :
using GLM
lm(Xconst, y)
LinearModel{GLM.LmResp{Array{Float64,1}},GLM.DensePredChol{Float64,Cholesky{Float64,Array{Float64,2}}}}:
Coefficients:
─────────────────────────────────────────────────────────────────
Estimate Std. Error t value Pr(>|t|) Lower 95% Upper 95%
─────────────────────────────────────────────────────────────────
x1 1.99693 0.0138377 144.312 <1e-64 1.96911 2.02476
x2 5.01237 0.0139446 359.448 <1e-83 4.98434 5.04041
─────────────────────────────────────────────────────────────────
To evaluate how easy it is to auto diff a function with Julia let’s define the predict and the cost function. predict takes two parameters and returns predicted values $\hat{y}$. cost evaluates how far $\hat{y}$ from $y$ :
predict(intercept, x) = Xconst[:,1] .* intercept + Xconst[:,2] .* x
predict (generic function with 1 method)
cost(intercept, x) = sum((predict(intercept, x) - y).^2)
cost (generic function with 1 method)
println(cost(1.5, 4))
println(cost(2, 5))
58.77621086610803
0.46418446941640396
In order to minimize cost, gradients will be automatically calculated by Julia. $n$ iterations are run to find the two values that minimize cost :
using ForwardDiff
ForwardDiff.gradient(z -> cost(z[1], z[2]), [1.0, 2.0])
2-element Array{Float64,1}:
-72.81718390149325
-287.7385372871958
ForwardDiff.gradient(z -> cost(z[1], z[2]), [2.0, 3.0])
2-element Array{Float64,1}:
18.26089198493291
-198.1888809734103
grad(intercept, x) = ForwardDiff.gradient(z -> cost(z[1], z[2]), [intercept, x])
grad (generic function with 1 method)
grad(1.0, 2.0)
2-element Array{Float64,1}:
-72.81718390149325
-287.7385372871958
function eval(interceptStart=0.0, xStart=0.0, niter=1000, lr=0.01)
evalIntercept = interceptStart
evalX = xStart
for i in 1:niter
gradVector = grad(evalIntercept, evalX)
evalIntercept = evalIntercept - lr * gradVector[1]
evalX = evalX - lr * gradVector[2]
end
return evalIntercept, evalX
end
eval (generic function with 5 methods)
The gradient descent method returns an accurate evaluation of the parameters used to generate $y$ :
eval()
(1.9969334464481345, 5.012372712566651)