Statistics in SQL: Simple Linear Regressions

Comments 0

Share to social media

The series so far:

  1. Statistics in SQL: Pearson’s Correlation
  2. Statistics in SQL: Kendall’s Tau Rank Correlation
  3. Statistics in SQL: Simple Linear Regressions
  4. Statistics in SQL: The Kruskal–Wallis Test
  5. Statistics in SQL: The Mann–Whitney U Test
  6. Statistics in SQL: Student's T Test

Let’s imagine that we have two variables, X and Y which we then plot using scatter graph.

C:\Users\andrew.clarke\SkyDrive\Documents\scatterPlot.gif

It looks a bit like someone has fired a shotgun at a wall but is there a relationship between the two variables? If so, what is it? There seems to be a weak positive linear relationship between the two variables here so we can be fairly confident of plotting a trendline.

Here is the data, and we will proceed to calculate the slope and intercept. We will also calculate the correlation.

It gives the result…

Alpha was calculated by this expression, …

… Beta was calculated thus.

… and Rho by this

The Pearson’s Product Moment correlation can be used to calculate the probability of this being a significant correlation via the Fisher Transformation.

The slope (beta) and intercept (alpha) can be used to calculate any point on the trendline.

We can, for example calculate the value of Y when X is 100 by the equation
y’ = a + bx

Is, in our example

We can calculate the trendline X,Y points by slightly altering the SQL

We can then plot a trend line

C:\Users\andrew.clarke\SkyDrive\Documents\scatterPlot.gif

We are, of course, assuming a linear relationship and parametric data. With Pearson’s Rho and the number of X-Y pairs, we can go on to estimate the fisher transformation (t= 1.912) and probability of 0.061825.

The PowerShell to do the Gnuplot graphs is here

Load comments

About the author

Phil Factor

See Profile

Phil Factor (real name withheld to protect the guilty), aka Database Mole, has 40 years of experience with database-intensive applications. Despite having once been shouted at by a furious Bill Gates at an exhibition in the early 1980s, he has remained resolutely anonymous throughout his career. See also :

Phil Factor's contributions