If you do not know what linear regression is, basically, finding the best predictive line (y = w*x + b) that “explains” the relationship between the independent variable (x) and the dependent variable (y). In general there can be more than one independent variables, but we will consider only one here. If you still do not know what “linear regression” is, this is a good point to stop and google it. That will make it easier to follow the rest of this document. We already have functions in various programming languages and Software Packages (such as Excel) to do linear regression fast and easy.
That is for the faint hearted. For the brave reader, we are now going to look at the machinery behind linear regression and write a Python code (using numpy) to do it. We are also going to learn a thing or two in the process that will help us understand Neural Networks in the III part of this document series. We can either use data you already have or generate a new dataset. I encourage you to generate a new data set for following this tutorial and to work on your own data once you are comfortable with methodology.