Optimizing Simple Linear Regression with Gradient Descent Algorithm

2 min readJul 17, 2022

The dictionary meaning of the word optimization is “an act, process, or methodology of making something such as a design, system or decision, fully perfect, functional, or as effective as possible”. Mathematically, optimization often involves processes such as finding the maximum of a function or minimizing errors involved in a given scenario. At high level, there are two optimizing techniques.

Derivative-based:
In this optimization technique, the derivative of the desired function is computed (though mathematical differentiation) to get the gradient(slope) and is followed until the optimum(optima) level is reached.
However, you can use this method only if the function is continuous and hence it possible to carry out differentiation on it to get derivative. Two common derivative-based optimizing techniques are Steepest Descent(Gradient Descent) and Newton’s method.

Graphical Representation of Gradient Descent

Comparison of Gradient Descent(green) and Newton’s optimization(red)

Derivative-free:
But what if the derivative does not exist? Quite a few times, you would come across discrete problems that cannot be defined using continuous functions. Consequently, you would not be able to carry out differentiation and calculate gradient for finding optima.
In theory, it is possible to check all the cases for a discrete problem to find the optima, but such computations are usually infeasible and time consuming for most of the real-life problems. Hence, you need derivative-free optimization techniques for finding optima for such problems. Two common derivative-free optimizing techniques are Random Search and Downhill Simplex.

Optimizing Simple Linear Regression with Gradient Descent Algorithm

Written by Anay Dongre

No responses yet