When I finished the article on gradient descent, I realized that there were two important points missing. The first concerns the stochastic approach when we have too large data sets, the second being to see very concretely what happens when we poorly choose the value of the learning rate. I will therefore take advantage of this article to finally continue the previous article 😉
How to talk about Machine Learning or even Deep Learning without addressing the – famous – gradient descent? There are many articles on this subject of course, but often you have to read several in order to fully understand all the mechanisms. Often too mathematical or not enough, I will try especially here to explain its operation smoothly and step by step in order to try to demystify the subject.
As soon as you begin to create machine learning models, you will be faced with the delicate problem of balance in the adjustment of bias and variance. In this article I try to simply explain how understand these two very important concepts.
If this method of “prediction” based on probabilities and states / transitions had its heyday, it now seems less fashionable. In this article we will come back to the fundamentals of Markov chains and their application in Python.