We talk about it everywhere. It is even in almost all major marketing messages, but above all it is at the heart of communication and business development strategies. To be honest, it is difficult today to read a specialist magazine or even the current press without seeing a mention. In short, if Machine Learning gives pride of place to our communication channels, is it nevertheless well understood by all? Let’s take a quick and fun update together on what Machine Learning actually is.
Machine Learning & Artificial Intelligence
The first confusion that we commonly encounter is the amalgamation between AI (Artificial Intelligence) and ML (Machine Learning). If these two concepts are related, they are two different concepts. To put it simply, Artificial intelligence (by using the Larousse definition) is a “set of theories and techniques implemented with a view to producing machines capable of simulating intelligence” . Machine Learning is actually just one of these disciplines.
Machine Learning is therefore one of the facets of AI dedicated to machine learning. Put more simply, machine learning aims to imitate us – we human beings – in our learning. Until now, almost all of our approaches have been deterministic. In short, we lay down rigid laws that the computer tool must respect. Unfortunately these laws quickly find their limits, because as we say so well “the exception is the rule”. Result, we create algorithms which quickly stack exceptions until these famous exceptions dominate the said rule. Catastrophe! software then becomes unmanageable… To sum up, these deterministic approaches are not sufficient because they do not know how to adapt to a certain reality.
To go further, we will have to think differently about these rules / laws and model them on concrete experiences.
The objective is therefore to move from a strict and descriptive logic to an experimental logic. This implies using techniques that will drink statistics and therefore data. It may sound complicated, of course, but come to think of it this is exactly how humans work!
Experiment to produce logic
Let’s take an example or rather a simple experiment: 2 pots of different colors are placed in front of me. I also have pawns of different shapes (round, square, triangular, oval, etc.) in my hands.
The experiment will take place in two phases:
Phase 1: you place your pawns in the pots:
- The round pawn in pot A
- The square in pot B
- Finally, the triangular pawn in pot B
Have you guessed the distribution rule at this level? not sure … I continue,
- The oval pawn in pot A
- The hexagonal pawn in pot B
And there? you found ?
Phase 2: If I now present a pawn in the shape of a pentagon … where will you put it?
It’s a safe bet you’ll put it in pot B!
Well, well done, that’s exactly how Machine Learning works!
Naturally we looked at the cases presented and we deduced that the distribution characteristic was the roundness or not of the pawns. And that’s exactly how we operate every day as soon as we learn. Of course, this is a simple example, because in reality there are many more scenarios but also more parameters to take into account.
Deterministic vs Probabilistic approach
Who says more cases (therefore more data) also says more errors or rather more cases which, for various reasons (new characteristics?), Will stand out. Imagine that in our previous experience I carried out the first phase with thousands of cases! hard to imagine that no error or imprecision would have crept in, right? or that quite simply other characteristics or parameters (hitherto hidden) are also part of the actual choice of distribution (such as size for example).
In short, it is often our famous exceptions that we always have trouble managing. Never mind, a data approach will make them insignificant! we are then entering the era of statistics / probabilities.
Let’s just sum it up:
- The deterministic approach requires that each pawn of rounded shape goes to pot A and the others to pot B. but be careful because this law (strict and rigid) does not manage exceptions! From an algorithmic point of view it is simpler and above all nobody can derogate from the enacted law! Each exception must therefore also be managed explicitly.
- The probabilistic or statistical approach is more flexible because it will be based on observations made. The idea is therefore to break down the problem into two phases: a first phase of observation (or learning) in which we collect data and the second which, from these observations will give us a law (more or less approximate). In this case you don’t know the rule / law, but you have to derive one from the data.
These two approaches are fundamentally different because if the first seems rather simple to set up at first sight, the second adds a new essential component: the taking into account of uncertainties. Indeed, who says probabilities or statistics also says hazards and bias! These two notions, which had been put aside until then, will allow us to define analytical and predictive models reflecting an observed reality and no longer an enacted reality.
Did you say prediction?
Because yes, the main objective of Machine Learning is to put oneself in the capacity to predict behaviors or information
… And, predict from data only!
I will come back to this in a future article but know that for this the advances in mathematical knowledge, storage power, data integration but also machine power quite simply allow today to obtain bluffing results. . Beyond the Machine Learning toolbox (regression algorithms, classification algorithms and other “clustering” models), we can distinguish two main types of approaches: the so-called supervised approach and the unsupervised approach.
The so-called supervised approach is one that by the fact that you have made observations and that you have the results. If we take a medical example, you have clinical cases and diagnoses in front of you. You therefore have symptoms, different characteristics on your patients (the famous parameters described above) and the diagnosis observed (the label to use ML jargon). In the case of an unsupervised approach it is more complex because you have the clinical cases but not the corresponding diagnoses.
On the way to data computing!
The main challenge is therefore to produce a law from the data that will minimize the error rate. However, to maximize the quality of your Machine Learning system, it will be necessary to increase:
- The number of data or observations: remember in my experience, I had to add scenarios so that we can understand the distribution criterion.
- The number of characteristics: also remember that a pawn has a certain shape, size, color, etc.
In short, we will have to collect a lot of information in order to feed our system but fortunately, we are in the era of Big Data.
This probabilistic approach will not be without error as I specified above. It makes sense, the more data we have, the more variations and errors there will be. Unfortunately, or rather fortunately, it must be accepted. Fortunately, why, and quite simply because Machine Learning is designed to handle variations / errors. And then again, this is what we do – we human beings – naturally every day and all the time in our learning.
To borrow from Roman Kacew’s quote: “White and black are fed up, gray is just human”. Because we do not live in a rigid world and every day we experience these small variations that we unconsciously accept. So, if humans work like this, Machine Learning is undoubtedly the digital answer.
I will simply conclude with:
Doing Machine Learning is designing systems from data.
It is therefore a real revolution in our software approach but above all a concrete entry into the digital age of data!