In this article I show you how to use the new arrival of data analysis with Python: datapre.eda
preparing the datasets in a Machine Learning project is a very important step that should not be neglected, otherwise you risk over evaluating your model (over-fitting) or quite simply the opposite (under fitting). In this article we will go through the essential steps for this delicate operation.
To follow up on my article on the management of character strings, here is a first part which will allow us to have a progressive approach to the processing of this type of data. Far from any semantic approach (which will be the subject of a later post) we will discuss here the technique of bags of words