Data prep Archives - A.I. Shelf

dataprep.eda: a newcomer in data analysis

by Benoit CaylaAugust 10, 2020November 24, 2020

In this article I show you how to use the new arrival of data analysis with Python: datapre.eda

Explore your data with DataExplore

by Benoit CaylaJuly 19, 2020November 24, 2020

Discover in this article how to use the Open Source DataExplore tool to visualize and even manipulate your data.

Analyze your data with Pandas-profiling

by Benoit CaylaFebruary 16, 2020November 30, 2020

Analyze your data effortlessly with the pandas_profiling Python library.

Strings Comparison

by Benoit CaylaFebruary 10, 2020November 24, 2020

Find out in this article how to use distance algorithms and the Fuzzywuzzy library to compare strings.

Get started with Tesseract

by Benoit CaylaDecember 12, 2019December 16, 2020

122/5000
Interested in OCRs? learn how to use Tesseract (Open Source OCR) from the command line but also via Python.

Preparing the datasets

by Benoit CaylaMay 13, 2019December 4, 2020

preparing the datasets in a Machine Learning project is a very important step that should not be neglected, otherwise you risk over evaluating your model (over-fitting) or quite simply the opposite (under fitting). In this article we will go through the essential steps for this delicate operation.

Variables correlation

by Benoit CaylaMay 6, 2019November 28, 2020

This article shows you how to detect links between observation variables.

Orange Data Science Tool

by Benoit CaylaApril 27, 2019December 4, 2020

Discover in this article in the form of a tutorial how this small Open-Source Data-science tool can save you a lot of time!

Bag of Words

by Benoit CaylaOctober 28, 2018December 4, 2020

To follow up on my article on the management of character strings, here is a first part which will allow us to have a progressive approach to the processing of this type of data. Far from any semantic approach (which will be the subject of a later post) we will discuss here the technique of bags of words