You have noticed ? we hardly speak of Big Data anymore! Yet this Buzword has been the heart of the marketing strategy of many companies and software publishers around data. But if what is it really? it’s hard to imagine that the data deluge won’t happen.
To follow up on my article on the management of character strings, here is a first part which will allow us to have a progressive approach to the processing of this type of data. Far from any semantic approach (which will be the subject of a later post) we will discuss here the technique of bags of words
If you want to have an analytical approach to your data, you have of course been faced with the difficulty of using character strings. So much so that very often you have certainly had to put some aside. Lack of tools, complexity of managing complex semantics … In this article (first in a series) we will tackle these problems and especially see how to solve them.
It will soon be back to school, it was hot, the beach was good and the sand very warm. You are therefore well rested and ready to go back to school. It is therefore the right time to review some statistical bases that will allow you to better understand and use Machine Learning algorithms.