I recently began diving into Machine Learning. My plan is to start with hands-on approach, and dig into theory on-demand. After some researching and chatting with GPT, I selected the Sentiment Analysis problem and the Naive Bayes algorithm as the first one to try, for the following reasons:
- Low complexity level. The Bayes' Theorem is not necessarily intuitive to me, but it's relatively easy to understand, and still powerful enough to produce good results. The idea is to make something simple, get familiar with the tools and test the waters. I was choosing between Bayes' and k-NN as my first target.
- Available datasets. For my project, I'm using the IMDB 50K Dataset which contains a collection of positive and negative comments, along with their sentiment scores. There are plenty of others datasets available for training the model for your specific needs.
- Well-known problem. Sentiment Analysis has been widely used for a while now to gain measurable insights from various forms of text.