Computer Science Sentiment Analysis Using a Naive Bayes Algorithm

ANaiveBayes classifier is not a single algorithm but uses multiple machine learning algorithms to classify data. It not only uses probability, but it is simple to implement. Some real-world examples of its use include filtering spam, classifying documents, text analysis, or medical diagnosis.

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

To perform sentiment analysis using a Naive Bayes algorithm, complete the following:

    Access the resources related to sentiment analysis, located in the topic Resources (https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment)

Note: There are about 50 datasets that are suitable for use in a sentiment analysis task. For this part of the exercise, you must choose one of these datasets, provided it includes at least 10,000 instances.

  1. Ensure that the datasets are suitable for classification using this method.
  2. You may search for data in other repositories, such as Data.gov, Kaggle or Scikit Learn.

For your selected dataset, build a classification model as follows, in Python:

  1. Explain the dataset and the type of information you wish to gain by applying a classification method.
  2. Explain the Naive Bayes algorithm and how you will be using it in your analysis (list the steps, the intuition behind the mathematical representation, and address its assumptions).
  3. Import the necessary libraries, then read the dataset into a data frame and perform initial statistical exploration.
  4. Clean the data and address unusual phenomena (e.g., normalization, feature scaling, outliers); use illustrative diagrams and plots and explain them.
  5. Formulate two questions that can be answered by applying a classification method using the Naïve Bayes.
  6. Choose one of the Naive Bayes types of algorithms: Gaussian Naïve Bayes, Multinomial Naïve Bayes, or Bernoulli Naïve Bayes and explain your reasoning.
  7. Split the data into dependent and independent variables (or features and labels).
  8. Vectorize the text into numbers.
  9. Train the Naive Bayes classifier on the training set.
  10. Make classification predictions.
  11. Interpret the results in the context of the questions you asked.
  12. Validate your model using a confusion matrix, accuracy score, ROC-AUC curves, and k-fold cross validation. Then, explain the results.
  13. Include all mathematical formulas used and graphs representing the final outcomes.

From the work done above, prepare a comprehensive technical report as Jupyter notebook, including all code, code comments, all outputs, plots, and analysis. Make sure the project documentation contains:

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

a) Problem statement

b) Algorithm of the solution

c) Analysis of the findings

d) References

Still stressed with your coursework?
Get quality coursework help from an expert!