Semester Project of course Introduction to Machine Learning (CompSci 289A) at UC Berkeley with Prof. Jonathan Shewchuk, Spring 2022
Pipeline Visualization
Metrics of classifiers on test data
Confusion Matrix of best classifier, LDA
Kaggle dataset (MBTI) Myers-Briggs Personality Type Dataset, originally collected through PersonalityCafe forum.
.
├── data
│ ├── cleaned_mbti_train.csv # Cleaned train data
│ ├── cleaned_mbti_test.csv # Cleaned test data
│ └── class16_mbti_map.npy
├── docs
│ ├── Final_Project_Report.pdf # Final Project Report
│ └── Final_Presentation_Slides.pdf # Project Presentation Slides
├── notebooks # Collection of notebooks
│ ├── bestmodels.ipynb # Best model of each classifier, final parameter
│ └── bestmodels_include_tuning.ipynb # Best model of each classifier, tuning process
├── result # results: scores, cm, plot
└── README.md