Emotion/Mood Detector using CNN model
This project is part of the COMP 472 Artificial Intelligence course, focusing on developing, analyzing, and mitigating biases in AI models using deep learning techniques. The project's ultimate goal is to detect emotional states from facial images and addressing AI biases. Our model classifies images into four emotional states: Happy, Neutral, Surprised, and Focused.
- Data Collection: Initially, images were sourced from Kaggle's facial emotion recognition dataset and augmented by images scraped using a Python tool from Google Images.
- Preprocessing: Images were processed to handle inconsistencies such as varied lighting, orientations, and scales using techniques like resizing and normalization.
- Model Training: We utilized convolutional neural networks (CNNs), testing different architectures to optimize our classifiers.
- Bias Mitigation: Post initial model evaluations, we integrated bias mitigation techniques focusing on gender and age disparities by augmenting underrepresented groups in the training data.
- Data Preparation: Scripts for data cleaning, augmentation, and preparation for training.
- Model Definition: Contains CNN architecture definitions (CNN, CNNVariant2, CNNVariant3).
- Training Scripts: Code for training the models on the prepared datasets.
- Evaluation: Scripts for evaluating the models on test data and analyzing bias within model predictions.
- Utilities: Helper functions for metrics calculation, model evaluation, and data loading.
- CNN: Basic CNN with layers optimized for initial feature learning.
- CNNVariant2 & CNNVariant3: Enhanced CNN architectures with deeper layers and adjusted parameters to improve learning efficiency and accuracy.
Models were evaluated based on accuracy, precision, recall, and F1-score. Bias was specifically analyzed by comparing these metrics across different demographic groups (age and gender) to ensure fairness and inclusivity.
- Happy Class: Precision of 89%, Recall of 90%
- Neutral Class: Precision of 85%, Recall of 82%
- Surprised Class: Precision of 78%, Recall of 80%
- Focused Class: Precision of 80%, Recall of 83% The mitigation strategies improved the representation in the dataset, particularly increasing the presence of male senior images by 173 and adjusting female images to balance gender representation.
Detected and addressed potential biases by augmenting specific subsets of the data and modifying training procedures to include a more balanced representation across age and gender lines. Post-mitigation results showed an improvement in model fairness without a significant compromise on performance.
- Python, PyTorch (for model development and training)
- Scikit-learn (for performance metrics and evaluations)
- Pandas & NumPy (for data manipulation and analysis)
- torchvision (for image transformations and dataset management)
- Kunal Shah
- Imran Ahmed
- Matteo Mazzone