Multimodal Emotion Recognition System
A real-time pipeline that fuses CNN-based facial features with NLP sentiment embeddings to predict emotion.
An applied computer-vision and NLP system that predicts emotion in real time by combining what a person looks like with what they say.
What it does
- Builds a multimodal pipeline fusing CNN-based facial feature extraction with NLP text-sentiment embeddings for real-time emotion prediction.
- Applies preprocessing optimisations and hyperparameter tuning to improve cross-modal fusion performance.
Tech stack: Python · CNNs · NLP · computer vision