Azi Almasi

Data Scientist

Data Analyst

Azi Almasi

Data Scientist

Data Analyst

Attention-Based Multimodal Deep Learning for Subject-Independent Stress Detection from Wearable Signals

See Demo

Project Overview

This project investigates stress detection from wearable physiological signals using deep learning and attention-based multimodal fusion. The goal is to develop a robust, subject-independent model capable of generalising across unseen individuals—a key requirement for real-world health and wellbeing applications.

The work focuses on learning how different physiological modalities contribute to stress, rather than treating all signals equally. To achieve this, I designed a multi-branch neural architecture with attention-based fusion, allowing the model to dynamically weight chest- and wrist-based signals depending on their relevance.


Data & Problem Setup

  • Dataset: WESAD (Wearable Stress and Affect Detection)

  • Signals used:

    • Chest: ECG, EDA, Respiration

    • Wrist: EDA, BVP, Skin Temperature, Accelerometer

  • Tasks:

    • Binary classification: Stress vs Non-Stress

    • Tri-class classification: Baseline vs Stress vs Amusement

  • Evaluation protocol:
    Leave-One-Subject-Out Cross-Validation (LOSO-CV) to ensure strict subject independence


Model Architecture

The proposed model consists of:

  • Separate modality-specific encoders for chest and wrist signals

  • Multi-Head Attention layers for:

    • Intra-modality feature weighting

    • Cross-modality fusion

  • Modality Dropout, forcing the network to remain robust when one modality is noisy or missing

  • End-to-end training with calibrated probability outputs

This design allows the model to adaptively focus on the most informative physiological sources, rather than relying on fixed fusion rules.


Key Results

Binary Stress Classification (LOSO-CV)

Modality Setup Accuracy F1-Score ROC-AUC
Chest + Wrist (Fusion) 85.8% 85.1% 0.94
Chest Only 78.3% 75.4% 0.99
Wrist Only 56.1% 58.1% 0.64

Insights:

  • Chest signals carry the strongest stress-related information.

  • Wrist signals alone are insufficient for reliable stress detection.

  • Attention-based multimodal fusion significantly improves robustness and overall performance.


Tri-Class Classification (Baseline / Stress / Amusement)

Modality Setup Accuracy Macro F1
Chest + Wrist (Fusion) 62.4% 50.2%

Tri-class classification is substantially more challenging due to overlapping physiological responses, yet the fusion model consistently outperformed unimodal alternatives.


Why This Matters

  • Demonstrates realistic, subject-independent evaluation

  • Shows how attention mechanisms improve interpretability and robustness

  • Highlights the limitations of wrist-only wearables for stress detection

  • Provides a scalable foundation for mental-health monitoring, wellbeing analytics, and digital therapeutics


Technical Stack

  • Python, PyTorch

  • NumPy, SciPy, scikit-learn

  • Attention-based deep learning

  • Advanced cross-validation & calibration strategies


Future Directions

  • Temporal attention over longer physiological contexts

  • Domain adaptation for real-world wearable noise

  • Extension to anxiety, cognitive load, and affective state detection

  • Deployment-oriented lightweight models for mobile and edge devices