TITLE:
Wearable-Inspired Panic Episode Forecasting with Synthetic Physiological Time Series: A Feature Engineered Gradient Boosting Baseline with Clinically Motivated Thresholding
AUTHORS:
Rocco de Filippis, Abdullah Al Foysal
KEYWORDS:
Panic Disorder, Early Warning Systems, Wearable Sensors, Electrodermal Activity, Heart Rate Variability, Rare-Event Prediction, Threshold Optimization, Synthetic Data, Machine Learning
JOURNAL NAME:
Open Access Library Journal,
Vol.13 No.3,
March
13,
2026
ABSTRACT: Background: Panic attacks can present rapidly and unpredictably, yet wearable sensors (heart rate, electrodermal activity, respiration, movement) offer a path to continuous monitoring and potentially actionable early warnings. However, developing and validating forecasting pipelines is difficult due to limited labelled datasets, heterogeneous symptom profiles, and ethical constraints in real-world collection. Objective: We propose a fully reproducible synthetic-data framework that simulates circadian physiology and panic-episode dynamics, then evaluates classical machine-learning baselines for multi-class warning prediction (Normal, Early Warning, Urgent Warning) and for binary panic detection (any warning vs Normal). Methods: We generated 60,000 minute-level samples with circadian rhythms and injected panic episodes with either sudden or gradual onset, generating severity trajectories and phase-specific physiological shifts. We engineered 125 features (rolling statistics, slopes, rate-of-change, circadian z-scores, interaction terms, and composite arousal/propensity indices) and trained models on 30 selected features. Class imbalance was addressed with SMOTE Tomek on the training split. We compared Random Forest, Gradient Boosting, and Logistic Regression; the best model was selected by panic-detection F1. We further optimized decision thresholds for clinical deployment trade-offs (alarm burden vs detection). Results: The dataset exhibited extreme class imbalance (Normal ≈ 99.2%, Early ≈ 0.4%, Urgent ≈ 0.4%). Gradient Boosting achieved overall accuracy 0.993 and weighted F1 0.994, but more realistically, binary panic detection reached F1 0.641 with recall 0.787. Discrimination remained strong for the Normal class (AP ≈ 1.00; AUC ≈ 0.995) while minority-class precision-recall degraded, consistent with rare-event forecasting. Threshold optimization showed an operational “clinical” threshold near 0.50 yielding ≈ 16.6 alarms/day with recall ≈ 0.809. Temporal analysis indicated stable accuracy across hours with variable detection rates. Conclusions: A feature-engineered Gradient Boosting baseline can produce operational early-warning signals from wearable-like streams under controlled synthetic assumptions, and thresholding meaningfully tunes clinical burden. The study is a proof-of-concept: results are constrained by synthetic label rules, possible episode-generation accounting inconsistencies, and lack of subject-level personalization. Real-world validation with calibrated probabilities and prospective evaluation is necessary before clinical claims.