TITLE:
Early Alzheimer’s Disease Detection from Short Speech Samples Using Lightweight, Interpretable Linguistic Markers
AUTHORS:
Rocco de Filippis, Abdullah Al Foysal
KEYWORDS:
Alzheimer’s Disease, Speech-Based Detection, Linguistic Biomarkers, Machine Learning, Explainable AI, Cognitive Decline Monitoring
JOURNAL NAME:
Open Access Library Journal,
Vol.12 No.12,
December
25,
2025
ABSTRACT: Early detection of Alzheimer’s disease (AD) is critical for intervention and monitoring. Spontaneous speech is a rich behavioural signal of cognitive decline, yet many machine-learning pipelines rely on heavy neural models that are difficult to interpret and deploy clinically. In this study, we present a lightweight and transparent classification pipeline using short narrative speech samples. All analyses in this study were conducted on synthetically generated speech transcripts, designed to emulate linguistic patterns reported in early Alzheimer’s disease. Features include lexical complexity, disfluency rates, pronoun usage, readability, idea density, and common function-word statistics, modelled using a regularized linear classifier. On the provided validation split, performance is near-perfect: ROC AUC = 1.000, Average Precision = 1.000, and 100% accuracy at a 0.5 probability threshold. The ROC curve and Precision-Recall curve show a classifier that cleanly separates AD from controls, while the confusion matrix confirms zero false positives and zero false negatives (55 vs. 55 per class). The calibration curve indicates that predicted probabilities remain close to observed frequencies, and t-SNE visualization shows clear cluster separation between AD and control participants in the fused feature space. Interpretability analyses reveal consistent clinical patterns pauses per sentence, fillers per sentence, pronoun ratio, and reduced readability are the strongest predictors of early AD, while longer sentences, higher idea density, and higher content-word ratio are characteristic of healthy controls. Permutation importance confirms that single keywords contribute minimally, suggesting the model relies on broader linguistic behaviour rather than dataset artifacts. The distribution of coefficients shows a sparse pattern with a few strong drivers and many near-zero weights ideal for a clinically interpretable system. Given the unusually high performance, these results are interpreted cautiously; the study explicitly analyses potential leakage channels and outlines rigorous validation procedures to confirm genuine signal. We provide a set of rigorous leakage checks and outline an external validation plan. With proper safeguards, this low-cost pipeline could support clinical screening and longitudinal monitoring of cognitive decline.