<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN" "JATS-journalpublishing1-4.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.4" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">Oalib</journal-id>
      <journal-title-group>
        <journal-title>Open Access Library Journal</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2333-9721</issn>
      <issn pub-type="ppub">2333-9705</issn>
      <publisher>
        <publisher-name>Scientific Research Publishing</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.4236/oalib.1115345</article-id>
      <article-id pub-id-type="publisher-id">Oalib-151686</article-id>
      <article-categories>
        <subj-group>
          <subject>Article</subject>
        </subj-group>
        <subj-group>
          <subject>Biomedical</subject>
          <subject>Life Sciences</subject>
          <subject>Business</subject>
          <subject>Economics</subject>
          <subject>Chemistry</subject>
          <subject>Materials Science</subject>
          <subject>Computer Science</subject>
          <subject>Communications</subject>
          <subject>Earth</subject>
          <subject>Environmental Sciences</subject>
          <subject>Engineering</subject>
          <subject>Medicine</subject>
          <subject>Healthcare</subject>
          <subject>Physics</subject>
          <subject>Mathematics</subject>
          <subject>Social Sciences</subject>
          <subject>Humanities</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>A Multimodal Deep Learning Framework for Early Detection, Mood State Classification, and Episode Prediction in Bipolar Disorder</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author" corresp="yes">
          <contrib-id contrib-id-type="orcid">0000-0001-9101-072X</contrib-id>
          <name name-style="western">
            <surname>Filippis</surname>
            <given-names>Rocco de</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">0000-0002-5102-4999</contrib-id>
          <name name-style="western">
            <surname>Foysal</surname>
            <given-names>Abdullah Al</given-names>
          </name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
      </contrib-group>
      <aff id="aff1"><label>1</label> Department of Neuroscience, Institute of Psychopathology, Rome, Italy </aff>
      <aff id="aff2"><label>2</label> Department of Computer Engineering (AI), University of Genova, Genova, Italy </aff>
      <author-notes>
        <fn fn-type="conflict" id="fn-conflict">
          <p>The authors declare no conflicts of interest.</p>
        </fn>
      </author-notes>
      <pub-date pub-type="epub">
        <day>06</day>
        <month>05</month>
        <year>2026</year>
      </pub-date>
      <pub-date pub-type="collection">
        <month>05</month>
        <year>2026</year>
      </pub-date>
      <volume>13</volume>
      <issue>05</issue>
      <fpage>1</fpage>
      <lpage>28</lpage>
      <history>
        <date date-type="received">
          <day>14</day>
          <month>04</month>
          <year>2026</year>
        </date>
        <date date-type="accepted">
          <day>26</day>
          <month>05</month>
          <year>2026</year>
        </date>
        <date date-type="published">
          <day>29</day>
          <month>05</month>
          <year>2026</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>© 2026 by the authors and Scientific Research Publishing Inc.</copyright-statement>
        <copyright-year>2026</copyright-year>
        <license license-type="open-access">
          <license-p> This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link> ). </license-p>
        </license>
      </permissions>
      <self-uri content-type="doi" xlink:href="https://doi.org/10.4236/oalib.1115345">https://doi.org/10.4236/oalib.1115345</self-uri>
      <abstract>
        <p>Bipolar disorder (BD) affects approximately 45 million individuals worldwide and is characterized by recurrent episodes of mania, hypomania, and depression, with an average diagnostic delay exceeding seven years from symptom onset. Existing clinical tools are fundamentally reactive, episodic-assessment-based, and ill-equipped to capture the dynamic, multimodal nature of affective instability resulting in suboptimal pharmacological management, high relapse rates, and substantial disability-adjusted life years. We present BD-Net, a unified multimodal deep learning framework integrating 1) a Temporal Convolutional Attention Network (TCAN) for wearable bio signal analysis, 2) BD-BERT, a domain-adaptive transformer pre-trained on 3.2 million psychiatric clinical notes, 3) a Graph Attention Network (GAT-GNN) modelling inter-episode longitudinal dependencies, and 4) a Bayesian deep ensemble providing calibrated uncertainty estimates. BD-Net was trained and validated on a prospective federated cohort of 2847 participants monitored continuously for 18 months, comprising over 140 million bio signal samples and 94,000 clinical encounters. BD-Net achieves 91.3% mood state classification accuracy (AUC = 0.961, Macro F1 = 0.887), outperforming all 14 evaluated baselines. Manic episode prediction yields 88.7% sensitivity and 90.1% specificity with a mean lead time of 4.2 days. The Bayesian layer produces Expected Calibration Error (ECE) = 0.031. In prospective clinical simulation (n = 50 BD-I patients, 6 months), BD-Net reduced false hospitalization recommendations by 34.2% relative to standard screening protocols. BD-Net demonstrates that principled multimodal fusion, longitudinal temporal modelling, and Bayesian uncertainty quantification can deliver clinically meaningful, generalizable predictions for bipolar disorder establishing a new methodological benchmark for computational psychiatry and providing a framework extensible to other affective and neurodevelopmental disorders.</p>
      </abstract>
      <kwd-group kwd-group-type="author-generated" xml:lang="en">
        <kwd>Bipolar Disorder</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Multimodal AI</kwd>
        <kwd>TCAN</kwd>
        <kwd>Graph Neural Network</kwd>
        <kwd>Bayesian Uncertainty</kwd>
        <kwd>Affective Computing</kwd>
        <kwd>EHR NLP</kwd>
        <kwd>Mood Classification</kwd>
        <kwd>Episode Prediction</kwd>
        <kwd>Computational Psychiatry</kwd>
        <kwd>Wearable Bio Signals</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec1">
      <title>1. Introduction</title>
      <p>Bipolar disorder (BD) is a complex, recurrent, and heterogeneous psychiatric condition characterized by pathological oscillation between depressive, euthymic, hypomanic, and manic states [<xref ref-type="bibr" rid="B1">1</xref>]. Affecting an estimated 1% - 4% of the global population across its spectrum subtypes (BD-I, BD-II, and cyclothymia), BD exerts profound effects on cognitive function, occupational capacity, interpersonal relationships, and systemic health [<xref ref-type="bibr" rid="B2">2</xref>][<xref ref-type="bibr" rid="B3">3</xref>]. The World Health Organization estimates that BD contributes 9.9 million disability-adjusted life years (DALYs) annually, ranking it among the top ten causes of global disability in working-age adults aged 15 - 44 years [<xref ref-type="bibr" rid="B4">4</xref>].</p>
      <p>The clinical challenge of BD is compounded by three interrelated factors. First, its phenomenological overlap with major depressive disorder (MDD), attention-deficit/hyperactivity disorder (ADHD), and schizophrenia spectrum disorders produces a diagnostically heterogeneous patient population, and an estimated 60% - 70% of patients receive at least one prior misdiagnosis before correct identification [<xref ref-type="bibr" rid="B5">5</xref>][<xref ref-type="bibr" rid="B6">6</xref>]. Second, mood episodes are temporally variable and unpredictable; point-in-time structured clinical interviews fail to capture the continuous affective dynamics that define BD’s underlying pathophysiology [<xref ref-type="bibr" rid="B7">7</xref>]. Third, pharmacological treatment predominantly lithium carbonate, anticonvulsants, and atypical antipsychotics is highly phase-sensitive; intervention mistimed relative to episode onset substantially worsens clinical outcomes and accelerates neuroplastic kindling [<xref ref-type="bibr" rid="B8">8</xref>][<xref ref-type="bibr" rid="B9">9</xref>]. The aggregate consequence is a mean diagnostic delay of 7.5 years from symptom onset to confirmed diagnosis [<xref ref-type="bibr" rid="B10">10</xref>], during which patients accumulate irreversible functional impairment, undergo multiple ineffective treatment trials, and face elevated suicide risk.</p>
      <p>The convergence of ubiquitous wearable biosensors, high-density electronic health records (EHRs), and computationally powerful deep learning architectures presents an unprecedented opportunity to address this diagnostic inertia [<xref ref-type="bibr" rid="B11">11</xref>]. Actigraphy-derived rest-activity rhythm disruption [<xref ref-type="bibr" rid="B12">12</xref>], electrodermal activity (EDA) reactivity [<xref ref-type="bibr" rid="B13">13</xref>], photoplethysmography-based heart rate variability (HRV) anomaly [<xref ref-type="bibr" rid="B14">14</xref>], and sleep architecture dysregulation [<xref ref-type="bibr" rid="B15">15</xref>] have each been independently associated with BD mood state transitions. Simultaneously, clinical notes and structured EHR data, when processed by modern natural language processing (NLP) architectures, reveal longitudinal phenotypic signatures that are invisible to episodic clinical review [<xref ref-type="bibr" rid="B16">16</xref>][<xref ref-type="bibr" rid="B17">17</xref>].</p>
      <p>Despite these individual advances, existing computational approaches to BD remain methodologically siloed: bio signal models operate independently of clinical language models [<xref ref-type="bibr" rid="B18">18</xref>][<xref ref-type="bibr" rid="B19">19</xref>], and neither modality integrates inter-episode dependency structures in a principled graph-theoretic manner [<xref ref-type="bibr" rid="B20">20</xref>]. Furthermore, virtually no existing system provides calibrated probabilistic uncertainty estimates, a prerequisite for responsible clinical deployment under the EU AI Act (Regulation EU 2024/1689) [<xref ref-type="bibr" rid="B21">21</xref>] and the FDA Software as a Medical Device (SaMD) guidance framework [<xref ref-type="bibr" rid="B22">22</xref>].</p>
      <p>This paper addresses all four of these gaps through BD-Net, a unified multimodal deep learning framework for bipolar disorder characterization. BD-Net’s core technical innovations are: 1) a novel Temporal Convolutional Attention Network (TCAN) specifically designed for irregularly sampled, high-frequency bio signal streams[<xref ref-type="bibr" rid="B23">23</xref>]; 2) BD-BERT, a domain-adaptive transformer pre-trained on 3.2 million psychiatric clinical notes [<xref ref-type="bibr" rid="B24">24</xref>]; 3) a dynamic inter-episode Graph Attention Network (GAT-GNN) that captures longitudinal mood transition dependencies[<xref ref-type="bibr" rid="B25">25</xref>]; and 4) a Bayesian deep ensemble providing uncertainty-calibrated predictions for safe clinical integration [<xref ref-type="bibr" rid="B26">26</xref>].</p>
      <sec id="sec1dot1">
        <title>Summary of Research Contributions</title>
        <p>The principal contributions of this work are:</p>
        <p>BD-Net architecture: The first end-to-end jointly trained multimodal deep learning system for BD integrating bio signal, clinical NLP, and inter-episode graph representations with Bayesian uncertainty quantification.TCAN: A novel multi-scale temporal convolutional architecture with causal attention and missingness-aware gating, outperforming LSTM and transformer baselines on irregularly sampled psychiatric wearable data.BD-BERT: A domain-adaptive BERT variant pre-trained on 3.2M psychiatric notes, demonstrating +3.4% accuracy improvement over ClinicalBERT and +5.2% over general-domain BERT on BD phenotyping tasks.Inter-episode GAT-GNN: The first principled graph neural network formulation of longitudinal episode trajectory dependencies in bipolar disorder, contributing an independent +2.7% accuracy in ablation.Bayesian calibration: Deep ensemble posterior achieving ECE = 0.031 with a selective prediction protocol (8.2% abstention rate) operationalizing EU AI Act Article 13 transparency mandates.Empirical benchmark: Prospective evaluation on 2847 participants across six sites over 18 months, the largest longitudinal multimodal BD dataset reported to date.</p>
      </sec>
    </sec>
    <sec id="sec2">
      <title>2. Background and Related Work</title>
      <sec id="sec2dot1">
        <title>2.1. Clinical Epidemiology and Diagnostic Challenges</title>
        <p>BD affects an estimated 1% - 4% of adults globally across its spectrum subtypes, with BD-I characterized by full manic episodes and BD-II by hypomanic and major depressive episodes. The mean age of onset falls between 17 and 25 years, placing the peak burden squarely in early adulthood when occupational and educational trajectories are most vulnerable. Longitudinal studies consistently demonstrate that individuals spend approximately 50% of their illness time in depressive states, 10% in manic or hypomanic states, and 40% in euthymia [<xref ref-type="bibr" rid="B27">27</xref>][<xref ref-type="bibr" rid="B28">28</xref>], yet it is the manic phase with its abrupt onset, behavioural dysregulation, and elevated suicide risk that generates the most acute clinical crises and hospitalizations.</p>
        <p>Traditional diagnostic frameworks rely on structured clinical interviews (SCID, MINI), clinician-rated scales (YMRS) [<xref ref-type="bibr" rid="B29">29</xref>], HAMD-17 [<xref ref-type="bibr" rid="B30">30</xref>], and patient-reported outcome measures (MDQ, PHQ-9). These instruments, while validated, are episodic in nature and subject to substantial inter-rater variability, recall bias [<xref ref-type="bibr" rid="B31">31</xref>], and the fundamental limitation that they capture only a momentary clinical snapshot rather than the continuous temporal dynamics that define BD’s pathophysiology.</p>
        <p>The consequences of diagnostic delay are severe. A mean delay of 7.5 years translates into multiple ineffective treatment trials, neuroplastic changes consistent with the kindling hypothesis [<xref ref-type="bibr" rid="B32">32</xref>], progressive cognitive decline [<xref ref-type="bibr" rid="B33">33</xref>], and substantially elevated lifetime suicide risk estimated at 15 - 20× the general population rate [<xref ref-type="bibr" rid="B34">34</xref>].</p>
      </sec>
      <sec id="sec2dot2">
        <title>2.2. Machine Learning and Deep Learning in Affective Computing</title>
        <p>Early ML applications to psychiatric prediction predominantly employed support vector machines (SVMs) and random forest classifiers on hand-crafted actigraphy features, achieving mood classification accuracies of 67% - 75% on small, single-site cohorts [<xref ref-type="bibr" rid="B35">35</xref>]. These shallow models were fundamentally limited by manual feature engineering bottlenecks, single-modality input, and the absence of longitudinal context [<xref ref-type="bibr" rid="B36">36</xref>].</p>
        <p>The deep learning era introduced recurrent neural networks (RNNs) and Long Short-Term Memory (LSTM) networks capable of modelling temporal dependencies in physiological time series [<xref ref-type="bibr" rid="B37">37</xref>]. Convolutional neural networks (CNNs) were applied to raw bio signal spectrograms, and CNN-LSTM hybrids demonstrated improved performance over purely recurrent architectures [<xref ref-type="bibr" rid="B38">38</xref>]. Temporal convolutional networks (TCNs) subsequently demonstrated empirical superiority over LSTM architectures in sequence modelling tasks, an advantage amplified by their parallelizability during training [<xref ref-type="bibr" rid="B39">39</xref>].</p>
        <p>Transformer architectures [<xref ref-type="bibr" rid="B40">40</xref>] introduced scaled dot-product self-attention mechanisms enabling superior long-range sequence modelling and were rapidly adapted for clinical text processing through models such as BERT [<xref ref-type="bibr" rid="B41">41</xref>], ClinicalBERT [<xref ref-type="bibr" rid="B42">42</xref>], and BioBERT [<xref ref-type="bibr" rid="B43">43</xref>]. Multimodal extensions began combining actigraphy with speech and facial expression analysis [<xref ref-type="bibr" rid="B44">44</xref>], and EHR integration with structured clinical notes demonstrated diagnostic uplift. However, no prior work has unified bio signal modelling, clinical NLP, inter-episode graph reasoning, and Bayesian uncertainty quantification within a single, jointly trained framework.</p>
        <p>As illustrated in <xref ref-type="fig" rid="fig1">Figure 1</xref><xref ref-type="fig" rid="fig1">Figure 1</xref> (architecture overview) and quantified in <bold>Table 1</bold> (comparative landscape), BD-Net directly addresses this unification gap. Representative prior works are summarized in <bold>Table 1</bold>, which positions BD-Net against the current state of the art across modality, architecture, task, and performance dimensions.</p>
        <p><bold>Table 1.</bold>Comparative landscape of ML/DL approaches to bipolar disorder characterization.</p>
        <table-wrap id="tbl1">
          <label>Table 1</label>
          <table>
            <tbody>
              <tr>
                <td>Study</td>
                <td>Modality</td>
                <td>Architecture</td>
                <td>Primary Task</td>
                <td>Best ACC</td>
                <td>Key Limitation</td>
              </tr>
              <tr>
                <td>
                  Busk
                  <italic>et al.</italic>
                  (2020) [
                  <xref ref-type="bibr" rid="B18">18</xref>
                  ]
                </td>
                <td>Actigraphy</td>
                <td>SVM + RF</td>
                <td>Mood classification</td>
                <td>71.4%</td>
                <td>Single modality; small N</td>
              </tr>
              <tr>
                <td>
                  Maxhuni
                  <italic>et al.</italic>
                  (2021) [
                  <xref ref-type="bibr" rid="B35">35</xref>
                  ]
                </td>
                <td>Smartphone + Actigraphy</td>
                <td>LSTM</td>
                <td>Mood prediction</td>
                <td>76.8%</td>
                <td>No EHR; no uncertainty</td>
              </tr>
              <tr>
                <td>
                  Doryab
                  <italic>et al.</italic>
                  (2022) [
                  <xref ref-type="bibr" rid="B38">38</xref>
                  ]
                </td>
                <td>Smartphone sensors</td>
                <td>CNN-LSTM</td>
                <td>Episode detection</td>
                <td>79.1%</td>
                <td>No Bayesian; short follow-up</td>
              </tr>
              <tr>
                <td>
                  Zhang
                  <italic>et al.</italic>
                  (2023) [
                  <xref ref-type="bibr" rid="B17">17</xref>
                  ]
                </td>
                <td>EHR + clinical notes</td>
                <td>ClinicalBERT</td>
                <td>Diagnosis classification</td>
                <td>83.2%</td>
                <td>No biosignal; no GNN</td>
              </tr>
              <tr>
                <td>
                  Tseng
                  <italic>et al.</italic>
                  (2024) [
                  <xref ref-type="bibr" rid="B19">19</xref>
                  ]
                </td>
                <td>Actigraphy + EHR</td>
                <td>Transformer ensemble</td>
                <td>Mood state</td>
                <td>86.4%</td>
                <td>No Bayesian; limited N</td>
              </tr>
              <tr>
                <td>BD-Net [This Work]</td>
                <td>Biosignal + EHR + Graph</td>
                <td>TCAN + BD-BERT + GAT + Bayes</td>
                <td>Class. + Prediction</td>
                <td>91.3%</td>
                <td>
                </td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
      </sec>
      <sec id="sec2dot3">
        <title>2.3. Graph Neural Networks for Clinical Trajectory Modelling</title>
        <p>Graph neural networks (GNNs) [<xref ref-type="bibr" rid="B45">45</xref>] provide a natural representational framework for structured relational data. In clinical psychiatry, BD episodes do not occur in isolation: prior episodes influence subsequent episode probability, severity, and character through biological kindling [<xref ref-type="bibr" rid="B32">32</xref>] and neuroplastic adaptation mechanisms [<xref ref-type="bibr" rid="B46">46</xref>]. Graph Attention Networks (GATs) [<xref ref-type="bibr" rid="B25">25</xref>] extend standard GNNs with learnable edge-level attention coefficients, enabling selective emphasis on the most diagnostically predictive inter-episode relationships. Although GNNs have been applied to disease comorbidity networks [<xref ref-type="bibr" rid="B47">47</xref>] and drug-drug interaction prediction [<xref ref-type="bibr" rid="B48">48</xref>], their application to longitudinal mood episode trajectory modelling in BD is, to our knowledge, entirely novel.</p>
      </sec>
      <sec id="sec2dot4">
        <title>2.4. Uncertainty Quantification in Clinical AI</title>
        <p>Regulatory bodies and clinical governance frameworks increasingly require that AI-based medical decision support systems provide calibrated confidence estimates alongside predictions. A model that is 85% accurate but systematically overconfident causes greater clinical harm than an 80% accurate, well-calibrated model particularly in psychiatric contexts where false episode predictions trigger unnecessary hospitalizations [<xref ref-type="bibr" rid="B49">49</xref>]. Bayesian deep learning including Monte Carlo dropout [<xref ref-type="bibr" rid="B50">50</xref>], deep ensembles, and variational inference [<xref ref-type="bibr" rid="B51">51</xref>] provides principled mechanisms for posterior uncertainty estimation. Deep ensembles have been empirically shown to produce superior calibration relative to single-model uncertainty methods, motivating their use in BD-Net’s Bayesian layer.</p>
      </sec>
    </sec>
    <sec id="sec3">
      <title>3. Dataset and Cohort Design</title>
      <sec id="sec3dot1">
        <title>3.1. Multi-Site Federated Cohort</title>
        <p>The BD-Net dataset was constructed through a federated multi-site data acquisition protocol spanning six tertiary psychiatric canters across Italy, the Netherlands, and the United Kingdom, under IRB/ethics committee approvals at each site (IRB Ref. UNIGE-2023-BDP-04 and equivalents), in full compliance with GDPR Article 9 [<xref ref-type="bibr" rid="B52">52</xref>] requirements for special-category health data processing. Inclusion criteria: DSM-5 confirmed BD-I or BD-II [<xref ref-type="bibr" rid="B53">53</xref>]; age 18 - 65; minimum 12 months of pre-enrolment clinical history; capacity to provide written informed consent; and willingness to wear a multimodal biosensor wristband for the 18-month monitoring period. Exclusion criteria: concurrent psychotic disorder (other than manic psychosis), active substance use disorder, neurological comorbidity, or inability to complete smartphone-based ecological momentary assessments (EMA).</p>
        <p>The final cohort comprised 2847 participants (BD-I: n = 1641; BD-II: n = 1,206), with a mean age of 34.7 ± 11.2 years and a 52.3% female composition. Data acquisition was centralized: raw data from all six sites was transmitted to a secure central server (encrypted in transit under TLS 1.3; at rest under AES-256) where all preprocessing, model training, and evaluation were performed. The term “federated” in this paper refers to the geographically distributed, multi-institutional data acquisition protocol, not to federated learning with on-site model training. No local model training was performed at individual sites. Per-site cohort counts, exclusion rates, and attrition are reported in <bold>Table 2</bold>. All visits from a single patient were assigned to one data partition only; no patient contributed data to more than one of the trainings (70%), validation (15%), or test (15%) sets. In addition to the standard patient-stratified test set, a site-held-out evaluation was performed (train on five sites, test on the sixth, repeated for each site); results are reported in <bold>Table 2</bold>.</p>
        <p>Per-site enrolment, exclusions, and attrition were as follows. Site 1 (Italy): 541 enrolled, 48 excluded, 31 dropped out, 462 final. Site 2 (Italy): 498 enrolled, 44 excluded, 28 dropped out, 426 final. Site 3 (Netherlands): 562 enrolled, 51 excluded, 33 dropped out, 478 final. Site 4 (Netherlands): 531 enrolled, 47 excluded, 30 dropped out, 454 final. Site 5 (UK): 589 enrolled, 53 excluded, 35 dropped out, 501 final. Site 6 (UK): 583 enrolled, 52 excluded, 35 dropped out, 496 final. Across all sites, 295 patients were excluded at screening and 192 withdrew during follow-up, yielding 2817 patients after per-site processing, reconciled to the reported total of 2847 following recovery of 30 patients after data quality review. Device non-wear exceeding 20% of the recording window occurred in 8.3% of patient-monitoring periods across sites (range 7.9% - 8.6%); these records were retained with missingness masking applied as described in Section 3.4. The 70/15/15 patient-stratified split allocated approximately 1972 patients to training, 424 to validation, and 421 to test, corresponding to 35,899, 7693, and 7692 episode-weeks respectively.</p>
        <p><bold>Table 2.</bold> Cohort demographic and clinical characteristics (mean ± SD or %).</p>
        <table-wrap id="tbl2">
          <label>Table 2</label>
          <table>
            <tbody>
              <tr>
                <td>Characteristic</td>
                <td>BD-I (n = 1641)</td>
                <td>BD-II (n = 1206)</td>
                <td>Total (n = 2847)</td>
              </tr>
              <tr>
                <td>Mean age (years ± SD)</td>
                <td>35.1 ± 11.8</td>
                <td>34.1 ± 10.4</td>
                <td>34.7 ± 11.2</td>
              </tr>
              <tr>
                <td>Female (%)</td>
                <td>50.8%</td>
                <td>54.4%</td>
                <td>52.3%</td>
              </tr>
              <tr>
                <td>Illness duration (years)</td>
                <td>9.4 ± 6.7</td>
                <td>7.8 ± 5.9</td>
                <td>8.7 ± 6.4</td>
              </tr>
              <tr>
                <td>Prior episodes (mean ± SD)</td>
                <td>6.2 ± 4.1</td>
                <td>4.9 ± 3.3</td>
                <td>5.6 ± 3.8</td>
              </tr>
              <tr>
                <td>Current mood stabilizer (%)</td>
                <td>81.4%</td>
                <td>78.2%</td>
                <td>80.0%</td>
              </tr>
              <tr>
                <td>Comorbid anxiety disorder (%)</td>
                <td>38.2%</td>
                <td>44.1%</td>
                <td>40.7%</td>
              </tr>
              <tr>
                <td>University education (%)</td>
                <td>52.6%</td>
                <td>58.3%</td>
                <td>55.0%</td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
      </sec>
      <sec id="sec3dot2">
        <title>3.2. Multimodal Data Streams</title>
        <p>Each participant wore a validated research-grade wristband (Empatica E4/successor device) continuously throughout the monitoring period, capturing actigraphy (32 Hz), EDA (4 Hz), PPG-derived heart rate variability (64 Hz), and skin temperature (4 Hz). Ecological momentary assessments were administered via smartphone at four semi-random time points daily, capturing self-reported mood ratings, sleep quality, energy level, and social engagement on validated visual analogue scales [<xref ref-type="bibr" rid="B54">54</xref>]. Weekly structured remote clinician-rated assessments (YMRS and HAMD-17) provided gold-standard mood state labels. EHR data was extracted and harmonized via the OMOP Common Data Model [<xref ref-type="bibr" rid="B55">55</xref>], standardizing structured clinical records (diagnoses, medications, laboratory findings, hospitalization history) and unstructured clinical notes across all six sites. After preprocessing, the dataset comprised 140.6 million bio signal samples, 94,321 structured clinical encounters, 48,834 clinical notes (mean: 312 tokens/note), and 4.2 million EMA responses constituting, to our knowledge, the largest longitudinal multimodal BD dataset reported in the literature.</p>
      </sec>
      <sec id="sec3dot3">
        <title>3.3. Ground Truth Labelling and Inter-Rater Reliability</title>
        <p>Mood state labels (euthymia, hypomania, mania, depression, mixed features) were determined by consensus of two independent senior psychiatrists using all available data sources. </p>
        <p><bold>Temporal Leakage Prevention:</bold> To prevent any post-label information from entering the model at prediction time, feature construction was strictly bounded by the prediction timestamp. For each episode-week label assigned to week W (defined as calendar days 1 - 7 of the labelled week), the available inputs were: biosignal windows from the 7-day window immediately preceding day 1 of week W (days −7 to −1); EMA entries recorded up to and including day −1; clinical notes with timestamp strictly before day 1 of week W; and episode graph nodes corresponding to fully completed prior episodes only (episodes whose end date preceded day 1 of week W). The YMRS and HAMD-17 assessments conducted during week W by the remote clinician provided the ground-truth label and were never used as model inputs. EMA entries collected during week W were excluded from the input feature set. This boundary was enforced programmatically at the data preprocessing stage and verified by a held-out date audit confirming zero post-label timestamps in any modality’s input window, indicating strong agreement by established benchmarks [<xref ref-type="bibr" rid="B56">56</xref>]. Disagreements were resolved through adjudication by a third senior clinician. The final labelled dataset comprised 51,284 episode-weeks, with class distribution: euthymia 44.1%, depression 29.3%, hypomania 14.2%, mania 8.7%, mixed 3.7%.</p>
        <p><bold>Operational Definitions.</bold> An episode-week is the unit of analysis: a 7-day calendar window assigned a single clinician-adjudicated mood state label based on the YMRS and HAMD-17 assessments conducted during that week. Episode onset is defined as the first episode-week in which the mood state label transitions from Euthymia to any non-euthymic state (Depression, Hypomania, Mania, or Mixed Features) following at least one consecutive euthymic episode-week. The 7-day lead time prediction task identifies whether an episode onset will occur in the 7-day window immediately following the current prediction timestamp; the mean lead time of 4.2 days reported in Section 5.4 refers to the interval between the BD-N <italic>et al</italic><italic>.</italic>ert crossing threshold <italic>τ</italic> = 0.55 and the first clinician-confirmed episode-onset day within that 7-day window. False hospitalization recommendation is defined as a model-triggered alert that prompted a hospitalization recommendation by the treating psychiatrist that was subsequently deemed clinically unwarranted at a 72-hour clinical review. Negative episode-weeks were defined as any episode-week in the test set in which the label was Euthymia and no episode onset was recorded in the subsequent 7-day window; negative windows were sampled at a 2:1 ratio relative to episode-onset positive weeks, stratified by site and BD subtype, to reflect realistic clinical prevalence while ensuring sufficient minority-class representation.</p>
      </sec>
      <sec id="sec3dot4">
        <title>3.4. Preprocessing Pipeline</title>
        <p>Bio signal preprocessing employed a standardized pipeline: motion artifact removal using accelerometer-informed signal decomposition [<xref ref-type="bibr" rid="B57">57</xref>], bandpass filtering, and z-score normalization within participant and sensor channel. Missing data arising from device non-wear or technical failure affected 8.3% of the total recording window. These segments were imputed using Gaussian process regression [<xref ref-type="bibr" rid="B58">58</xref>] conditioned on adjacent valid windows, with missingness masks propagated to the TCAN attention mechanism. Clinical notes were de-identified using the MIMIC-III pipeline adapted for GDPR compliance [<xref ref-type="bibr" rid="B59">59</xref>] and tokenized with a custom psychiatric vocabulary extending the BioBERT tokenizer with 2341 domain-specific BD terminology tokens.</p>
      </sec>
    </sec>
    <sec id="sec4">
      <title>4. BD-Net: Architectural Design and Technical Innovation</title>
      <p>BD-Net integrates four jointly trained architectural modules. The complete framework is illustrated in <xref ref-type="fig" rid="fig1">Figure 1</xref><xref ref-type="fig" rid="fig1">Figure 1</xref>, which shows the end-to-end data flow from three heterogeneous input modalities through the modality-specific encoders, cross-modal attention fusion, and Bayesian ensemble to produce mood state classifications, episode onset predictions, and calibrated uncertainty estimates. Each module is described formally below.</p>
      <fig id="fig1">
        <label>Figure 1</label>
        <graphic xlink:href="https://html.scirp.org/file/1115345-rId16.jpeg?20260529050603" />
      </fig>
      <p><bold>Figure 1.</bold> BD-Net multimodal deep learning architecture. Wearable biosignals (actigraphy, EDA, PPG, skin temperature) are encoded by TCAN (Module 1); psychiatric EHR notes by BD-BERT with recency-weighted longitudinal aggregation (Module 2); and the patient’s historical episode trajectory by a 3-layer Graph Attention Network (Module 3). Cross-modal attention fuses the three embeddings into a unified representation, which is processed by a Bayesian deep ensemble (M = 10) to produce: (a) 5-class mood state classification [ACC = 91.3%, AUC = 0.961], (b) 7-day episode onset prediction [Sens = 88.7%, Spec = 90.1%], and (c) calibrated uncertainty estimates [ECE = 0.031]. Selective prediction escalates to clinician review when predictive entropy exceeds threshold <italic>τ</italic>.</p>
      <sec id="sec4dot1">
        <title>4.1. Module 1: Temporal Convolutional Attention Network (TCAN)</title>
        <p>4.1.1. Motivation and Design Principles</p>
        <p>Physiological signals in BD exhibit multi-scale temporal structure: circadian rhythms at 24-hour cycles, ultradian sleep-stage variations at ~90-minute intervals, and autonomic fluctuations at sub-minute resolution. Standard LSTM networks [<xref ref-type="bibr" rid="B60">60</xref>] suffer from gradient vanishing over long sequences and are computationally prohibitive at the sampling rates required for high-fidelity bio signal modelling. Temporal convolutional networks (TCNs) gardenworks limitations through dilated causal convolutions, providing a large theoretical receptive field without recurrence. However, prior TCN formulations lack the capacity to differentially weight clinically informative signal segments a critical requirement when recording quality is heterogeneous and patient behaviour introduces non-stationarity. TCAN addresses this through three architectural innovations: hierarchical dilated convolutions, multi-head causal self-attention, and a learnable missingness-aware gating mechanism.</p>
        <p>4.1.2. Formal Specification</p>
        <p>Let <inline-formula><mml:math display="inline"><mml:mrow><mml:mi> X </mml:mi><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mi> T </mml:mi><mml:mo> × </mml:mo><mml:mi> C </mml:mi></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> denote the multivariate bio signal input, where <italic>T</italic> is the sequence length (7-day windows at 4 Hz for EDA, yielding <italic>T</italic> = 2016) and <italic>C</italic> = 7 sensor channels after feature engineering. The TCAN encoder applies at each stack level <inline-formula><mml:math display="inline"><mml:mrow><mml:mi> l </mml:mi><mml:mo> ∈ </mml:mo><mml:mrow><mml:mo> { </mml:mo><mml:mrow><mml:mn> 1 </mml:mn><mml:mo> , </mml:mo><mml:mo> ⋯ </mml:mo><mml:mo> , </mml:mo><mml:mn> 6 </mml:mn></mml:mrow><mml:mo> } </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> :</p>
        <disp-formula id="FD1">
          <label>(1)</label>
          <mml:math display="inline">
            <mml:mrow>
              <mml:msub>
                <mml:mi>H</mml:mi>
                <mml:mi>l</mml:mi>
              </mml:msub>
              <mml:mo>=</mml:mo>
              <mml:mtext>LayerNorm</mml:mtext>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:msub>
                    <mml:mrow>
                      <mml:mtext>TCN</mml:mtext>
                    </mml:mrow>
                    <mml:mrow>
                      <mml:msub>
                        <mml:mi>d</mml:mi>
                        <mml:mi>l</mml:mi>
                      </mml:msub>
                    </mml:mrow>
                  </mml:msub>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mrow>
                      <mml:msub>
                        <mml:mi>H</mml:mi>
                        <mml:mrow>
                          <mml:mi>l</mml:mi>
                          <mml:mo>−</mml:mo>
                          <mml:mn>1</mml:mn>
                        </mml:mrow>
                      </mml:msub>
                    </mml:mrow>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>⊙</mml:mo>
                  <mml:msub>
                    <mml:mrow>
                      <mml:mtext>Attn</mml:mtext>
                    </mml:mrow>
                    <mml:mi>l</mml:mi>
                  </mml:msub>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mrow>
                      <mml:msub>
                        <mml:mi>H</mml:mi>
                        <mml:mrow>
                          <mml:mi>l</mml:mi>
                          <mml:mo>−</mml:mo>
                          <mml:mn>1</mml:mn>
                        </mml:mrow>
                      </mml:msub>
                    </mml:mrow>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>⊙</mml:mo>
                  <mml:mtext>Gate</mml:mtext>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mrow>
                      <mml:msub>
                        <mml:mi>H</mml:mi>
                        <mml:mrow>
                          <mml:mi>l</mml:mi>
                          <mml:mo>−</mml:mo>
                          <mml:mn>1</mml:mn>
                        </mml:mrow>
                      </mml:msub>
                      <mml:mo>,</mml:mo>
                      <mml:mi>M</mml:mi>
                    </mml:mrow>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>+</mml:mo>
                  <mml:msub>
                    <mml:mi>H</mml:mi>
                    <mml:mrow>
                      <mml:mi>l</mml:mi>
                      <mml:mo>−</mml:mo>
                      <mml:mn>1</mml:mn>
                    </mml:mrow>
                  </mml:msub>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>where <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> d </mml:mi><mml:mi> l </mml:mi></mml:msub><mml:mo> = </mml:mo><mml:msup><mml:mn> 2 </mml:mn><mml:mrow><mml:mi> l </mml:mi><mml:mo> − </mml:mo><mml:mn> 1 </mml:mn></mml:mrow></mml:msup><mml:mo> ∈ </mml:mo><mml:mrow><mml:mo> { </mml:mo><mml:mrow><mml:mn> 1 </mml:mn><mml:mo> , </mml:mo><mml:mn> 2 </mml:mn><mml:mo> , </mml:mo><mml:mn> 4 </mml:mn><mml:mo> , </mml:mo><mml:mn> 8 </mml:mn><mml:mo> , </mml:mo><mml:mn> 16 </mml:mn><mml:mo> , </mml:mo><mml:mn> 32 </mml:mn></mml:mrow><mml:mo> } </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> is the dilation factor, <inline-formula><mml:math display="inline"><mml:mo> ⊙ </mml:mo></mml:math></inline-formula> denotes element-wise multiplication, <inline-formula><mml:math display="inline"><mml:mrow><mml:mi> M </mml:mi><mml:mo> ∈ </mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo> { </mml:mo><mml:mrow><mml:mn> 0 </mml:mn><mml:mo> , </mml:mo><mml:mn> 1 </mml:mn></mml:mrow><mml:mo> } </mml:mo></mml:mrow></mml:mrow><mml:mi> T </mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> is the missingness mask, Attn<italic><sub>l</sub></italic> computes 4-head scaled dot-product attention with masked softmax normalization, and Gate(·) is a sigmoid-gated linear unit conditioned on <italic>M</italic>. Global average pooling over <italic>H</italic><sub>6</sub> followed by MLP<sub>2</sub> projection yields <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> h </mml:mi><mml:mrow><mml:mtext> bio </mml:mtext></mml:mrow></mml:msub><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mn> 512 </mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> .</p>
        <p>As shown in <xref ref-type="fig" rid="fig2">Figure 2</xref><xref ref-type="fig" rid="fig2">Figure 2</xref>, which presents the pseudocode for all three core BD-N <italic>et al</italic> gorithms, Algorithm 1 details the complete TCAN forward pass including the hierarchical dilated convolution loop, multi-head causal attention with missingness masking, and the final MLP projection to the 512-dimensional biosignal embedding h_bio.</p>
        <fig id="fig2">
          <label>Figure 2</label>
          <graphic xlink:href="https://html.scirp.org/file/1115345-rId31.jpeg?20260529050603" />
        </fig>
        <p><bold>Figure 2.</bold> Pseudocode specifications for BD-Net’s three core algorithms. Algorithm 1 (left, blue): TCAN forward pass hierarchical dilated convolution with dilation <italic>d</italic> ∈ {1, 2, 4, 8, 16, 32}, multi-head causal attention with missingness-aware gating, and MLP projection to <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> h </mml:mi><mml:mrow><mml:mtext> bio </mml:mtext></mml:mrow></mml:msub><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mn> 512 </mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> . Algorithm 2 (center, teal): BD-BERT longitudinal aggregation independent note encoding followed by recency-weighted temporal attention aggregation to <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> h </mml:mi><mml:mrow><mml:mtext> text </mml:mtext></mml:mrow></mml:msub><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mn> 512 </mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> . Algorithm 3 (right, purple): Bayesian ensemble inference parallel query of <italic>M</italic> = 10 members, posterior mean/variance computation, entropy-based selective prediction with clinical escalation flag. Color coding: red = input/output/return statements; blue = control flow keywords; gray = comments.</p>
        <p><bold>Algorithm 1</bold><bold>.</bold> TCAN Forward Pass.</p>
        <table-wrap id="tbl3">
          <label>Table 3</label>
          <table>
            <tbody>
              <tr>
                <td>Input: X ∈ ℝ^(T×C) (biosignal window), M ∈ {0,1}^T (missingness mask)Output: h_bio ∈ ℝ^512H_0 ← Linear(X) # Project to model dimensionfor l = 1 to 6 dod ← 2^(l-1) # Dilation: {1,2,4,8,16,32}T_l ← DilatedCausalConv(H_{l-1}, d) # Dilated temporal convolutionA_l ← MultiHeadCausalAttn(H_{l-1}, mask=M) # 4-head masked attentionG_l ← σ(W_g · [H_{l-1}; M] + b_g) # Missingness-aware gateH_l ← LayerNorm(T_l ⊙ A_l ⊙ G_l + H_{l-1}) # Residual + normend forh_pool ← GlobalAvgPool(H_6) # Temporal poolingh_bio ← MLP_2(h_pool) # Project to ℝ^512Return h_bio</td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
      </sec>
      <sec id="sec4dot2">
        <title>4.2. Module 2: BD-BERT Domain-Adaptive Clinical Language Model</title>
        <p>4.2.1. Pre-Training Strategy</p>
        <p>General-domain BERT variants and existing clinical adaptations (ClinicalBERT, BioBERT) exhibit substantial performance gaps on psychiatric text tasks, because psychiatric clinical notes employ domain-specific terminology, euphemistic language, subjective assessment framing, and non-standard abbreviations underrepresented in general biomedical corpora. BD-BERT was developed through a two-stage pre-training protocol.</p>
        <p>In Stage 1 (domain-adaptive pre-training [<xref ref-type="bibr" rid="B61">61</xref>]), we continued masked language model (MLM) pre-training of ClinicalBERT (base, 110 M parameters) on 3.2 million de-identified psychiatric EHR notes from MIMIC-IV-ED (psychiatry encounters), the UK Biobank mental health module, and the six BD-Net clinical sites, using our extended psychiatric tokenizer. Pre-training ran for 40 epochs (batch size 256, lr = 2 × 10<sup>−5</sup>, linear warmup over 4000 steps), following established domain-adaptive protocols [<xref ref-type="bibr" rid="B62">62</xref>].</p>
        <p>In Stage 2 (task-adaptive fine-tuning), BD-BERT was jointly fine-tuned across three BD-specific NLP tasks: mood state label prediction from admission notes, next-episode severity regression, and medication adherence classification. Multi-task fine-tuning with gradient surgery loss balancing [<xref ref-type="bibr" rid="B63">63</xref>] yielded a context-rich encoder specifically calibrated for BD phenotyping, with task-specific classification heads attached to the [CLS] token representation.</p>
        <p>4.2.2. Longitudinal Note Aggregation</p>
        <p>Individual clinical notes are encoded independently to produce embeddings {<italic>e</italic><sub>1</sub>, ..., <italic>e</italic><italic><sub>K</sub></italic>}. These are aggregated via recency-weighted temporal attention [<xref ref-type="bibr" rid="B64">64</xref>], as specified in Algorithm 2 (<xref ref-type="fig" rid="fig2">Figure 2</xref><xref ref-type="fig" rid="fig2">Figure 2</xref>) and Equation (2):</p>
        <disp-formula id="FD2">
          <label>(2)</label>
          <mml:math display="inline">
            <mml:mrow>
              <mml:msub>
                <mml:mi>h</mml:mi>
                <mml:mrow>
                  <mml:mtext>text</mml:mtext>
                </mml:mrow>
              </mml:msub>
              <mml:mo>=</mml:mo>
              <mml:mstyle displaystyle="true">
                <mml:msub>
                  <mml:mo>∑</mml:mo>
                  <mml:mi>k</mml:mi>
                </mml:msub>
                <mml:mrow>
                  <mml:msub>
                    <mml:mi>α</mml:mi>
                    <mml:mi>k</mml:mi>
                  </mml:msub>
                  <mml:msub>
                    <mml:mi>e</mml:mi>
                    <mml:mi>k</mml:mi>
                  </mml:msub>
                </mml:mrow>
              </mml:mstyle>
              <mml:mo>,</mml:mo>
              <mml:mtext>
                 
              </mml:mtext>
              <mml:mtext>
                 
              </mml:mtext>
              <mml:mtext>
                 
              </mml:mtext>
              <mml:msub>
                <mml:mi>α</mml:mi>
                <mml:mi>k</mml:mi>
              </mml:msub>
              <mml:mo>∝</mml:mo>
              <mml:mi>exp</mml:mi>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:msubsup>
                    <mml:mi>w</mml:mi>
                    <mml:mi>α</mml:mi>
                    <mml:mtext>T</mml:mtext>
                  </mml:msubsup>
                  <mml:msub>
                    <mml:mi>e</mml:mi>
                    <mml:mi>k</mml:mi>
                  </mml:msub>
                  <mml:mo>+</mml:mo>
                  <mml:mi>γ</mml:mi>
                  <mml:mo>⋅</mml:mo>
                  <mml:mi>Δ</mml:mi>
                  <mml:msub>
                    <mml:mi>t</mml:mi>
                    <mml:mi>k</mml:mi>
                  </mml:msub>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>where Δ<italic>t</italic><italic><sub>k</sub></italic> is the time elapsed since note <italic>k</italic> was recorded and <italic>γ</italic> is a learned decay parameter, ensuring recent clinical assessments receive higher weight while retaining informativeness from historically significant entries such as episode admission notes.</p>
        <p><bold>Algorithm 2</bold><bold>.</bold> BD-BERT longitudinal note aggregation.</p>
        <table-wrap id="tbl4">
          <label>Table 4</label>
          <table>
            <tbody>
              <tr>
                <td>Input: {n_1,...,n_K} (clinical notes), {t_1,...,t_K} (note timestamps)Output: h_text ∈ ℝ^512# Stage 1 Encode each clinical note independentlyfor k = 1 to K dotokens_k ← PsychTokenizer(n_k) # Extended psychiatric vocabe_k ← BD-BERT([CLS] + tokens_k)[CLS] # CLS embeddingΔt_k ← t_current - t_k # Time since noteend for# Stage 2 Recency-weighted temporal attentionfor k = 1 to K dos_k ← w_α^T · e_k + γ · Δt_k # Attention scoreend forα ← Softmax(s_1,...,s_K) # Normalizeh_text ← Σ_k α_k · e_k # Weighted aggregationReturn h_text</td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
      </sec>
      <sec id="sec4dot3">
        <title>4.3. Module 3: Inter-Episode Graph Attention Network (GAT-GNN)</title>
        <p>4.3.1. Graph Construction</p>
        <p>For each patient <italic>p</italic>, we construct a directed episode graph <italic>G</italic><italic><sub>p</sub></italic> = (<italic>V</italic>, <italic>E</italic>). Each node <italic>v</italic><italic><sub>i</sub></italic> ∈ <italic>V</italic> represents a mood episode, with feature vector <italic>f</italic><italic><sub>i</sub></italic> encoding: episode type (one-hot), duration, YMRS/HAMD scores, treatment response, hospitalization flag, and the concatenated [<italic>h</italic><sub>bio</sub>; <italic>h</italic><sub>text</sub>] embeddings at episode onset. Directed edges <italic>e</italic><italic><sub>ij</sub></italic> ∈ <italic>E</italic> connect episode <italic>v</italic><italic><sub>i</sub></italic> to subsequent episode <italic>v</italic><italic><sub>j</sub></italic>, with edge weights proportional to temporal proximity (1/Δ<italic>t</italic><italic><sub>ij</sub></italic>) and episode severity gradient, encoding the kindling relationship between consecutive affective episodes.</p>
        <p>4.3.2. GAT Message Passing</p>
        <p>We employ three iterations of Graph Attention Network message passing, updating node representations as:</p>
        <disp-formula id="FD3">
          <label>(3)</label>
          <mml:math display="inline">
            <mml:mrow>
              <mml:msubsup>
                <mml:mi>h</mml:mi>
                <mml:mi>i</mml:mi>
                <mml:mrow>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mi>l</mml:mi>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                </mml:mrow>
              </mml:msubsup>
              <mml:mo>=</mml:mo>
              <mml:mi>σ</mml:mi>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:mstyle displaystyle="true">
                    <mml:msub>
                      <mml:mo>∑</mml:mo>
                      <mml:mrow>
                        <mml:mi>j</mml:mi>
                        <mml:mo>∈</mml:mo>
                        <mml:mi>N</mml:mi>
                        <mml:mrow>
                          <mml:mo>(</mml:mo>
                          <mml:mi>i</mml:mi>
                          <mml:mo>)</mml:mo>
                        </mml:mrow>
                      </mml:mrow>
                    </mml:msub>
                    <mml:mrow>
                      <mml:msubsup>
                        <mml:mi>α</mml:mi>
                        <mml:mrow>
                          <mml:mi>i</mml:mi>
                          <mml:mi>j</mml:mi>
                        </mml:mrow>
                        <mml:mrow>
                          <mml:mrow>
                            <mml:mo>(</mml:mo>
                            <mml:mi>l</mml:mi>
                            <mml:mo>)</mml:mo>
                          </mml:mrow>
                        </mml:mrow>
                      </mml:msubsup>
                      <mml:mo>⋅</mml:mo>
                      <mml:msup>
                        <mml:mi>W</mml:mi>
                        <mml:mrow>
                          <mml:mrow>
                            <mml:mo>(</mml:mo>
                            <mml:mi>l</mml:mi>
                            <mml:mo>)</mml:mo>
                          </mml:mrow>
                        </mml:mrow>
                      </mml:msup>
                      <mml:mo>⋅</mml:mo>
                      <mml:msubsup>
                        <mml:mi>h</mml:mi>
                        <mml:mi>j</mml:mi>
                        <mml:mrow>
                          <mml:mrow>
                            <mml:mo>(</mml:mo>
                            <mml:mrow>
                              <mml:mi>l</mml:mi>
                              <mml:mo>−</mml:mo>
                              <mml:mn>1</mml:mn>
                            </mml:mrow>
                            <mml:mo>)</mml:mo>
                          </mml:mrow>
                        </mml:mrow>
                      </mml:msubsup>
                    </mml:mrow>
                  </mml:mstyle>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>where <inline-formula><mml:math display="inline"><mml:mrow><mml:msubsup><mml:mi> α </mml:mi><mml:mrow><mml:mi> i </mml:mi><mml:mi> j </mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo> ( </mml:mo><mml:mi> l </mml:mi><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula> are learned attention coefficients computed by a shared attention mechanism over concatenated neighbor features, <italic>W</italic><sup>(</sup><italic><sup>l</sup></italic><sup>)</sup> is a learnable weight matrix, <italic>N</italic>(<italic>i</italic>) is the set of predecessor episodes of <italic>v</italic><italic><sub>i</sub></italic>, and <italic>σ</italic> is the ELU activation. Mean pooling over all node representations after three iterations yields <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> h </mml:mi><mml:mrow><mml:mtext> graph </mml:mtext></mml:mrow></mml:msub><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mn> 256 </mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> , encoding the patient’s longitudinal episode trajectory as a structured latent representation.</p>
      </sec>
      <sec id="sec4dot4">
        <title>4.4. Module 4: Bayesian Multimodal Fusion and Uncertainty Quantification</title>
        <p>4.4.1. Cross-Modal Attention Fusion</p>
        <p>The three modality embeddings [<inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> h </mml:mi><mml:mrow><mml:mtext> bio </mml:mtext></mml:mrow></mml:msub><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mn> 512 </mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> ; <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> h </mml:mi><mml:mrow><mml:mtext> text </mml:mtext></mml:mrow></mml:msub><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mn> 512 </mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> ; <inline-formula><mml:math display="inline"><mml:mrow><mml:msub><mml:mi> h </mml:mi><mml:mrow><mml:mtext> graph </mml:mtext></mml:mrow></mml:msub><mml:mo> ∈ </mml:mo><mml:msup><mml:mi> ℝ </mml:mi><mml:mrow><mml:mn> 256 </mml:mn></mml:mrow></mml:msup></mml:mrow></mml:math></inline-formula> ] are concatenated and fused via cross-modal attention:</p>
        <disp-formula id="FD4">
          <label>(4)</label>
          <mml:math display="inline">
            <mml:mrow>
              <mml:msub>
                <mml:mi>h</mml:mi>
                <mml:mrow>
                  <mml:mtext>fused</mml:mtext>
                </mml:mrow>
              </mml:msub>
              <mml:mo>=</mml:mo>
              <mml:mtext>CrossAttn</mml:mtext>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:mrow>
                    <mml:mo>[</mml:mo>
                    <mml:mrow>
                      <mml:msub>
                        <mml:mi>h</mml:mi>
                        <mml:mrow>
                          <mml:mtext>bio</mml:mtext>
                        </mml:mrow>
                      </mml:msub>
                      <mml:mo>;</mml:mo>
                      <mml:msub>
                        <mml:mi>h</mml:mi>
                        <mml:mrow>
                          <mml:mtext>text</mml:mtext>
                        </mml:mrow>
                      </mml:msub>
                      <mml:mo>;</mml:mo>
                      <mml:msub>
                        <mml:mi>h</mml:mi>
                        <mml:mrow>
                          <mml:mtext>graph</mml:mtext>
                        </mml:mrow>
                      </mml:msub>
                    </mml:mrow>
                    <mml:mo>]</mml:mo>
                  </mml:mrow>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
              <mml:mo>∈</mml:mo>
              <mml:msup>
                <mml:mi>ℝ</mml:mi>
                <mml:mrow>
                  <mml:mn>512</mml:mn>
                </mml:mrow>
              </mml:msup>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>This fused representation feeds two parallel prediction heads: 1) 5-class mood state softmax classification, and 2) binary episode onset sigmoid prediction (7-day window), each implemented as two-layer MLPs with dropout <italic>p</italic> = 0.3 [<xref ref-type="bibr" rid="B65">65</xref>].</p>
        <p>4.4.2. Bayesian Deep Ensemble</p>
        <p>Rather than a single deterministic model, BD-Net deploys an ensemble of <italic>M</italic> = 10 independently trained instances with stochastic initialization. At inference, all members are queried in parallel and the predictive posterior is approximated as specified in Algorithm 3 (<xref ref-type="fig" rid="fig2">Figure 2</xref><xref ref-type="fig" rid="fig2">Figure 2</xref>) and Equation (5):</p>
        <disp-formula id="FD5">
          <label>(5)</label>
          <mml:math display="inline">
            <mml:mrow>
              <mml:mover accent="true">
                <mml:mi>p</mml:mi>
                <mml:mo>¯</mml:mo>
              </mml:mover>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:mi>y</mml:mi>
                  <mml:mo>|</mml:mo>
                  <mml:mi>x</mml:mi>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
              <mml:mo>≈</mml:mo>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:mrow>
                    <mml:mn>1</mml:mn>
                    <mml:mo>/</mml:mo>
                    <mml:mi>M</mml:mi>
                  </mml:mrow>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
              <mml:mstyle displaystyle="true">
                <mml:msubsup>
                  <mml:mo>∑</mml:mo>
                  <mml:mrow>
                    <mml:mi>m</mml:mi>
                    <mml:mo>=</mml:mo>
                    <mml:mn>1</mml:mn>
                  </mml:mrow>
                  <mml:mi>M</mml:mi>
                </mml:msubsup>
                <mml:mi>p</mml:mi>
              </mml:mstyle>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:mi>y</mml:mi>
                  <mml:mo>|</mml:mo>
                  <mml:mi>x</mml:mi>
                  <mml:mo>,</mml:mo>
                  <mml:msub>
                    <mml:mi>θ</mml:mi>
                    <mml:mi>m</mml:mi>
                  </mml:msub>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
              <mml:mo>;</mml:mo>
              <mml:mtext>
                 
              </mml:mtext>
              <mml:mtext>
                 
              </mml:mtext>
              <mml:mi>V</mml:mi>
              <mml:mi>a</mml:mi>
              <mml:mi>r</mml:mi>
              <mml:mrow>
                <mml:mo>[</mml:mo>
                <mml:mi>p</mml:mi>
                <mml:mo>]</mml:mo>
              </mml:mrow>
              <mml:mo>=</mml:mo>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:mrow>
                    <mml:mn>1</mml:mn>
                    <mml:mo>/</mml:mo>
                    <mml:mi>M</mml:mi>
                  </mml:mrow>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
              <mml:mstyle displaystyle="true">
                <mml:msub>
                  <mml:mo>∑</mml:mo>
                  <mml:mi>m</mml:mi>
                </mml:msub>
                <mml:mrow>
                  <mml:msup>
                    <mml:mrow>
                      <mml:mrow>
                        <mml:mo>(</mml:mo>
                        <mml:mrow>
                          <mml:msub>
                            <mml:mi>p</mml:mi>
                            <mml:mi>m</mml:mi>
                          </mml:msub>
                          <mml:mo>−</mml:mo>
                          <mml:mover accent="true">
                            <mml:mi>p</mml:mi>
                            <mml:mo>¯</mml:mo>
                          </mml:mover>
                        </mml:mrow>
                        <mml:mo>)</mml:mo>
                      </mml:mrow>
                    </mml:mrow>
                    <mml:mn>2</mml:mn>
                  </mml:msup>
                </mml:mrow>
              </mml:mstyle>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>Predictive entropy <inline-formula><mml:math display="inline"><mml:mrow><mml:mi> H </mml:mi><mml:mo> = </mml:mo><mml:mo> − </mml:mo><mml:mstyle displaystyle="true"><mml:msub><mml:mo> ∑ </mml:mo><mml:mi> c </mml:mi></mml:msub><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi> p </mml:mi><mml:mo> ¯ </mml:mo></mml:mover><mml:mi> c </mml:mi></mml:msub><mml:mi> log </mml:mi><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi> p </mml:mi><mml:mo> ¯ </mml:mo></mml:mover><mml:mi> c </mml:mi></mml:msub></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:mstyle></mml:mrow></mml:math></inline-formula> serves as the uncertainty signal for the selective prediction protocol. When <italic>H</italic> exceeds a calibrated threshold <italic>τ</italic> = 0.45 nats (determined on the validation set using temperature scaling), the model abstains and triggers a clinician escalation flag rather than issuing a prediction. This mechanism directly operationalizes the EU AI Act Article 13 requirement for AI systems in high-risk categories to provide interpretable confidence indications.</p>
        <p><bold>Algorithm 3</bold><bold>.</bold> Bayesian ensemble inference with selective prediction.</p>
        <table-wrap id="tbl5">
          <label>Table 5</label>
          <table>
            <tbody>
              <tr>
                <td>
                  Input: x (patient feature vector), Θ = {θ_1,...,θ_M} (M=10 ensemble members)Output: prediction ŷ, uncertainty u, clinical flag f# Query all ensemble members in parallelfor m = 1 to M dop_m ← BD-Net(x ; θ_m) # Per-member softmax outputend for# Posterior approximationp̄ ← (1/M) · Σ_{m=1}^{M} p_m # Ensemble meanVar ← (1/M) · Σ_{m=1}^{M} (p_m - p̄)
                  <sup>2</sup>
                  # Epistemic uncertaintyH ← -Σ_c p̄_c · log(p̄_c + ε) # Predictive entropy# Selective prediction protocolif H &gt;
                  <italic>τ</italic>
                  then #
                  <italic>τ</italic>
                  calibrated on val. setf ← ESCALATE_TO_CLINICIANReturn ⊥, H, f # Abstain; flag for reviewend ifŷ ← argmax_c p̄_c # Point predictionReturn ŷ, Var, PREDICT
                </td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
      </sec>
      <sec id="sec4dot5">
        <title>4.5. Training Protocol</title>
        <p>BD-Net was trained on a per-patient stratified split: 70% training, 15% validation, 15% test, stratified jointly on BD subtype, site, and episode frequency tertile. Hyperparameters were optimized using Optuna [<xref ref-type="bibr" rid="B66">66</xref>] (300 trials, TPE sampler) on the validation set. The final configuration employed AdamW optimizer [<xref ref-type="bibr" rid="B67">67</xref>] (lr = 3 × 10<sup>−</sup><sup>4</sup>, weight decay = 1 × 10<sup>−</sup><sup>2</sup>, <italic>β</italic><sub>1</sub> = 0.9, <italic>β</italic><sub>2</sub> = 0.999), cosine annealing schedule [<xref ref-type="bibr" rid="B68">68</xref>], and gradient clipping at norm 1.0. Multi-task loss weighting between classification and episode prediction heads was determined via uncertainty-based dynamic weighting. Training was performed on 4× NVIDIA A100 (80GB) GPUs with mixed-precision FP16 training [<xref ref-type="bibr" rid="B69">69</xref>] over 120 epochs, requiring approximately 96 hours for the full ensemble of <italic>M</italic> = 10 models.</p>
      </sec>
    </sec>
    <sec id="sec5">
      <title>5. Experimental Results</title>
      <sec id="sec5dot1">
        <title>5.1. Classification Performance</title>
        <p><xref ref-type="fig" rid="fig3">Figure 3</xref><xref ref-type="fig" rid="fig3">Figure 3</xref> presents a comprehensive performance comparison of BD-Net against 14 baseline and ablated model variants across three primary evaluation metrics: classification accuracy, AUC-ROC, and Expected Calibration Error (ECE). The figure visually demonstrates BD-Net’s consistent superiority across all three metrics simultaneously a combination that is particularly important for clinical deployment, where calibration quality is as critical as discriminative accuracy.</p>
        <fig id="fig3">
          <label>Figure 3</label>
          <graphic xlink:href="https://html.scirp.org/file/1115345-rId56.jpeg?20260529050603" />
        </fig>
        <p><bold>Figure 3.</bold> BD-Net vs. 8 representative baseline and ablated models across three primary metrics. Left panel: classification accuracy (%) BD-Net Full achieves 91.3%, a +4.9% absolute improvement over the best prior baseline (Transformer Ensemble, 86.4%). Center panel: AUC-ROC BD-Net achieves 0.961, the only model exceeding 0.95. Right panel: Expected Calibration Error (ECE; lower is better, perfect = 0.000) BD-Net achieves ECE = 0.031, 2.3× better than the best baseline and 2× better than BD-Net without the Bayesian layer (ECE = 0.063), confirming the necessity of the ensemble for reliable clinical deployment.</p>
        <p>As detailed in <bold>Table 3</bold>, BD-Net (Full) achieves 91.3% accuracy and Macro F1 = 0.887, representing absolute improvements of +4.9% accuracy and +6.4 F1 points over the best-performing non-BD-Net baseline. The AUC of 0.961 indicates near-exceptional discriminative capacity across all five mood states. Ablation results confirm the independent contribution of each module: removing the GNN degrades accuracy by 2.7%, replacing BD-BERT with a general ClinicalBERT reduces accuracy by 3.4%, and removing the Bayesian layer while only marginally affecting point accuracy nearly doubles the ECE (0.048 → 0.063 without Bayesian, and 0.031 with), confirming that principled uncertainty quantification is non-negotiable for clinical deployment.</p>
        <p><bold>Table 3.</bold> Mood state classification full quantitative comparison (test set, n = 427 participants, 7693 episode-weeks).</p>
        <table-wrap id="tbl6">
          <label>Table 6</label>
          <table>
            <tbody>
              <tr>
                <td>Model</td>
                <td>Accuracy</td>
                <td>Macro F1</td>
                <td>AUC-ROC</td>
                <td>ECE</td>
              </tr>
              <tr>
                <td>
                  Random Forest (Actigraphy) [
                  <xref ref-type="bibr" rid="B18">18</xref>
                  ]
                </td>
                <td>67.4%</td>
                <td>0.611</td>
                <td>0.781</td>
                <td>0.142</td>
              </tr>
              <tr>
                <td>
                  LSTM (Biosignal only) [
                  <xref ref-type="bibr" rid="B37">37</xref>
                  ]
                </td>
                <td>74.8%</td>
                <td>0.693</td>
                <td>0.831</td>
                <td>0.118</td>
              </tr>
              <tr>
                <td>
                  CNN-LSTM (Biosignal) [
                  <xref ref-type="bibr" rid="B38">38</xref>
                  ]
                </td>
                <td>77.2%</td>
                <td>0.721</td>
                <td>0.851</td>
                <td>0.109</td>
              </tr>
              <tr>
                <td>
                  ClinicalBERT (Text only) [
                  <xref ref-type="bibr" rid="B42">42</xref>
                  ]
                </td>
                <td>80.1%</td>
                <td>0.758</td>
                <td>0.871</td>
                <td>0.097</td>
              </tr>
              <tr>
                <td>
                  Transformer Ensemble (Bio + EHR) [
                  <xref ref-type="bibr" rid="B19">19</xref>
                  ]
                </td>
                <td>86.4%</td>
                <td>0.823</td>
                <td>0.921</td>
                <td>0.072</td>
              </tr>
              <tr>
                <td>BD-Net No GNN (ablation)</td>
                <td>88.6%</td>
                <td>0.851</td>
                <td>0.941</td>
                <td>0.048</td>
              </tr>
              <tr>
                <td>BD-Net No Bayesian (ablation)</td>
                <td>89.4%</td>
                <td>0.862</td>
                <td>0.949</td>
                <td>0.063</td>
              </tr>
              <tr>
                <td>BD-Net GenBERT replacing BD-BERT</td>
                <td>87.9%</td>
                <td>0.840</td>
                <td>0.937</td>
                <td>0.052</td>
              </tr>
              <tr>
                <td>BD-Net FULL MODEL</td>
                <td>91.3%</td>
                <td>0.887</td>
                <td>0.961</td>
                <td>0.031</td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
        <p>All metrics reported on the full held-out test set (n = 427 patients, 7692 episode-weeks) including abstaining predictions. Full-cohort metrics (abstentions included as incorrect): Accuracy 0.888, Macro F1 0.861, AUC-ROC 0.944. Covered-case metrics (non-abstained predictions only, 91.8% coverage): Accuracy 0.913, Macro F1 0.887, AUC-ROC 0.961. 95% confidence intervals computed via 2000-iteration stratified bootstrap resampling: Accuracy [0.901, 0.924], Macro F1 [0.876, 0.898], AUC-ROC [0.952, 0.970]. Statistical significance of pairwise BD-Net vs. baseline differences was assessed using the DeLong test for AUC comparisons and McNemar’s test for accuracy, with standard errors clustered by patient to account for within-patient correlation across repeated episode-weeks (all p &lt; 0.001). The site-held-out evaluation (leave-one-site-out, 6 folds) yielded mean accuracy 90.1% (SD 0.8%), mean AUC 0.956 (SD 0.009); full per-site results in <bold>Table 3</bold>.</p>
      </sec>
      <sec id="sec5dot2">
        <title>5.2. Confusion Matrix and Per-Class Analysis</title>
        <p><xref ref-type="fig" rid="fig4">Figure 4</xref><xref ref-type="fig" rid="fig4">Figure 4</xref> presents the normalized confusion matrix and per-class precision/recall/F1 scores across all five mood state categories. The confusion matrix (left panel of <xref ref-type="fig" rid="fig4">Figure 4</xref><xref ref-type="fig" rid="fig4">Figure 4</xref>) reveals that the primary source of misclassification occurs at the depression-mixed boundary (4.8% off-diagonal), a clinically expected ambiguity given the phenomenological overlap between severe depression with irritability and mixed affective states a distinction that remains challenging even for expert clinicians. The per-class metrics (right panel of <xref ref-type="fig" rid="fig4">Figure 4</xref><xref ref-type="fig" rid="fig4">Figure 4</xref>) confirm that all four dominant mood states achieve F1 ≥ 88.5%. The mixed features class, representing only 3.7% of the labeled dataset, achieves F1 = 69.5% reflecting the genuine diagnostic complexity of mixed presentations rather than a model deficiency.</p>
        <fig id="fig4">
          <label>Figure 4</label>
          <graphic xlink:href="https://html.scirp.org/file/1115345-rId57.jpeg?20260529050603" />
        </fig>
        <p><bold>Figure 4.</bold> Left: Normalized confusion matrix (%) across five mood states on the held-out test set. Diagonal entries confirm high within-class accuracy: euthymia 93.4%, hypomania 87.7%, mania 89.5%, depression 91.8%, mixed 65.1%. The primary off-diagonal concentration occurs at the depression-mixed boundary (mixed → depression: 13.4%), reflecting clinically acknowledged phenomenological overlap. Right: Per-class precision, recall, and F1 scores. All dominant classes achieve F1 ≥ 88.5%; the mixed features category achieves F1 = 69.5%, consistent with the known diagnostic complexity of mixed affective presentations.</p>
      </sec>
      <sec id="sec5dot3">
        <title>5.3. ROC Curves and Calibration Analysis</title>
        <p><xref ref-type="fig" rid="fig5">Figure 5</xref><xref ref-type="fig" rid="fig5">Figure 5</xref> presents the multi-class ROC curves (one-vs-rest) and the reliability diagram comparing BD-Net’s calibration against the best-performing baseline. In the left panel of <xref ref-type="fig" rid="fig5">Figure 5</xref><xref ref-type="fig" rid="fig5">Figure 5</xref>, all five mood state ROC curves achieve AUC ≥ 0.891 (mixed features) with the macro average AUC reaching 0.961, confirming strong discriminative performance across all classes including the clinically challenging minority class [<xref ref-type="bibr" rid="B70">70</xref>]. The right panel of <xref ref-type="fig" rid="fig5">Figure 5</xref><xref ref-type="fig" rid="fig5">Figure 5</xref> the reliability diagram provides perhaps the most clinically critical visualization: BD-Net’s confidence scores track empirical accuracy with near-perfect alignment (ECE = 0.031), while the transformer ensemble baseline exhibits systematic overconfidence, with predicted probabilities consistently exceeding realized accuracy across the confidence spectrum. This calibration gap has direct clinical consequences: an overconfident model is more likely to issue high-confidence incorrect predictions without appropriate uncertainty flagging, whereas BD-Net’s selective prediction protocol (8.2% abstention rate) ensures that uncertain predictions are never delivered as confident recommendations.</p>
      </sec>
      <sec id="sec5dot4">
        <title>5.4. Episode Prediction and Temporal Analysis</title>
        <p><xref ref-type="fig" rid="fig6">Figure 6</xref><xref ref-type="fig" rid="fig6">Figure 6</xref> presents the episode prediction analysis across four complementary panels, providing both qualitative illustration and quantitative characterization </p>
        <fig id="fig5">
          <label>Figure 5</label>
          <graphic xlink:href="https://html.scirp.org/file/1115345-rId58.jpeg?20260529050603" />
        </fig>
        <p><bold>Figure 5.</bold> Left: Multi-class ROC curves (one-vs-rest) for all five mood states. Macro AUC = 0.961; per-class AUCs range from 0.891 (mixed features reflecting clinical diagnostic complexity) to 0.972 (euthymia). All curves substantially exceed the random classifier diagonal. Right: Reliability diagram comparing BD-Net (ECE = 0.031, blue circles) against the transformer ensemble baseline (ECE = 0.072, red squares). The perfect calibration diagonal (gray dashed) represents the ideal. BD-Net closely tracks the diagonal across all confidence bins; the baseline systematically overestimates its own certainty. Shaded regions indicate calibration error for each model.</p>
        <fig id="fig6">
          <label>Figure 6</label>
          <graphic xlink:href="https://html.scirp.org/file/1115345-rId59.jpeg?20260529050603" />
        </fig>
        <p><bold>Figure 6.</bold> (A) Exemplar manic risk trajectory BD-Net issues a clinician alert 4.2 days before episode onset, opening a pharmacological intervention window (green shaded). (B) Lead time distributions manic episodes predicted at mean 4.2 ± 1.8 days (blue); depressive episodes at 3.7 ± 1.6 days (teal). Green band marks the 3 - 7-day pharmacological window. (C) Ablation waterfalls chart each module’s cumulative accuracy contribution (base LSTM: 74.8%; +TCAN: +3.2%; +BD-BERT: +4.1%; +GNN: +2.7%; +Bayesian Fusion: +3.1%; +Full training: +3.4% → 91.3%). (D) Subgroup fairness analysis maximum disparity 5.2% (medicated vs. unmedicated), reflecting genuine biological heterogeneity rather than demographic bias; sex-based disparity 2.1%.</p>
        <p>of BD-Net’s prospective prediction capability. Panel A of <xref ref-type="fig" rid="fig6">Figure 6</xref><xref ref-type="fig" rid="fig6">Figure 6</xref> shows an exemplar manic risk trajectory for a single BD-I patient: the BD-Net risk score rises from a baseline of ~0.12 and exceeds the alert threshold <italic>τ</italic> = 0.55 at day 30.8, 4.2 days before clinician-confirmed manic episode onset at day 35. This lead time falls within the pharmacological intervention window during which lithium dose adjustment or benzodiazepine augmentation can meaningfully modify the episode trajectory. Panel B of <xref ref-type="fig" rid="fig6">Figure 6</xref><xref ref-type="fig" rid="fig6">Figure 6</xref> presents the lead time distributions across all predicted episodes, confirming that the mean lead times of 4.2 ± 1.8 days (manic) and 3.7 ± 1.6 days (depressive) are sustained across the full test cohort and not artifacts of a small number of outlier cases.</p>
        <p>Panel C of <xref ref-type="fig" rid="fig6">Figure 6</xref><xref ref-type="fig" rid="fig6">Figure 6</xref> presents the ablation waterfall chart, which quantifies the marginal accuracy contribution of each BD-Net component added sequentially to a base LSTM baseline. The TCAN module contributes +3.2% over the LSTM baseline, consistent with published TCN superiority in sequence modeling; BD-BERT contributes an additional +4.1%, the largest single-module gain; the GAT-GNN contributes +2.7%; and the Bayesian fusion and end-to-end joint training contribute a combined +6.5%, reflecting the synergistic interaction effects achievable only through joint optimization. Panel D of <xref ref-type="fig" rid="fig6">Figure 6</xref><xref ref-type="fig" rid="fig6">Figure 6</xref> presents subgroup fairness analysis [<xref ref-type="bibr" rid="B71">71</xref>] across demographic and clinical strata. The maximum accuracy disparity is 5.2% (medicated vs. unmedicated patients), which reflects genuine biological heterogeneity unmedicated patients exhibit more variable and severe bio signal profiles rather than differential performance across socially protected characteristics. Sex-based disparity of 2.1% and age-tertile maximum disparity of 4.8% compare favourably to published fairness benchmarks in psychiatric AI [<xref ref-type="bibr" rid="B72">72</xref>].</p>
        <p><bold>Table 4</bold>. Episode onset prediction at 7-day lead time BD-Net test set performance.</p>
        <table-wrap id="tbl7">
          <label>Table 7</label>
          <table>
            <tbody>
              <tr>
                <td>Episode Type</td>
                <td>Sensitivity</td>
                <td>Specificity</td>
                <td>PPV</td>
                <td>NPV</td>
                <td>AUC</td>
                <td>Mean Lead Time</td>
              </tr>
              <tr>
                <td>Manic Episode</td>
                <td>88.7%</td>
                <td>90.1%</td>
                <td>84.3%</td>
                <td>93.2%</td>
                <td>0.952</td>
                <td>4.2 ± 1.8 days</td>
              </tr>
              <tr>
                <td>Major Depressive Episode</td>
                <td>83.4%</td>
                <td>87.6%</td>
                <td>80.1%</td>
                <td>89.8%</td>
                <td>0.921</td>
                <td>3.7 ± 1.6 days</td>
              </tr>
              <tr>
                <td>Any Episode (pooled)</td>
                <td>86.1%</td>
                <td>88.9%</td>
                <td>82.3%</td>
                <td>91.6%</td>
                <td>0.937</td>
                <td>3.9 ± 1.7 days</td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
        <p>As shown in <bold>Table 4</bold>, the negative predictive value (NPV) of 93.2% for manic episodes is particularly clinically significant: it indicates that when BD-Net does not issue an episode alert, clinicians can proceed with high confidence directly supporting routine outpatient management and reducing unnecessary emergency presentations without increasing missed episode risk.</p>
      </sec>
      <sec id="sec5dot5">
        <title>5.5. Clinical Simulation: Hospitalization Decision Support</title>
        <p>To assess real-world clinical utility, a 6-month prospective simulation integrated BD-Net outputs into a structured decision-support protocol for 50 BD-I patients managed by three consultant psychiatrists [<xref ref-type="bibr" rid="B73">73</xref>]. Compared to a matched historical cohort managed without BD-Net: a 34.2% reduction in false hospitalization recommendations (hospitalizations recommended but not clinically warranted), a 28.6% reduction in unplanned emergency department contacts, and zero sentinel events (missed episodes requiring emergency intervention) in the BD-Net arm. These results are consistent with published evidence on AI-assisted psychiatric triage [<xref ref-type="bibr" rid="B74">74</xref>] and provide a direct translational proof-of-concept pending a powered randomized controlled trial [<xref ref-type="bibr" rid="B75">75</xref>].</p>
      </sec>
    </sec>
    <sec id="sec6">
      <title>6. Discussion</title>
      <sec id="sec6dot1">
        <title>6.1. Methodological Advances</title>
        <p>BD-Net establishes multiple methodological precedents for computational psychiatry. The TCAN architecture demonstrates that integrating multi-scale temporal convolution with causal attention and missingness-aware gating is qualitatively transformative for heterogeneous psychiatric wearable data the missingness-aware gating mechanism in particular addresses a practically ubiquitous challenge that prior TCN and LSTM formulations systematically ignored. The 2.7% ablation contribution of the missingness gate (ablated separately from full TCAN) represents genuine predictive recovery from what would otherwise be treated as missing at random a critical distinction in real-world wearable deployments.</p>
        <p>BD-BERT’s 3.4% accuracy improvement over ClinicalBERT a model trained on 2 billion words of clinical text is a quantitatively meaningful signal reflecting how specialized psychiatric terminology remains systematically underrepresented in general biomedical NLP corpora. The recency-weighted aggregation mechanism, inspired by work on temporal document modelling, further improves performance by 1.8% over uniform-weight note aggregation, confirming that clinical relevance decays non-trivially with note age in BD management contexts.</p>
        <p>The inter-episode GAT-GNN introduces a fundamentally novel dimension: the recognition that BD episodes are nodes in a causally structured longitudinal network. The 2.7% accuracy contribution of the GNN achieved without any additional monitoring burden beyond structured clinical documentation reflects genuine predictive information encoded in the historica GNN achieve djectory that instantaneous bio signal and text features cannot recover. This finding has direct implications for clinical information systems: comprehensive longitudinal episode documentation is not merely a medicolegal requirement [<xref ref-type="bibr" rid="B76">76</xref>] but a quantitatively valuable predictive resource that AI systems can exploit.</p>
      </sec>
      <sec id="sec6dot2">
        <title>6.2. Clinical Implications</title>
        <p>The 4.2-day mean manic episode prediction lead time requires careful clinical interpretation. Mood stabilizer titration (lithium, valproate) typically requires 3-5 days to achieve effective plasma-level modification. BD-Net’s lead time therefore opens precisely the pharmacological intervention window that psychiatrists require to act meaningfully before full manic episode crystallization a targeting precision that no prior computational system has demonstrated at this scale. The 93.2% NPV for crystallization enables confident outpatient management in the absence of an alert, directly reducing unnecessary emergency presentations estimated to cost €800 - 1200 per day per patient in European settings [<xref ref-type="bibr" rid="B77">77</xref>].</p>
        <p>The 34.2% reduction in false hospitalization recommendations observed in clinical simulation carries substantial health-economic implications. At scale across BD’s 45 million affected individuals globally, aggregate economic reallocation toward community-based psychiatric care would be transformative. Moreover, unnecessary hospitalizations carry their own iatrogenic risks stigma amplification [<xref ref-type="bibr" rid="B78">78</xref>], medication disruption, and occupational harm making their reduction a direct patient safety benefit independent of cost considerations.</p>
      </sec>
      <sec id="sec6dot3">
        <title>6.3. Ethical and Regulatory Considerations</title>
        <p>BD-Net’s development was conducted under a comprehensive ethical framework. All data processing complies with GDPR Article 9 requirements for special-category health data. The Bayesian uncertainty layer directly implements EU AI Act Article 13 transparency requirements providing interpretable confidence estimates rather than opaque point predictions. Model outputs are explicitly framed as clinical decision support, not autonomous clinical decisions, with mandatory clinician override built into all deployment protocols, consistent with SaMD best practices.</p>
        <p>A user study with 18 BD patients revealed heterogeneous prediction preferences: 72% wished to receive episode predictions, 15% preferred not to, and 13% wished to receive uncertainty-adjusted predictions only when model confidence exceeded 85% [<xref ref-type="bibr" rid="B79">79</xref>]. These data underscore the necessity of individual preference elicitation a capability BD-Net’s confidence-stratified selective prediction protocol directly supports. This finding is consistent with broader literature on patient preferences for AI-assisted psychiatric care [<xref ref-type="bibr" rid="B80">80</xref>], which consistently identifies model transparency and patient autonomy as primary determinants of acceptance.</p>
      </sec>
      <sec id="sec6dot4">
        <title>6.4. Limitations</title>
        <p>Several limitations require transparent acknowledgment. First, despite comprising the largest longitudinal BD dataset reported to date, BD-Net was developed exclusively in European settings; generalizability to LMIC populations, where wearable technology adoption is lower and psychiatric resources are scarcer, requires dedicated prospective evaluation. Second, the 18-month monitoring window, while substantially longer than prior studies, remains insufficient to characterize rare clinical event sequences such as ultra-rapid cycling patterns that may require multi-year observation.</p>
        <p>Third, BD-Net’s ensemble inference (<italic>M</italic> = 10 models) may present deployment challenges in resource-limited clinical environments. Knowledge distillation [<xref ref-type="bibr" rid="B81">81</xref>] to a single calibrated student model with temperature scaling is identified as a priority for lightweight deployment. Fourth, the clinical simulation (n = 50; n = 3 clinicians; 6 months) is underpowered for definitive translational validation; a properly powered randomized controlled trial comparing BD-Net-augmented versus standard-of-care management remains the necessary next step before clinical recommendation.</p>
      </sec>
    </sec>
    <sec id="sec7">
      <title>7. Future Research Directions</title>
      <p>BD-Net opens several high-priority research trajectories that constitute a coherent roadmap for next-generation computational psychiatry.</p>
      <p>Neuroimaging integration: Extending BD-Net’s GAT-GNN to incorporate resting-state fMRI functional connectivity matrices [<xref ref-type="bibr" rid="B82">82</xref>] and structural MRI cortical thickness profiles [<xref ref-type="bibr" rid="B83">83</xref>], modelling neurobiological substrates of episode transitions within the graph framework directly linking affective dynamics to their neural correlates.Federated learning: Deploying BD-Net under differential privacy-preserving federated learning [<xref ref-type="bibr" rid="B84">84</xref>] across hospital networks, enabling training on vastly larger, more diverse datasets without centralizing sensitive psychiatric data a critical requirement for GDPR-compliant multi-site AI development.Psychiatric foundation model: Pre-training a large multimodal psychiatric foundation model integrating BD-BERT’s text domain with TCAN-style biosignal encoding analogous to MedPaLM [<xref ref-type="bibr" rid="B85">85</xref>] but domain-specific to neuropsychiatry which BD-Net’s architecture is designed to serve as a component of.Causal inference: Augmenting BD-Net with causal graph-based reasoning [<xref ref-type="bibr" rid="B86">86</xref>] to identify actionable causal mechanisms (e.g., sleep disruption → EDA hyperreactivity → prodromal mania), elevating the system from predictive to prescriptive clinical intelligence enabling personalized intervention design.Powered RCT: A multi-site randomized controlled trial (target N ≥ 600, 24-month follow-up) comparing BD-Net-augmented versus standard-of-care management, with pre-specified co-primary endpoints of episode frequency, hospitalization rate, and patient-reported quality of life the definitive translational validation step.</p>
    </sec>
    <sec id="sec8">
      <title>8. Conclusion</title>
      <p>Bipolar disorder is fundamentally a disorder of temporal dynamics of transitions, trajectories, and the catastrophic consequences of missed clinical inflection points. Existing tools are episodic instruments applied to a continuous-time disorder, and the resulting diagnostic inertia carries an enormous and largely preventable human cost. BD-Net addresses this mismatch through a principled, architecturally integrated multimodal deep learning framework that captures BD’s dynamics simultaneously at the bio signal level through TCAN, the clinical language level through BD-BERT, and the inter-episode network level through the GAT-GNN while providing the Bayesian uncertainty quantification that responsible clinical AI deployment demands and regulatory frameworks require. The results 91.3% mood state classification accuracy, AUC = 0.961, ECE = 0.031, a 4.2-day manic episode prediction lead time, and a 34.2% reduction in false hospitalization recommendations in clinical simulation collectively establish BD-Net as the current state of the art in computational bipolar disorder characterization. Each architectural module contributes independently and synergistically; the ablation study confirms that none is dispensable. More broadly, BD-Net demonstrates that the long-promised transformation of psychiatry through artificial intelligence is not merely aspirational. It is technically achievable, clinically translatable, and ethically implementable when built on rigorous multimodal architectures, principled uncertainty quantification, and patient-centred design. The next frontier is not better predictions in isolation, but better predictions integrated seamlessly into better care. BD-Net is designed, from its foundations, as a bridge between those two goals.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <title>References</title>
      <ref id="B1">
        <label>1.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Merikangas, K.R., Jin, R., He, J., Kessler, R.C., Lee, S., Sampson, N.A., <italic>et al.</italic> (2011) Prevalence and Correlates of Bipolar Spectrum Disorder in the World Mental Health Survey Initiative. <italic>Archives of General Psychiatry</italic>, 68, 241-251. https://doi.org/10.1001/archgenpsychiatry.2011.12 <pub-id pub-id-type="doi">10.1001/archgenpsychiatry.2011.12</pub-id><pub-id pub-id-type="pmid">21383262</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1001/archgenpsychiatry.2011.12">https://doi.org/10.1001/archgenpsychiatry.2011.12</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Merikangas, K.R.</string-name>
              <string-name>Jin, R.</string-name>
              <string-name>He, J.</string-name>
              <string-name>Kessler, R.C.</string-name>
              <string-name>Lee, S.</string-name>
              <string-name>Sampson, N.A.</string-name>
            </person-group>
            <year>2011</year>
            <article-title>Prevalence and Correlates of Bipolar Spectrum Disorder in the World Mental Health Survey Initiative</article-title>
            <source>Archives of General Psychiatry</source>
            <volume>68</volume>
            <pub-id pub-id-type="doi">10.1001/archgenpsychiatry.2011.12</pub-id>
            <pub-id pub-id-type="pmid">21383262</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B2">
        <label>2.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Kessler, R.C., Berglund, P., Demler, O., Jin, R., Merikangas, K.R. and Walters, E.E. (2005) Lifetime Prevalence and Age-of-Onset Distributions of DSM-IV Disorders in the National Comorbidity Survey Replication. <italic>Archives of General Psychiatry</italic>, 62, 593-602. https://doi.org/10.1001/archpsyc.62.6.593 <pub-id pub-id-type="doi">10.1001/archpsyc.62.6.593</pub-id><pub-id pub-id-type="pmid">15939837</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1001/archpsyc.62.6.593">https://doi.org/10.1001/archpsyc.62.6.593</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Kessler, R.C.</string-name>
              <string-name>Berglund, P.</string-name>
              <string-name>Demler, O.</string-name>
              <string-name>Jin, R.</string-name>
              <string-name>Merikangas, K.R.</string-name>
              <string-name>Walters, E.E.</string-name>
            </person-group>
            <year>2005</year>
            <article-title>Lifetime Prevalence and Age-of-Onset Distributions of DSM-IV Disorders in the National Comorbidity Survey Replication</article-title>
            <source>Archives of General Psychiatry</source>
            <volume>62</volume>
            <pub-id pub-id-type="doi">10.1001/archpsyc.62.6.593</pub-id>
            <pub-id pub-id-type="pmid">15939837</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B3">
        <label>3.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Perlis, R.H., Miyahara, S., Marangell, L.B., Wisniewski, S.R., Ostacher, M., DelBello, M.P., <italic>et al.</italic> (2004) Long-Term Implications of Early Onset in Bipolar Disorder: Data from the First 1000 Participants in the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD). <italic>Biological</italic><italic>Psychiatry</italic>, 55, 875-881. https://doi.org/10.1016/j.biopsych.2004.01.022 <pub-id pub-id-type="doi">10.1016/j.biopsych.2004.01.022</pub-id><pub-id pub-id-type="pmid">15110730</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.biopsych.2004.01.022">https://doi.org/10.1016/j.biopsych.2004.01.022</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Perlis, R.H.</string-name>
              <string-name>Miyahara, S.</string-name>
              <string-name>Marangell, L.B.</string-name>
              <string-name>Wisniewski, S.R.</string-name>
              <string-name>Ostacher, M.</string-name>
              <string-name>DelBello, M.P.</string-name>
            </person-group>
            <year>2004</year>
            <article-title>Long-Term Implications of Early Onset in Bipolar Disorder: Data from the First 1000 Participants in the Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD)</article-title>
            <source>Biological Psychiatry</source>
            <volume>55</volume>
            <pub-id pub-id-type="doi">10.1016/j.biopsych.2004.01.022</pub-id>
            <pub-id pub-id-type="pmid">15110730</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B4">
        <label>4.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">World Health Organization (2023) Mental Disorders. WHO Fact Sheet. WHO Press.</mixed-citation>
          <element-citation publication-type="book">
            <year>2023</year>
            <article-title>Mental Disorders</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B5">
        <label>5.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">Goodwin, G.M., Haddad, P.M., Ferrier, I.N., <italic>et al.</italic> (2016) Evidence-Based Guidelines for Treating Bipolar Disorder: Revised Third Edition Recommendations from the British Association for Psychopharmacology. <italic>Journal of Psychopharmacology</italic>, 30, 495-553. https://doi.org/10.1177/0269881116636545 <pub-id pub-id-type="doi">10.1177/0269881116636545</pub-id><pub-id pub-id-type="pmid">26979387</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1177/0269881116636545">https://doi.org/10.1177/0269881116636545</ext-link></mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Goodwin, G.M.</string-name>
              <string-name>Haddad, P.M.</string-name>
              <string-name>Ferrier, I.N.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>Evidence-Based Guidelines for Treating Bipolar Disorder: Revised Third Edition Recommendations from the British Association for Psychopharmacology</article-title>
            <source>Journal of Psychopharmacology</source>
            <volume>30</volume>
            <pub-id pub-id-type="doi">10.1177/0269881116636545</pub-id>
            <pub-id pub-id-type="pmid">26979387</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B6">
        <label>6.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Hirschfeld, R.M.A., Lewis, L. and Vornik, L.A. (2003) Perceptions and Impact of Bipolar Disorder: How Far Have We Really Come? <italic>The Journal of Clinical Psychiatry</italic>, 64, 161-174. https://doi.org/10.4088/jcp.v64n0209 <pub-id pub-id-type="doi">10.4088/jcp.v64n0209</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.4088/jcp.v64n0209">https://doi.org/10.4088/jcp.v64n0209</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Hirschfeld, R.M.A.</string-name>
              <string-name>Lewis, L.</string-name>
              <string-name>Vornik, L.A.</string-name>
            </person-group>
            <year>2003</year>
            <article-title>Perceptions and Impact of Bipolar Disorder: How Far Have We Really Come? The Journal of Clinical Psychiatry, 64, 161-174</article-title>
            <pub-id pub-id-type="doi">10.4088/jcp.v64n0209</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B7">
        <label>7.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">American Psychiatric Association (2013) Diagnostic and Statistical Manual of Mental Disorders. 5th Edition, APA Publishing.</mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Edition, A</string-name>
            </person-group>
            <year>2013</year>
            <article-title>Diagnostic and Statistical Manual of Mental Disorders</article-title>
            <source>5th Edition</source>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B8">
        <label>8.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Malhi, G.S., Bell, E. and Boyce, P. (2019) Lithium: Still a Cornerstone of Bipolar Disorder Management. <italic>CNS Drugs</italic>, 33, 1209-1213.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Malhi, G.S.</string-name>
              <string-name>Bell, E.</string-name>
              <string-name>Boyce, P.</string-name>
            </person-group>
            <year>2019</year>
            <article-title>Lithium: Still a Cornerstone of Bipolar Disorder Management</article-title>
            <source>CNS Drugs</source>
            <volume>33</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B9">
        <label>9.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Geddes, J.R. and Miklowitz, D.J. (2013) Treatment of Bipolar Disorder. <italic>The</italic><italic>Lancet</italic>, 381, 1672-1682. https://doi.org/10.1016/s0140-6736(13)60857-0 <pub-id pub-id-type="doi">10.1016/s0140-6736(13)60857-0</pub-id><pub-id pub-id-type="pmid">23663953</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/s0140-6736(13)60857-0">https://doi.org/10.1016/s0140-6736(13)60857-0</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Geddes, J.R.</string-name>
              <string-name>Miklowitz, D.J.</string-name>
            </person-group>
            <year>2013</year>
            <article-title>Treatment of Bipolar Disorder</article-title>
            <source>The Lancet</source>
            <volume>6736</volume>
            <issue>13</issue>
            <pub-id pub-id-type="doi">10.1016/s0140-6736(13)60857-0</pub-id>
            <pub-id pub-id-type="pmid">23663953</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B10">
        <label>10.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Lish, J.D., Dime-Meenan, S., Whybrow, P.C., Price, R.A. and Hirschfeld, R.M.A. (1994) The National Depressive and Manic-Depressive Association (DMDA) Survey of Bipolar Members. <italic>Journal of Affective Disorders</italic>, 31, 281-294. https://doi.org/10.1016/0165-0327(94)90104-x <pub-id pub-id-type="doi">10.1016/0165-0327(94)90104-x</pub-id><pub-id pub-id-type="pmid">7989643</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/0165-0327(94)90104-x">https://doi.org/10.1016/0165-0327(94)90104-x</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Lish, J.D.</string-name>
              <string-name>Dime-Meenan, S.</string-name>
              <string-name>Whybrow, P.C.</string-name>
              <string-name>Price, R.A.</string-name>
              <string-name>Hirschfeld, R.M.A.</string-name>
            </person-group>
            <year>1994</year>
            <article-title>The National Depressive and Manic-Depressive Association (DMDA) Survey of Bipolar Members</article-title>
            <source>Journal of Affective Disorders</source>
            <volume>0327</volume>
            <issue>94</issue>
            <pub-id pub-id-type="doi">10.1016/0165-0327(94)90104-x</pub-id>
            <pub-id pub-id-type="pmid">7989643</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B11">
        <label>11.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Reinertsen, E. and Clifford, G.D. (2018) A Review of Physiological and Behavioral Monitoring with Digital Sensors for Neuropsychiatric Illnesses. <italic>Physiological Measurement</italic>, 39, 05TR01. https://doi.org/10.1088/1361-6579/aabf64 <pub-id pub-id-type="doi">10.1088/1361-6579/aabf64</pub-id><pub-id pub-id-type="pmid">29671754</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1088/1361-6579/aabf64">https://doi.org/10.1088/1361-6579/aabf64</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Reinertsen, E.</string-name>
              <string-name>Clifford, G.D.</string-name>
            </person-group>
            <year>2018</year>
            <article-title>A Review of Physiological and Behavioral Monitoring with Digital Sensors for Neuropsychiatric Illnesses</article-title>
            <source>Physiological Measurement</source>
            <volume>39</volume>
            <pub-id pub-id-type="doi">10.1088/1361-6579/aabf64</pub-id>
            <pub-id pub-id-type="pmid">29671754</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B12">
        <label>12.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Faurholt-Jepsen, M., Vinberg, M., Frost, M., <italic>et al.</italic> (2015) Smartphone Data as an Electronic Biomarker of Illness Activity in Bipolar Disorder. <italic>Bipolar Disorders</italic>, 17, 715-728.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Faurholt-Jepsen, M.</string-name>
              <string-name>Vinberg, M.</string-name>
              <string-name>Frost, M.</string-name>
            </person-group>
            <year>2015</year>
            <article-title>Smartphone Data as an Electronic Biomarker of Illness Activity in Bipolar Disorder</article-title>
            <source>Bipolar Disorders</source>
            <volume>17</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B13">
        <label>13.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Greco, A., Valenza, G., Lanata, A., <italic>et al.</italic> (2016) Assessment of Mental and Physical Health Conditions via Electrodermal Activity. <italic>IEEE Transactions on Neural Systems and Rehabilitation Engineering</italic>, 24, 744-753.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Greco, A.</string-name>
              <string-name>Valenza, G.</string-name>
              <string-name>Lanata, A.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>Assessment of Mental and Physical Health Conditions via Electrodermal Activity</article-title>
            <source>IEEE Transactions on Neural Systems and Rehabilitation Engineering</source>
            <volume>24</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B14">
        <label>14.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Shaffer, F. and Ginsberg, J.P. (2017) An Overview of Heart Rate Variability Metrics and Norms. <italic>Frontiers</italic><italic>in</italic><italic>Public</italic><italic>Health</italic>, 5, Article No. 258. https://doi.org/10.3389/fpubh.2017.00258 <pub-id pub-id-type="doi">10.3389/fpubh.2017.00258</pub-id><pub-id pub-id-type="pmid">29034226</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fpubh.2017.00258">https://doi.org/10.3389/fpubh.2017.00258</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Shaffer, F.</string-name>
              <string-name>Ginsberg, J.P.</string-name>
            </person-group>
            <year>2017</year>
            <article-title>An Overview of Heart Rate Variability Metrics and Norms</article-title>
            <source>Frontiers in Public Health</source>
            <volume>5</volume>
            <elocation-id>No</elocation-id>
            <pub-id pub-id-type="doi">10.3389/fpubh.2017.00258</pub-id>
            <pub-id pub-id-type="pmid">29034226</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B15">
        <label>15.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Harvey, A.G. (2008) Sleep and Circadian Rhythms in Bipolar Disorder: Seeking Synchrony, Harmony, and Regulation. <italic>American</italic><italic>Journal</italic><italic>of</italic><italic>Psychiatry</italic>, 165, 820-829. https://doi.org/10.1176/appi.ajp.2008.08010098 <pub-id pub-id-type="doi">10.1176/appi.ajp.2008.08010098</pub-id><pub-id pub-id-type="pmid">18519522</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1176/appi.ajp.2008.08010098">https://doi.org/10.1176/appi.ajp.2008.08010098</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Harvey, A.G.</string-name>
              <string-name>Synchrony, H</string-name>
            </person-group>
            <year>2008</year>
            <article-title>Sleep and Circadian Rhythms in Bipolar Disorder: Seeking Synchrony, Harmony, and Regulation</article-title>
            <source>American Journal of Psychiatry</source>
            <volume>165</volume>
            <pub-id pub-id-type="doi">10.1176/appi.ajp.2008.08010098</pub-id>
            <pub-id pub-id-type="pmid">18519522</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B16">
        <label>16.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Waudby, C.J., Lependu, P. and Shah, N.H. (2012) Finding the Right Patient: Mining EHR Data for Psychiatric Research. <italic>Journal of the American Medical Informatics Association</italic>, 19, 802-808.</mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Waudby, C.J.</string-name>
              <string-name>Lependu, P.</string-name>
              <string-name>Shah, N.H.</string-name>
            </person-group>
            <year>2012</year>
            <article-title>Finding the Right Patient: Mining EHR Data for Psychiatric Research</article-title>
            <source>Journal of the American Medical Informatics Association</source>
            <volume>19</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B17">
        <label>17.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Khandker, R.K., Prince, M.R.I., Chekani, F., Dexter, P.R., Boustani, M.A. and Ben Miled, Z. (2023) Digital-Reported Outcome from Medical Notes of Schizophrenia and Bipolar Patients Using Hierarchical Bert. <italic>Information</italic>, 14, 471. https://doi.org/10.3390/info14090471 <pub-id pub-id-type="doi">10.3390/info14090471</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3390/info14090471">https://doi.org/10.3390/info14090471</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Khandker, R.K.</string-name>
              <string-name>Prince, M.R.I.</string-name>
              <string-name>Chekani, F.</string-name>
              <string-name>Dexter, P.R.</string-name>
              <string-name>Boustani, M.A.</string-name>
              <string-name>Miled, Z.</string-name>
            </person-group>
            <year>2023</year>
            <article-title>Digital-Reported Outcome from Medical Notes of Schizophrenia and Bipolar Patients Using Hierarchical Bert</article-title>
            <source>Information</source>
            <volume>14</volume>
            <pub-id pub-id-type="doi">10.3390/info14090471</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B18">
        <label>18.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Busk, J., Faurholt-Jepsen, M., Frost, M., Bardram, J.E., Vedel Kessing, L. and Winther, O. (2020) Forecasting Mood in Bipolar Disorder from Smartphone Self-Assessments: Hierarchical Bayesian Approach. <italic>JMIR mHealth and</italic><italic>uHealth</italic>, 8, e15028. https://doi.org/10.2196/15028 <pub-id pub-id-type="doi">10.2196/15028</pub-id><pub-id pub-id-type="pmid">32234702</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.2196/15028">https://doi.org/10.2196/15028</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Busk, J.</string-name>
              <string-name>Faurholt-Jepsen, M.</string-name>
              <string-name>Frost, M.</string-name>
              <string-name>Bardram, J.E.</string-name>
              <string-name>Kessing, L.</string-name>
              <string-name>Winther, O.</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Forecasting Mood in Bipolar Disorder from Smartphone Self-Assessments: Hierarchical Bayesian Approach</article-title>
            <source>JMIR mHealth and uHealth</source>
            <volume>8</volume>
            <pub-id pub-id-type="doi">10.2196/15028</pub-id>
            <pub-id pub-id-type="pmid">32234702</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B19">
        <label>19.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Khoo, L.S., Lim, M.K., Chong, C.Y. and McNaney, R. (2024) Machine Learning for Multimodal Mental Health Detection: A Systematic Review of Passive Sensing Approaches. <italic>Sensors</italic>, 24, 348. https://doi.org/10.3390/s24020348 <pub-id pub-id-type="doi">10.3390/s24020348</pub-id><pub-id pub-id-type="pmid">38257440</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3390/s24020348">https://doi.org/10.3390/s24020348</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Khoo, L.S.</string-name>
              <string-name>Lim, M.K.</string-name>
              <string-name>Chong, C.Y.</string-name>
              <string-name>McNaney, R.</string-name>
            </person-group>
            <year>2024</year>
            <article-title>Machine Learning for Multimodal Mental Health Detection: A Systematic Review of Passive Sensing Approaches</article-title>
            <source>Sensors</source>
            <volume>24</volume>
            <pub-id pub-id-type="doi">10.3390/s24020348</pub-id>
            <pub-id pub-id-type="pmid">38257440</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B20">
        <label>20.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Scarselli, F., Gori, M., Tsoi, A.C., <italic>et al.</italic> (2009) The Graph Neural Network Model. <italic>IEEE Transactions on Neural Networks</italic>, 20, 61-80. https://doi.org/10.1109/tnn.2008.2005605 <pub-id pub-id-type="doi">10.1109/tnn.2008.2005605</pub-id><pub-id pub-id-type="pmid">19068426</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/tnn.2008.2005605">https://doi.org/10.1109/tnn.2008.2005605</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Scarselli, F.</string-name>
              <string-name>Gori, M.</string-name>
              <string-name>Tsoi, A.C.</string-name>
            </person-group>
            <year>2009</year>
            <article-title>The Graph Neural Network Model</article-title>
            <source>IEEE Transactions on Neural Networks</source>
            <volume>20</volume>
            <pub-id pub-id-type="doi">10.1109/tnn.2008.2005605</pub-id>
            <pub-id pub-id-type="pmid">19068426</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B21">
        <label>21.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Bárcena, S. and Arellano-Sabag, J.S. (2026) The European Union Artificial Intelligence Act: Ethical Principles and the Regulation of AI for Social Welfare and Development. In: <italic>Law</italic>, <italic>Governance and Technology Series</italic>, Springer Nature Switzerland, 377-404. https://doi.org/10.1007/978-3-032-13063-1_17 <pub-id pub-id-type="doi">10.1007/978-3-032-13063-1_17</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-3-032-13063-1_17">https://doi.org/10.1007/978-3-032-13063-1_17</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Arellano-Sabag, J.S.</string-name>
              <string-name>Law, G</string-name>
              <string-name>Series, S</string-name>
            </person-group>
            <year>2026</year>
            <article-title>The European Union Artificial Intelligence Act: Ethical Principles and the Regulation of AI for Social Welfare and Development</article-title>
            <source>In: Law</source>
            <volume>377</volume>
            <pub-id pub-id-type="doi">10.1007/978-3-032-13063-1_17</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B22">
        <label>22.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">U.S. Food and Drug Administration (2021) Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan. FDA.</mixed-citation>
          <element-citation publication-type="other">
            <year>2021</year>
            <article-title>Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B23">
        <label>23.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Bai, S., Kolter, J.Z. and Koltun, V. (2018) An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. https://arxiv.org/abs/1803.01271</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Bai, S.</string-name>
              <string-name>Kolter, J.Z.</string-name>
              <string-name>Koltun, V.</string-name>
            </person-group>
            <year>2018</year>
            <article-title>An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B24">
        <label>24.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Peng, Y., Chen, Q. and Lu, Z. (2020) An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining. In <italic>Proceedings of the</italic> 19 <italic>th</italic><italic>SIGBioMed</italic><italic>Workshop on Biomedical Language Processing</italic>, Association for Computational Linguistics, 205-214. https://doi.org/10.18653/v1/2020.bionlp-1.22 <pub-id pub-id-type="doi">10.18653/v1/2020.bionlp-1.22</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.18653/v1/2020.bionlp-1.22">https://doi.org/10.18653/v1/2020.bionlp-1.22</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Peng, Y.</string-name>
              <string-name>Chen, Q.</string-name>
              <string-name>Lu, Z.</string-name>
              <string-name>Processing, A</string-name>
            </person-group>
            <year>2020</year>
            <article-title>An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining</article-title>
            <source>In Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing</source>
            <volume>205</volume>
            <pub-id pub-id-type="doi">10.18653/v1/2020.bionlp-1.22</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B25">
        <label>25.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Gao, H., Wang, Z. and Ji, S. (2018) Large-Scale Learnable Graph Convolutional Networks. <italic>Proceedings of the</italic> 24 <italic>th ACM SIGKDD International Conference on Knowledg</italic><italic>e Discovery &amp; Data Mining</italic>, London, 19-23 August 2018, 1416-1424. https://doi.org/10.1145/3219819.3219947 <pub-id pub-id-type="doi">10.1145/3219819.3219947</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/3219819.3219947">https://doi.org/10.1145/3219819.3219947</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Gao, H.</string-name>
              <string-name>Wang, Z.</string-name>
              <string-name>Ji, S.</string-name>
              <string-name>Mining, L</string-name>
            </person-group>
            <year>2018</year>
            <article-title>Large-Scale Learnable Graph Convolutional Networks</article-title>
            <source>Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</source>
            <volume>19</volume>
            <pub-id pub-id-type="doi">10.1145/3219819.3219947</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B26">
        <label>26.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Lakshminarayanan, B., Pritzel, A. and Blundell, C. (2017) Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles. <italic>Proceedings of the</italic>31 <italic>st International Conference on Neural Information Processing Systems</italic>, Long Beach, 4-9 December 2017, 6405-6416.</mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Lakshminarayanan, B.</string-name>
              <string-name>Pritzel, A.</string-name>
              <string-name>Blundell, C.</string-name>
              <string-name>Systems, L</string-name>
            </person-group>
            <year>2017</year>
            <article-title>Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles</article-title>
            <source>Proceedings of the 31st International Conference on Neural Information Processing Systems</source>
            <volume>4</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B27">
        <label>27.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Judd, L.L., Akiskal, H.S., Schettler, P.J., Endicott, J., Maser, J., Solomon, D.A., <italic>et al.</italic> (2002) The Long-Term Natural History of the Weekly Symptomatic Status of Bipolar I Disorder. <italic>Archives of General Psychiatry</italic>, 59, 530-537. https://doi.org/10.1001/archpsyc.59.6.530 <pub-id pub-id-type="doi">10.1001/archpsyc.59.6.530</pub-id><pub-id pub-id-type="pmid">12044195</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1001/archpsyc.59.6.530">https://doi.org/10.1001/archpsyc.59.6.530</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Judd, L.L.</string-name>
              <string-name>Akiskal, H.S.</string-name>
              <string-name>Schettler, P.J.</string-name>
              <string-name>Endicott, J.</string-name>
              <string-name>Maser, J.</string-name>
              <string-name>Solomon, D.A.</string-name>
            </person-group>
            <year>2002</year>
            <article-title>The Long-Term Natural History of the Weekly Symptomatic Status of Bipolar I Disorder</article-title>
            <source>Archives of General Psychiatry</source>
            <volume>59</volume>
            <pub-id pub-id-type="doi">10.1001/archpsyc.59.6.530</pub-id>
            <pub-id pub-id-type="pmid">12044195</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B28">
        <label>28.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Forty, L., Ulanova, A., Jones, L., Jones, I., Gordon-Smith, K., Fraser, C., <italic>et al.</italic> (2014) Comorbid Medical Illness in Bipolar Disorder. <italic>British Journal of Psychiatry</italic>, 205, 465-472. https://doi.org/10.1192/bjp.bp.114.152249 <pub-id pub-id-type="doi">10.1192/bjp.bp.114.152249</pub-id><pub-id pub-id-type="pmid">25359927</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1192/bjp.bp.114.152249">https://doi.org/10.1192/bjp.bp.114.152249</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Forty, L.</string-name>
              <string-name>Ulanova, A.</string-name>
              <string-name>Jones, L.</string-name>
              <string-name>Jones, I.</string-name>
              <string-name>Gordon-Smith, K.</string-name>
              <string-name>Fraser, C.</string-name>
            </person-group>
            <year>2014</year>
            <article-title>Comorbid Medical Illness in Bipolar Disorder</article-title>
            <source>British Journal of Psychiatry</source>
            <volume>205</volume>
            <pub-id pub-id-type="doi">10.1192/bjp.bp.114.152249</pub-id>
            <pub-id pub-id-type="pmid">25359927</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B29">
        <label>29.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Young, R.C., Biggs, J.T., Ziegler, V.E. and Meyer, D.A. (1978) A Rating Scale for Mania: Reliability, Validity and Sensitivity. <italic>British</italic><italic>Journal</italic><italic>of</italic><italic>Psychiatry</italic>, 133, 429-435. https://doi.org/10.1192/bjp.133.5.429 <pub-id pub-id-type="doi">10.1192/bjp.133.5.429</pub-id><pub-id pub-id-type="pmid">728692</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1192/bjp.133.5.429">https://doi.org/10.1192/bjp.133.5.429</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Young, R.C.</string-name>
              <string-name>Biggs, J.T.</string-name>
              <string-name>Ziegler, V.E.</string-name>
              <string-name>Meyer, D.A.</string-name>
              <string-name>Reliability, V</string-name>
            </person-group>
            <year>1978</year>
            <article-title>A Rating Scale for Mania: Reliability, Validity and Sensitivity</article-title>
            <source>British Journal of Psychiatry</source>
            <volume>133</volume>
            <pub-id pub-id-type="doi">10.1192/bjp.133.5.429</pub-id>
            <pub-id pub-id-type="pmid">728692</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B30">
        <label>30.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Hamilton, M. (1960) A Rating Scale for Depression. <italic>Journal</italic><italic>of</italic><italic>Neurology</italic>, <italic>Neurosurgery</italic><italic>&amp;</italic><italic>Psychiatry</italic>, 23, 56-62. https://doi.org/10.1136/jnnp.23.1.56 <pub-id pub-id-type="doi">10.1136/jnnp.23.1.56</pub-id><pub-id pub-id-type="pmid">14399272</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1136/jnnp.23.1.56">https://doi.org/10.1136/jnnp.23.1.56</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Hamilton, M.</string-name>
              <string-name>Neurology, N</string-name>
            </person-group>
            <year>1960</year>
            <article-title>A Rating Scale for Depression</article-title>
            <source>Journal of Neurology</source>
            <volume>23</volume>
            <pub-id pub-id-type="doi">10.1136/jnnp.23.1.56</pub-id>
            <pub-id pub-id-type="pmid">14399272</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B31">
        <label>31.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Streiner, D.L. and Cairney, J. (2007) What’s under the ROC? An Introduction to Receiver Operating Characteristics Curves. <italic>The Canadian Journal of Psychiatry</italic>, 52, 121-128. https://doi.org/10.1177/070674370705200210 <pub-id pub-id-type="doi">10.1177/070674370705200210</pub-id><pub-id pub-id-type="pmid">17375868</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1177/070674370705200210">https://doi.org/10.1177/070674370705200210</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Streiner, D.L.</string-name>
              <string-name>Cairney, J.</string-name>
            </person-group>
            <year>2007</year>
            <article-title>What’s under the ROC? An Introduction to Receiver Operating Characteristics Curves</article-title>
            <source>The Canadian Journal of Psychiatry</source>
            <volume>52</volume>
            <pub-id pub-id-type="doi">10.1177/070674370705200210</pub-id>
            <pub-id pub-id-type="pmid">17375868</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B32">
        <label>32.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Kendler, K.S., Thornton, L.M. and Gardner, C.O. (2000) Stressful Life Events and Previous Episodes in the Etiology of Major Depression in Women: An Evaluation of the “Kindling” Hypothesis. <italic>American Journal of Psychiatry</italic>, 157, 1243-1251. https://doi.org/10.1176/appi.ajp.157.8.1243 <pub-id pub-id-type="doi">10.1176/appi.ajp.157.8.1243</pub-id><pub-id pub-id-type="pmid">10910786</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1176/appi.ajp.157.8.1243">https://doi.org/10.1176/appi.ajp.157.8.1243</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Kendler, K.S.</string-name>
              <string-name>Thornton, L.M.</string-name>
              <string-name>Gardner, C.O.</string-name>
            </person-group>
            <year>2000</year>
            <article-title>Stressful Life Events and Previous Episodes in the Etiology of Major Depression in Women: An Evaluation of the “Kindling” Hypothesis</article-title>
            <source>American Journal of Psychiatry</source>
            <volume>157</volume>
            <pub-id pub-id-type="doi">10.1176/appi.ajp.157.8.1243</pub-id>
            <pub-id pub-id-type="pmid">10910786</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B33">
        <label>33.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Arts, B., Jabben, N., Krabbendam, L. and van Os, J. (2007) Meta-Analyses of Cognitive Functioning in Euthymic Bipolar Patients and Their First-Degree Relatives. <italic>Psychological Medicine</italic>, 38, 771-785. https://doi.org/10.1017/s0033291707001675 <pub-id pub-id-type="doi">10.1017/s0033291707001675</pub-id><pub-id pub-id-type="pmid">17922938</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1017/s0033291707001675">https://doi.org/10.1017/s0033291707001675</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Arts, B.</string-name>
              <string-name>Jabben, N.</string-name>
              <string-name>Krabbendam, L.</string-name>
              <string-name>Os, J.</string-name>
            </person-group>
            <year>2007</year>
            <article-title>Meta-Analyses of Cognitive Functioning in Euthymic Bipolar Patients and Their First-Degree Relatives</article-title>
            <source>Psychological Medicine</source>
            <volume>38</volume>
            <pub-id pub-id-type="doi">10.1017/s0033291707001675</pub-id>
            <pub-id pub-id-type="pmid">17922938</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B34">
        <label>34.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Dome, P., Rihmer, Z. and Gonda, X. (2019) Suicide Risk in Bipolar Disorder: A Brief Review. <italic>Medicina</italic>, 55, Article No. 403. https://doi.org/10.3390/medicina55080403 <pub-id pub-id-type="doi">10.3390/medicina55080403</pub-id><pub-id pub-id-type="pmid">31344941</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3390/medicina55080403">https://doi.org/10.3390/medicina55080403</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Dome, P.</string-name>
              <string-name>Rihmer, Z.</string-name>
              <string-name>Gonda, X.</string-name>
            </person-group>
            <year>2019</year>
            <article-title>Suicide Risk in Bipolar Disorder: A Brief Review</article-title>
            <source>Medicina</source>
            <volume>55</volume>
            <elocation-id>No</elocation-id>
            <pub-id pub-id-type="doi">10.3390/medicina55080403</pub-id>
            <pub-id pub-id-type="pmid">31344941</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B35">
        <label>35.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Maxhuni, A., Muñoz-Meléndez, A., Osmani, V., Perez, H., Mayora, O. and Morales, E.F. (2016) Classification of Bipolar Disorder Episodes Based on Analysis of Voice and Motor Activity of Patients. <italic>Pervasive</italic><italic>and</italic><italic>Mobile</italic><italic>Computing</italic>, 31, 50-66. https://doi.org/10.1016/j.pmcj.2016.01.008 <pub-id pub-id-type="doi">10.1016/j.pmcj.2016.01.008</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.pmcj.2016.01.008">https://doi.org/10.1016/j.pmcj.2016.01.008</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Maxhuni, A.</string-name>
              <string-name>Osmani, V.</string-name>
              <string-name>Perez, H.</string-name>
              <string-name>Mayora, O.</string-name>
              <string-name>Morales, E.F.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>Classification of Bipolar Disorder Episodes Based on Analysis of Voice and Motor Activity of Patients</article-title>
            <source>Pervasive and Mobile Computing</source>
            <volume>31</volume>
            <pub-id pub-id-type="doi">10.1016/j.pmcj.2016.01.008</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B36">
        <label>36.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Saeb, S., Zhang, M., Karr, C.J., Schueller, S.M., Corden, M.E., Kording, K.P., <italic>et al.</italic> (2015) Mobile Phone Sensor Correlates of Depressive Symptom Severity in Daily-Life Behavior: An Exploratory Study. <italic>Journal</italic><italic>of</italic><italic>Medical</italic><italic>Internet</italic><italic>Research</italic>, 17, e175. https://doi.org/10.2196/jmir.4273 <pub-id pub-id-type="doi">10.2196/jmir.4273</pub-id><pub-id pub-id-type="pmid">26180009</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.2196/jmir.4273">https://doi.org/10.2196/jmir.4273</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Saeb, S.</string-name>
              <string-name>Zhang, M.</string-name>
              <string-name>Karr, C.J.</string-name>
              <string-name>Schueller, S.M.</string-name>
              <string-name>Corden, M.E.</string-name>
              <string-name>Kording, K.P.</string-name>
            </person-group>
            <year>2015</year>
            <article-title>Mobile Phone Sensor Correlates of Depressive Symptom Severity in Daily-Life Behavior: An Exploratory Study</article-title>
            <source>Journal of Medical Internet Research</source>
            <volume>17</volume>
            <pub-id pub-id-type="doi">10.2196/jmir.4273</pub-id>
            <pub-id pub-id-type="pmid">26180009</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B37">
        <label>37.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Jensen, O. and Lisman, J.E. (1996) Novel Lists of 7 +/-2 Known Items Can Be Reliably Stored in an Oscillatory Short-Term Memory Network: Interaction with Long-Term Memory. <italic>Learning &amp; Memory</italic>, 3, 257-263. https://doi.org/10.1101/lm.3.2-3.257 <pub-id pub-id-type="doi">10.1101/lm.3.2-3.257</pub-id><pub-id pub-id-type="pmid">10456095</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1101/lm.3.2-3.257">https://doi.org/10.1101/lm.3.2-3.257</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Jensen, O.</string-name>
              <string-name>Lisman, J.E.</string-name>
            </person-group>
            <year>1996</year>
            <article-title>Novel Lists of 7 +/-2 Known Items Can Be Reliably Stored in an Oscillatory Short-Term Memory Network: Interaction with Long-Term Memory</article-title>
            <source>Learning &amp; Memory</source>
            <volume>3</volume>
            <pub-id pub-id-type="doi">10.1101/lm.3.2-3.257</pub-id>
            <pub-id pub-id-type="pmid">10456095</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B38">
        <label>38.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Doryab, A., Masown, J., Lim, J., <italic>et al.</italic> (2022) Prediction of Symptom Severity Change among People Diagnosed with Serious Mental Illness Using Passive Mobile Sensing. <italic>IEEE Journal of Biomedical and Health Informatics</italic>, 26, 1803-1812.</mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Doryab, A.</string-name>
              <string-name>Masown, J.</string-name>
              <string-name>Lim, J.</string-name>
            </person-group>
            <year>2022</year>
            <article-title>Prediction of Symptom Severity Change among People Diagnosed with Serious Mental Illness Using Passive Mobile Sensing</article-title>
            <source>IEEE Journal of Biomedical and Health Informatics</source>
            <volume>26</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B39">
        <label>39.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Van den Oord, A., Dieleman, S., Zen, H., <italic>et al.</italic> (2016) WaveNet: A Generative Model for Raw Audio. https://arxiv.org/abs/1609.03499</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Oord, A.</string-name>
              <string-name>Dieleman, S.</string-name>
              <string-name>Zen, H.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>WaveNet: A Generative Model for Raw Audio</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B40">
        <label>40.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Vaswani, A., Shazeer, N., Parmar, N., <italic>et al.</italic> (2017) Attention Is All You Need. <italic>Proceedings of the</italic>31 <italic>st International Conference on Neural Information Processing Systems</italic>, Long Beach, 4-9 December 2017, 6000-6010.</mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Vaswani, A.</string-name>
              <string-name>Shazeer, N.</string-name>
              <string-name>Parmar, N.</string-name>
              <string-name>Systems, L</string-name>
            </person-group>
            <year>2017</year>
            <article-title>Attention Is All You Need</article-title>
            <source>Proceedings of the 31st International Conference on Neural Information Processing Systems</source>
            <volume>4</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B41">
        <label>41.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2019) BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. <italic>NAACL</italic>- <italic>HLT</italic> 2019, Minneapolis, 2-7 June 2019, 4171-4186.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Devlin, J.</string-name>
              <string-name>Chang, M.W.</string-name>
              <string-name>Lee, K.</string-name>
              <string-name>Toutanova, K.</string-name>
            </person-group>
            <year>2019</year>
            <article-title>BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding</article-title>
            <source>NAACL-HLT 2019</source>
            <volume>2</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B42">
        <label>42.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Alsentzer, E., Murphy, J., Boag, W., Weng, W., Jindi, D., Naumann, T., <italic>et al.</italic> (2019) Publicly Available Clinical BERT Embeddings. <italic>Proceedings of the</italic>2 <italic>nd Clinical Natural Language Processing Workshop</italic>, Minneapolis, June 2019, 72-78. https://doi.org/10.18653/v1/w19-1909 <pub-id pub-id-type="doi">10.18653/v1/w19-1909</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.18653/v1/w19-1909">https://doi.org/10.18653/v1/w19-1909</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Alsentzer, E.</string-name>
              <string-name>Murphy, J.</string-name>
              <string-name>Boag, W.</string-name>
              <string-name>Weng, W.</string-name>
              <string-name>Jindi, D.</string-name>
              <string-name>Naumann, T.</string-name>
              <string-name>Workshop, M</string-name>
            </person-group>
            <year>2019</year>
            <article-title>Publicly Available Clinical BERT Embeddings</article-title>
            <source>Proceedings of the 2nd Clinical Natural Language Processing Workshop</source>
            <volume>72</volume>
            <pub-id pub-id-type="doi">10.18653/v1/w19-1909</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B43">
        <label>43.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Tai, W., Kung, H.T., Dong, X., Comiter, M. and Kuo, C. (2020) exBERT: Extending Pre-Trained Models with Domain-Specific Vocabulary under Constrained Training Resources. In: <italic>Findings of the Association for Computational Linguistics</italic>: <italic>EMNLP</italic> 2020, 1433-1439. https://doi.org/10.18653/v1/2020.findings-emnlp.129 <pub-id pub-id-type="doi">10.18653/v1/2020.findings-emnlp.129</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.18653/v1/2020.findings-emnlp.129">https://doi.org/10.18653/v1/2020.findings-emnlp.129</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Tai, W.</string-name>
              <string-name>Kung, H.T.</string-name>
              <string-name>Dong, X.</string-name>
              <string-name>Comiter, M.</string-name>
              <string-name>Kuo, C.</string-name>
            </person-group>
            <year>2020</year>
            <article-title>exBERT: Extending Pre-Trained Models with Domain-Specific Vocabulary under Constrained Training Resources</article-title>
            <source>In: Findings of the Association for Computational Linguistics: EMNLP 2020</source>
            <volume>1433</volume>
            <pub-id pub-id-type="doi">10.18653/v1/2020.findings-emnlp.129</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B44">
        <label>44.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Valstar, M., Schuller, B., Smith, K., Eyben, F., Jiang, B., Bilakhia, S., <italic>et al.</italic> (2013) AVEC 2013. <italic>Proceedings of the</italic>3 <italic>rd ACM International Workshop on Audio</italic>/ <italic>Visual Emotion challenge</italic>, Barcelona, 21 October 2013, 3-10. https://doi.org/10.1145/2512530.2512533 <pub-id pub-id-type="doi">10.1145/2512530.2512533</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/2512530.2512533">https://doi.org/10.1145/2512530.2512533</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Valstar, M.</string-name>
              <string-name>Schuller, B.</string-name>
              <string-name>Smith, K.</string-name>
              <string-name>Eyben, F.</string-name>
              <string-name>Jiang, B.</string-name>
              <string-name>Bilakhia, S.</string-name>
            </person-group>
            <year>2013</year>
            <article-title>AVEC 2013</article-title>
            <source>Proceedings of the 3rd ACM International Workshop on Audio/Visual Emotion challenge</source>
            <volume>21</volume>
            <pub-id pub-id-type="doi">10.1145/2512530.2512533</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B45">
        <label>45.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Zhou, J., Cui, G., Hu, S., Zhang, Z., Yang, C., Liu, Z., <italic>et al.</italic> (2020) Graph Neural Networks: A Review of Methods and Applications. <italic>AI</italic><italic>Open</italic>, 1, 57-81. https://doi.org/10.1016/j.aiopen.2021.01.001 <pub-id pub-id-type="doi">10.1016/j.aiopen.2021.01.001</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.aiopen.2021.01.001">https://doi.org/10.1016/j.aiopen.2021.01.001</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Zhou, J.</string-name>
              <string-name>Cui, G.</string-name>
              <string-name>Hu, S.</string-name>
              <string-name>Zhang, Z.</string-name>
              <string-name>Yang, C.</string-name>
              <string-name>Liu, Z.</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Graph Neural Networks: A Review of Methods and Applications</article-title>
            <source>AI Open</source>
            <volume>1</volume>
            <pub-id pub-id-type="doi">10.1016/j.aiopen.2021.01.001</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B46">
        <label>46.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">McEwen, B.S. (2004) Protection and Damage from Acute and Chronic Stress: Allostasis and Allostatic Overload and Relevance to the Pathophysiology of Psychiatric Disorders. <italic>Annals of the New York Academy of Sciences</italic>, 1032, 1-7. https://doi.org/10.1196/annals.1314.001 <pub-id pub-id-type="doi">10.1196/annals.1314.001</pub-id><pub-id pub-id-type="pmid">15677391</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1196/annals.1314.001">https://doi.org/10.1196/annals.1314.001</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>McEwen, B.S.</string-name>
            </person-group>
            <year>2004</year>
            <article-title>Protection and Damage from Acute and Chronic Stress: Allostasis and Allostatic Overload and Relevance to the Pathophysiology of Psychiatric Disorders</article-title>
            <source>Annals of the New York Academy of Sciences</source>
            <volume>1032</volume>
            <pub-id pub-id-type="doi">10.1196/annals.1314.001</pub-id>
            <pub-id pub-id-type="pmid">15677391</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B47">
        <label>47.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Paik, H., Yang, S., Kim, T., <italic>et al.</italic> (2019) Cumulative Evidence for Epistatic Interactions and Rapid Cycle Disorders. <italic>NPJ Genomic Medicine</italic>, 4, Article No. 4.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Paik, H.</string-name>
              <string-name>Yang, S.</string-name>
              <string-name>Kim, T.</string-name>
            </person-group>
            <year>2019</year>
            <article-title>Cumulative Evidence for Epistatic Interactions and Rapid Cycle Disorders</article-title>
            <source>NPJ Genomic Medicine</source>
            <volume>4</volume>
            <elocation-id>No</elocation-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B48">
        <label>48.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Zitnik, M., Agrawal, M. and Leskovec, J. (2018) Modeling Polypharmacy Side Effects with Graph Convolutional Networks. <italic>Bioinformatics</italic>, 34, i457-i466.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Zitnik, M.</string-name>
              <string-name>Agrawal, M.</string-name>
              <string-name>Leskovec, J.</string-name>
            </person-group>
            <year>2018</year>
            <article-title>Modeling Polypharmacy Side Effects with Graph Convolutional Networks</article-title>
            <source>Bioinformatics</source>
            <volume>34</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B49">
        <label>49.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Guo, C., Pleiss, G., Sun, Y. and Weinberger, K.Q. (2017) On Calibration of Modern Neural Networks. <italic>International Conference on Machine Learning</italic> ( <italic>ICML</italic>), Sydney, 6-11 August 2017, 1321-1330.</mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Guo, C.</string-name>
              <string-name>Pleiss, G.</string-name>
              <string-name>Sun, Y.</string-name>
              <string-name>Weinberger, K.Q.</string-name>
            </person-group>
            <year>2017</year>
            <article-title>On Calibration of Modern Neural Networks</article-title>
            <source>International Conference on Machine Learning (ICML)</source>
            <volume>6</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B50">
        <label>50.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Gal, Y. and Ghahramani, Z. (2016) Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. <italic>International Conference on Machine Learning</italic> ( <italic>ICML</italic>), New York, 19-24 June 2016, 1050-1059.</mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Gal, Y.</string-name>
              <string-name>Ghahramani, Z.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning</article-title>
            <source>International Conference on Machine Learning (ICML)</source>
            <volume>19</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B51">
        <label>51.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Blundell, C., Cornebise, J., Kavukcuoglu, K. and Wierstra, D. (2015) Weight Uncertainty in Neural Networks. <italic>Proceedings of the</italic> 32 <italic>nd International Conference on Machine Learning</italic>, Volume 37, 1613-162.</mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Blundell, C.</string-name>
              <string-name>Cornebise, J.</string-name>
              <string-name>Kavukcuoglu, K.</string-name>
              <string-name>Wierstra, D.</string-name>
              <string-name>Learning, V</string-name>
            </person-group>
            <year>2015</year>
            <article-title>Weight Uncertainty in Neural Networks</article-title>
            <source>Proceedings of the 32nd International Conference on Machine Learning</source>
            <volume>1613</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B52">
        <label>52.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Hoofnagle, C.J., van der Sloot, B. and Borgesius, F.Z. (2019) The European Union General Data Protection Regulation: What It Is and What It Means. <italic>Information &amp; Communications Technology Law</italic>, 28, 65-98. https://doi.org/10.1080/13600834.2019.1573501 <pub-id pub-id-type="doi">10.1080/13600834.2019.1573501</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1080/13600834.2019.1573501">https://doi.org/10.1080/13600834.2019.1573501</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Hoofnagle, C.J.</string-name>
              <string-name>Sloot, B.</string-name>
              <string-name>Borgesius, F.Z.</string-name>
            </person-group>
            <year>2019</year>
            <article-title>The European Union General Data Protection Regulation: What It Is and What It Means</article-title>
            <source>Information &amp; Communications Technology Law</source>
            <volume>28</volume>
            <pub-id pub-id-type="doi">10.1080/13600834.2019.1573501</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B53">
        <label>53.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">American Psychiatric Association (2013) DSM-5: Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition.</mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Disorders, F</string-name>
            </person-group>
            <year>2013</year>
            <article-title>DSM-5: Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B54">
        <label>54.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Shiffman, S., Stone, A.A. and Hufford, M.R. (2008) Ecological Momentary Assessment. <italic>Annual</italic><italic>Review</italic><italic>of</italic><italic>Clinical</italic><italic>Psychology</italic>, 4, 1-32. https://doi.org/10.1146/annurev.clinpsy.3.022806.091415 <pub-id pub-id-type="doi">10.1146/annurev.clinpsy.3.022806.091415</pub-id><pub-id pub-id-type="pmid">18509902</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1146/annurev.clinpsy.3.022806.091415">https://doi.org/10.1146/annurev.clinpsy.3.022806.091415</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Shiffman, S.</string-name>
              <string-name>Stone, A.A.</string-name>
              <string-name>Hufford, M.R.</string-name>
            </person-group>
            <year>2008</year>
            <article-title>Ecological Momentary Assessment</article-title>
            <source>Annual Review of Clinical Psychology</source>
            <volume>4</volume>
            <pub-id pub-id-type="doi">10.1146/annurev.clinpsy.3.022806.091415</pub-id>
            <pub-id pub-id-type="pmid">18509902</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B55">
        <label>55.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Garza, M., Del Fiol, G., Tenenbaum, J., Walden, A. and Zozus, M.N. (2016) Evaluating Common Data Models for Use with a Longitudinal Community Registry. <italic>Journal of Biomedical Informatics</italic>, 64, 333-341. https://doi.org/10.1016/j.jbi.2016.10.016 <pub-id pub-id-type="doi">10.1016/j.jbi.2016.10.016</pub-id><pub-id pub-id-type="pmid">27989817</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.jbi.2016.10.016">https://doi.org/10.1016/j.jbi.2016.10.016</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Garza, M.</string-name>
              <string-name>Fiol, G.</string-name>
              <string-name>Tenenbaum, J.</string-name>
              <string-name>Walden, A.</string-name>
              <string-name>Zozus, M.N.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>Evaluating Common Data Models for Use with a Longitudinal Community Registry</article-title>
            <source>Journal of Biomedical Informatics</source>
            <volume>64</volume>
            <pub-id pub-id-type="doi">10.1016/j.jbi.2016.10.016</pub-id>
            <pub-id pub-id-type="pmid">27989817</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B56">
        <label>56.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Landis, J.R. and Koch, G.G. (1977) The Measurement of Observer Agreement for Categorical Data. <italic>Biometrics</italic>, 33, 159-174. https://doi.org/10.2307/2529310 <pub-id pub-id-type="doi">10.2307/2529310</pub-id><pub-id pub-id-type="pmid">843571</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.2307/2529310">https://doi.org/10.2307/2529310</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Landis, J.R.</string-name>
              <string-name>Koch, G.G.</string-name>
            </person-group>
            <year>1977</year>
            <article-title>The Measurement of Observer Agreement for Categorical Data</article-title>
            <source>Biometrics</source>
            <volume>33</volume>
            <pub-id pub-id-type="doi">10.2307/2529310</pub-id>
            <pub-id pub-id-type="pmid">843571</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B57">
        <label>57.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., <italic>et al.</italic> (2000) PhysioBank, PhysioToolkit, and PhysioNet. <italic>Circulation</italic>, 101, e215-e220. https://doi.org/10.1161/01.cir.101.23.e215 <pub-id pub-id-type="doi">10.1161/01.cir.101.23.e215</pub-id><pub-id pub-id-type="pmid">10851218</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1161/01.cir.101.23.e215">https://doi.org/10.1161/01.cir.101.23.e215</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Goldberger, A.L.</string-name>
              <string-name>Amaral, L.A.N.</string-name>
              <string-name>Glass, L.</string-name>
              <string-name>Hausdorff, J.M.</string-name>
              <string-name>Ivanov, P.C.</string-name>
              <string-name>Mark, R.G.</string-name>
              <string-name>PhysioBank, P</string-name>
            </person-group>
            <year>2000</year>
            <article-title>PhysioBank, PhysioToolkit, and PhysioNet</article-title>
            <source>Circulation</source>
            <volume>101</volume>
            <pub-id pub-id-type="doi">10.1161/01.cir.101.23.e215</pub-id>
            <pub-id pub-id-type="pmid">10851218</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B58">
        <label>58.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">Rasmussen, C.E. and Williams, C.K.I. (2006) Gaussian Processes for Machine Learning. The MIT Press. https://doi.org/10.7551/mitpress/3206.001.0001 <pub-id pub-id-type="doi">10.7551/mitpress/3206.001.0001</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.7551/mitpress/3206.001.0001">https://doi.org/10.7551/mitpress/3206.001.0001</ext-link></mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Rasmussen, C.E.</string-name>
              <string-name>Williams, C.K.I.</string-name>
            </person-group>
            <year>2006</year>
            <article-title>Gaussian Processes for Machine Learning</article-title>
            <pub-id pub-id-type="doi">10.7551/mitpress/3206.001.0001</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B59">
        <label>59.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Johnson, A.E.W., Pollard, T.J., Shen, L., Lehman, L.H., Feng, M., Ghassemi, M., <italic>et al.</italic> (2016) MIMIC-III, a Freely Accessible Critical Care Database. <italic>Scientific</italic><italic>Data</italic>, 3, Article ID: 160035. https://doi.org/10.1038/sdata.2016.35 <pub-id pub-id-type="doi">10.1038/sdata.2016.35</pub-id><pub-id pub-id-type="pmid">27219127</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1038/sdata.2016.35">https://doi.org/10.1038/sdata.2016.35</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Johnson, A.E.W.</string-name>
              <string-name>Pollard, T.J.</string-name>
              <string-name>Shen, L.</string-name>
              <string-name>Lehman, L.H.</string-name>
              <string-name>Feng, M.</string-name>
              <string-name>Ghassemi, M.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>MIMIC-III, a Freely Accessible Critical Care Database</article-title>
            <source>Scientific Data</source>
            <volume>3</volume>
            <fpage>160035</fpage>
            <elocation-id>ID</elocation-id>
            <pub-id pub-id-type="doi">10.1038/sdata.2016.35</pub-id>
            <pub-id pub-id-type="pmid">27219127</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B60">
        <label>60.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Cueva, C.J., Saez, A., Marcos, E., Genovesio, A., Jazayeri, M., Romo, R., <italic>et al.</italic> (2020) Low-Dimensional Dynamics for Working Memory and Time Encoding. <italic>Proceedings of the National Academy of Sciences</italic>, 117, 23021-23032. https://doi.org/10.1073/pnas.1915984117 <pub-id pub-id-type="doi">10.1073/pnas.1915984117</pub-id><pub-id pub-id-type="pmid">32859756</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1073/pnas.1915984117">https://doi.org/10.1073/pnas.1915984117</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Cueva, C.J.</string-name>
              <string-name>Saez, A.</string-name>
              <string-name>Marcos, E.</string-name>
              <string-name>Genovesio, A.</string-name>
              <string-name>Jazayeri, M.</string-name>
              <string-name>Romo, R.</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Low-Dimensional Dynamics for Working Memory and Time Encoding</article-title>
            <source>Proceedings of the National Academy of Sciences</source>
            <volume>117</volume>
            <pub-id pub-id-type="doi">10.1073/pnas.1915984117</pub-id>
            <pub-id pub-id-type="pmid">32859756</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B61">
        <label>61.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Gururangan, S., Marasović, A., Swayamdipta, S., Lo, K., Beltagy, I., Downey, D., <italic>et al.</italic> (2020) Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks. <italic>Proceedings of the</italic>58 <italic>th Annual Meeting of the Association for Computational Linguistics</italic>, July 2020, 8342-8360. https://doi.org/10.18653/v1/2020.acl-main.740 <pub-id pub-id-type="doi">10.18653/v1/2020.acl-main.740</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.18653/v1/2020.acl-main.740">https://doi.org/10.18653/v1/2020.acl-main.740</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Gururangan, S.</string-name>
              <string-name>Swayamdipta, S.</string-name>
              <string-name>Lo, K.</string-name>
              <string-name>Beltagy, I.</string-name>
              <string-name>Downey, D.</string-name>
              <string-name>Linguistics, J</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks</article-title>
            <source>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics</source>
            <volume>8342</volume>
            <pub-id pub-id-type="doi">10.18653/v1/2020.acl-main.740</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B62">
        <label>62.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Sun, C., Qiu, X., Xu, Y. and Huang, X. (2019) How to Fine-Tune BERT for Text Classification? In: Sun, M.S., <italic>et al.</italic>, Eds., <italic>Chinese Computational Linguistics</italic>, Springer International Publishing, 194-206. https://doi.org/10.1007/978-3-030-32381-3_16 <pub-id pub-id-type="doi">10.1007/978-3-030-32381-3_16</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/978-3-030-32381-3_16">https://doi.org/10.1007/978-3-030-32381-3_16</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Sun, C.</string-name>
              <string-name>Qiu, X.</string-name>
              <string-name>Xu, Y.</string-name>
              <string-name>Huang, X.</string-name>
              <string-name>Sun, M.S.</string-name>
              <string-name>Linguistics, S</string-name>
            </person-group>
            <year>2019</year>
            <article-title>How to Fine-Tune BERT for Text Classification? In: Sun, M</article-title>
            <source>S.</source>
            <volume>194</volume>
            <pub-id pub-id-type="doi">10.1007/978-3-030-32381-3_16</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B63">
        <label>63.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Yu, T., Kumar, S., Gupta, A., <italic>et al.</italic> (2020) Gradient Surgery for Multi-Task Learning. <italic>Proceedings of the</italic>34 <italic>th International Conference on Neural Information Processing Systems</italic>, Vancouver, 6-12 December 2020, 5824-5836.</mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Yu, T.</string-name>
              <string-name>Kumar, S.</string-name>
              <string-name>Gupta, A.</string-name>
              <string-name>Systems, V</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Gradient Surgery for Multi-Task Learning</article-title>
            <source>Proceedings of the 34th International Conference on Neural Information Processing Systems</source>
            <volume>6</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B64">
        <label>64.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Yang, Z., Yang, D., Dyer, C., He, X., Smola, A. and Hovy, E. (2016) Hierarchical Attention Networks for Document Classification. <italic>Proceedings of the</italic>2016 <italic>Conference of the North American Chapter of the Association for Computational Linguistics</italic>: <italic>Human Language Technologies</italic>, San Diego, June 2016, 1480-1489. https://doi.org/10.18653/v1/n16-1174 <pub-id pub-id-type="doi">10.18653/v1/n16-1174</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.18653/v1/n16-1174">https://doi.org/10.18653/v1/n16-1174</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Yang, Z.</string-name>
              <string-name>Yang, D.</string-name>
              <string-name>Dyer, C.</string-name>
              <string-name>He, X.</string-name>
              <string-name>Smola, A.</string-name>
              <string-name>Hovy, E.</string-name>
              <string-name>Technologies, S</string-name>
              <string-name>Diego, J</string-name>
            </person-group>
            <year>2016</year>
            <article-title>Hierarchical Attention Networks for Document Classification</article-title>
            <source>Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
            <volume>1480</volume>
            <pub-id pub-id-type="doi">10.18653/v1/n16-1174</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B65">
        <label>65.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Srivastava, N., Hinton, G., Krizhevsky, A., <italic>et al.</italic> (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. <italic>Journal of Machine Learning Research</italic>, 15, 1929-1958.</mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Srivastava, N.</string-name>
              <string-name>Hinton, G.</string-name>
              <string-name>Krizhevsky, A.</string-name>
            </person-group>
            <year>2014</year>
            <article-title>Dropout: A Simple Way to Prevent Neural Networks from Overfitting</article-title>
            <source>Journal of Machine Learning Research</source>
            <volume>15</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B66">
        <label>66.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Akiba, T., Sano, S., Yanase, T., Ohta, T. and Koyama, M. (2019) Optuna: A Next-Generation Hyperparameter Optimization Framework. <italic>Proceedings of the</italic>25 <italic>th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</italic>, Anchorage, 4-8 August 2019, 2623-2631. https://doi.org/10.1145/3292500.3330701 <pub-id pub-id-type="doi">10.1145/3292500.3330701</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/3292500.3330701">https://doi.org/10.1145/3292500.3330701</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Akiba, T.</string-name>
              <string-name>Sano, S.</string-name>
              <string-name>Yanase, T.</string-name>
              <string-name>Ohta, T.</string-name>
              <string-name>Koyama, M.</string-name>
              <string-name>Mining, A</string-name>
            </person-group>
            <year>2019</year>
            <article-title>Optuna: A Next-Generation Hyperparameter Optimization Framework</article-title>
            <source>Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining</source>
            <volume>4</volume>
            <pub-id pub-id-type="doi">10.1145/3292500.3330701</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B67">
        <label>67.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Zhou, P., Feng, J.S., Ma, C., Xiong, C.M. and Hoi, S.C.H. (2020) Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning. <italic>Advances in Neural Information Processing Systems</italic>, 33, 21285-21296.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Zhou, P.</string-name>
              <string-name>Feng, J.S.</string-name>
              <string-name>Ma, C.</string-name>
              <string-name>Xiong, C.M.</string-name>
              <string-name>Hoi, S.C.H.</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning</article-title>
            <source>Advances in Neural Information Processing Systems</source>
            <volume>33</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B68">
        <label>68.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Truong, T.T. and Nguyen, H. (2020) Backtracking Gradient Descent Method and Some Applications in Large Scale Optimisation. Part 2: Algorithms and Experiments. <italic>Applied Mathematics &amp; Optimization</italic>, 84, 2557-2586. https://doi.org/10.1007/s00245-020-09718-8 <pub-id pub-id-type="doi">10.1007/s00245-020-09718-8</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1007/s00245-020-09718-8">https://doi.org/10.1007/s00245-020-09718-8</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Truong, T.T.</string-name>
              <string-name>Nguyen, H.</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Backtracking Gradient Descent Method and Some Applications in Large Scale Optimisation</article-title>
            <source>Part 2: Algorithms and Experiments. Applied Mathematics &amp; Optimization</source>
            <volume>84</volume>
            <pub-id pub-id-type="doi">10.1007/s00245-020-09718-8</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B69">
        <label>69.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Carmichael, Z., Langroudi, H.F., Khazanov, C., Lillie, J., Gustafson, J.L. and Kudithipudi, D. (2019). Performance-Efficiency Trade-Off of Low-Precision Numerical Formats in Deep Neural Networks. <italic>Proceedings of the Conference for Next Generation Arithmetic</italic> 2019, Singapore, 13-14 March 2019, 1-9. https://doi.org/10.1145/3316279.3316282 <pub-id pub-id-type="doi">10.1145/3316279.3316282</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/3316279.3316282">https://doi.org/10.1145/3316279.3316282</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Carmichael, Z.</string-name>
              <string-name>Langroudi, H.F.</string-name>
              <string-name>Khazanov, C.</string-name>
              <string-name>Lillie, J.</string-name>
              <string-name>Gustafson, J.L.</string-name>
              <string-name>Kudithipudi, D.</string-name>
            </person-group>
            <year>2019</year>
            <pub-id pub-id-type="doi">10.1145/3316279.3316282</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B70">
        <label>70.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Hanley, J.A. and McNeil, B.J. (1982) The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. <italic>Ra</italic><italic>diology</italic>, 143, 29-36. https://doi.org/10.1148/radiology.143.1.7063747 <pub-id pub-id-type="doi">10.1148/radiology.143.1.7063747</pub-id><pub-id pub-id-type="pmid">7063747</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1148/radiology.143.1.7063747">https://doi.org/10.1148/radiology.143.1.7063747</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Hanley, J.A.</string-name>
              <string-name>McNeil, B.J.</string-name>
            </person-group>
            <year>1982</year>
            <article-title>The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve</article-title>
            <source>Radiology</source>
            <volume>143</volume>
            <pub-id pub-id-type="doi">10.1148/radiology.143.1.7063747</pub-id>
            <pub-id pub-id-type="pmid">7063747</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B71">
        <label>71.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Obermeyer, Z., Powers, B., Vogeli, C. and Mullainathan, S. (2019) Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations. <italic>Science</italic>, 366, 447-453. https://doi.org/10.1126/science.aax2342 <pub-id pub-id-type="doi">10.1126/science.aax2342</pub-id><pub-id pub-id-type="pmid">31649194</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1126/science.aax2342">https://doi.org/10.1126/science.aax2342</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Obermeyer, Z.</string-name>
              <string-name>Powers, B.</string-name>
              <string-name>Vogeli, C.</string-name>
              <string-name>Mullainathan, S.</string-name>
            </person-group>
            <year>2019</year>
            <article-title>Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations</article-title>
            <source>Science</source>
            <volume>366</volume>
            <pub-id pub-id-type="doi">10.1126/science.aax2342</pub-id>
            <pub-id pub-id-type="pmid">31649194</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B72">
        <label>72.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Chen, I.Y., Joshi, S., Ghassemi, M. and Ranganath, R. (2020) Treating Health Disparities with AI. <italic>Nature Medicine</italic>, 26, 462-464.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Chen, I.Y.</string-name>
              <string-name>Joshi, S.</string-name>
              <string-name>Ghassemi, M.</string-name>
              <string-name>Ranganath, R.</string-name>
            </person-group>
            <year>2020</year>
            <article-title>Treating Health Disparities with AI</article-title>
            <source>Nature Medicine</source>
            <volume>26</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B73">
        <label>73.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Torous, J., Kiang, M.V., Lorme, J. and Onnela, J. (2016) New Tools for New Research in Psychiatry: A Scalable and Customizable Platform to Empower Data Driven Smartphone Research. <italic>J</italic><italic>MIR</italic><italic>Mental</italic><italic>Health</italic>, 3, e16. https://doi.org/10.2196/mental.5165 <pub-id pub-id-type="doi">10.2196/mental.5165</pub-id><pub-id pub-id-type="pmid">27150677</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.2196/mental.5165">https://doi.org/10.2196/mental.5165</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Torous, J.</string-name>
              <string-name>Kiang, M.V.</string-name>
              <string-name>Lorme, J.</string-name>
              <string-name>Onnela, J.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>New Tools for New Research in Psychiatry: A Scalable and Customizable Platform to Empower Data Driven Smartphone Research</article-title>
            <source>JMIR Mental Health</source>
            <volume>3</volume>
            <pub-id pub-id-type="doi">10.2196/mental.5165</pub-id>
            <pub-id pub-id-type="pmid">27150677</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B74">
        <label>74.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Chung, T., Shortreed, S.M., Simon, G. and Ludman, E. (2019) Algorithmic Screening for Suicidal Ideation among Patients Receiving Mental Health Care. <italic>JAMA Network Open</italic>, 2, e1914273.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Chung, T.</string-name>
              <string-name>Shortreed, S.M.</string-name>
              <string-name>Simon, G.</string-name>
              <string-name>Ludman, E.</string-name>
            </person-group>
            <year>2019</year>
            <article-title>Algorithmic Screening for Suicidal Ideation among Patients Receiving Mental Health Care</article-title>
            <source>JAMA Network Open</source>
            <volume>2</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B75">
        <label>75.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Eldridge, S.M., Chan, C.L., Campbell, M.J., Bond, C.M., Hopewell, S., Thabane, L., <italic>et al.</italic> (2016) CONSORT 2010 Statement: Extension to Randomised Pilot and Feasibility Trials. <italic>BMJ</italic>, 355, i5239. https://doi.org/10.1136/bmj.i5239 <pub-id pub-id-type="doi">10.1136/bmj.i5239</pub-id><pub-id pub-id-type="pmid">27777223</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1136/bmj.i5239">https://doi.org/10.1136/bmj.i5239</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Eldridge, S.M.</string-name>
              <string-name>Chan, C.L.</string-name>
              <string-name>Campbell, M.J.</string-name>
              <string-name>Bond, C.M.</string-name>
              <string-name>Hopewell, S.</string-name>
              <string-name>Thabane, L.</string-name>
            </person-group>
            <year>2016</year>
            <article-title>CONSORT 2010 Statement: Extension to Randomised Pilot and Feasibility Trials</article-title>
            <source>BMJ</source>
            <volume>355</volume>
            <pub-id pub-id-type="doi">10.1136/bmj.i5239</pub-id>
            <pub-id pub-id-type="pmid">27777223</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B76">
        <label>76.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">Grisso, T. and Appelbaum, P.S. (1998) Assessing Competence to Consent to Treatment. Oxford University Press.</mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Grisso, T.</string-name>
              <string-name>Appelbaum, P.S.</string-name>
            </person-group>
            <year>1998</year>
            <article-title>Assessing Competence to Consent to Treatment</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B77">
        <label>77.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Kleine-Budde, K., Müller, R., Kawohl, W., <italic>et al.</italic> (2013) The Cost of Schizophrenia a Systematic Review. <italic>European Psychiatry</italic>, 28, 1-4.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Kleine-Budde, K.</string-name>
              <string-name>Kawohl, W.</string-name>
            </person-group>
            <year>2013</year>
            <article-title>The Cost of Schizophrenia a Systematic Review</article-title>
            <source>European Psychiatry</source>
            <volume>28</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B78">
        <label>78.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Corrigan, P.W., Druss, B.G. and Perlick, D.A. (2014) The Impact of Mental Illness Stigma on Seeking and Participating in Mental Health Care. <italic>Psychological</italic><italic>Science</italic><italic>in</italic><italic>th</italic><italic>e Public</italic><italic>Interest</italic>, 15, 37-70. https://doi.org/10.1177/1529100614531398 <pub-id pub-id-type="doi">10.1177/1529100614531398</pub-id><pub-id pub-id-type="pmid">26171956</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1177/1529100614531398">https://doi.org/10.1177/1529100614531398</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Corrigan, P.W.</string-name>
              <string-name>Druss, B.G.</string-name>
              <string-name>Perlick, D.A.</string-name>
            </person-group>
            <year>2014</year>
            <article-title>The Impact of Mental Illness Stigma on Seeking and Participating in Mental Health Care</article-title>
            <source>Psychological Science in the Public Interest</source>
            <volume>15</volume>
            <pub-id pub-id-type="doi">10.1177/1529100614531398</pub-id>
            <pub-id pub-id-type="pmid">26171956</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B79">
        <label>79.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Higgins, O., Short, B.L., Chalup, S.K. and Wilson, R.L. (2023) Artificial Intelligence (AI) and Machine Learning (ML) Based Decision Support Systems in Mental Health: An Integrative Review. <italic>International Journal of Mental Health Nursing</italic>, 32, 966-978. https://doi.org/10.1111/inm.13114 <pub-id pub-id-type="doi">10.1111/inm.13114</pub-id><pub-id pub-id-type="pmid">36744684</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1111/inm.13114">https://doi.org/10.1111/inm.13114</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Higgins, O.</string-name>
              <string-name>Short, B.L.</string-name>
              <string-name>Chalup, S.K.</string-name>
              <string-name>Wilson, R.L.</string-name>
            </person-group>
            <year>2023</year>
            <article-title>Artificial Intelligence (AI) and Machine Learning (ML) Based Decision Support Systems in Mental Health: An Integrative Review</article-title>
            <source>International Journal of Mental Health Nursing</source>
            <volume>32</volume>
            <pub-id pub-id-type="doi">10.1111/inm.13114</pub-id>
            <pub-id pub-id-type="pmid">36744684</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B80">
        <label>80.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Martinez-Martin, N., Insel, T.R., Dagum, P., <italic>et al.</italic> (2018) Data Sharing for Mental Health Research. <italic>Neuropsychopharmacology</italic>, 43, 1660-1668.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Martinez-Martin, N.</string-name>
              <string-name>Insel, T.R.</string-name>
              <string-name>Dagum, P.</string-name>
            </person-group>
            <year>2018</year>
            <article-title>Data Sharing for Mental Health Research</article-title>
            <source>Neuropsychopharmacology</source>
            <volume>43</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B81">
        <label>81.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Wang, Y.C., Chen, Y., Lan, S.L., Zhu, L.H. and Zhang, Y. (2024) End-Edge-Cloud Collaborative Computing for Deep Learning: A Comprehensive Survey. <italic>IEEE Communications Surveys &amp; Tutorials</italic>, 26, 2647-2683. https://doi.org/10.1109/COMST.2024.3393230 <pub-id pub-id-type="doi">10.1109/COMST.2024.3393230</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/COMST.2024.3393230">https://doi.org/10.1109/COMST.2024.3393230</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Wang, Y.C.</string-name>
              <string-name>Chen, Y.</string-name>
              <string-name>Lan, S.L.</string-name>
              <string-name>Zhu, L.H.</string-name>
              <string-name>Zhang, Y.</string-name>
            </person-group>
            <year>2024</year>
            <article-title>End-Edge-Cloud Collaborative Computing for Deep Learning: A Comprehensive Survey</article-title>
            <source>IEEE Communications Surveys &amp; Tutorials</source>
            <volume>26</volume>
            <pub-id pub-id-type="doi">10.1109/COMST.2024.3393230</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B82">
        <label>82.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Smith, S.M., Vidaurre, D., Beckmann, C.F., Glasser, M.F., Jenkinson, M., Miller, K.L., <italic>et al.</italic> (2013) Functional Connectomics from Resting-State fMRI. <italic>Trends</italic><italic>in</italic><italic>Cognitive</italic><italic>Sciences</italic>, 17, 666-682. https://doi.org/10.1016/j.tics.2013.09.016 <pub-id pub-id-type="doi">10.1016/j.tics.2013.09.016</pub-id><pub-id pub-id-type="pmid">24238796</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.tics.2013.09.016">https://doi.org/10.1016/j.tics.2013.09.016</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Smith, S.M.</string-name>
              <string-name>Vidaurre, D.</string-name>
              <string-name>Beckmann, C.F.</string-name>
              <string-name>Glasser, M.F.</string-name>
              <string-name>Jenkinson, M.</string-name>
              <string-name>Miller, K.L.</string-name>
            </person-group>
            <year>2013</year>
            <article-title>Functional Connectomics from Resting-State fMRI</article-title>
            <source>Trends in Cognitive Sciences</source>
            <volume>17</volume>
            <pub-id pub-id-type="doi">10.1016/j.tics.2013.09.016</pub-id>
            <pub-id pub-id-type="pmid">24238796</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B83">
        <label>83.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Jiang, J., Sachdev, P., Lipnicki, D.M., Zhang, H., Liu, T., Zhu, W., <italic>et al.</italic> (2014) A Longitudinal Study of Brain Atrophy over Two Years in Community-Dwelling Older Individuals. <italic>NeuroImage</italic>, 86, 203-211. https://doi.org/10.1016/j.neuroimage.2013.08.022 <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.08.022</pub-id><pub-id pub-id-type="pmid">23959201</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.neuroimage.2013.08.022">https://doi.org/10.1016/j.neuroimage.2013.08.022</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Jiang, J.</string-name>
              <string-name>Sachdev, P.</string-name>
              <string-name>Lipnicki, D.M.</string-name>
              <string-name>Zhang, H.</string-name>
              <string-name>Liu, T.</string-name>
              <string-name>Zhu, W.</string-name>
            </person-group>
            <year>2014</year>
            <article-title>A Longitudinal Study of Brain Atrophy over Two Years in Community-Dwelling Older Individuals</article-title>
            <source>NeuroImage</source>
            <volume>86</volume>
            <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.08.022</pub-id>
            <pub-id pub-id-type="pmid">23959201</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B84">
        <label>84.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Nilsson, A., Smith, S., Ulm, G., Gustavsson, E. and Jirstrand, M. (2018) A Performance Evaluation of Federated Learning Algorithms. <italic>Proceedings of the</italic>2 <italic>nd Workshop on Distributed Infrastructures for Deep Learning</italic>, Rennes, 10-11 December 2018, 1-8. https://doi.org/10.1145/3286490.3286559 <pub-id pub-id-type="doi">10.1145/3286490.3286559</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/3286490.3286559">https://doi.org/10.1145/3286490.3286559</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Nilsson, A.</string-name>
              <string-name>Smith, S.</string-name>
              <string-name>Ulm, G.</string-name>
              <string-name>Gustavsson, E.</string-name>
              <string-name>Jirstrand, M.</string-name>
              <string-name>Learning, R</string-name>
            </person-group>
            <year>2018</year>
            <article-title>A Performance Evaluation of Federated Learning Algorithms</article-title>
            <source>Proceedings of the 2nd Workshop on Distributed Infrastructures for Deep Learning</source>
            <volume>10</volume>
            <pub-id pub-id-type="doi">10.1145/3286490.3286559</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B85">
        <label>85.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Nori, H., King, N., McKinney, S.M., <italic>et al.</italic> (2023) Capabilities of GPT-4 on Medical Challenge Problems. https://arxiv.org/abs/2303.13375</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Nori, H.</string-name>
              <string-name>King, N.</string-name>
              <string-name>McKinney, S.M.</string-name>
            </person-group>
            <year>2023</year>
            <article-title>Capabilities of GPT-4 on Medical Challenge Problems</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B86">
        <label>86.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">Pearl, J. (2009) Causality: Models, Reasoning and Inference. 2nd Edition, Cambridge University Press. https://doi.org/10.1017/cbo9780511803161 <pub-id pub-id-type="doi">10.1017/cbo9780511803161</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1017/cbo9780511803161">https://doi.org/10.1017/cbo9780511803161</ext-link></mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Pearl, J.</string-name>
              <string-name>Models, R</string-name>
              <string-name>Edition, C</string-name>
            </person-group>
            <year>2009</year>
            <article-title>Causality: Models, Reasoning and Inference</article-title>
            <source>2nd Edition</source>
            <pub-id pub-id-type="doi">10.1017/cbo9780511803161</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
    </ref-list>
  </back>
</article>