<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">AM</journal-id><journal-title-group><journal-title>Applied Mathematics</journal-title></journal-title-group><issn pub-type="epub">2152-7385</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/am.2021.128048</article-id><article-id pub-id-type="publisher-id">AM-111291</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Physics&amp;Mathematics</subject></subj-group></article-categories><title-group><article-title>
 
 
  A Gaussian Multivariate Hidden Markov Model for Breast Tumor Diagnosis
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Angelo</surname><given-names>Raherinirina</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Adore</surname><given-names>Randriamandroso</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Aimé</surname><given-names>Richard Hajalalaina</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Rivo</surname><given-names>Andry Rakotoarivelo</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Fontaine</surname><given-names>Rafamatantantsoa</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>University of Fianarantsoa, Fianarantsoa, Madagascar</addr-line></aff><pub-date pub-type="epub"><day>12</day><month>08</month><year>2021</year></pub-date><volume>12</volume><issue>08</issue><fpage>679</fpage><lpage>693</lpage><history><date date-type="received"><day>4,</day>	<month>July</month>	<year>2021</year></date><date date-type="rev-recd"><day>13,</day>	<month>August</month>	<year>2021</year>	</date><date date-type="accepted"><day>16,</day>	<month>August</month>	<year>2021</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  The stage of a tumor is sometimes hard to predict, especially early in its development. The size and complexity of its observations are the major problems that lead to false diagnoses. Even experienced doctors can make a mistake in causing terrible consequences for the patient. We propose a mathematical tool for the diagnosis of breast cancer. The aim is to help specialists in making a decision on the likelihood of a patient’s condition knowing the series of observations available. This may increase the patient’s chances of recovery. With a multivariate observational hidden Markov model, we describe the evolution of the disease by taking the geometric properties of the tumor as observable variables. The latent variable corresponds to the type of tumor: malignant or benign. The analysis of the covariance matrix makes it possible to delineate the zones of occurrence for each group belonging to a type of tumors. It is therefore possible to summarize the properties that characterize each of the tumor categories using the parameters of the model. These parameters highlight the differences between the types of tumors.
 
</p></abstract><kwd-group><kwd>Hidden Markov Chain</kwd><kwd> Gaussian Mixture</kwd><kwd> Breast Tumor</kwd><kwd> Malignant and Benign</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Breast cancer remains the most common and deadliest cancer worldwide. Statistics show that it affects women much more than men [<xref ref-type="bibr" rid="scirp.111291-ref1">1</xref>] . With an upward trend over the past twenty years, the developed countries seem to be the most affected. Little information is available on the situation in underdeveloped countries [<xref ref-type="bibr" rid="scirp.111291-ref2">2</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref3">3</xref>] .</p><p>In the case of Madagascar, the non-existence of official statistics does not allow an inventory to be made on breast cancer. Data from the Institute Pasteur de Madagascar (IPM) shows that the average age of patients is 48 years. Although the data from the IPM does not represent the whole of the Malagasy reality, they nevertheless show that the number of cases of breast tumor continues to increase; it has become the most frequently observed cancer. The most affected age group in the series is 36 to 55 years old, see <xref ref-type="fig" rid="fig1">Figure 1</xref>. The youngest patient is 22 years old and the oldest 90 years old. Of all the cases that could be typed histologically, 2/3 are in the grade 3 infiltrating ductal carcinoma stage of the Scar-Bloom-Richardson histoprognostic classification (SBR) [<xref ref-type="bibr" rid="scirp.111291-ref2">2</xref>] .</p><p>Usually the diagnosis is very late and most cases are seen clinically only at an advanced stage of the tumor [<xref ref-type="bibr" rid="scirp.111291-ref4">4</xref>] . In addition to socio-economic factors, the structure of the health system makes the early diagnosis of cancerous lesions impossible in all Malagasy territories. Support establishments are almost non-existent outside the main cities (Antananarivo, Fianarantsoa). Indeed, according to IPM data, 80% of the cases observed come from the two large hospitals in the capital Antananarivo. In these situations, the ability to quickly and unambiguously detect the nature of a tumor is of paramount importance. This makes it possible to help specialists in the choice and type of treatment to prescribe from the early stages of its development [<xref ref-type="bibr" rid="scirp.111291-ref5">5</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref6">6</xref>] .</p><p>However, studies have shown that the diagnosis of the intrinsic nature of cancer is not necessarily reliable at a fairly early stage of its development [<xref ref-type="bibr" rid="scirp.111291-ref7">7</xref>] . Leading to inadequate treatment, this false diagnosis increases the death rate of the disease. The biggest challenge is therefore to provide tools for the rapid and reliable detection of the nature of tumors. In addition to mammographic exams, mathematical and computer modeling tools can help us in this process [<xref ref-type="bibr" rid="scirp.111291-ref8">8</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref9">9</xref>] . Several models have already been proposed to describe the evolution of breast cancer [<xref ref-type="bibr" rid="scirp.111291-ref8">8</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref10">10</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref11">11</xref>] .</p><p>We are interested in so-called stochastic models to describe the evolution of</p><p>the breast tumor [<xref ref-type="bibr" rid="scirp.111291-ref11">11</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref12">12</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref13">13</xref>] . Given the small amount of information available, these models seem the most appropriate. In [<xref ref-type="bibr" rid="scirp.111291-ref11">11</xref>] the authors use a Markov chain to model the metastatic course of lung cancer. Generally, these models assume that tumor types are unambiguously determined based on observations. This is not the case in reality, appearances are deceptive. Hence the interest to consider the dynamics behind the observed data. The hidden Markov chain is one of the tools used to study the dynamics of a latent phenomenon such as breast cancer. It has already shown its proof in various situations such as the characterization and prediction of the evolution of tumors [<xref ref-type="bibr" rid="scirp.111291-ref14">14</xref>] [<xref ref-type="bibr" rid="scirp.111291-ref15">15</xref>] .</p><p>For breast cancer, the biomarker provides information on the nature of a tumor: benign or malignant [<xref ref-type="bibr" rid="scirp.111291-ref16">16</xref>] . However, due to the complexity and quantity of data to be exploited, it is necessary to be able to count on an agent who judges the observations objectively and reliably. In this article, our objective is to provide a tool to help diagnose the nature of a breast tumor at a primary stage. By having observations of the biomarker, we consider that the nature of the tumor is a hidden variable. The observable variables consist of the geometric properties of the tumor located on the mammary area. With probabilistic techniques, we will characterize the true nature of the tumor according to the observations. Based on a series of observations, we will determine the probability that the tumor is malignant or benign. This information is important in a country like Madagascar where diagnostic instruments are still very expensive and inaccessible.</p></sec><sec id="s2"><title>2. Gaussian Mixture Hidden Markov Chain Model</title><p>A Markov chain is a model which allows the study of sequences of a random quantity. It is based on a very strong assumption that current information is needed to predict the future [<xref ref-type="bibr" rid="scirp.111291-ref17">17</xref>] . Introduced by Rabiner [<xref ref-type="bibr" rid="scirp.111291-ref18">18</xref>] , in 1998, the hidden Markov chain is an extension of this model. In most cases, the phenomena that interest us are not observable. The hidden Markov model (HMM) takes into account observed data and hidden events. Consider a markovian process ( X t ) in a discrete state space</p><p>S = ( S 1 , S 2 , ⋯ , S n ) (1)</p><p>Because we are working on a discrete-time Markov chain, it is necessary to introduce the transition matrix A such that</p><p>A i j = P ( X t = j / X t − 1 = i ) . (2)</p><p>With an initial condition</p><p>μ i = P ( X 0 = i ) (3)</p><p>For all t, the evolution of ( X t ) is unobservable. What we have instead is an observation ( Y t ) which can take its value in a finite set</p><p>O = { Y 1 , Y 2 , ⋯ , Y M } (4)</p><p>which depends on these hidden variables.</p><p>We then have an observation function</p><p>f i ( y k ) = P ( Y t = y k / S t = i ) (5)</p><p>which gives us the probability of the observation knowing the state.</p><p>The Equations (1)-(5) characterize a hidden Markov model. The aim of our modeling work is to determine the nature of a breast tumor in the first phase of its evolution.</p><p>We assume that the patient's condition at time t corresponds to a malignant tumor or benign tumor:</p><p>X ∈ S = { Malign , Benign } . (6)</p><p>The process ( X t ) t ≥ 0 is considered as an unobservable finite state space Markov chain, see <xref ref-type="fig" rid="fig2">Figure 2</xref>. The variable Y ∈ R n represents the observations corresponding to the geometric properties of the tumor. Our goal is to use the series of observations to estimate the different parameters of the process.</p><p>Different very powerful algorithms have been developed to do this [<xref ref-type="bibr" rid="scirp.111291-ref19">19</xref>] . The forward-backward algorithm calculates the probability of the current state knowing the sequence of observations ( y 1 , y 2 , ⋯ , y T ) [<xref ref-type="bibr" rid="scirp.111291-ref20">20</xref>] .</p><p>Let us summarize in θ all the parameters of the model: transition matrix, parameter of the probability distribution (mean, variance) for each state. By defining the functions α and β such that</p><p>α t + 1 ( j ) = P ( O 1 : ( t + 1 ) = o 1 : ( t + 1 ) , S t + 1 = s t + 1 / θ ) = ∑ i = 1 N α t ( i ) a i j f j ( y t + 1 ) , (7)</p><p>β t ( i ) = P ( O t : T = o t : T / S t = s t , θ ) = ∑ j = 1 N β t + 1 ( i ) a i j f j ( y t + 1 ) . (8)</p><p>We have</p><p>P ( Y 1 : T = y 1 : T / θ ) = ∑ i = 1 N α t ( i ) β t ( i ) . (9)</p><p>α and β allow us compute respectively the probability of a given sequences of observations given the parameter θ and a starting time t.</p><p>These values will be used in the learning phase (modify the parameter given new observations). The implementation is described by <xref ref-type="table" rid="table1">Table 1</xref>.</p><p>In our case, we are interested in the sequence of hidden states knowing the sequence of observations it generated. The Viterbi algorithm, described in <xref ref-type="table" rid="table2">Table 2</xref>, solves this problem [<xref ref-type="bibr" rid="scirp.111291-ref21">21</xref>] .</p><p>Regarding breast cancer, the observable variable is not a singular value but a d-dimensional vector. This corresponds to the different measurements made on a patient during a diagnosis (temperature, concentration, etc.). Thus, we have</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Baum-Welch Algorithm for the implementation of the EM algorithm</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Data: ( O 1 , O 2 , ⋯ , O n ) set of observations Initialization θ 0 ( a i j ) Matrix of transition p i l Weight of the l mixture component of state i, μ i l Mean vector of the l mixture component of state i, σ i l Covariance matrix of the l mixture component of state i. Forward ( O 1 , O 2 , ⋯ , O n ) , Backward ( O 1 , O 2 , ⋯ , O n ) . While log ( P ( O 1 , O 2 , ⋯ , O n ) ) ≤ p 0 (fixed threshold) Expectation ξ t ( i , j ) = P ( s t = i , s t + 1 = j / 0 , θ ) , γ t ( i ) = P ( s t = i / O , θ ) = ∑ j ξ ( i , j ) , γ i l ( t ) = P ( S t = i , m = l / O t , θ ) = γ i ( t ) p i l P ( O t / μ i l , σ i l , θ ) P ( O t / S t = i , θ ) . Maximization a i j = ∑ t = 1 T ξ t ( i , j ) ∑ t = 1 T γ t ( i ) , p i l = ∑ t = 1 T γ i l ( t ) O t ∑ t = 1 T γ i l ( t ) , σ i l = ∑ t = 1 T γ i l ( t ) ( O t − μ i l ) ( O t − μ i l ) T ∑ t = 1 T γ i l ( t ) . End While f j ( O t + 1 ) : density of probability of the state i, γ i l ( t ) : probability that the component l of the state i generated the observation at time t, γ i ( t ) : probability that the state i generated the observation at time t.</th></tr></thead></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Viterbi algorithm (generation of hidden state)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >For i = 1 to N do δ 1 ( i ) = p i f i ( O 1 ) End For For t = 2 to T do For j = 1 to N do δ t ( j ) = max i ( δ t − 1 ( i ) a i j ) f j ( O t ) ψ t ( j ) = arg max i ( δ t − 1 ( i ) a i j ) f j ( O t ) End For End For P * = max i ( δ T ( i ) ) Q T ∗ = arg max i ( δ T ( i ) ) For t = T − 1 to 1 do Q t * = ψ t + 1 ( Q t + 1 * ) End For</th></tr></thead></tbody></table></table-wrap><p>chosen a multivariate Gaussian density to model the probability of observation of each state [<xref ref-type="bibr" rid="scirp.111291-ref22">22</xref>] .</p><p>A multivariate Gaussian mixture is a function composed of the sum of several other functions (in most cases Gaussian functions) [<xref ref-type="bibr" rid="scirp.111291-ref23">23</xref>] . Given that our observations are multidimensional (several observed characters), we use a random vector X = [ X 1 , X 2 , ⋯ , X n ] distributed according to a mixed Gauss distribution. Each component of the mixture has for average</p><p>μ = [ E ( X 1 ) , E ( X 2 ) , ⋯ , E ( X n ) ] , (10)</p><p>and a covariance matrix</p><p>σ = ( V ( X 1 ) C o v ( X 1 , X 2 ) ⋯ C o v ( X 1 , X n ) C o v ( X 1 , X 2 ) V ( X 2 ) ⋱ ⋮ ⋮ ⋮ ⋱ ⋮ C o v ( X 1 , X n ) ⋯ ⋯ V ( X n ) ) , (11)</p><p>with σ i j = σ j i and σ i i = V a r ( X i ) = C o v ( X i , X i ) .</p><p>By associating to each component i a weight p i with ∑ i p i = 1 , the expression of the Gaussian multidimensional distribution is</p><p>f ( x , μ , σ ) = 1 ( 2 π ) n / 2 | σ | 1 / 2 exp ( 1 2 ( x − μ ) T σ − 1 ( x − μ ) ) (12)</p><p>Thus, the Gaussian mixture f k ( x ) associated with the state k is characterized by:</p><p>1) p i k ∈ R the weight of the component i of the mixture representing the density of the observable function of the states k,</p><p>2) μ i k ∈ R n the mean vector of the random vector of the component i of the mixture k,</p><p>3) σ i j ∈ R n &#215; n the covariance matrix of the Gaussian density of the component i of the mixture k.</p><p>The parameters are estimated by implementing the Expectation-Maximization algorithm [<xref ref-type="bibr" rid="scirp.111291-ref22">22</xref>] described in <xref ref-type="table" rid="table3">Table 3</xref>.</p></sec><sec id="s3"><title>3. Result and Discussion</title><p>As we do not have any database on breast cancer in Madagascar, we used the published database on breast cancer from Wisconsin to illustrate our model [<xref ref-type="bibr" rid="scirp.111291-ref24">24</xref>] . With a sample made up of 569 individuals, the authors measured 30 parameters which make it possible to determine the state of the tumor (malignant or benign). These 30 observations characterize the two types of cancer that we want to identify. <xref ref-type="fig" rid="fig3">Figure 3</xref> represents the correlation matrix of the characteristic variables.</p><p>This figure shows that some of the variables are related. Going from light blue to dark blue, the figure shows the degrees of relationships between the covariates.</p><p>It is possible to reduce the number of variables by a principal component analysis. We have plotted the empirical distribution of the geometric characteristics according to the type of tumor, see <xref ref-type="fig" rid="fig4">Figure 4</xref>. In particular, we are finding that some of them are easy to distinguish between the tumor types, while others do not.</p><p>After switching to principal component analysis, we concluded that a 2-dimensional subspace is sufficient to keep 95% of the variance of the observed data. <xref ref-type="fig" rid="fig5">Figure 5</xref> represents the projection of individuals on the two main.</p><p>The number of components of each mixture can be calculated using the Bayesian</p><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> Expectation Maximization algorithm for a gaussian mixture multivariate model</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Observation: ( x 1 , x 2 , ⋯ , x N ) with x i ∈ R n Initialization: μ 0 , σ 0 , π 0 While log ( P ( x 1 , x 2 , ⋯ , x N ) ) ≤ p 0 ( t h r e s h o l d ) Expectation r n k = p k N ( x n / μ k , σ k ) ∑ j = 1 K p j N ( x n / μ j , σ j ) Maximization <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x60.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x61.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x62.png" xlink:type="simple"/></inline-formula> End While</th></tr></thead></tbody></table></table-wrap><p>Information Criterion (BIC) on each of the data that we have for the different types of tumors [<xref ref-type="bibr" rid="scirp.111291-ref25">25</xref>] . For an observation, the point clouds are obtained by making a vector product on each of the two eigenvectors.</p><p>We thus have the number of optimal components for each mixture. Our results show that the number of suitable components for the benign type is k = 2 while that of the malignant type is k = 3. Which is quite logical because of the arrangement of the points in the <xref ref-type="fig" rid="fig5">Figure 5</xref>. The arrangement on the point clouds of malignant cases is quite dispersive, so it is logical that its mixture has more</p><p>components than the other type. For each of the two states (malignant and benign), we estimate the parameters of their distribution using the Expectation Maximization algorithm. Learning is supervised because we already know for each observation the state that generated it. The parameters of the Gaussian distribution for each mixture are given in <xref ref-type="table" rid="table4">Table 4</xref> and <xref ref-type="table" rid="table5">Table 5</xref>.</p><p>Thus, we have a probability density which makes it possible to associate a probability value for each observation, see <xref ref-type="fig" rid="fig6">Figure 6</xref>. For each observation<inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x65.png" xlink:type="simple"/></inline-formula>, we calculate its probability conditioned by all the parameters<inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x66.png" xlink:type="simple"/></inline-formula>. The “+” sign represents the averages. Its size is proportional to the weight of the component</p><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> Parameters of the distribution of the state “malignant” tumor</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Malign Tumor</th><th align="center" valign="middle" >Eigenvectors</th><th align="center" valign="middle" >Eigenvalues</th></tr></thead><tr><td align="center" valign="middle" >Covariance <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x68.png" xlink:type="simple"/></inline-formula> Mean <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x69.png" xlink:type="simple"/></inline-formula> Weight: 0.55</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x70.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x71.png" xlink:type="simple"/></inline-formula></td><td align="center" valign="middle" >1.856 5.476789</td></tr><tr><td align="center" valign="middle" >Covariance <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x72.png" xlink:type="simple"/></inline-formula> Mean <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x73.png" xlink:type="simple"/></inline-formula> Weight: 0.20</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x74.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x75.png" xlink:type="simple"/></inline-formula></td><td align="center" valign="middle" >7.9519 18.39</td></tr><tr><td align="center" valign="middle" >Covariance <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x76.png" xlink:type="simple"/></inline-formula> Mean <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x77.png" xlink:type="simple"/></inline-formula> Weight: 0.25</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x78.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x79.png" xlink:type="simple"/></inline-formula></td><td align="center" valign="middle" >1.513 3.55</td></tr></tbody></table></table-wrap><p>in the mixture.</p><p>With the estimated parameters, we performed a simulation on the model. The result of the principal component analysis of the simulated data are represented in the <xref ref-type="fig" rid="fig7">Figure 7</xref>.</p><p>By comparing this result with the real data in <xref ref-type="fig" rid="fig5">Figure 5</xref>, we can deduce that the result is reliable. This concerns the efficiency of the EM algorithm in estimating</p><table-wrap id="table5" ><label><xref ref-type="table" rid="table5">Table 5</xref></label><caption><title> Parameter of the distribution of the state “benign” tumor</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Benign Tumor</th><th align="center" valign="middle" >Eigenvectors</th><th align="center" valign="middle" >Eigenvalues</th></tr></thead><tr><td align="center" valign="middle" >Covariance <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x80.png" xlink:type="simple"/></inline-formula> Mean <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x81.png" xlink:type="simple"/></inline-formula> Weight: 0.84</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x82.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x83.png" xlink:type="simple"/></inline-formula></td><td align="center" valign="middle" >1.2802 2.34</td></tr><tr><td align="center" valign="middle" >Covariance <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x84.png" xlink:type="simple"/></inline-formula> Mean <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x85.png" xlink:type="simple"/></inline-formula> Weight: 0.16</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x86.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="/html.scirp.org/file/3-7404743x87.png" xlink:type="simple"/></inline-formula></td><td align="center" valign="middle" >2.389 12.1202</td></tr></tbody></table></table-wrap><p>the parameters of a Hidden Markov model.</p><p>Using a simulation, we illustrate the approximation of the real data by the simulated data from the model. Considering the two tumor states: malignant and benign, <xref ref-type="fig" rid="fig8">Figure 8</xref> represents a realization of the Viterbi algorithm (<xref ref-type="table" rid="table2">Table 2</xref>).</p><p>The first red curve represents the actual condition of the tumor (1 for benign and 0 for malignant) and which is assumed to be hidden. The blue curve below corresponds to the state estimated by the algorithm and is the component most likely to have generated the observations. The estimate was made using the observation sequences projected onto the two eigenvectors and represented by the two lowest blue curves. According to this figure, the model manages to bring</p><p>closer the true natures of the tumors during its evolution.</p><p>This model is therefore a real additional tool which can help specialists in the diagnosis of breast tumors at an early stage. Simulations with breast cancer data show that inference with a Gaussian mixture hidden Markov model can help us characterize the internal nature of the disease.</p></sec><sec id="s4"><title>4. Conclusions</title><p>Faced with the difficulties in diagnosing breast cancer, specialists need other decision-making tools. Especially in an underdeveloped country like Madagascar, patient reception structures remain insufficient. The cost of health examinations is very high and the support is done late. This article constitutes our contribution to the anticipation of decision-making according to the geometric characteristics of the tumor. As we do not have a database on breast cancer in Madagascar, the inference of our hidden Makov model was made with data on breast cancer from Wisconsin, available on the web. As the observations relate to several variables related to the geometric characteristics of the tumor, we chose the multivariate Gaussian distribution for the distribution of observations corresponding to each state.</p><p>By implementing the Viterbi algorithm, we succeeded in identifying the parameters of the distribution of observations corresponding to each state of the tumor: benign or malignant. With the estimated parameters, we reproduced the initial data set under the same conditions. The result is significant.</p><p>The interest of our modeling work is both to characterize the nature of tumors as early as possible and also to predict its evolution. This is important especially in the event that our health system is not yet ready for support. Even if the data from the Institut Pasteur de Madagascar (IPM) are not representative, the frequency of the observed cases shows that the situation is critical. To be able to carry out more in-depth studies, specialists recommend the establishment of a database on breast cancer in Madagascar.</p></sec><sec id="s5"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s6"><title>Cite this paper</title><p>Raherinirina, A., Randriamandroso, A., Hajalalaina, A.R., Rakotoarivelo, R.A. and Rafamatantantsoa, F. (2021) A Gaussian Multivariate Hidden Markov Model for Breast Tumor Diagnosis. Applied Mathematics, 12, 679-693. https://doi.org/10.4236/am.2021.128048</p></sec></body><back><ref-list><title>References</title><ref id="scirp.111291-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Azamjah, N., Soltan-Zadeh, Y. and Zayeri, F. (2019) Global Trend of Breast Cancer Mortality Rate: A 25-Year Study. Asian Pacic Journal of Cancer Prevention, 20, 2015-2020. https://doi.org/10.31557/APJCP.2019.20.7.2015</mixed-citation></ref><ref id="scirp.111291-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Sado, A. (2015) Application of Brody Growth Function to Describe Dynamics of Breast Cancer Cells. American Journal of Applied Mathematics, 3, 138-145. https://doi.org/10.11648/j.ajam.20150303.20</mixed-citation></ref><ref id="scirp.111291-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Al-Moundhri, M., Al-Bahrani, B., Pervez, I., Ganguly, S., Nirmala, V., Al-Madhani, A., Al-Mawaly, K. and Grant, C. (2004) The Outcome of Treatment of Breast Cancer in a Developing Country—Oman. The Breast, 13, 139-145. https://doi.org/10.1016/j.breast.2003.10.001</mixed-citation></ref><ref id="scirp.111291-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Raharisolo, V., Rabarijaona, L., Rajemiarimoelisoa, C., Rasendramino, M. and Migliani, R. (2002) Management of Breast Cancers Diagnosed at the Institut Pasteur de Madagascar from 1995 to 2001. Archives de l’Institut Pasteur de Madagascar, 68, 104-108.</mixed-citation></ref><ref id="scirp.111291-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Cianfrocca, M. and Goldstein, L. (2004) Prognostic and Predictive Factors in Early-Stage Breast Cancer. Oncologist, 9, 606-601. https://doi.org/10.1634/theoncologist.9-6-606</mixed-citation></ref><ref id="scirp.111291-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Chuwa, E., Yeo, A., Koong, H., Wong, C., Yong, W., Tan, P., Ho, J., Wong, J. and Ho, G. (2009) Early Detection of Breast Cancer through Population-Based Mammographic Screening in Asian Women: A Comparison Study between Screen-Detected and Symptomatic Breast Cancers. The Breast Journal, 15, 133-139.https://doi.org/10.1111/j.1524-4741.2009.00687.x</mixed-citation></ref><ref id="scirp.111291-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Bicar, A. (2018) Le cancer du sein chez la jeune femme et sa prise en charge. PhD Thesis, University of Limoge, Limoge, 15.</mixed-citation></ref><ref id="scirp.111291-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Botesteanu, D., Lipkowitz, S., Lee, J. and Levy, D. (2016) Mathematical Models of Breast and Ovarian Cancers. WIREs Systems Biology and Medicine, 8, 337-362. https://doi.org/10.1002/wsbm.1343</mixed-citation></ref><ref id="scirp.111291-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Harris, J., Lippman, M., Veronesi, U. and Willett, W. (1992) Breast Cancer. New England Journal of Medicine, 327, 473-480. https://doi.org/10.1056/NEJM199208133270706</mixed-citation></ref><ref id="scirp.111291-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Enderling, H., Chaplain, M., Anderson, A. and Vaidya, J. (2007) A Mathematical Model of Breast Cancer Development, Local Treatment and Recurrence. Journal of Theoretical Biology, 246, 245-259. https://doi.org/10.1016/j.jtbi.2006.12.010</mixed-citation></ref><ref id="scirp.111291-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Newton, P., Mason, J., Bethel, K., Bazhenova, L., Nieva, J. and Kuhn, P. (2012) A Stochastic Markov Chain Model to Describe Lung Cancer Growth and Metastasis. PloS ONE, 7, e34637. https://doi.org/10.1371/journal.pone.0034637</mixed-citation></ref><ref id="scirp.111291-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Lee, S. and Zelen, M. (2006) A Stochastic Model for Predicting the Mortality of Breast Cancer. JNCI Monographs, 2006, 79-86. https://doi.org/10.1093/jncimonographs/lgj011</mixed-citation></ref><ref id="scirp.111291-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Speer, J., Petrosky, V., Retsky, M. and Wardwell, R. (1984) A Stochastic Numerical Model of Breast Cancer Growth That Simulates Clinical Data. Cancer Research, 44, 4124-4130.</mixed-citation></ref><ref id="scirp.111291-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Mohammadreza, M., Mohammadreza, S. and Hossein, R. (2020) Using Hidden Markov Model to Predict Recurrence of Breast Cancer Based on Sequential Patterns in Gene Expression Profiles. Journal of Biomedical Informatics, 111, Article ID: 103570. https://doi.org/10.1016/j.jbi.2020.103570</mixed-citation></ref><ref id="scirp.111291-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Mahmoudzadeh, E., Montazeri, M., Zekri, M. and Sadri, S. (2015) Extended Hidden Markov Model for Optimized Segmentation of Breast Thermography Images. Infrared Physics and Technology, 72, 19-28. https://doi.org/10.1016/j.infrared.2015.06.012</mixed-citation></ref><ref id="scirp.111291-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Zaman, K., Ambrosetti, A., Perey, L., Jeanneret-Sozzi, W., Delaloye, J. and Ziegler, D. (2007) Cancer du sein chez la jeune femme: Traitements adjuvants et désir de grossesse. Revue Médicale Suisse, 7, 1298-1304.</mixed-citation></ref><ref id="scirp.111291-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Norris, J. (1998) Markov Chains. Cambridge University Press, Cambridge.https://doi.org/10.1017/CBO9780511810633</mixed-citation></ref><ref id="scirp.111291-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Rabiner, L. and Juang, B. (1986) An Introduction to Hidden Markov Models. IEEE ASSP Magazine, 3, 4-6. https://doi.org/10.1109/MASSP.1986.1165342</mixed-citation></ref><ref id="scirp.111291-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Rabiner, L. (1989) A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77, 257-286. https://doi.org/10.1109/5.18626</mixed-citation></ref><ref id="scirp.111291-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Devijver, P. (1985) Baum’s Forward-Backward Algorithm Revisited. Pattern Recognition Letters, 3, 369-373. https://doi.org/10.1016/0167-8655(85)90023-6</mixed-citation></ref><ref id="scirp.111291-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Lou, H. (1995) Implementing the Viterbi Algorithm. IEEE Signal Processing Magazine, 12, 42-52. https://doi.org/10.1109/79.410439</mixed-citation></ref><ref id="scirp.111291-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Xuan, G.R., Zhang, W. and Chai, P.Q. (2001) EM Algorithms of Gaussian Mixture Model and Hidden Markov Model. International Conference on Image Processing, 1, 145-148.</mixed-citation></ref><ref id="scirp.111291-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Pinto, R. and Engel, P. (2015) A Fast Incremental Gaussian Mixture Model. PLoS ONE, 10, e0139931. https://doi.org/10.1371/journal.pone.0139931</mixed-citation></ref><ref id="scirp.111291-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Dua, D. and Gra, C. (2019) UCI Machine Learning Repository.University of California, School of Information and Computer Science, Irvine. http://archive.ics.uci.edu/ml</mixed-citation></ref><ref id="scirp.111291-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Deisenroth, M., Faisal, A. and Ong, C. (2020) Mathematics for Machine Learning. Cambridge University Press, Cambridge. https://mml-book.com/https://doi.org/10.1017/9781108679930</mixed-citation></ref></ref-list></back></article>