<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">OJS</journal-id><journal-title-group><journal-title>Open Journal of Statistics</journal-title></journal-title-group><issn pub-type="epub">2161-718X</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/ojs.2019.91006</article-id><article-id pub-id-type="publisher-id">OJS-90198</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Physics&amp;Mathematics</subject></subj-group></article-categories><title-group><article-title>
 
 
  Analysis of Hospital Mortality Data: The Role of DRG’s
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Mohamed</surname><given-names>M. Shoukri</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Sara</surname><given-names>N. Algahtani</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Abdelmoneim</surname><given-names>M. Eldali</given-names></name><xref ref-type="aff" rid="aff3"><sup>3</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Manal</surname><given-names>R. AlMarzouqi</given-names></name><xref ref-type="aff" rid="aff3"><sup>3</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Saleh</surname><given-names>M. Al-Ageel</given-names></name><xref ref-type="aff" rid="aff3"><sup>3</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Department of Epidemiology and Biostatistics, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada</addr-line></aff><aff id="aff3"><addr-line>Department of Biostatistics, Epidemiology, and Scientific Computing, King Faisal Specialist Hospital and Research Center, Riyadh, KSA</addr-line></aff><aff id="aff2"><addr-line>King Fahd Medical City, Riyadh, KSA</addr-line></aff><pub-date pub-type="epub"><day>18</day><month>01</month><year>2019</year></pub-date><volume>09</volume><issue>01</issue><fpage>62</fpage><lpage>73</lpage><history><date date-type="received"><day>24,</day>	<month>December</month>	<year>2018</year></date><date date-type="rev-recd"><day>22,</day>	<month>January</month>	<year>2019</year>	</date><date date-type="accepted"><day>25,</day>	<month>January</month>	<year>2019</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  <em>Background</em>
  : Factors associated with hospital mortality are usually identified and their effects are quantified through statistical modeling. To guide the choice of the best statistical model, we first quantify the predictive ability of each model and then use the CIHI index to see if the hospital policy needs any change. <b>Objectives</b>: The main purpose of this study compared three statistical models in the evaluation of the association between hospital mortality and two risk factors, namely subject’s age at admission and the length of stay, adjusting for the effect of Diagnostic Related Groups (DRG). <b>Methods</b>: We use several SAS procedures to quantify the effect of DRG on the variability in hospital mortality. These procedures are the Logistic Regression model (ignoring the DRG effect), the Generalized Estimating Equation (GEE) that takes into account the within DRG clustering effect (but the within cluster correlation is treated as nuisance parameter), and the Generalized Linear Mixed Model (GLIMMIX). We showed that the GLIMMIX is superior to other models as it properly account
  s
   for the clustering effect of “Diagnostic Related Groups” denoted by DRG. <b>Results</b>: The GLM procedure showed that the proportional contribution of DRG is 16%. All three models showed significant and increasing trend in mortality (P &lt; 0.0001) with respect to the two risk factors (age at admission, and hospital length of stay). It was also clear that the CIHI index was not different under the three models. We re-estimated the models parameters after dichotomizing the risk factors at the optimal cut-off points, using the ROC curve. The parameters estimates and their significance did not change.
 
</p></abstract><kwd-group><kwd>Diagnostic Related Groups</kwd><kwd> Intra-Cluster Correlation</kwd><kwd> GEE Models</kwd><kwd> GLIMMIX Models</kwd><kwd> Odds Ratios</kwd><kwd> ROC Curves</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The ability to gauge hospital performance using patient outcome data depends upon many factors. In principle, the outcome needs reflect features that are directly affected by the quality of hospital care, to name but a few; mortality, readmission rates patients and employee satisfaction. Beyond this, however, there are a number of important data and statistical considerations:</p><p>1) Data must be available and used to adjust for differences in patient health at admission across different hospitals (case-mix differences). These adjustments are required to ensure that variations in reported performance apply to hospitals’ contributions to their patients’ outcomes rather than to the intrinsic difficulty of the patients they treat. Needless to say that, performance of the adjustments depends on the type and quality of available data.</p><p>2) In distinct contrast to the previous point, reported performance should not adjust away differences related to the quality of the hospital. For example, if “presence of a special dialysis care unit” is systematically associated with better survival following organ failure, a hospital’s reported performance should capture the benefit provided by that unit and as a consequence such hospital-level characteristic should not influence the risk adjustment.</p><p>3) The reported performance measure should be little affected by the variability associated with rates based on the small numbers of cases.</p><p>In this report we address technical statistical issues associated with the KFSHRC hospital mortality data. The salient point is that there is no consensus to guide our choice of an appropriate statistical model. However, we shall use the most scientific statistical models to analyze our data. To enhance the traditional modeling techniques we include use of more flexible models incorporating Diagnosis Related Groups (DRG) adjustment [<xref ref-type="bibr" rid="scirp.90198-ref1">1</xref>] ; stressing the use of statistical distributions that do not belong to the well-known Gaussian family used in the hierarchical, random effects models; evaluation of the effectiveness of current outlier detection methods; and consideration of producing an ensemble of hospital-specific Standardized Mortality Ratios (HSMR) that accurately estimates the true, underlying distribution of ratios [<xref ref-type="bibr" rid="scirp.90198-ref2">2</xref>] . Discussion with clinicians and other quality experts concluded that risk adjustments should not reflect hospital characteristics, but their use in reducing confounding of the case-mix/ risk relation. Statistical models are available for each of these operations. The ability to develop and implement such models is now available since the adoption of the SAS software and the acquisition of its important components.</p><p>In Section 2 we define what is meant by DGR, and in Section 3 we describe the data that were made available to us, with mortality as the primary outcome at the King Faisal Specialist Hospital (KFSHRC). In Section 4 we compare the models, and in Section 5 we discuss the quantitative merits of these models, followed by recommendations.</p></sec><sec id="s2"><title>2. The Importance of Incorporating DRG’s within the Proposed Models</title><p>The Diagnostic Related Groups (DRGs) were first developed at Yale University in 1975. The main objective was to group patients with similar treatments and conditions for comparative studies. DRGs were designed to be homogeneous units of hospital activity to which binding prices could be attached. A central theme in the advocacy of DRGs was that this reimbursement system would, by constraining the hospitals, oblige their administrators to alter the behavior of the physicians and surgeons comprising their medical staffs. Hospitals were forced to leave the nearly risk-free world of cost reimbursement, and face the uncertain financial consequences associated with the provision of health care. DRGs were designed to provide practice pattern information that administrators could use to influence individual physician behavior.</p><p>DRGs were designed to be homogeneous units of hospital activity to which binding prices could be attached. A central theme in the advocacy of DRGs was that this reimbursement system would, by constraining the hospitals, oblige their administrators to alter the behavior of the physicians and surgeons comprising their medical staffs. Hospitals were forced to leave the nearly risk-free world of cost reimbursement, and face the uncertain financial consequences associated with the provision of health care. DRGs were designed to provide practice pattern information that administrators could use to influence individual physician behavior.</p><p>In 2007, author Rick Mayes [<xref ref-type="bibr" rid="scirp.90198-ref3">3</xref>] described DRGs as:</p><p>...the single most influential postwar innovation in medical financing: Medicare’s prospective payment system (PPS). Inexorably rising medical inflation and deep economic deterioration forced policymakers in the late 1970s to pursue radical reform of Medicare to keep the program from insolvency.</p><p>In the USA the most significant change in health policy since Medicare and Medicaid’s passage in 1965 went virtually unnoticed by the general public [<xref ref-type="bibr" rid="scirp.90198-ref4">4</xref>] . Nevertheless, the change was nothing short of revolutionary. For the first time, the federal government gained the upper hand in its financial relationship with the hospital industry. Medicare’s new prospective payment system with DRGs triggered a shift in the balance of political and economic power between the providers of medical care (hospitals and physicians) and those who paid for it―power that providers had successfully accumulated for more than half a century. From statistical view point DRG’s are considered artificial clusters of subjects.</p><p>Krumholz et al. [<xref ref-type="bibr" rid="scirp.90198-ref5">5</xref>] discussed several factors that should be considered when assessing hospital quality. These relate to differences in the chronic and clinical acuity of patients at hospital presentation, the numbers of patients treated at a hospital, the frequency of the outcome studied, the extent to which the outcome reflects a hospital quality signal, and the form of the performance metric used to assess hospital quality. However, issues related to DRG have not been considered as factors of importance. Since the outcome of interest is hospital mortality, any attempt to derive risk adjusted mortality that does not take into account the relative importance of DRG will produce biased estimates [<xref ref-type="bibr" rid="scirp.90198-ref6">6</xref>] . The performance measure is reported as:</p><p>Observed # deaths/Expected (model based) # of deaths (1)</p><p>The denominator of Equation (1) results from applying a model that adjusts/standardizes for an ensemble of patient-level, pre-admission risk factors, rather than only demographic factors such as age and gender as is typical in epidemiological applications. The statistical issues arising in the estimation of the standardized death rate and the SMR are identical because the latter is simply the hospital-specific value divided by the expected number of deaths computed from the postulated risk model.</p></sec><sec id="s3"><title>3. Methods</title><sec id="s3_1"><title>3.1. Study Design</title><p>Hospital discharge status, available from the hospital medical records from 2014 through 2016 were extracted. For each subject, the age at admission, length of stay and DRG membership were included in this cross sectional retrospective design. The study was reviewed and approved by the Institutional Review Board at the King Faisal Specialist Hospital and Research Center (KFSHRC).</p></sec><sec id="s3_2"><title>3.2. Study Variables</title><sec id="s3_2_1"><title>3.2.1. Dependent Variable</title><p>Discharge status is the dependent variable (dead/alive). Because of the Bernoulli distribution of the outcome, the log-odds of death were calculated in the analytical cohort.</p></sec><sec id="s3_2_2"><title>3.2.2. Independent Variables</title><p>Regression models included parameters that defined age at admission, length of stay, gender, and DRG. Because DRG is a categorical variable with excessive number of levels our modeling strategy used DRG as a clustering variable, and as a random effect variable. The fundamental aim was to adjust the standard errors of the estimated model parameters for the possible within DRG correlation. Another reason is, to preserve the stochastic process and hierarchical structure of the data, and develop an effective risk adjustment. Because patient-specific outcomes are binary (death indicator), a Bernoulli model operating at the patient level is appropriate. Risk adjustment and stabilization should adopt this model and thus logistic regression is a suitable approach for including the effects of patient-level characteristics. With flexible modeling of covariate influences, the model would produce a valid risk adjustment and there is no reason to replace the logistic by another function.</p><p>The evaluation process must be based on an effective risk adjustment. Though one might wish to have additional information of patient attributes and clinical severity, even with currently available data we should evaluate whether a more flexible risk adjustment model will improve performance. Patient characteristics (clinical and demographic) are of the three types, measured and accounted for, measurable but not accounted for, and characteristics that are difficult or impossible to measure. Prudence dictates that risk adjustments should include pre-admission medical conditions, but whether or not to include demographic attributes is a policy decision.</p></sec></sec><sec id="s3_3"><title>3.3. Statistical Analysis</title><p>Univariate and descriptive statistics were used to profile the study covariates, including the frequency distribution of the top twelve DRG’s, as shown in <xref ref-type="table" rid="table1">Table 1</xref>. Because of the binary nature of the outcome of interest (patient’s status when discharged), we fitted logistic regression models to estimate change in level (intercept) and trend (slope) on log-odds of age at admission and length of stay. Each model was adjusted to account for the clustering effect of DRG.</p><p>Three statistical estimation procedures in SAS (GLM, GEE, GLIMMIX) were used to account for the correlation between responses with a DRG and heterogeneity across individuals in the study. The intra-class correlation was calculated using the one-way ANOVA using the GLM procedure in SAS. Data management and analyses were accomplished via PC-SAS (v9.4) [<xref ref-type="bibr" rid="scirp.90198-ref7">7</xref>] , with an a-priori Type I error rate set at 0.01.</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> The ICD9 diagnoses for the top most frequent DRG’s in the hospital data base</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >DRG</th><th align="center" valign="middle" >frequency</th><th align="center" valign="middle" >ICD9 Diagnosis</th></tr></thead><tr><td align="center" valign="middle" >1-P67D</td><td align="center" valign="middle" >4270</td><td align="center" valign="middle" >Neonate-admission weight &gt; 2499 g W/O significant OR procedure.</td></tr><tr><td align="center" valign="middle" >2-O60Z</td><td align="center" valign="middle" >2097</td><td align="center" valign="middle" >Vaginal delivery complications</td></tr><tr><td align="center" valign="middle" >3-F42B</td><td align="center" valign="middle" >1434</td><td align="center" valign="middle" >Diseases of the circulatory system</td></tr><tr><td align="center" valign="middle" >4-R60B</td><td align="center" valign="middle" >1308</td><td align="center" valign="middle" >Acute Leukemia W catastrophic CC</td></tr><tr><td align="center" valign="middle" >5-R61B</td><td align="center" valign="middle" >1273</td><td align="center" valign="middle" >Lymphoma &amp; non-acute Leukemia W/O catastrophic CC</td></tr><tr><td align="center" valign="middle" >6-O01B</td><td align="center" valign="middle" >1187</td><td align="center" valign="middle" >Pregnancy complications</td></tr><tr><td align="center" valign="middle" >7-L04C</td><td align="center" valign="middle" >1173</td><td align="center" valign="middle" >Diseases of Kidney and Urinary tract</td></tr><tr><td align="center" valign="middle" >8-K64B</td><td align="center" valign="middle" >1127</td><td align="center" valign="middle" >Endocrine and metabolic diseases</td></tr><tr><td align="center" valign="middle" >9-Q60A</td><td align="center" valign="middle" >1064</td><td align="center" valign="middle" >Immunity disorders</td></tr><tr><td align="center" valign="middle" >10-E62B</td><td align="center" valign="middle" >1016</td><td align="center" valign="middle" >Diseases of the respiratory system</td></tr><tr><td align="center" valign="middle" >11-O66Z</td><td align="center" valign="middle" >1004</td><td align="center" valign="middle" >Antenatal and other obstetric admissions</td></tr><tr><td align="center" valign="middle" >12-B02C</td><td align="center" valign="middle" >1002</td><td align="center" valign="middle" >Diseases of the nervous system</td></tr></tbody></table></table-wrap><p>The analyses produced point estimates and 95% confidence intervals of the odds ratios whether the two covariates were entered the models as continuous or as categorical variables.</p></sec></sec><sec id="s4"><title>4. Results</title><p>The study included 191943 discharges, of which 184,907 alive and 7046 dead. The summary statistics are outlined in <xref ref-type="table" rid="table1">Table 1</xref> and <xref ref-type="table" rid="table2">Table 2</xref>.</p><p>Assuming that the number of DRG’s in the data base is k, and the size of the i<sup>th</sup> DRG is k i . The estimated intra-cluster correlation obtained from the one-way ANOVA using the GLM procedure in SAS (Shoukri [<xref ref-type="bibr" rid="scirp.90198-ref8">8</xref>] ) is given in Equation (2):</p><p>ρ ^ = MSBD − MSWD MSBD + ( k 0 − 1 ) MSWD (2)</p><p>where MSBD and MSWD are respectively the between DRG mean squares and the within DRG mean squares. Moreover:</p><p>k 0 = 1 n − 1 [ ∑ ​   k i − ∑ ​   k i 2 ∑ ​   k i ] ,     i = 1 , 2 , ⋯ , k     and     n = ∑ ​   k i</p><p>Summary statistics for age at admission and length of stay are presented in <xref ref-type="table" rid="table2">Table 2</xref>. Note that the standard deviation formula uses the (number of observations minus one) to produce an unbiased estimator for the corresponding population parameter (Shoukri [<xref ref-type="bibr" rid="scirp.90198-ref8">8</xref>] ).</p><p>The main purpose of using the GLM, which requires independent responses, is to produce a point estimator of the within cluster correlation (Shoukri, [<xref ref-type="bibr" rid="scirp.90198-ref8">8</xref>] ).</p><p>From the GLM procedure we have, MSBD = 1.8214 , MSWD = 0.0218 and k 0 = 474 . From Equation (1) the intra-cluster correlation coefficient ρ ^ = 0.16 .</p><p>Note that:</p><p>The Effect of Dichotomization</p><p>Measurements of continuous variables are made in all branches of epidemiological studies aiding in the diagnosis and treatment of patients. In clinical practice it is helpful to label individuals as having r not having an attribute, such as</p><table-wrap-group id="2"><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Summary statistics for age at admission (AAA) and length of stay (LOS) presented by discharge status (Alive, Dead). (a) Status = Alive; (b) Status = Dead</title></caption><table-wrap id="2_1"><caption><title> (b)</title></caption><table><tbody><thead><tr><th align="center" valign="middle" ></th><th align="center" valign="middle" >N</th><th align="center" valign="middle" >Minimum</th><th align="center" valign="middle" >Maximum</th><th align="center" valign="middle" >Mean</th><th align="center" valign="middle" >Std. Deviation</th></tr></thead><tr><td align="center" valign="middle" >AAA</td><td align="center" valign="middle" >184907</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >100</td><td align="center" valign="middle" >32.39</td><td align="center" valign="middle" >24.841</td></tr><tr><td align="center" valign="middle" >LOS</td><td align="center" valign="middle" >184894</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >358</td><td align="center" valign="middle" >9.10</td><td align="center" valign="middle" >15.352</td></tr></tbody></table></table-wrap><table-wrap id="2_2"><caption><title></title></caption><table><tbody><thead><tr><th align="center" valign="middle" ></th><th align="center" valign="middle" >N</th><th align="center" valign="middle" >Minimum</th><th align="center" valign="middle" >Maximum</th><th align="center" valign="middle" >Mean</th><th align="center" valign="middle" >Std. Deviation</th></tr></thead><tr><td align="center" valign="middle" >AAA</td><td align="center" valign="middle" >7046</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >100</td><td align="center" valign="middle" >46.31</td><td align="center" valign="middle" >27.196</td></tr><tr><td align="center" valign="middle" >LOS</td><td align="center" valign="middle" >6990</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >362</td><td align="center" valign="middle" >28.74</td><td align="center" valign="middle" >41.572</td></tr></tbody></table></table-wrap></table-wrap-group><p>being “old” or “young” or having “long stay” depending on the number of days.</p><p>Dichotomization of continuous variables is also common in clinical research, but the statistical analysis has some serious drawbacks as there will be reduction in the precision of the estimated effect sizes. Though grouping may help data presentation, notably in tables, categorization is unnecessary for. Here we consider the impact of converting continuous data to two groups (dichotomizing), as this is the most common approach in clinical research.</p><p>Within each model we estimated for each effect the log-odds as an effect size using continuous and categorized covariates. We found that the GLIMMIX has superior advantage over the logistic regression and the GEE models. We calculated the optimal split for the AAA and LOS using the “Receiver Operating Characteristic curve” or ROC curve. <xref ref-type="fig" rid="fig1">Figure 1</xref> and <xref ref-type="table" rid="table3">Table 3</xref> shows the optimal cut off for LOS is 160 days, and the corresponding area under curve 73%. This means that the risk of death is significantly higher among patients who are hospitalized over 160 days relative to those who stay less than 160 days, (corrected P-value = 0.0001). Additionally, in <xref ref-type="fig" rid="fig2">Figure 2</xref>, and <xref ref-type="table" rid="table4">Table 4</xref> we show that the optimal split for AAA is 53 years. The areas under the ROC curve corresponding to the dichotomized covariate AAA is 65%.</p><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> Area under the curve for the LOS optimal cut-off point</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Area</th><th align="center" valign="middle"  rowspan="2"  >Std. Error</th><th align="center" valign="middle"  rowspan="2"  >P-value</th><th align="center" valign="middle"  colspan="2"  >Asymptotic 95% Confidence Interval</th></tr></thead><tr><td align="center" valign="middle" >Lower Bound</td><td align="center" valign="middle" >Upper Bound</td></tr><tr><td align="center" valign="middle" >0.727</td><td align="center" valign="middle" >0.004</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.720</td><td align="center" valign="middle" >0.734</td></tr></tbody></table></table-wrap><p>Corrected P-value = 0.0001; The optimal cut-off point is LOS ≥ 160 days.</p><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> Area under the curve, test result variable(s) for AAA</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Area</th><th align="center" valign="middle"  rowspan="2"  >Std. Error</th><th align="center" valign="middle"  rowspan="2"  >P-Value</th><th align="center" valign="middle"  colspan="2"  >Asymptotic 95% Confidence Interval</th></tr></thead><tr><td align="center" valign="middle" >Lower Bound</td><td align="center" valign="middle" >Upper Bound</td></tr><tr><td align="center" valign="middle" >0.647</td><td align="center" valign="middle" >0.004</td><td align="center" valign="middle" >0.0001</td><td align="center" valign="middle" >0.640</td><td align="center" valign="middle" >0.654</td></tr></tbody></table></table-wrap><p>Corrected p-value = 0.0001; The optimal cut-off pint is age ≥ 53 is associated with death.</p><p>Dichotomizing leads to several problems. Firstly, much information is lost, and this can been seen from the increase in the estimated standard errors of the odds ratios. Moreover, the odds ratios point estimates are inflated as well and therefore are potentially biased. For example, in <xref ref-type="table" rid="table5">Table 5</xref> the estimated odds ratio of the dichotomized LOS is 14.53 while its value is 1.024 when measured on the continuous scale under the same model. The remark holds true if the fitted model is the GEE as shown in <xref ref-type="table" rid="table6">Table 6</xref>. The estimates are somewhat stable under the GLIMMIX and the results are shown in <xref ref-type="table" rid="table7">Table 7</xref>. One may conclude that dichotomization may increase the risk of a positive result being a false-positive.</p><p>Categorical Age:</p><p>Alternative to dichotomization we categorized age in a meaningful way such that:</p><p>Group 1: Age is less than 14 years</p><p>Group 2: Age between 15 and 30</p><p>Group 3: Age between 31 and 59</p><p>Group 4: Age above 60.</p><p>When we plotted the mortality rate, with 95% confidence limits, against the 4 age categories, as shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>, there was an increasing trend in mortality</p><table-wrap id="table5" ><label><xref ref-type="table" rid="table5">Table 5</xref></label><caption><title> Estimating the odds ratios using the logistic regression models</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Effect</th><th align="center" valign="middle"  colspan="2"  >Continuous Covariates</th><th align="center" valign="middle"  colspan="2"  >Dichotomized Covariates</th></tr></thead><tr><td align="center" valign="middle" >OR</td><td align="center" valign="middle" >95% CI</td><td align="center" valign="middle" >OR</td><td align="center" valign="middle" >95% CI</td></tr><tr><td align="center" valign="middle" >AAA</td><td align="center" valign="middle" >1.021</td><td align="center" valign="middle" >1.02 - 1.022</td><td align="center" valign="middle" >2.88</td><td align="center" valign="middle" >2.75 - 3.03</td></tr><tr><td align="center" valign="middle" >LOS</td><td align="center" valign="middle" >1.024</td><td align="center" valign="middle" >1.023 - 1.025</td><td align="center" valign="middle" >14.53</td><td align="center" valign="middle" >11.86 - 17.80</td></tr><tr><td align="center" valign="middle" >AIC = 54522</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td><td align="center" valign="middle" >AIC = 57694</td><td align="center" valign="middle" ></td></tr></tbody></table></table-wrap><table-wrap id="table6" ><label><xref ref-type="table" rid="table6">Table 6</xref></label><caption><title> Estimating the odds ratios using the GEE models</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Effect</th><th align="center" valign="middle"  colspan="2"  >Continuous Covariates</th><th align="center" valign="middle"  colspan="2"  >Dichotomized Covariates</th></tr></thead><tr><td align="center" valign="middle" >OR</td><td align="center" valign="middle" >95% CI</td><td align="center" valign="middle" >OR</td><td align="center" valign="middle" >95% CI</td></tr><tr><td align="center" valign="middle" >AAA</td><td align="center" valign="middle" >1.025</td><td align="center" valign="middle" >1.02 - 1.031</td><td align="center" valign="middle" >3.22</td><td align="center" valign="middle" >2.304 - 4.511</td></tr><tr><td align="center" valign="middle" >LOS</td><td align="center" valign="middle" >1.030</td><td align="center" valign="middle" >1.024 - 1.036</td><td align="center" valign="middle" >19.88</td><td align="center" valign="middle" >10.665 - 37.308</td></tr><tr><td align="center" valign="middle" >QIC = 13192</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td><td align="center" valign="middle" >QIC = 14319</td><td align="center" valign="middle" ></td></tr></tbody></table></table-wrap><table-wrap id="table7" ><label><xref ref-type="table" rid="table7">Table 7</xref></label><caption><title> Estimating the odds ratios using the GLIMMIX models</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Effect</th><th align="center" valign="middle"  colspan="2"  >Continuous Covariates</th><th align="center" valign="middle"  colspan="2"  >Dichotomized Covariates</th></tr></thead><tr><td align="center" valign="middle" >OR</td><td align="center" valign="middle" >95% CI</td><td align="center" valign="middle" >OR</td><td align="center" valign="middle" >95% CI</td></tr><tr><td align="center" valign="middle" >AAA</td><td align="center" valign="middle" >1.014</td><td align="center" valign="middle" >1.012 - 1.017</td><td align="center" valign="middle" >1.68</td><td align="center" valign="middle" >1.48 - 1.90</td></tr><tr><td align="center" valign="middle" >LOS</td><td align="center" valign="middle" >1.013</td><td align="center" valign="middle" >1.011 - 1.015</td><td align="center" valign="middle" >2.42</td><td align="center" valign="middle" >1.48 - 3.97</td></tr><tr><td align="center" valign="middle"  colspan="2"  >σ b 2 = 3.24 &#177; 0.50</td><td align="center" valign="middle" ></td><td align="center" valign="middle"  colspan="2"  >σ b 2 = 3.56 &#177; 0.54</td></tr><tr><td align="center" valign="middle"  colspan="2"  >Scaled deviance = 0.62</td><td align="center" valign="middle" ></td><td align="center" valign="middle"  colspan="2"  >Scaled deviance = 0.65</td></tr><tr><td align="center" valign="middle"  colspan="2"  >−2 Res log-likelihood = 477379</td><td align="center" valign="middle" ></td><td align="center" valign="middle"  colspan="2"  >−2 Res log pseudo-likelihood = 478461</td></tr></tbody></table></table-wrap><p>as age groups moved up. The one-degree of freedom Cochran-Armitage test for trend was quite significant with p-value &lt; 0.001.</p><p>σ b 2 = 3.56 &#177; 0.54 , Scaled deviance = 0.61, and, −2 Res log pseudo-likelihood = 477030.</p><p>The GLIMMIX estimation when age is categorized into 4 groups is given in <xref ref-type="table" rid="table8">Table 8</xref>. The odds ratio estimate of LOS, which is highly correlated with age has improved and in fact is almost similar to the estimated odds ratio when the age was taken as a continuous covariate. There is almost no change in the between DRG variance component estimate, σ b 2 = 3.56 , confirming the hypothesis that the measured covariates are not correlated with the random component in the model. The scaled deviance is less than one, indicating that the model has captured the effect of the measured and the unmeasured covariates. The value of −2 Res log pseudo-likelihood, (which is equivalently defined as the AIC),indicates that model goodness of fit is also acceptable.</p><p>Under the GLIMMIX model, whose results are summarized in <xref ref-type="table" rid="table8">Table 8</xref>, theCIHI [<xref ref-type="bibr" rid="scirp.90198-ref9">9</xref>] index of hospital performance when mortality is the outcome of interest is:</p><p>N U M = ∑ ​   O i = 7053 , and D E N = ∑ ​   E i = 7117</p><p>giving CIHI = NUM/DEN less than unity. This indicates that the hospital risk adjusted mortality meets the CIHI criteria for quality.</p></sec><sec id="s5"><title>5. Discussion</title><p>Although the fitted models produced odds ratio estimates whose changes and trend were in the same direction and of the same significance, the magnitude of point estimates and length of confidence intervals varied. Clearly the logistic regression produced a smaller length of confidence intervals. This should be expected, since this model ignores the nature of the correlation structure among responses within each DRG, and hence the standard errors are under-estimated. The GEE, introduced by Liang and Zeger [<xref ref-type="bibr" rid="scirp.90198-ref10">10</xref>] is suitable for the analysis of clustered data. The GEE estimation produced similar magnitude of point estimates but relatively less precise confidence intervals. The GEE is supposed to produce consistent estimates even if the within DRG correlation parameter is misspecified. In our case we assigned an exchangeable correlation, as a working correlation parameter to represent the average within DRG heterogeneity. Finally the GLIMMIX produced entirely different set of estimates, and an estimate of an</p><table-wrap id="table8" ><label><xref ref-type="table" rid="table8">Table 8</xref></label><caption><title> GLIMMIX: Age is categorized into 4 groups with group 4 being the reference</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Effect</th><th align="center" valign="middle" >OR</th><th align="center" valign="middle" >95% CI</th></tr></thead><tr><td align="center" valign="middle" >Age group 1</td><td align="center" valign="middle" >0.901</td><td align="center" valign="middle" >0.816 - 0.996</td></tr><tr><td align="center" valign="middle" >Age group 2</td><td align="center" valign="middle" >1.805</td><td align="center" valign="middle" >1.684 - 1.935</td></tr><tr><td align="center" valign="middle" >Age group 3</td><td align="center" valign="middle" >3.589</td><td align="center" valign="middle" >3.353 - 3.842</td></tr><tr><td align="center" valign="middle" >LOS</td><td align="center" valign="middle" >1.024</td><td align="center" valign="middle" >1.023 - 1.025</td></tr></tbody></table></table-wrap><p>additional parameter representing the variance component or quantification of the between DRG variation.</p><p>When applied to dichotomous response variable, the GEE results are to be interpreted at the population average (PA) level and does not account for heterogeneity across the DGRs. The GEE indicates that the risk of death depends on AAA and LOS that are measured at the individual level and not on any random effect across the clusters of DRGs. This model uses a working correlation as an instrument to account for the within DRGs variations.</p><p>However the, the fundamental feature of the GLIMMIX is the assumption of heterogeneity across DRGs in our study population. The GLIMMIX interprets the estimated model parameter (the odds ratio) as conditioned on DRG specific intercepts. These intercepts reflect a natural heterogeneity due to unmeasured covariates. For example, presence or absence of co-morbid conditions, or re-admission (yes/no) may produce different trajectory for the risk of death. Therefore, there is as expected a sizable difference in the magnitude of uncertainties between the GEE and the GLIMMIX.</p></sec><sec id="s6"><title>6. Conclusion</title><p>In conclusion, health services and quality of care research might use any of the above models depending on the scientific question posed. The GEE, will produce estimates that will be of most interest to quality health services and policy makers who evaluate hospital performance on an average level. On the other hand, the GLIMMIX will have RDRG level interpretation.</p></sec><sec id="s7"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s8"><title>Cite this paper</title><p>Shoukri, M.M., Algahtani, S.N., Eldali, A.M., AlMarzouqi, M.R. and Al-Ageel, S.M. (2019) Analysis of Hospital Mortality Data: The Role of DRG’s. Open Journal of Statistics, 9, 62-73. https://doi.org/10.4236/ojs.2019.91006</p></sec></body><back><ref-list><title>References</title><ref id="scirp.90198-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Blackford, G. (2008) Discover How to Survive Medicare Severity DRGs with a Successful Clinical Documentation Improvement Program and Present on Admission Indicators. 2008 AHIMA Convention Proceedings, 11-16 October 2008, Seattle, WA.</mixed-citation></ref><ref id="scirp.90198-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Glance, L.G., Dick, A.W., Mukamel, D.B., et al. (2010) How Well Do Hospital Mortality Rates Reported in the New York State. CABG Reportcards Predict Subsequent Hospital Performance? Medical Care, 48, 466-471.   
https://doi.org/10.1097/MLR.0b013e3181d568f7</mixed-citation></ref><ref id="scirp.90198-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Mayes, R. (2007) The Origins, Development, and Passage of Medicare's Revolutionary Prospective Payment System. Journal of the History of Medicine and Allied Sciences, 62, 21-55. https://doi.org/10.1093/jhmas/jrj038</mixed-citation></ref><ref id="scirp.90198-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Donabedian, A. (1988) The Quality of Care. How Can It Be Assessed? JAMA, 260, 1743-1748. https://doi.org/10.1001/jama.1988.03410120089033</mixed-citation></ref><ref id="scirp.90198-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Krumholz, H.M. (2008) Outcomes Research: Generating Evidence for Best Practice. Circulation, 118, 309-318.  
https://doi.org/10.1161/CIRCULATIONAHA.107.690917</mixed-citation></ref><ref id="scirp.90198-ref6"><label>6</label><mixed-citation publication-type="journal" xlink:type="simple"><name name-style="western"><surname>Martin</surname><given-names> G. </given-names></name>,<etal>et al</etal>. (<year>2008</year>)<article-title>MS-DRG Journey: How One Hospital Joined Together to Successfully Implement MS-DRGs</article-title><source> Journal of AHIMA</source><volume> 79</volume>,<fpage> 70</fpage>-<lpage>72</lpage>.<pub-id pub-id-type="doi"></pub-id></mixed-citation></ref><ref id="scirp.90198-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">SAS Institute (2010) STAT-SAS, Version 9.4. SAS Institute, Cary, North Carolina.</mixed-citation></ref><ref id="scirp.90198-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Shoukri, M. (2010) Measures of Inter-Observer Agreement and Reliability. Chapman and Hall-CRC Press, Boca Raton. https://doi.org/10.1201/b10433</mixed-citation></ref><ref id="scirp.90198-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Canadian Institute for Health Information (2009) The CIHI Data Quality Framework.</mixed-citation></ref><ref id="scirp.90198-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Quality Framework (2009). http://www.cihi.ca/CIHI-ext-portal/pdf/internet/DATA</mixed-citation></ref><ref id="scirp.90198-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Liang, K.Y. and Zeger, S.L. (1993) Regression Analysis for Correlated Data. Annual Review of Public Health, 14, 43-68.  
https://doi.org/10.1146/annurev.pu.14.050193.000355</mixed-citation></ref></ref-list></back></article>