<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">AM</journal-id><journal-title-group><journal-title>Applied Mathematics</journal-title></journal-title-group><issn pub-type="epub">2152-7385</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/am.2016.77053</article-id><article-id pub-id-type="publisher-id">AM-65573</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Physics&amp;Mathematics</subject></subj-group></article-categories><title-group><article-title>
 
 
  Statistical Analysis of Fuzzy Linear Regression Model Based on Centroid Method
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>iwu</surname><given-names>Zhang</given-names></name><xref ref-type="aff" rid="aff1"><sub>1</sub></xref></contrib></contrib-group><aff id="aff1"><label>1</label><addr-line>School of Mathematical Science, Yancheng Teachers University, Yancheng, China</addr-line></aff><author-notes><corresp id="cor1">* E-mail:</corresp></author-notes><pub-date pub-type="epub"><day>18</day><month>04</month><year>2016</year></pub-date><volume>07</volume><issue>07</issue><fpage>579</fpage><lpage>586</lpage><history><date date-type="received"><day>9</day>	<month>September</month>	<year>2015</year></date><date date-type="rev-recd"><day>accepted</day>	<month>15</month>	<year>April</year>	</date><date date-type="accepted"><day>18</day>	<month>April</month>	<year>2016</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  This paper transforms fuzzy number into clear number using the centroid method, thus we can research the traditional linear regression model which is transformed from the fuzzy linear regression model. The model’s input and output are fuzzy numbers, and the regression coefficients are clear numbers. This paper considers the parameter estimation and impact analysis based on data deletion. Through the study of example and comparison with other models, it can be concluded that the model in this paper is applied easily and better.
 
</p></abstract><kwd-group><kwd>Centroid Method</kwd><kwd> Fuzzy Linear Regression Model</kwd><kwd> Parameter Estimation</kwd><kwd> Data Deletion Model</kwd><kwd> Cook Distance</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Regression analysis is an important and comprehensive approach to analyze relationship between dependent variable and other one or more independent variables; it has a very wide range of applications in engineering sciences, social sciences, economic and financial fields. Traditional regression analysis methods often require that both independent variable and dependent variable are clear data. However, practical problem is often not clear data but fuzzy data. For example, the amount of observations described in language, such as something large, something heavy, or approximately equal to 3, etc. Because of some ambiguous indicators, analyzing these issues only by traditional regression can not get satisfactory and completely results.</p><p>By means of Zadeh’s [<xref ref-type="bibr" rid="scirp.65573-ref1">1</xref>] fuzzy set theory, researchers established a different fuzzy regression model and solved its solutions. After Tanaka et al. [<xref ref-type="bibr" rid="scirp.65573-ref2">2</xref>] , Diamond [<xref ref-type="bibr" rid="scirp.65573-ref3">3</xref>] estimated regression coefficients using least squares method (FLS), which is similar to traditional LS estimate. Savic and Pedrycz [<xref ref-type="bibr" rid="scirp.65573-ref4">4</xref>] established a two-step model of fuzzy regression analysis by combining FLS with linear programming. Recently, Chang [<xref ref-type="bibr" rid="scirp.65573-ref5">5</xref>] compared fuzzy regression methods, summed up three methods of fuzzy regression: minimum fuzzy criteria, least squares fitting criteria and interval regression analysis method. The main difference between fuzzy regression and conventional regression is that the residual in fuzzy regression is a fuzzy variable, but a random variable in traditional regression. The fuzzy regression model discussed here can be divided into several cases, such as the regression coefficient being expressed in fuzzy numbers; or part of variables being ambiguous; or input and output variables being ambiguous [<xref ref-type="bibr" rid="scirp.65573-ref6">6</xref>] - [<xref ref-type="bibr" rid="scirp.65573-ref10">10</xref>] .</p><p>This paper starts from the fuzzy input and output variables, transforms them into clear data by centroid method [<xref ref-type="bibr" rid="scirp.65573-ref11">11</xref>] , and then, the problem of fuzzy linear regression analysis (regression coefficient is clear number) can be transformed into traditional linear regression analysis. Thus, the problems of fuzzy linear regression analysis can be addressed by the estimation and statistical diagnosis method of traditional linear regression model.</p></sec><sec id="s2"><title>2. Fuzzy Regression Model and Parameter Estimation</title><p>Assume that <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x6.png" xlink:type="simple"/></inline-formula> is a observational data set of fuzzy input and fuzzy output, and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x7.png" xlink:type="simple"/></inline-formula> (<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x8.png" xlink:type="simple"/></inline-formula>is the fuzzy number set of the real number set R). Then fuzzy linear regression model can be expressed as</p><disp-formula id="scirp.65573-formula37"><label>(1)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x9.png"  xlink:type="simple"/></disp-formula><p>where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x10.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x11.png" xlink:type="simple"/></inline-formula>is the i-th observation error.</p><p>Assume</p><disp-formula id="scirp.65573-formula38"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x12.png"  xlink:type="simple"/></disp-formula><p>Then the fuzzy linear regression model (1) can be expressed in the form of matrix as</p><disp-formula id="scirp.65573-formula39"><label>(2)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x13.png"  xlink:type="simple"/></disp-formula><p>where the membership functions of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x14.png" xlink:type="simple"/></inline-formula> are<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x15.png" xlink:type="simple"/></inline-formula>.</p><p>For convenience of discussion, we assume that all observations are triangular fuzzy numbers</p><disp-formula id="scirp.65573-formula40"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x16.png"  xlink:type="simple"/></disp-formula><p>where the <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x17.png" xlink:type="simple"/></inline-formula> are all real numbers. And their membership functions are</p><disp-formula id="scirp.65573-formula41"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x18.png"  xlink:type="simple"/></disp-formula><disp-formula id="scirp.65573-formula42"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x19.png"  xlink:type="simple"/></disp-formula><p>In order to obtain the estimator of regression coefficient<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x20.png" xlink:type="simple"/></inline-formula>, a natural idea is to try to turn fuzzy observation data <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x20.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x21.png" xlink:type="simple"/></inline-formula> into crisp data, and then use traditional least squares method to calculate the estimator of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x20.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x21.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x22.png" xlink:type="simple"/></inline-formula> [<xref ref-type="bibr" rid="scirp.65573-ref12">12</xref>] . There are many ways to transform fuzzy data into crisp data, one of the most common methods is the centroid method [<xref ref-type="bibr" rid="scirp.65573-ref7">7</xref>] .</p><p>The fuzzy data <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x23.png" xlink:type="simple"/></inline-formula> is transformed into crisp data <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x23.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x24.png" xlink:type="simple"/></inline-formula> (<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x23.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x24.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x25.png" xlink:type="simple"/></inline-formula>are usually called the centroid of<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x23.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x24.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x25.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x26.png" xlink:type="simple"/></inline-formula>) with the formula</p><disp-formula id="scirp.65573-formula43"><label>(3)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x27.png"  xlink:type="simple"/></disp-formula><p>Obviously, when the observation data is a symmetric triangular fuzzy data, the centroid of fuzzy observation data is the symmetric center of the symmetric triangular fuzzy data.</p><p>Lemma 1 [<xref ref-type="bibr" rid="scirp.65573-ref13">13</xref>] The traditional linear regression model</p><disp-formula id="scirp.65573-formula44"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x28.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x29.png" xlink:type="simple"/></inline-formula></p><p><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x30.png" xlink:type="simple"/></inline-formula>.</p><p>Then the least squares estimator of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x31.png" xlink:type="simple"/></inline-formula> is</p><p><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x32.png" xlink:type="simple"/></inline-formula>.</p><p>According to Lemma 1, it is easy to get the estimator for the parameter in model (1) or (2).</p><p>Theorem 2.1 Assume that the fuzzy linear regression model is</p><disp-formula id="scirp.65573-formula45"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x33.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x34.png" xlink:type="simple"/></inline-formula> are observed triangular fuzzy data and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x34.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x35.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x34.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x35.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x36.png" xlink:type="simple"/></inline-formula>. Then the estimator for <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x34.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x35.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x36.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x37.png" xlink:type="simple"/></inline-formula> is</p><disp-formula id="scirp.65573-formula46"><label>(4)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x38.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x39.png" xlink:type="simple"/></inline-formula> <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x39.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x40.png" xlink:type="simple"/></inline-formula> can be calculated by (3).</p><p>When fuzzy data is reduced to crisp data, the least squares estimation of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x41.png" xlink:type="simple"/></inline-formula> is a conventional least squares estimation.</p><p>Specifically, when the fuzzy linear regression model is<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x42.png" xlink:type="simple"/></inline-formula>, we have</p><p>Corrary Let <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x43.png" xlink:type="simple"/></inline-formula> is a set of triangular fuzzy data of the fuzzy linear regression model<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x43.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x44.png" xlink:type="simple"/></inline-formula>, it has</p><disp-formula id="scirp.65573-formula47"><label>(5)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x45.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x46.png" xlink:type="simple"/></inline-formula></p><p>If the fuzzy observation data <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x47.png" xlink:type="simple"/></inline-formula> is not triangular fuzzy data, the centroid method can also apply.</p></sec><sec id="s3"><title>3. The Evaluation Performance of Fuzzy Linear Regression Model</title><p>In order to evaluate the performance of fuzzy regression model, Kim and Bishu [<xref ref-type="bibr" rid="scirp.65573-ref11">11</xref>] introduced an absolute difference of the observed fuzzy dependent variable and estimated one as</p><disp-formula id="scirp.65573-formula48"><label>(6)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x48.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x49.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x49.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x50.png" xlink:type="simple"/></inline-formula> are the support of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x49.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x50.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x51.png" xlink:type="simple"/></inline-formula> and<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x49.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x50.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x51.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x52.png" xlink:type="simple"/></inline-formula>, respectively.</p><p>Essentially, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x53.png" xlink:type="simple"/></inline-formula>is estimated error term, the smaller value of<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x53.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x54.png" xlink:type="simple"/></inline-formula>, the better fit of fuzzy linear regression model. Nasrabadi and Nasrabadi [<xref ref-type="bibr" rid="scirp.65573-ref12">12</xref>] showed the general calculation steps. Kao and Chyu [<xref ref-type="bibr" rid="scirp.65573-ref13">13</xref>] showed with the fuzzy linear regression model<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x53.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x54.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x55.png" xlink:type="simple"/></inline-formula>, the formula of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x53.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x54.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x55.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x56.png" xlink:type="simple"/></inline-formula> is</p><disp-formula id="scirp.65573-formula49"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x57.png"  xlink:type="simple"/></disp-formula><p>When putting <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x58.png" xlink:type="simple"/></inline-formula> in (5) into the above formula, it has</p><disp-formula id="scirp.65573-formula50"><label>(7)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x59.png"  xlink:type="simple"/></disp-formula><p>where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x60.png" xlink:type="simple"/></inline-formula>. The value of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x60.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x61.png" xlink:type="simple"/></inline-formula> can be determined by using the following method</p><disp-formula id="scirp.65573-formula51"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x62.png"  xlink:type="simple"/></disp-formula></sec><sec id="s4"><title>4. Parameter Estimation and Impact Analysis of the Data Deleted Fuzzy Linear Regression Model</title><sec id="s4_1"><title>4.1. The Fuzzy Linear Regression Model Based on Data Deletion</title><p>For the fuzzy linear regression model (1) or (2), in order to evaluate the role and impact of the i-th data point <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x63.png" xlink:type="simple"/></inline-formula> in the regression analysis, we can compare the inference results of before and after deleting the i-th data point<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x63.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x64.png" xlink:type="simple"/></inline-formula>. And we can test this point whether it is an outlier point or not. The fuzzy linear regression model with the i-th point deleted is called a case deletion fuzzy linear regression model (FCDM), and its component form and matrix form are respectively</p><disp-formula id="scirp.65573-formula52"><label>(8)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x65.png"  xlink:type="simple"/></disp-formula><disp-formula id="scirp.65573-formula53"><label>(9)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x66.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x67.png" xlink:type="simple"/></inline-formula> is the vector or matrix after deleting the i-th data of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x67.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x68.png" xlink:type="simple"/></inline-formula> respectively and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x67.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x68.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x69.png" xlink:type="simple"/></inline-formula> denotes the least squares estimator of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x67.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x68.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x69.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x70.png" xlink:type="simple"/></inline-formula> in model (7).</p></sec><sec id="s4_2"><title>4.2. The Parameter Estimate of Case Deletion Fuzzy Linear Regression Model</title><p>According to Lemma 1 and Theorem 2.1, we can obtain the least squares estimator<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x71.png" xlink:type="simple"/></inline-formula>.</p><p>Theorem 4.1 For the case deletion fuzzy linear regression model (6) or (7), the least squares estimator of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x72.png" xlink:type="simple"/></inline-formula> is</p><p><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x73.png" xlink:type="simple"/></inline-formula>and</p><disp-formula id="scirp.65573-formula54"><label>(10)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/1-7402896x74.png"  xlink:type="simple"/></disp-formula><p>where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x75.png" xlink:type="simple"/></inline-formula>, and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x75.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x76.png" xlink:type="simple"/></inline-formula> is the vector which is composed of the i-th row’s element of matrix<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x75.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x77.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x75.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x77.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x78.png" xlink:type="simple"/></inline-formula>is</p><p>the main diagonal element of matrix<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x79.png" xlink:type="simple"/></inline-formula>. The proof of formula (10) can be obtained in Wei et</p><p>al. [<xref ref-type="bibr" rid="scirp.65573-ref13">13</xref>] .</p><p>Theorem 4.1 gives a calculation formula of the regression coefficient after the i-th data point deleted and also shows the relationship of the regression coefficient before and after the i-th data point deleted. It is the basis that we evaluate whether this point is a outlier point or a strong impact point. If the i-th data point is a normal point, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x80.png" xlink:type="simple"/></inline-formula>and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x80.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x81.png" xlink:type="simple"/></inline-formula> should be little difference. If they have a large difference, it shows that the existence of the i-th data point seriously affect the estimation of<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x80.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x81.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x82.png" xlink:type="simple"/></inline-formula>, and this data point may be a outlier point or a strong impact point.</p></sec><sec id="s4_3"><title>4.3. The Impact Analysis of the Case Deletion Fuzzy Linear Regression Model</title><p>At present, the existing method on fuzzy regression model did not consider the actual data which often contains outlier point or strong impact point. However, because of the gross error, rounding error, and other factor’s interference, it’s difficult to avoid that actual data mixed with a certain proportion of outlier points or strong impact points. Once mixed with outliers, these methods will face serious challenges, and even lead to wrong conclusions. Research about the impact of the data on the model is an important part of statistical diagnosis, and one of the most straightforward way is to delete data [<xref ref-type="bibr" rid="scirp.65573-ref6">6</xref>] . As we transform fuzzy data into clear data, the problem of the fuzzy linear regression analysis is transformed into a traditional linear regression analysis problem. Therefore, the discussion of impact analysis based on data deleted fuzzy regression model can be transformed into a traditional data-deleted linear regression model.</p><p>Although we can get <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x83.png" xlink:type="simple"/></inline-formula> by formula (8), it’s a vector which is difficult to compare. In practice, Cook’s distance is often used to measure<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x83.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x84.png" xlink:type="simple"/></inline-formula>. Cook’s distance is one of the most important diagnostic statistics, and was originally proposed based on the statistical significance of parameter confidence region by Cook in 1977 [<xref ref-type="bibr" rid="scirp.65573-ref14">14</xref>] .</p><p>Cook’s distance is defined as</p><disp-formula id="scirp.65573-formula55"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x85.png"  xlink:type="simple"/></disp-formula><p>here <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x86.png" xlink:type="simple"/></inline-formula></p><p>A big <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x87.png" xlink:type="simple"/></inline-formula> shows the estimate <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x87.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x88.png" xlink:type="simple"/></inline-formula> is far away from the true parameter<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x87.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x88.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x89.png" xlink:type="simple"/></inline-formula>, and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x87.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x88.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x89.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x90.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x87.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x88.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x89.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x90.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x91.png" xlink:type="simple"/></inline-formula> have large difference.</p><p>The following theorem is a simple formula for calculating Cook’s distance.</p><p>Theorem 4.2 [<xref ref-type="bibr" rid="scirp.65573-ref13">13</xref>] For the fuzzy linear regression model (2) or (7), Cook’s distance can be expressed as</p><disp-formula id="scirp.65573-formula56"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x92.png"  xlink:type="simple"/></disp-formula><p>where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x93.png" xlink:type="simple"/></inline-formula>. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x93.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x94.png" xlink:type="simple"/></inline-formula>is the fitted values with before and after deleting the i-th data point.</p><p>During the specific data analysis, we first calculate Cook’s distance point by point written as<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x95.png" xlink:type="simple"/></inline-formula>, and then find one or more particularly large <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x95.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x96.png" xlink:type="simple"/></inline-formula> through a list or figure (maybe not particularly large). The data point, with a big effect on parameter estimate, may be the outlier or the strong impact point.</p></sec></sec><sec id="s5"><title>5. Analysis of Practical Example</title><p>The following study shows an application of the centroid method and compares the proposed method in this paper with Diamond and Sakawa and Yano’s method.</p><p>The data in <xref ref-type="table" rid="table1">Table 1</xref> are triangular fuzzy numbers, and we establish the fuzzy linear regression model<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x97.png" xlink:type="simple"/></inline-formula>, discuss the model’s error term and the outlier data point.</p><p>By Theorem 2.1 (centroid), we can get the fuzzy linear regression equation</p><disp-formula id="scirp.65573-formula57"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x98.png"  xlink:type="simple"/></disp-formula><p>Using Diamond’s method [<xref ref-type="bibr" rid="scirp.65573-ref3">3</xref>] , we can obtain the fuzzy linear regression equation</p><disp-formula id="scirp.65573-formula58"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x99.png"  xlink:type="simple"/></disp-formula><p>Also when using Sakawa-Yano’s method [<xref ref-type="bibr" rid="scirp.65573-ref11">11</xref>] , we can obtain:</p><disp-formula id="scirp.65573-formula59"><graphic  xlink:href="http://html.scirp.org/file/1-7402896x100.png"  xlink:type="simple"/></disp-formula><p>By formula (7), we calculate the model’s error term <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x101.png" xlink:type="simple"/></inline-formula> and it’s sum using centroid method, Diamond’s method and Sakawa-Yano’s method, and the results are listed in <xref ref-type="table" rid="table1">Table 1</xref>. From <xref ref-type="table" rid="table1">Table 1</xref>, we can find that the sum of the model’s error term using centroid method is less than using Diamond’s method and Sakawa-Yano’s method. Thus, the result of fuzzy linear regression model using centroid method is better than using Diamond’s method and Sakawa-Yano’s method.</p><p><xref ref-type="fig" rid="fig1">Figure 1</xref> obtained by Matlab programming.</p><p><xref ref-type="table" rid="table2">Table 2</xref> and <xref ref-type="fig" rid="fig1">Figure 1</xref> show Cook’s distance under centroid method and their scatter plot, respectively. Because of being with the Cook’s distance. These results indicate that the data point No. 7 is an outlier or strong impact point.</p><fig id="fig1"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref></label><caption><title> The scatter plot of Cook’s distance under centroid method</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/1-7402896x102.png"/></fig><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Data from Sakawa and Yano [<xref ref-type="bibr" rid="scirp.65573-ref15">15</xref>] </title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >No.</th><th align="center" valign="middle"  rowspan="2"  ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x103.png" xlink:type="simple"/></inline-formula></th><th align="center" valign="middle"  rowspan="2"  ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x104.png" xlink:type="simple"/></inline-formula></th><th align="center" valign="middle"  colspan="3"  >error term <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x105.png" xlink:type="simple"/></inline-formula></th></tr></thead><tr><td align="center" valign="middle" >Centroid method</td><td align="center" valign="middle" >Diamond</td><td align="center" valign="middle" >Sakawa-Yano</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >(1.5, 2.0, 2.5)</td><td align="center" valign="middle" >(3.5, 4.0, 4.5)</td><td align="center" valign="middle" >0.848</td><td align="center" valign="middle" >0.861</td><td align="center" valign="middle" >0.633</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >(3.0, 3.5, 4.0)</td><td align="center" valign="middle" >(5.0, 5.5, 6.0)</td><td align="center" valign="middle" >0.208</td><td align="center" valign="middle" >0.227</td><td align="center" valign="middle" >0.453</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >(4.5, 5.5, 6.5)</td><td align="center" valign="middle" >(6.5, 7.5, 8.5)</td><td align="center" valign="middle" >1.489</td><td align="center" valign="middle" >1.520</td><td align="center" valign="middle" >1.613</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >(6.5, 7.0, 7.5)</td><td align="center" valign="middle" >(6.0, 6.5, 7.0)</td><td align="center" valign="middle" >0.910</td><td align="center" valign="middle" >0.945</td><td align="center" valign="middle" >1.165</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >(8.0, 8.5, 9.0)</td><td align="center" valign="middle" >(8.0, 8.5, 9.0)</td><td align="center" valign="middle" >0.760</td><td align="center" valign="middle" >0.785</td><td align="center" valign="middle" >0.770</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >(9.5, 10.5, 11.5)</td><td align="center" valign="middle" >(7.0, 8.0, 9.0)</td><td align="center" valign="middle" >1.449</td><td align="center" valign="middle" >1.477</td><td align="center" valign="middle" >1.977</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >(10.5, 11.0, 11.5)</td><td align="center" valign="middle" >(10.0, 10.5, 11.0)</td><td align="center" valign="middle" >1.000</td><td align="center" valign="middle" >1.060</td><td align="center" valign="middle" >1.368</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >(12.0, 12.5, 13.0)</td><td align="center" valign="middle" >(9.0, 9.5, 10.0)</td><td align="center" valign="middle" >0.806</td><td align="center" valign="middle" >0.834</td><td align="center" valign="middle" >1.452</td></tr><tr><td align="center" valign="middle" >The sum of error t error term</td><td align="center" valign="middle" ></td><td align="center" valign="middle" ></td><td align="center" valign="middle" >7.470</td><td align="center" valign="middle" >7.709</td><td align="center" valign="middle" >9.431</td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Cook’s distance under centroid method</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >No.</th><th align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x106.png" xlink:type="simple"/></inline-formula></th><th align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x107.png" xlink:type="simple"/></inline-formula></th><th align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/1-7402896x108.png" xlink:type="simple"/></inline-formula></th></tr></thead><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >(1.5, 2.0, 2.5)</td><td align="center" valign="middle" >(3.5, 4.0, 4.5)</td><td align="center" valign="middle" >0.2267</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >(3.0, 3.5, 4.0)</td><td align="center" valign="middle" >(5.0, 5.5, 6.0)</td><td align="center" valign="middle" >0.0030</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >(4.5, 5.5, 6.5)</td><td align="center" valign="middle" >(6.5, 7.5, 8.5)</td><td align="center" valign="middle" >0.1199</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >(6.5, 7.0, 7.5)</td><td align="center" valign="middle" >(6.0, 6.5, 7.0)</td><td align="center" valign="middle" >0.0362</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >(8.0, 8.5, 9.0)</td><td align="center" valign="middle" >(8.0, 8.5, 9.0)</td><td align="center" valign="middle" >0.0202</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >(9.5, 10.5, 11.5)</td><td align="center" valign="middle" >(7.0, 8.0, 9.0)</td><td align="center" valign="middle" >0.1554</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >(10.5, 11.0, 11.5)</td><td align="center" valign="middle" >(10.0, 10.5, 11.0)</td><td align="center" valign="middle" >0.2735</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >(12.0, 12.5, 13.0)</td><td align="center" valign="middle" >(9.0, 9.5, 10.0)</td><td align="center" valign="middle" >0.1306</td></tr></tbody></table></table-wrap></sec><sec id="s6"><title>6. Conclusion</title><p>By transforming fuzzy data into clear data, the fuzzy linear regression model is transformed into traditional linear regression model. We study the parameter estimation and impact analysis of the case-deletion fuzzy linear regression model. By comparing with other methods through a practical example, we can conclude that the proposed method in this paper can be used easily and have a good fitting performance.</p></sec><sec id="s7"><title>Acknowledgements</title><p>This research is supported by National Natural Science Foundation Grant No. 11171065, the National Statistical Scientific Foundation Grant No. 2014LY059.</p></sec><sec id="s8"><title>Cite this paper</title><p>Aiwu Zhang, (2016) Statistical Analysis of Fuzzy Linear Regression Model Based on Centroid Method. Applied Mathematics,07,579-586. doi: 10.4236/am.2016.77053</p></sec></body><back><ref-list><title>References</title><ref id="scirp.65573-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Zadeh, L.A. (1975) The Concept of Linguistic Variable and Its Application to Approximate Reasoning. Information Sciences, 8, 99-244, 301-357.</mixed-citation></ref><ref id="scirp.65573-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Thanaka, H., Uejina, S. and Asai, K. (1982) Linear Regression Analysis with Fuzzy Model. IEEE Trans Systems Man Cybernetics, 12, 903-907.</mixed-citation></ref><ref id="scirp.65573-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Diamond, P. (1988) Fuzzy Least Squares. Information Science, 46, 141-157. http://dx.doi.org/10.1016/0020-0255(88)90047-3</mixed-citation></ref><ref id="scirp.65573-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Savic, D.A. and Pedrycz, W. (1988) Evaluation of Fuzzy Linear Regression Models. Fuzzy Sets and Systems, 46, 141-157.</mixed-citation></ref><ref id="scirp.65573-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Chang, Y.-H.O. and Ayyub, B.M. (2001) Fuzzy Regression Methods—A Comparative Assessment. Fuzzy Sets and Systems, 119, 225-246. http://dx.doi.org/10.1016/S0165-0114(99)00092-5</mixed-citation></ref><ref id="scirp.65573-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Kim, B. and Bishu, R.R. (1998) Evaluation of Fuzzy Linear Regression Model by Comparison Membership Function. Fuzzy Set and Systems, 100, 343-352. http://dx.doi.org/10.1016/S0165-0114(97)00100-0</mixed-citation></ref><ref id="scirp.65573-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Nasrabadi, M.M. and Nasrabadi, E. (2004) A Mathematical-Progrmming Approach to Fuzzy Linear Regression Analysis. Applied Mathematics and Computation, 155, 873-881. http://dx.doi.org/10.1016/j.amc.2003.07.031</mixed-citation></ref><ref id="scirp.65573-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Kao, C. and Chyu, C.L. (2002) A Fuzzy Linear Regression Model with Better Explanatory Power. Fuzzy Sets and Systems, 126, 401-409. http://dx.doi.org/10.1016/S0165-0114(01)00069-0</mixed-citation></ref><ref id="scirp.65573-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Yeh, C.-T. (2011) A Formula for Fuzzy Linear Regression Analysis. 2011 IEEE International Conference on Fuzzy Systems, Taipei, 27-30 June 2011, 2845-2850.</mixed-citation></ref><ref id="scirp.65573-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Azadeh, A., Neshat, N. and Rafiee, K. (2015) An Adaptive Neural Network-Fuzzy Linear Regression Approach for Improved Car Ownership Estimation and Forecasting in Complex and Uncertain Environments: The Case of Iran. Transportation Planning and Technology, 35, 221. http://dx.doi.org/10.1080/03081060.2011.651887</mixed-citation></ref><ref id="scirp.65573-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Yager, R.R. (1980) On a General Class of Fuzzy Connectives. Fuzzy Set and Systems, 4, 235-242. http://dx.doi.org/10.1016/0165-0114(80)90013-5</mixed-citation></ref><ref id="scirp.65573-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Chen, S.J. and Hwang, C.L. (1992) Fuzzy Multiple Attribute Decision Making. Springer, NY. http://dx.doi.org/10.1007/978-3-642-46768-4</mixed-citation></ref><ref id="scirp.65573-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Wei, B.C., Lin, J.G. and Xie, F.C. (2009) Diagnostic Statistics. Higher Education Press, Beijing, 19-44.</mixed-citation></ref><ref id="scirp.65573-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Cook, R.D. (1977) Detection of Influential Observations in Linear Regression. Technometrics, 19, 15-18. http://dx.doi.org/10.1080/00401706.1977.10489493</mixed-citation></ref><ref id="scirp.65573-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Sakawa, M. and Yano, H. (1992) Multiobjective Fuzzy Linear Regression Analysis for Fuzzy Input-Output Data. Fuzzy Set and Systems, 47, 173-181. http://dx.doi.org/10.1016/0165-0114(92)90175-4</mixed-citation></ref></ref-list></back></article>