<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">OALibJ</journal-id><journal-title-group><journal-title>Open Access Library Journal</journal-title></journal-title-group><issn pub-type="epub">2333-9705</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/oalib.1108549</article-id><article-id pub-id-type="publisher-id">OALibJ-116343</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Biomedical&amp;Life Sciences</subject><subject> Business&amp;Economics</subject><subject> Chemistry&amp;Materials Science</subject><subject> Computer Science&amp;Communications</subject><subject> Earth&amp;Environmental Sciences</subject><subject> Engineering</subject><subject> Medicine&amp;Healthcare</subject><subject> Physics&amp;Mathematics</subject><subject> Social Sciences&amp;Humanities</subject></subj-group></article-categories><title-group><article-title>
 
 
  A Comparative Analysis of Neural Network and Decision Tree Model for Detecting Result Anomalies
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Stanley</surname><given-names>Ziweritin</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Barilee</surname><given-names>Barisi Baridam</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Ugochi</surname><given-names>Adaku Okengwu</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib></contrib-group><aff id="aff2"><addr-line>Department of Computer Science, University of Port Harcourt, Choba, Nigeria</addr-line></aff><aff id="aff1"><addr-line>Department of Estate Management, Akanu Ibiam Federal Polytechnic, Unwana, Nigeria</addr-line></aff><pub-date pub-type="epub"><day>04</day><month>03</month><year>2022</year></pub-date><volume>09</volume><issue>03</issue><fpage>1</fpage><lpage>15</lpage><history><date date-type="received"><day>1,</day>	<month>March</month>	<year>2022</year></date><date date-type="rev-recd"><day>28,</day>	<month>March</month>	<year>2022</year>	</date><date date-type="accepted"><day>31,</day>	<month>March</month>	<year>2022</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  The decision tree and neural network models are considered as one of the fastest and easy-to-use techniques having the ability to learn from classified data patterns. These models can be employed in detecting result anomalies measura- ble under normal circumstances on the bases that the student is healthy, had no problem and sat for exams. The existing techniques lack merit and integrity to efficiently detect irregularities found between student continuous assessments and exam scores. The addition of weights and calibrated values aided the learning process and addressed the problem facing the existing methods in operation. This provided an instance of having suitable control over the objective function in overcoming the identified problem. The added calibrated value helped control wrongly classified data patterns and improved the intelligence of the model. In this paper, the K-fold cross-validation test was employed to have a better classification report with the best split. This research was aimed to provide a comparative analysis of neural network and decision tree model for de- tecting result anomalies. The functionality of both models was used as a measure to check against result anomalies. This resulted in 96% and 91% accuracy with feed-forward multi-layered neural network and decision tree tech- nique.
 
</p></abstract><kwd-group><kwd>Anomaly Detection</kwd><kwd> Decision Tree</kwd><kwd> Feed-Forward</kwd><kwd> Neural Network</kwd><kwd> Pre-Processing</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Data professionals all over the world are increasingly conducting a systematic search on machine learning models. This helps in providing insights and innovative solutions to real-life challenges or problems. The Neural Network (NN) and Decision Trees (DTs) are all machine learning methods used by data professionals in preference to solve realistic problems. This can also be used to detect irregularities found in student examination results. And this comprises of course work, assignment, practical score, test/quiz score as Continuous Assessment (CA) and exam scores. These depend mainly on the complexity of the model and the problem at hand [<xref ref-type="bibr" rid="scirp.116343-ref1">1</xref>]. Also, NN and DTs use some sets of input data to produce output for decision-making [<xref ref-type="bibr" rid="scirp.116343-ref2">2</xref>]. An NN structure comprises of input nodes, multiple levels of hidden layers and output units that carry out self-optimization through learning with each neuron receiving input and performing an operation. The input accepts data from the external environment process and feeds it to the hidden layer and the hidden layer accepts output from the input. The hidden layer presents it into a form that can be accepted and interpreted by the output layer given to the external environment [<xref ref-type="bibr" rid="scirp.116343-ref3">3</xref>]. The hidden units were introduced as a single layer with feed-forward and back-propagation NN and used random weights to present output as input to the output layers [<xref ref-type="bibr" rid="scirp.116343-ref4">4</xref>]. What are the NNs generally? Unlike traditional gradient-based algorithms which learn network weights slowly and relatively much faster, the NNs also can learn and detect anomalies from classified data patterns to produce the desired result and even though a neuron could not respond as required. The process of training and testing of NN and DT model places a very huge burden like misclassification error and overfitting on the existing methods of operation in detecting CA and exam type of anomalies. The limits that defined normal and abnormal behavior of data are often not precisely defined for one or more data domains. It occurs due to a lack of a well-defined data representation (no standard) which poses challenges for both conventional and some machine learning techniques. The arising need for a large-scale anomaly detection system made it impossible for existing techniques to find outliers (optimal solution) when the volume of training data increases significantly. The aim of this paper is to build a comparative analysis of a NN and DT model for detecting students’ result anomalies. This model will be developed based on the condition that the student is healthy, had no problem and sat for exam [<xref ref-type="bibr" rid="scirp.116343-ref5">5</xref>] and the functionality of both models will be used to check against result anomalies. We intend to feed the model with weighted CA and exam scores to compute for the calibrated value, CA and exam type of anomalies after training and testing stages. The addition of weights and calibrated values will help control the objective function in checking against result anomalies.</p></sec><sec id="s2"><title>2. Literature Review</title><p>A combination of K-means clustering and decision tree type of Machine Learning (ML) techniques was adopted on some set of training instances to form k clusters in detecting network attacks known as anomalies [<xref ref-type="bibr" rid="scirp.116343-ref6">6</xref>]. This study helped in minimizing False Positive Rate (FPR) and maximized Balance Detection Rates (BDRs) as stated. This required an improvement on false positive rate to control the attacks on users that leads to the use of Naive Bayes and DT for adaptive intrusion detection system. The adoption of the Naive Bayes model with some adjusted gamma (γ) and beta (β) variables made it possible to effectively keep track of the false positives rate for different types of networks attack, and also balanced the detection rates required for user network attacks but produced poor detection rate with the existing dataset [<xref ref-type="bibr" rid="scirp.116343-ref7">7</xref>]. The genetic algorithm (GA) was proposed as self-assessment tool to improve the quality of academic activities after evaluating marks allotted to students. This model was developed and trained with the process of Crossover and Mutation to predict the overall performance of students’ at the end of every semester examination. The student assessment model made used of population chromosomes in representing Potential Solutions (PSs) to the identified problem through the fitness function. A future generation was created by selecting more fitted individuals from the population with the use of genetic operators (cross-over and mutation). The cross-over operation created new individuals by combining parts from multiple individuals and mutation operator created new individuals from the old individuals. The fitness function converged and produced accurate results after repeating this process for a number of generations in analyzing subject marks of compiler, automata and data structure). The results of experiment revealed that compiler and data structure was given high importance to reach a better performance but produced poor level of metric accuracy.</p><p>A very novel decision tree and support vector machine (SVM) approach was proposed in detecting anomalies [<xref ref-type="bibr" rid="scirp.116343-ref8">8</xref>]. The data set was first allowed to pass through the decision tree and the output of the DT was feed to the support vector machine for finding anomalous content to obtain the desired output. The results revealed to be good in performance with the existing system dataset but when compared to the improved SVM model produced more accurate result. A decision tree result anomaly detection system was developed with classified data instances that connected root, sub-nodes and edges. In their work, the C4.5 type of decision tree model was adopted in the experiment with instances of leaf nodes fixed to 2 and a pruning value of confidence level set to 0.25. The dataset was subdivided into two groups: testing (20%) and training (80%) set, but was unable to detect some cases involving result anomalies at the training stage which resulted in low sensitivity value. Meaning sensitivity value was not accurate and reliable leading to inaccuracy in detection which was affected by the magnitude and variation in dataset. A binary classifier and regression tree model was developed to divide the forest space into certain subsets of tree’s with leaf nodes corresponding to different subdivisions, this was determined by a splitting rules in respect to each internal node stated [<xref ref-type="bibr" rid="scirp.116343-ref9">9</xref>]. The step function produced errors and resulted in poor performance with the existing dataset. They adopted the ID3 type of decision tree in combination with CART and C4.5 trees in predicting the safety of a car when fully loaded with passengers and luggages. The CART produced longer training time complexity with the highest prediction accuracy of about 0.5 seconds and 97.36% than the ID3 and C4.5 type of trees, but required more training time to perform better and resulted to over-fitting problem coursed by missing data. The combination of KNN, SVM and DT models was developed to predict student’s performance [<xref ref-type="bibr" rid="scirp.116343-ref10">10</xref>]. The classification report of SVM produced the highest and best accuracy rate of about 95%, followed by DT 93% and KNN recorded to be 92% as the lowest.</p><p>The paper is divided into sections as followings: Section One focuses on the introduction, Section Two presents a brief review of some of the previous approaches to the study area and the gap in exploring the proposed model; Section Three, introduces the materials and methods adopted for developing the model; Section Four, focuses on the results and detailed discussion of results; Section Five presents the conclusion to the paper.</p></sec><sec id="s3"><title>3. Materials and Methods</title><p>We intend to build on the areas that we have identified some lapses as stated subject to the condition that the student is healthy, had no mental problem and sat for the exams. We are employing the DT and NN model to detect CA and exam types of result anomalies using MATLAB. The addition of weighted CA, exam, and calibrated values will improve the performance of the system in predicting the target (<xref ref-type="fig" rid="fig1">Figure 1</xref>).</p><p>The architecture is based on a multi-layer feed forward neural network and C4.5 type of DT with students weighted CA and exam scores as input. The weight values lies in between 0 to 1 for controlling the output function set up to work well with a particular set of data type.</p><sec id="s3_1"><title>3.1. Dataset</title><p>The data used by this model is sourced from an experimental dataset of 1,300 items generated randomly using MATLAB random function. This contains instances with detailed information about student’s weighed CA and Exam scores as shown below in <xref ref-type="table" rid="table1">Table 1</xref>.</p></sec><sec id="s3_2"><title>3.2. Pre-Processing</title><p>Pre-processing is one of the preliminary stages were data is transform into computer understandable or acceptable format. Data from the outside world is often inconsistent, incomplete and lacks some behavioral trend and may likely contain errors [<xref ref-type="bibr" rid="scirp.116343-ref11">11</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref12">12</xref>]. In some cases requires scaling data with a standard scale to minimize errors. To build prediction models, we have to convert categorical labels (Ex_Anomaly, CA_Anomaly and No_Anomaly) into numerical labels (1, 2 and 3). But this depends on the method of analysis and both techniques required scaling data to fit the prediction model [<xref ref-type="bibr" rid="scirp.116343-ref13">13</xref>].</p></sec><sec id="s3_3"><title>3.3. Weighted CA and Exam Scores</title><p>The student Continuous Assessment (CA) is set to a maximum of 30% containing test/quiz, assignment and exam set to a maximum of 70% amounted to be 100% and both is weighted to 100% given by the equation below as:</p><p>WeightedCA ( W_CA ) = CA &#215; 100 30 = CA &#215; 3.3 (1)</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Dataset</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >S/N</th><th align="center" valign="middle" >CA</th><th align="center" valign="middle" >Exam</th><th align="center" valign="middle" >W_CA</th><th align="center" valign="middle" >W_Exam</th><th align="center" valign="middle" >Exact_Diff</th><th align="center" valign="middle" >TARGET</th></tr></thead><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >22</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >31</td><td align="center" valign="middle" >−28</td><td align="center" valign="middle" >CA_Anomaly</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >13</td><td align="center" valign="middle" >27</td><td align="center" valign="middle" >19</td><td align="center" valign="middle" >46</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >17</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >57</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >49</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >61</td><td align="center" valign="middle" >−31</td><td align="center" valign="middle" >CA_Anomaly</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >26</td><td align="center" valign="middle" >51</td><td align="center" valign="middle" >87</td><td align="center" valign="middle" >87</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >No_Anomaly</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >13</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >61</td><td align="center" valign="middle" >−18</td><td align="center" valign="middle" >CA_Anomaly</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >19</td><td align="center" valign="middle" >13</td><td align="center" valign="middle" >63</td><td align="center" valign="middle" >19</td><td align="center" valign="middle" >44</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >41</td><td align="center" valign="middle" >100</td><td align="center" valign="middle" >59</td><td align="center" valign="middle" >41</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >52</td><td align="center" valign="middle" >100</td><td align="center" valign="middle" >74</td><td align="center" valign="middle" >26</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >15</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >21</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td></tr><tr><td align="center" valign="middle" >1300</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >61</td><td align="center" valign="middle" >−31</td><td align="center" valign="middle" >CA_Anomaly</td></tr></tbody></table></table-wrap><p>WeightedExam ( W_Exam ) = CA &#215; 100 70 = CA &#215; 10 7 (2)</p><p>The exact difference is the absolute value of the difference between the weighted CA and Exam scores given as:</p><p>E_Exact = abs ( W_CA − W_Exam ) (3)</p><p>while the decision tree error (DT<sub>Err</sub>) term is the difference between the exact difference and the decision tree (DT) predicted difference stated as:</p><p>DT Err = E_Diff − P_Diff D (4)</p></sec><sec id="s3_4"><title>3.4. Methodology</title><p>The NN model is developed using the feed-forward Artificial Neural Network (ANN) and the C4.5 type of DT for detecting CA and Exam anomalies with weighted student’s CA (WCA) and exam (W_Exam) scores. The NN and DT models are trained and tested with (80% or 1040) and (20% or 260) of the total to be 1300 items and error rate set to 0.0001. The tolerance error is the ability of a model to learn when the sample data received is corrupted in some way that is common and important. In severally applications it is not possible to access noise-free data. The proposed model is developed in a modular fashion with each module performing a specific task [<xref ref-type="bibr" rid="scirp.116343-ref14">14</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref15">15</xref>].</p></sec><sec id="s3_5"><title>3.5. Neural Network Model</title><p>The neural network architecture of the proposed system is made up of two input variables (CA and exam scores), ten hidden layers and one output layer (CA or exam anomalies) as shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>, with the introduction of a weighted CA (WCA) to control the NN objective function in producing a reliable result. We adopted the use of feed-forward multi-layer neural network as a structure for input and output connections [<xref ref-type="bibr" rid="scirp.116343-ref16">16</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref17">17</xref>]. The NN algorithm is feed with inputs comprises of 80% training and 20% testing data from a total of 1300 items. And we allowed the network to weight and compute the difference between the weighted CA and exam scores with tolerance level set to 0.00001. The idea is to reduce this error rate, until the NN learns from the training data. Finally, we proposed the use of NN activation function to control the output in producing a better and more accurate result.</p><p>A j ( x &#175; , w &#175; ) = ∑ i = 0 n x j w j i (5)</p><p>The output function uses the sigmoid function:</p><p>O j ( x &#175; , w &#175; ) = 1 1 + e j ( x &#175; , w &#175; ) (6)</p><p>We defined the error function for the output of each neuron as:</p><p>E j ( x &#175; , w &#175; , d ) = ∑ ( O j ( x &#175; , w &#175; ) − d j ) 2 (7)</p><p>The weights are all adjusted as follows:</p><p>Δ w j i = − η ( ∂ E ∂ w i j ) (8)</p><p>where x are the inputs, w<sub>ij</sub> are the weights O j ( x &#175; , w &#175; ) are the actual outputs, d<sub>j</sub> are the expected outputs and η is the learning rate.</p><p>The pieces in between the input and the output layer are the hidden layers in neural network [<xref ref-type="bibr" rid="scirp.116343-ref18">18</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref19">19</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref20">20</xref>]. The increase in hidden layer does not really affect the accuracy but depends on the complexity of the problem at hand. But this decreases neural network training time. The output of the input is stored and processed by the hidden layer which transforms the output to something that can be used and accepted by the output layer. The output of the hidden layer becomes the input to the output layer as shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>. The learning process proceeds</p><p>by ways of presenting the network with a training set consists of input and response patterns. The error values of NN are computed using Equation (3) in comparison with the actual and target output for a given pattern. The error value was then used to alter the connection strength between layers in other to achieve a better network response in subsequent iterations.</p></sec><sec id="s3_6"><title>3.6. The Decision Tree Model</title><p>A decision tree can learn through the concept of divide and conquer by splitting the source dataset into subdivisions called sub-nodes based on the value test [<xref ref-type="bibr" rid="scirp.116343-ref21">21</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref22">22</xref>]. And this process is repeated on each subset in a manner called recursive partitioning. The recursion is completed and terminated when splitting no longer adds value to the predictions. In DT, data arrives in the form of records [<xref ref-type="bibr" rid="scirp.116343-ref23">23</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref24">24</xref>], written as:</p><p>( x , Y ) = ( x 1 , x 2 , x y , ⋯ , x k , Y ) (9)</p><p>where Y represents the dependent variable taken as the target variable while x is the independent variable as vector containing of the input variables X 1 , X 2 , X 3 , ⋯ , X n , etc. The DT describes a structure for making decisions characterized with an ordered pair of nodes [<xref ref-type="bibr" rid="scirp.116343-ref25">25</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref26">26</xref>]. Each sub-node is associated with a decision function of one or more features generated from IF…THEN rules. There are different types of DTs which includes Classification and Regression Trees (CRT), Chi-Square Automatic Interaction Detection (CHAID) and C4.5 as extension [<xref ref-type="bibr" rid="scirp.116343-ref27">27</xref>] [<xref ref-type="bibr" rid="scirp.116343-ref28">28</xref>]. The training of DT as a binary classifier aid learning with set of instances to classify new instances. The learning rate for each variant is fine-turned to fall between 0.1 to 1 with specified number of iterations [<xref ref-type="bibr" rid="scirp.116343-ref29">29</xref>].</p></sec></sec><sec id="s4"><title>4. Results and Discussion</title><p>We discussed about several machine learning techniques and adopted NN and DT algorithms. The multi-layered feed-forward NN was employed using two input neurons and ten hidden layers to produce a single output of having anomaly or no_anomaly. The weights, calibrated values and C4.5 type of binary classification tree model was employed to obtain the best split and solution.</p><p><xref ref-type="fig" rid="fig3">Figure 3</xref> depicts the total number of dataset tested with their respective classes of result anomaly and frequency. The model automatically groug results into their anomaly classes as Ex_Anomaly, CA_Anomaly and No_Anomaly (Free cases) shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>. The Ex_Anomaly recorded about 660, CA_Anomaly = 629 and free cases of anomaly = 11. The summation all the classes produced the total dataset, i.e. 660 + 629 + 11 = 1300 items.</p><p><xref ref-type="fig" rid="fig4">Figure 4</xref> depicts the decreasing Mean_Square_Error (MSE) validation graph for training and testing stages and terminates at its tolerance level within the iterations. There was no sign of over-fitting or under-fitting because the MSE of</p><p>validation decreases and coincide with the best possible solution at 2000 epoch with colored circle. The NN was able to learn with high accuracy and best validation performance value of 4.9928e−09 in predicting the target.</p><p><xref ref-type="fig" rid="fig5">Figure 5</xref> shows the graph of DT estimated objective function. The estimated objective function ranges from 1, 0.95, 0.9, to 0.6 against the minimum leaf node size ranges from 10<sup>0</sup>, 10<sup>1</sup>, 10<sup>2</sup> to 10<sup>3</sup> which decreases and convergences within the work space. The minimum feasible point occurs exactly at 0.65 in the Y axis, set of observed points, model error mean occurs within 10<sup>0</sup> to 10<sup>2</sup> of the horizontal axis.</p><p>The function evaluation graph shows the convergences of the minimum observed objective function. The objective function increased and decreased within the range of 0 to 5, but constant and linear from 5 to 30 at the horizontal axis to cut the Y-axis at 0.65 and was able to detect CA and exam anomalies within the work space showing the distance for the whole spectrum of iterations for the dataset (<xref ref-type="fig" rid="fig6">Figure 6</xref>).</p><p><xref ref-type="fig" rid="fig7">Figure 7</xref> shows the tree structure of the proposed DT gotten from the classified dataset feed into the model with the concept of recursive partitioning. This generated a root, sub and leaf nodes in a tree like fashion with some predefined rules using MATLAB. The decision tree shown in <xref ref-type="fig" rid="fig7">Figure 7</xref> is constructed and grown by the following IF ... THEN statement using the DT predicted difference (DT_Preded_Diff). This clearly defined the anomaly class boundaries as added advantage over the existing methods.</p><p>If (DT_Predicted_Diff &gt; 0) THEN Exam_anomaly</p><p>ELSEIF (DT_Predicted_Diff &lt; 0) THEN CA_anomaly</p><p>ELSE:</p><p>Normal_Case or Free_cases</p><p>End IF</p>Comparing the Results of the NN and DT Model<p>A comparative study of the proposed system was carried out with a total of dataset of 1300 items containing students CA, exam, weighted CA and exam scores, exact difference, NN and DT predicted differences, DT error term and anomaly detected by both models given bellow in <xref ref-type="table" rid="table2">Table 2</xref> for the first data items.</p><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Comparing the proposed and existing system model</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  colspan="6"  ></th><th align="center" valign="middle" ></th><th align="center" valign="middle"  colspan="2"  >Neural Network (NN)</th><th align="center" valign="middle"  colspan="3"  >Decision Tree (DT)</th></tr></thead><tr><td align="center" valign="middle" >S/N</td><td align="center" valign="middle" >CA</td><td align="center" valign="middle" >Exam</td><td align="center" valign="middle" >W_CA</td><td align="center" valign="middle" >W_Exam</td><td align="center" valign="middle" >E_Diff</td><td align="center" valign="middle" >ABS (E_Diff)</td><td align="center" valign="middle" >P_DiffN</td><td align="center" valign="middle" >ANOMALY</td><td align="center" valign="middle" >P_DiffD</td><td align="center" valign="middle" >Err</td><td align="center" valign="middle" >ANOMALY</td></tr><tr><td align="center" valign="middle" >1</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >22</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >31</td><td align="center" valign="middle" >−28</td><td align="center" valign="middle" >28</td><td align="center" valign="middle" >28.429</td><td align="center" valign="middle" >CA_Anomaly</td><td align="center" valign="middle" >25</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >CA_Anomaly</td></tr><tr><td align="center" valign="middle" >2</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >13</td><td align="center" valign="middle" >27</td><td align="center" valign="middle" >19</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >8.4285</td><td align="center" valign="middle" >Exam_Anomaly</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >No_Anomaly</td></tr><tr><td align="center" valign="middle" >3</td><td align="center" valign="middle" >17</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >57</td><td align="center" valign="middle" >8</td><td align="center" valign="middle" >49</td><td align="center" valign="middle" >49</td><td align="center" valign="middle" >48.857</td><td align="center" valign="middle" >Exam_Anomaly</td><td align="center" valign="middle" >51</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >4</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >61</td><td align="center" valign="middle" >−31</td><td align="center" valign="middle" >31</td><td align="center" valign="middle" >31.429</td><td align="center" valign="middle" >CA_Anomaly</td><td align="center" valign="middle" >35</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >CA_Anomaly</td></tr><tr><td align="center" valign="middle" >5</td><td align="center" valign="middle" >26</td><td align="center" valign="middle" >51</td><td align="center" valign="middle" >87</td><td align="center" valign="middle" >87</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >No_Anomaly</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >CA_Anomaly</td></tr><tr><td align="center" valign="middle" >6</td><td align="center" valign="middle" >13</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >61</td><td align="center" valign="middle" >18</td><td align="center" valign="middle" >18</td><td align="center" valign="middle" >18.429</td><td align="center" valign="middle" >CA_Anomaly</td><td align="center" valign="middle" >19</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >No_Anomaly</td></tr><tr><td align="center" valign="middle" >7</td><td align="center" valign="middle" >19</td><td align="center" valign="middle" >13</td><td align="center" valign="middle" >63</td><td align="center" valign="middle" >19</td><td align="center" valign="middle" >44</td><td align="center" valign="middle" >44</td><td align="center" valign="middle" >44.429</td><td align="center" valign="middle" >Exam_Anomaly</td><td align="center" valign="middle" >44</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >8</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >41</td><td align="center" valign="middle" >100</td><td align="center" valign="middle" >59</td><td align="center" valign="middle" >41</td><td align="center" valign="middle" >41</td><td align="center" valign="middle" >41.429</td><td align="center" valign="middle" >Exam_Anomaly</td><td align="center" valign="middle" >38</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >9</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >52</td><td align="center" valign="middle" >100</td><td align="center" valign="middle" >74</td><td align="center" valign="middle" >26</td><td align="center" valign="middle" >26</td><td align="center" valign="middle" >25.714</td><td align="center" valign="middle" >Exam_Anomaly</td><td align="center" valign="middle" >21</td><td align="center" valign="middle" >4</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >10</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >15</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >21</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >8.5714</td><td align="center" valign="middle" >Exam_Anomaly</td><td align="center" valign="middle" >7</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >Exam_Anomaly</td></tr><tr><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td><td align="center" valign="middle" >-</td></tr><tr><td align="center" valign="middle" >1300</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >43</td><td align="center" valign="middle" >30</td><td align="center" valign="middle" >61</td><td align="center" valign="middle" >-31</td><td align="center" valign="middle" >31</td><td align="center" valign="middle" >31.429</td><td align="center" valign="middle" >CA_Anomaly</td><td align="center" valign="middle" >35</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >CA_Anomaly</td></tr></tbody></table></table-wrap><p><xref ref-type="table" rid="table2">Table 2</xref> shows how the proposed model was able to detect CA and Exam anomalies. The weighted students’ CA (W_CA) and exam (W_Exam) values are all computed by the NN and DT algorithms. The exact difference (E_Diff) or solution = abs (W_CA-W_Exam) while the DT error (Err) value= (E_Diff − P_DiffD) shown in <xref ref-type="table" rid="table2">Table 2</xref> obtained from the first ten data items. By comparing what the Neural Network (NN) predicted as the difference (P_DiffN) = (28.429, 8.4285, 48.857, 31.429, 33, 18.429, ...) and the DT predicted difference (P_Diff) = (325, 8, 51, 35, 36, 19, 44, 38, ...) with the attributes of anomaly. This revealed that the NN produced a better, precise and more reliable result than the Decision Tree (DT) in terms of detected anomalies.</p><p>The following were detected by the NN and DT model shown in the overlapping chart (<xref ref-type="fig" rid="fig8">Figure 8</xref>) using MATLAB. The free case of result anomalies registered by DT is 116 which grow higher than the NN recorded as 97 free cases. The instances of exam anomalies detected by DT are 605 totally overlapped by number detected by the NN which is 633 cases. Also the total number of CA anomalies detected by DT shown as 579 and totally overlapped by the NN as 610 cases of CA anomalies. Total number of training dataset = 1300.</p><p>The accuracy of the proposed model is the ratio of correct classifications (True positives and negatives) from the overall number of cases using MATLAB function given below in the equation as:</p><p>Accuracy = total&#160;number&#160;of&#160;correct&#160;classification total&#160;number&#160;of&#160;cases = TP + TN TP + TN + FP + FN (10)</p><p>where TN represents true negative = 579, FP is false positive = 97, TP is True positive = 605 FN is false negative = 19 cases and produced 91% level of accuracy for the DT model. While TN = 633, FP = 37, TP = 610 FN = 20 cases and produced 96% level of accuracy for the NN model.</p></sec><sec id="s5"><title>5. Conclusion</title><p>The error value for both systems converged within the selected types of result anomalies. From the experimental results, DT recorded a wider rate of error compared to NN model. Also, the error rate of NN converged faster and produced a higher accuracy rate than DT when used to check against result anomalies. The added weights and calibrated values reduced the error value and improved the general performance of the proposed system. Therefore, we evidently conclude that the NN model performed better in terms of accuracy and precision in detecting anomalies.</p></sec><sec id="s6"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest.</p></sec><sec id="s7"><title>Cite this paper</title><p>Ziweritin, S., Baridam, B.B. and Okengwu, U.A. (2022) A Comparative Analysis of Neural Network and Decision Tree Model for Detecting Result Anomalies. Open Access Library Journal, 9: e8549. https://doi.org/10.4236/oalib.1108549</p></sec></body><back><ref-list><title>References</title><ref id="scirp.116343-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Omlin, C.W. and Snyders, S. (2003) Inductive Bias Strength in Knowledge-Based Neural Networks: Application to Magnetic Resonance Spectroscopy of Breast Tissues. Artificial Intelligence in Medicine, 28, 30-54.  
https://doi.org/10.1016/S0933-3657(03)00062-9</mixed-citation></ref><ref id="scirp.116343-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Yedjour, D., Yedjour, H. and Benyettou, A. (2011) Explaining Results of Artificial Neural Networks. Journal of Applied Sciences, 2, 1-45.  
https://doi.org/10.3923/jas.2011.2855.2860</mixed-citation></ref><ref id="scirp.116343-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Balogun, A.O., Mabayoje, M.A., Salihu, S. and Arinze, S.A. (2015) Enhanced Classification via Clustering Techniques Using Decision Tree for Feature Selection. International Journal of applied Information System (IJAIS), 9, 11-16.  
https://doi.org/10.5120/ijais2015451425</mixed-citation></ref><ref id="scirp.116343-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Bhargava, N., Sharma, G., Bhargava, R. and Mathuria, M. (2013) Decision Tree Analy- sis on j48 Algorithm for Data Mining. Proceedings of International Journal of Advanced Research in Computer Science and Software Engineering (IJARCSSE), 2, 45-98.</mixed-citation></ref><ref id="scirp.116343-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Patel, H.P. and Prajapati, P. (2018) Study and Analysis Tree Based Classification Algorithms. International Journal of Computer Sciences and Engineering (IJCSE), 6, 74-78.  
https://doi.org/10.26438/ijcse/v6i10.7478</mixed-citation></ref><ref id="scirp.116343-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Soranamageswari, M. and Meena, C. (2020) Histogram Based Image Spam Detection Using Back Propagation Neural Networks. Global Journal of Computer Science and Te- chnology (GJCST), 9, 62-67.</mixed-citation></ref><ref id="scirp.116343-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Lakshmi, T.M., Martin, A., Begum, R.M. and Venkatesan, V.P. (2013) An Analysis on Performance of Decision Tree Algorithms Using Students Qualitative Data. International Journal of Modern Education and Computer Science (IJMECS), 5, 18-27.  
https://doi.org/10.5815/ijmecs.2013.05.03</mixed-citation></ref><ref id="scirp.116343-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Li, Y., Xing, H., Hua, Q. and Wang, X. (2014) Classification of BGP Anomalies Using Decision Trees and Fuzzy Rough Sets. IEEE International Conference on Systems, Man and Cybernetics (SMC), San Diego, 5-8 October 2014, 1312-1317.  
https://doi.org/10.1109/SMC.2014.6974096</mixed-citation></ref><ref id="scirp.116343-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Chien-Liang, L. and Ching-Lung, F. (2019) Evaluation of CART, CHAID, and QUEST Algorithms: A Case Study of Construction Defects in Taiwan. Journal of Asian Architecture and Building Engineering, 18, 1-40.  
https://doi.org/10.1080/13467581.2019.1696203</mixed-citation></ref><ref id="scirp.116343-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Zhang, X. and Jiang, S.A. (2012) A Splitting Criteria Based on Similarity in Decision Tree Learning. JSW, 7, 82-1775. https://doi.org/10.4304/jsw.7.8.1775-1782</mixed-citation></ref><ref id="scirp.116343-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Huang, M. and Hsu, Y. (2012) Fetal Distress Prediction Using Discriminant Analysis, Decision Tree, and Artificial Neural Network. Journal of Biomedical Science and Engineering (JBSE), 5, 526-533. https://doi.org/10.4236/jbise.2012.59065</mixed-citation></ref><ref id="scirp.116343-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">John, E.B., Derek, T.A. and Chee, S.C. (2017) Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools, and Challenges for the Community. Journal of Applied Remote Sensing, 11, 609. https://doi.org/10.1117/1.JRS.11.042609</mixed-citation></ref><ref id="scirp.116343-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Ziweritin, S., Baridam, B.B. and Okengwu, U.A. (2020) Neural Network Model for Detection of Result Anomalies in Higher Education. Scientia Africana: An International Journal of Pure and Applied Sciences (IJPAS), 19, 91-104.</mixed-citation></ref><ref id="scirp.116343-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Zhou, Z.H., Jiang, Y., Yang, Y.B. and Chen, S.F. (2002) Lung Cancer Cell Identification Based on Artificial Neural Network Ensembles. Artificial Intelligence in Medicine, 24, 25-23. https://doi.org/10.1016/S0933-3657(01)00094-X</mixed-citation></ref><ref id="scirp.116343-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Badr, H.H.E., Abdelkarim, M. and Mohammed, E. (2014) A Comparative Study of Decision Tree ID3 and C4.5. International Journal of Advanced Computer Science and Applications, 4, 13-19. https://doi.org/10.14569/SpecialIssue.2014.040203</mixed-citation></ref><ref id="scirp.116343-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Aneetha, A.S. and Bose, S. (2012) The Combined Approach for Anomaly Detection Using Neural Networks and Clustering Techniques. Computer Science and Engineering: An International Journal (CSEIJ), 2, 37-46.  
https://doi.org/10.5121/cseij.2012.2404</mixed-citation></ref><ref id="scirp.116343-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Anyanwu, M.N. and Shiva, S.G. (2009) Comparative Analysis of Serial Decision Tree Classification Algorithms. International Journal of Computer Science and Security (IJCSS), 3, 37-46.</mixed-citation></ref><ref id="scirp.116343-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Abhishek, K. and Anil, P. (2019) Implement of Students Result by Using Genetic Algorithm. International Journal of Computer Sciences and Engineering, 7, 51-56.</mixed-citation></ref><ref id="scirp.116343-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Abdellatif, H., Mohamed, S. and Mohamed, F. (2016) Face Recognition: Synthesis of Classification Methods. International Journal of Computer Science and Information Security (IJCSIS), 14, 5-11.</mixed-citation></ref><ref id="scirp.116343-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Wiyono, S. and Abidin, T. (2019) Comparative Study of Machine Learning KNN, SVM, and Decision Tree Algorithm to Predict Student’s Performance. International Journal of Research, 7, 1-13. https://doi.org/10.29121/granthaalayah.v7.i1.2019.1048</mixed-citation></ref><ref id="scirp.116343-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Joseph, J.F., Das, A. and Seet, B.C. (2011) Cross-Layer Detection of Sinking Behavior in Wireless Ad Hoc Networks Using SVM and FDA. IEEE Transaction on Dependable and Secure Computing, 8, 12-23. https://doi.org/10.1109/TDSC.2009.48</mixed-citation></ref><ref id="scirp.116343-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Jha, J. and Ragha, L. (2013) Intrusion Detection System Using Support Vector Machine. International Journal of Applied Information System (IJAIS), 5, 25-30.</mixed-citation></ref><ref id="scirp.116343-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Peddabachigari, S., Abraham, A. and Grosan, C. (2007) Modeling Intrusion Detection System Using Hybrid Intelligent Systems. Journal of Network and Computer Applications, 30, 114-132. https://doi.org/10.1016/j.jnca.2005.06.003</mixed-citation></ref><ref id="scirp.116343-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Farid, D.M., Harbi, N. and Rahman, M.Z. (2010) Combining Naive Bayes and Decision Tree for Adaptive Intrusion Detection. International Journal of Network Security and Its Applications, 2, 12-25. https://doi.org/10.5121/ijnsa.2010.2202</mixed-citation></ref><ref id="scirp.116343-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Hamza, O.S., Ruqayyah, S. and Mohammed, O. (2016) Detecting Anomalies in Students, Results Using Decision Trees. International Journal of Modern Education and Computer Science (MECS), 8, 1312-1317. https://doi.org/10.5815/ijmecs.2016.07.04</mixed-citation></ref><ref id="scirp.116343-ref26"><label>26</label><mixed-citation publication-type="other" xlink:type="simple">Talwar, A. and Kumar, Y. (2013) Machine Learning: An Artificial Intelligence Methodology. International Journal of Engineering and Computer Science, 2, 3400-3404.</mixed-citation></ref><ref id="scirp.116343-ref27"><label>27</label><mixed-citation publication-type="other" xlink:type="simple">Aderemi, O.A. and Andronicus, A.A. (2017) A Survey of Machine-Learning and Nature-Inspired Based Credit Card Fraud Detection Techniques. International Journal of System Assurance Engineering and Management, 8, 937-953.  
https://doi.org/10.1007/s13198-016-0551-y</mixed-citation></ref><ref id="scirp.116343-ref28"><label>28</label><mixed-citation publication-type="other" xlink:type="simple">Huang, G.B., Zhu, Q.Y. and Siew, C.K. (2016) Extreme Learning Machine: Theory and Applications. International Journal of Neurocomputing, 70, 489-501.  
https://doi.org/10.1016/j.neucom.2005.12.126</mixed-citation></ref><ref id="scirp.116343-ref29"><label>29</label><mixed-citation publication-type="other" xlink:type="simple">Hawkins, D.M. (2001) The Detection of Errors in Multivariate Data Using Principal Components. Journal of the American Statistical Association, 69, 340-344.  
https://doi.org/10.1080/01621459.1974.10482950</mixed-citation></ref></ref-list></back></article>