<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">AM</journal-id><journal-title-group><journal-title>Applied Mathematics</journal-title></journal-title-group><issn pub-type="epub">2152-7385</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/am.2021.1210060</article-id><article-id pub-id-type="publisher-id">AM-112603</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Physics&amp;Mathematics</subject></subj-group></article-categories><title-group><article-title>
 
 
  Solving Riccati-Type Nonlinear Differential Equations with Novel Artificial Neural Networks
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Roseline</surname><given-names>N. Okereke</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Olaniyi</surname><given-names>S. Maliki</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Department of Mathematics, Michael Okpara University of Agriculture, Umudike, Nigeria</addr-line></aff><pub-date pub-type="epub"><day>13</day><month>10</month><year>2021</year></pub-date><volume>12</volume><issue>10</issue><fpage>919</fpage><lpage>930</lpage><history><date date-type="received"><day>19,</day>	<month>November</month>	<year>2020</year></date><date date-type="rev-recd"><day>18,</day>	<month>October</month>	<year>2021</year>	</date><date date-type="accepted"><day>21,</day>	<month>October</month>	<year>2021</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  In this study we investigate neural network solutions to nonlinear differential equations of Ricatti-type. We employ a feed-forward Multilayer Perceptron Neural Network (MLPNN), but avoid the standard back-propagation algorithm for updating the intrinsic weights. Our objective is to minimize an error, which is a function of the network parameters i.e., the weights and biases. Once the weights of the neural network are obtained by our systematic procedure, we need not adjust all the parameters in the network, as postulated by many researchers before us, in order to achieve convergence. We only need to fine-tune our biases which are fixed to lie in a certain given range, and convergence to a solution with an acceptable minimum error is achieved. This greatly reduces the computational complexity of the given problem. We provide two important ODE examples, the first is a Ricatti type differential equation to which the procedure is applied, and this gave us perfect agreement with the exact solution. The second example however provided us with only an acceptable approximation to the exact solution. Our novel artificial neural networks procedure has demonstrated quite clearly the function approximation capabilities of ANN in the solution of nonlinear differential equations of Ricatti type.
 
</p></abstract><kwd-group><kwd>Ricatti ODE</kwd><kwd> MLPNN</kwd><kwd> GRBF</kwd><kwd> Network Training</kwd><kwd> MathCAD 14</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>We present a new perspective for obtaining solutions of initial value problems of Ricatti-type [<xref ref-type="bibr" rid="scirp.112603-ref1">1</xref>], using Artificial Neural Networks (ANN). This is an extension of the procedure developed by Okereke [<xref ref-type="bibr" rid="scirp.112603-ref2">2</xref>]. We discover that neural network based model for the solution of ordinary differential equations (ODE) provides a number of advantages over standard numerical methods. Firstly, the neural network based solution is differentiable and is in closed analytic form. On the other hand most other techniques offer a discretized solution or a solution with limited differentiability. Secondly, the neural network based method for solving differential equations provides a solution with very good generalization properties. The major advantage here is that our method reduces considerably the computational complexity involved in weight updating, while maintaining satisfactory accuracy.</p><sec id="s1_1"><title>1.1. Neural Network Structure</title><p>A neural network is an inter-connection of processing elements, units or nodes, whose functionality resemble that of the human neurons. The processing ability of the network is stored in the connection strengths, simply called weights, which can be obtained by a process of adaptation to, a set of training patterns. Neural network methods can solve both ordinary and partial differential equations. Furthermore, it relies on the function approximation property of feed forward neural networks which results in a solution written in a closed analytic form. This form employs a feed forward neural network as a basic approximation element. Training of the neural network can be done either by any optimization technique which in turn requires the computation of the gradient of the error with respect to the network parameters, by regression based model or by basis function approximation.</p></sec><sec id="s1_2"><title>1.2. Neural Networks are Universal Approximators</title><p>Artificial neural network can make a nonlinear mapping from the inputs to the outputs of the corresponding system of neurons which is suitable for analyzing the problem defined by initial/boundary value problems that have no analytical solutions or which cannot be easily computed. One of the applications of the multilayer feed forward neural network is the global approximation of real valued multivariable function in a closed analytic form. Namely such neural networks are universal approximators. It has been find out in the literature that multilayer feed forward neural networks with one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another with any desired degree of accuracy. This is made clear in the following theorem.</p></sec><sec id="s1_3"><title>1.3. Universal Approximation Theorem</title><p>The universal approximation theorem for MLP was proved by Cybenko [<xref ref-type="bibr" rid="scirp.112603-ref3">3</xref>] and Hornik et al. [<xref ref-type="bibr" rid="scirp.112603-ref4">4</xref>] in 1989. Let I n represent an n-dimensional unit cube containing all possible input samples x = ( x 1 , x 2 , ⋯ , x n ) with x i ∈ [ 0 , 1 ] , i = 1 , 2 , ⋯ , n . Let C ( I n ) be the space of continuous functions on I n , given a continuous sigmoid function φ ( ⋅ ) , then the universal approximation theorem states that the finite sums of the form</p><p>y k = y k ( x , w ) = ∑ i = 1 N 2 w k i 3 φ ( ∑ j = 0 n w k i 2 x j ) , k = 1 , 2 , ⋯ , m (1)</p><p>are dense in C ( I n ) . This simply means that given any function f ∈ C ( I n ) and ε &gt; 0 , there is a sum y ( x , w ) of the above form that satisfies</p><p>| y ( x , w ) − f ( x ) | &lt; ε , ∀ x ∈ I n . (2)</p></sec><sec id="s1_4"><title>1.4. Learning in Neural Networks</title><p>A neural network has to be configured such that the application of a set of inputs produces the desired set of outputs. Various methods to set the strengths of the connection exist. One way is to set the weights explicitly, using priory knowledge. Another way is to train the neural network by feeding it, teaching patterns and letting it change its weights according to some learning rule. The term learning is widely used in the neural network field to describe this process; it might be formally described as: determining an optimized set of weights based on the statistics of the examples. The learning classification situations in neural networks may be classified into distinct sorts of learning: supervised learning, unsupervised learning, reinforcement learning and competitive learning [<xref ref-type="bibr" rid="scirp.112603-ref5">5</xref>].</p></sec><sec id="s1_5"><title>1.5. Gradient Computation with Respect to Network Inputs</title><p>Next step is to compute the gradient with respect to input vectors, for this purpose let us consider a multilayer perceptron (MLP) neural network [<xref ref-type="bibr" rid="scirp.112603-ref6">6</xref>] with n input units, a hidden layer with m sigmoid units and a linear output unit. For a given input vector x = ( x 1 , x 2 , ⋯ , x n ) the output of the network is written:</p><p>N ( x , p ) = ∑ i = 1 m v j φ ( z j ) , z j = ∑ i = 1 n w j i x i + u j . (3)</p><p>w j i denotes the weight from input unit 𝑖 to the hidden unit 𝑗, v j denotes weight from the hidden unit 𝑗 to the output unit, u j denotes the biases, and φ ( z j ) is the sigmoid activation function.</p><p>Now the derivative of networks output N with respect to input vector x i is:</p><p>∂ ∂ x i N ( x , p ) = ∂ ∂ x i ( ∑ j = 1 m v j φ ( z j ) ) = ∑ j = 1 m v j w j i φ ( 1 ) (4)</p><p>where φ ( 1 ) ≡ ∂ φ ( x ) / ∂ x . Similarly, the k<sup>th</sup> derivative of N is computed as; ∂ k N / ∂ x i k = ∑ j = 1 m v j w j i k φ j ( k )</p><p>Where φ j ≡ φ ( z j ) and φ ( k ) denotes the k<sup>th</sup> order derivative of the sigmoid activation function.</p></sec></sec><sec id="s2"><title>2. General Formulation for Differential Equations</title><p>Let us consider the following general differential equations which represent both ordinary and partial differential equations Majidzadeh [<xref ref-type="bibr" rid="scirp.112603-ref7">7</xref>]:</p><p>G ( x , ψ ( x ) , ∇ ψ ( x ) , ∇ 2 ψ ( x ) , ⋯ ) = 0 ,     ∀   x ∈ D , (5)</p><p>subject to some initial or boundary conditions, where x = ( x 1 , x 2 , ⋯ , x n ) ∈ ℝ n , D ⊂ ℝ n denotes the domain, and ψ ( x ) is the unknown scalar-valued solution to be computed. Here, G is the function which defines the structure of the differential equation and ∇ is a differential operator. Let ψ t ( x , p ) denote the trail solution with parameters (weights, biases) p. Legaris et al. [<xref ref-type="bibr" rid="scirp.112603-ref8">8</xref>] gave the following as the general formulation for the solution of differential Equations (4) using ANN. Now, ψ t ( x , p ) may be written as the sum of two terms</p><p>ψ t ( x , p ) = A ( x ) + F ( x , N ( x , p ) ) (6)</p><p>where A ( x ) satisfies initial or boundary condition and contains no adjustable parameters, whereas N ( x , p ) is the output of feed forward neural network with the parameters p and input data x. The function F ( x , N ( x , p ) ) is actually the operational model of the neural network. Feed forward neural network (FFNN) converts differential equation problem to function approximation problem. The neural network N ( x , p ) is given by</p><p>N ( x , p ) = ∑ j = 1 m v j σ ( z j ) , z j = ∑ i = 1 n w j i x i + u j . (7)</p><p>w j i denotes the weight from input unit 𝑖 to the hidden unit j, v j denotes weight from the hidden unit j to the output unit, u j denotes the biases, and σ ( z j ) is the sigmoid activation function.</p><sec id="s2_1"><title>2.1. Neural Network Training</title><p>The neural network weights determine the closeness of predicted outcome to the desired outcome. If the neural network weights are not able to make the correct prediction, then only the biases need to be adjusted. The basis function we shall apply in this work in training the neural network is the sigmoid activation function given by</p><p>σ ( z j ) = ( 1 + e − z j ) − 1 . (8)</p></sec><sec id="s2_2"><title>2.2. Neural Network Model for Solving First Order Nonlinear ODE</title><p>Let us consider the first order ordinary differential equation below</p><p>ψ ′ ( x ) = f ( x , ψ ) , x ∈ [ a , b ] (9)</p><p>with initial condition ψ ( a ) = A . In this case we assume the function f is nonlinear in its argument. The ANN trial solution may be written as</p><p>ψ t ( x , p ) = A + x N ( x , p ) , (10)</p><p>where N ( x , p ) is the neural output of the feed forward network with one input data x with parameters p. The trial solution ψ t ( x , p ) satisfies the initial condition. To solve this problem using neural network (NN), we shall employ a NN architecture with three layers. One input layer with one neuron; one hidden layer with n neurons and one output layer with one output unit, as depicted in <xref ref-type="fig" rid="fig1">Figure 1</xref> below.</p><p>Each neuron is connected to other neurons of the previous layer through adaptable synaptic weights w 1 j and biases u j . Now, ψ t ( x i , p ) = A + x i N ( x i , p ) with</p><p>N ( x , p ) = ∑ j = 1 n v j σ ( x w j + u j ) ,     z j = x w j + u j . (11)</p><p>It is possible to have Multi-layered perceptrons with more than three layers, in which case we have more hidden layers [<xref ref-type="bibr" rid="scirp.112603-ref9">9</xref>] [<xref ref-type="bibr" rid="scirp.112603-ref10">10</xref>]. The most important application of multilayered perceptrons is their ability in function approximation. The Kolmogorov existence theorem guarantees that a three-layered perceptron with n ( 2 n + 1 ) nodes can compute any continuous function of n variables [<xref ref-type="bibr" rid="scirp.112603-ref11">11</xref>] [<xref ref-type="bibr" rid="scirp.112603-ref12">12</xref>]. The accuracy of the approximation depends only on the number of neurons in the hidden layer and not on the number of the hidden layers [<xref ref-type="bibr" rid="scirp.112603-ref13">13</xref>]. For the purpose of numerical computation, as mentioned previously, our sigmoidal activation function σ ( ⋅ ) for the hidden units of our neural network is taken to be;</p><p>σ ( z ) = ( 1 + e − z ) − 1 (12)</p><p>with the property that;</p><p>σ ′ ( z ) = σ ( z ) ( 1 − σ ( z ) ) . (13)</p><p>The trial solution ψ t ( x , p ) satisfies the initial condition. We differentiate the trial solution ψ t ( x , p ) to get</p><p>d ψ t ( x , p ) d x = N ( x , p ) + x d N ( x , p ) d x , (14)</p><p>We observe that;</p><p>d N ( x , p ) d x = ∑ j = 1 n v j d d x σ ( x w j + u j ) = ∑ j = 1 n v j w j σ ′ ( z j )</p><p>⇒ d N ( x , p ) d x = ∑ j = 1 n v j w j σ ( z j ) ( 1 − σ ( z j ) )</p><p>For evaluating the derivative term in the right hand side of (32), we use equations (7) and (26)-(31).</p><p>The error function for this case is formulated as;</p><p>E ( p ) = ∑ i = 1 n ( d ψ t ( x i , p ) d x i − f ( x i , ψ t ( x i , p ) ) ) 2 . (15)</p><p>Minimization of the above error function is considered as a procedure for training the neural network, where the error corresponding to each input vector x is the value f ( x ) which has to become zero. In computing this error value, we require the network output as well as the derivatives of the output with respect to the input vectors. Therefore, while computing error with respect to the network parameters, we need to compute not only the gradient of the network but also the gradient of the network derivatives with respect to its inputs [<xref ref-type="bibr" rid="scirp.112603-ref14">14</xref>]. This process can be quite tedious computationally, and in this work we avoid this cumbersome process by introducing the novel procedure outlined in this paper.</p></sec></sec><sec id="s3"><title>3. Numerical Example</title><p>The Riccati equation is a nonlinear ordinary differential equation of first order of the form:</p><p>y ′ ( x ) = p ( x ) y + q ( x ) y 2 + r ( x ) (16)</p><p>where p ( x ) , q ( x ) , r ( x ) are continuous functions of x. Neural network method can also solve this type of ODE. We show how our new approach can solve this type of ODE by redefining the neural network with respect to the form the ODE takes. Specifically, we consider the initial value problem:</p><p>y ′ ( x ) = 2 y ( x ) − y 2 ( x ) + 1 , y ( 0 ) = 0 , x ∈ [ 0 , 1 ] , (17)</p><p>which was solved by Otadi and Mosleh (2011) [<xref ref-type="bibr" rid="scirp.112603-ref15">15</xref>]. The exact solution is y ( x ) = 2 t a n h ( x 2 ) .</p><p>The trial solution is given by y t ( x ) = A + x ℵ ( x , p ) . Applying the initial conditions gives A = 0 . Therefore y t ( x ) = x ℵ ( x , p ) . This solution obviously satisfies the given initial condition. We observe that in Equation (17), the term y 2 ( x ) is what makes the ODE nonlinear. Also this term cannot be separated from 2 y ( x ) . Therefore, we incorporate 2 y ( x ) − y 2 ( x ) into the neural network to take care of the nonlinearity seen in the given differential equation. Thus, the new neural network becomes,</p><p>ℵ ( x , p ) = ∑ j m v j [ 2 σ ( z j ) − σ 2 ( z j ) ] = ∑ j m v j σ ( z j ) [ 2 − σ ( z j ) ] (18)</p><p>The error to be minimized is</p><p>E = 1 2 ∑ i = 1 n { d d t y t ( x i , p ) − [ 2 y t ( x i , p ) − y t 2 ( x i , p ) + 1 ] } 2 (19)</p><p>where the set { x i , i = 1 , ⋯ , n } are the discrete points in the interval [ 0 , 1 ] . We proceed as follows.</p><p>To compute the weights w j , j = 1 , 2 , 3 from the input layer to the hidden layer (<xref ref-type="fig" rid="fig1">Figure 1</xref>), we construct a function ϑ ( x ) such that w = ϕ − 1 f , f and ϕ . In particular, for x = ( x 1 , x 2 , x 3 ) , f ( x ) = ( ϑ ( x 1 ) , ϑ ( x 2 ) , ϑ ( x 3 ) ) T . Here N = 3 and the solution w = ϕ − 1 f is given by;</p><p>[ w 1 w 2 w 3 ] = [ φ 1 ( x 1 ) φ 2 ( x 1 ) φ 3 ( x 1 ) φ 1 ( x 2 ) φ 2 ( x 2 ) φ 3 ( x 2 ) φ 1 ( x 3 ) φ 2 ( x 3 ) φ 3 ( x 3 ) ] − 1 [ f 1 f 2 f 3 ] (20)</p><p>Here;</p><p>φ i ( x ) = exp ( − | x − x i | 2 2 σ 2 ) ,       σ 2 = 1 N ∑ i = 1 N ( x i − x &#175; ) 2 ,     x &#175; = 1 N ∑ i = 1 N x i (21)</p><p>The above is the so-called Gaussian Radial Basis function (GRBF) approximation model. To obtain the weights ν j , j = 1 , 2 , 3 from hidden layer to the output layer, we construct another function θ ( x ) such that ν = ϕ − 1 f , where, f ( x ) = ( θ ( x 1 ) , θ ( x 2 ) , θ ( x 3 ) ) T , x = ( x 1 , x 2 , x 3 ) and ϕ is given in Equation (20). We only need to replace the w j ’s by the ν j ’s, j = 1 , 2 , 3 .</p><p>The exact form of f ( x ) depends on the nature of a given differential equation. This will be made clear below. The nonlinear differential Equation (17) is rewritten as; y ′ ( x ) − 2 y ( x ) + y 2 ( x ) = 1 .</p><p>We now form a linear function based on the default sign of the differential equation, i.e. ϑ ( x ) = a x − b , where a is the coefficient of the derivative of y and b is the coefficient of y (i.e. a = 1 , b = − 2 ). Thus;</p><p>ϑ ( x ) = x + 2 ,         f ( x ) = ( ϑ ( x 1 ) , ϑ ( x 2 ) , ϑ ( x 3 ) ) T = ( 2.1 , 2.2 , 2.3 ) T , for x = ( 0.1 , 0.2 , 0.3 ) T .</p><p>This we apply to get the weights from input layer to the hidden layer. Thus f = ( 2.1 , 2.2 , 2.3 ) T and w = ϕ − 1 f</p><p>⇒ [ w 1 w 2 w 3 ] = [ 1 0 .94 0 .78 0 .94 1 0 .94 0 .78 0 .94 1 ] − 1 [ 2.1 2.2 2.3 ] (22)</p><p>Hence, the weights from the input layer to the hidden layer are</p><p>[ w 1 w 2 w 3 ] = [ 41 .335 − 73 .437 36 .79 − 73 .437 139 .062 − 73 .437 36 .79 − 73 .437 41 .335 ] [ 2.1 2.2 2.3 ] ,     [ w 1 w 2 w 3 ] = [ 9.858 − 17.187 10.767 ] (23)</p><p>The weights from input layer to the hidden layer are: w 1 = 9.858 , w 2 = − 17.187 , w 3 = 10.767 .</p><p>In order to get the weights from the hidden layer to the output layer, we now apply the forcing function which in this case is a constant function. That is, θ ( x ) = 1 , which is a constant function.</p><p>⇒ f ^ = ( θ ( x 1 ) , θ ( x 2 ) , θ ( x 3 ) ) T = ( 1 , 1,1 ) T (24)</p><p>θ ( x ) being the nonhomogeneous term. With v = ϕ − 1 f ^ the weights from the hidden layer to the output layer are given by</p><p>[ v 1 v 2 v 3 ] = [ 1 0 .94 0 .78 0 .94 1 0 .94 0 .78 0 .94 1 ] − 1 [ 1 1 1 ] = [ 41.335 − 73.437 36.79 − 73.437 139 .067 − 73.437 36.79 − 73.437 41 .335 ] [ 1 1 1 ]</p><p>⇒ [ v 1 v 2 v 3 ] = [ 4.687 − 7.812 4.687 ] (25)</p><p>Thus the weights from the hidden layer to the output layer are: v 1 = 4.687 , v 2 = − 7.812 , v 3 = 4.687 .</p><p>The biases are fixed between −20 and 20. We now train the network with the available parameters using our MathCAD 14 [<xref ref-type="bibr" rid="scirp.112603-ref16">16</xref>] algorithm (computer output) as follows:</p><p>w 1 : = 9.858 w 2 : = − 17.187 w 3 : = 10.767 x : = 1 v 1 : = 4.687 v 2 : = − 7.812 v 3 : = 4.687 u 1 : = − 20 u 2 : = 10 u 3 : = − 12.534 z 1 : = w 1 ⋅ x + u 1 = − 10.142 z 2 : = w 2 ⋅ x + u 2 = − 7.187 z 3 : = w 3 ⋅ x + u 3 = − 1.767 σ ( z 1 ) : = [ 1 + exp ( z 1 ) ] − 1 = 3.9388 &#215; 10 − 5 ,σ ( z 2 ) : = [ 1 + exp ( z 2 ) ] − 1 = 7.5578 &#215; 10 − 4 , σ ( z 3 ) : = [ 1 + exp ( z 3 ) ] − 1 = 0.1459 ℵ : = v 1 ⋅ σ ( z 1 ) ⋅ ( 2 − σ ( z 1 ) ) + v 2 ⋅ σ ( z 2 ) ⋅ ( 2 − σ ( z 2 ) ) + v 3 ⋅ σ ( z 3 ) ⋅ ( 2 − σ ( z 3 ) ) = 1.256457 y p ( x ) : = x ⋅ ℵ = 1.256457 , y d ( x ) : = 2 ⋅ tanh ( x ⋅ 2 ) = 1 .256367 E : = 0.5 ⋅ ( y d ( x ) − y p ( x ) ) 2 = 4.05 &#215; 10 − 4</p><p>The plots of the exact and predicted values in <xref ref-type="table" rid="table1">Table 1</xref> are depicted in <xref ref-type="fig" rid="fig2">Figure 2</xref> below.</p>Example<p>We consider the initial value problem:</p><p>x 2 y ′ + x 2 y 2 = 2 , y ( 1 2 ) = 0 , x ∈ ( 0 , 1 ] (26)</p><p>The exact solution is easily computed as: y ( x ) = ( 8 x 3 − 1 ) ( x + 4 x 4 ) − 1 .</p><p>Our trial solution for the given problem is y t ( x ) = A + x ℵ ( x , p ) . Applying the initial conditions gives</p><p>A = − 1 2 ℵ ( 1 2 , p ) . Therefore, y t ( x ) = − 1 2 ℵ ( 1 2 , p ) + x ℵ ( x , p ) (27)</p><p>In Equation (26), the nonlinear term y 2 ( x ) is alone in the ode (i.e. dividing</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Comparison of the results</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Input data (X)</th><th align="center" valign="middle" >0</th><th align="center" valign="middle" >0.1</th><th align="center" valign="middle" >0.2</th><th align="center" valign="middle" >0.3</th><th align="center" valign="middle" >0.4</th><th align="center" valign="middle" >0.5</th><th align="center" valign="middle" >0.6</th><th align="center" valign="middle" >0.7</th><th align="center" valign="middle" >0.8</th><th align="center" valign="middle" >0.9</th><th align="center" valign="middle" >1</th></tr></thead><tr><td align="center" valign="middle" >Y Exact</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0.19868</td><td align="center" valign="middle" >0.38967</td><td align="center" valign="middle" >0.56642</td><td align="center" valign="middle" >0.72434</td><td align="center" valign="middle" >0.86106</td><td align="center" valign="middle" >0.97623</td><td align="center" valign="middle" >1.07104</td><td align="center" valign="middle" >1.14761</td><td align="center" valign="middle" >1.20852</td><td align="center" valign="middle" >1.25637</td></tr><tr><td align="center" valign="middle" >Y Pred</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0.19867</td><td align="center" valign="middle" >0.38967</td><td align="center" valign="middle" >0.56642</td><td align="center" valign="middle" >0.72433</td><td align="center" valign="middle" >0.86106</td><td align="center" valign="middle" >0.97622</td><td align="center" valign="middle" >1.07103</td><td align="center" valign="middle" >1.14764</td><td align="center" valign="middle" >1.20849</td><td align="center" valign="middle" >1.25639</td></tr></tbody></table></table-wrap><p>out rightly by x 2 ). Therefore, our neural network for this problem takes the form:</p><p>ℵ ( x , p ) = ∑ j 3 v j σ 2 ( z j ) = ∑ j 3 v j σ ( z j ) [ σ ( z j ) ] (28)</p><p>We form algebraic equation of degree one with the default sign of the ode. Thus ϑ ( x ) = a x + b , ( a = x 2 ,   b = 0 ). Hence ϑ ( x ) = x 3 ⇒ f ( x ) = ( 0.001 , 0.008 , 0.027 ) T , for x = ( 0.1 , 0.2 , 0.3 ) T</p><p>This we apply to get the weights from input layer to the hidden layer. We employ the GRBF here for the weights w = ϕ − 1 f . Hence;</p><p>[ w 1 w 2 w 3 ] = [ 1 0 .94 0 .78 0 .94 1 0 .94 0 .78 0 .94 1 ] − 1 [ 0.001 0.008 0.027 ] ⇒ [ w 1 w 2 w 3 ] = [ 0.447 − 0.944 0.565 ] (29)</p><p>The weights from input layer to the hidden layer are: w 1 = 0.447 ,     w 2 = − 0.944 ,     w 3 = 0.565 .</p><p>We now use the forcing function, a constant function in this case, to get the weights from the hidden layer to the output layer. That is, θ ( x ) = 2 ⇒ f ^ ( x ) = ( 2 , 2 , 2 ) T for x = ( 0.1 , 0.2 , 0.3 ) T . Hence, the weights v = ϕ − 1 f ^ from the hidden layer to the output layer are;</p><p>[ v 1 v 2 v 3 ] = [ 1 0 .94 0 .78 0 .94 1 0 .94 0 .78 0 .94 1 ] − 1 [ 2 2 2 ] ⇒ [ v 1 v 2 v 3 ] = [ 9.375 − 15.625 9.375 ] (30)</p><p>The weights from the hidden layer to the output layer are: v 1 = 9.375 ,     v 2 = − 15.625 ,     v 3 = 9.375 .</p><p>The biases are fixed between −10 and 10. We now train the network with the available parameters using our MathCAD 14 algorithm as follows:</p><p>w 1 : = 1.234 w 2 : = − 2.725 w 3 : = 1.716 x : = 1 v 1 : = 9.375 v 2 : = − 15.625 v 3 : = 9.375 u 1 : = − 7 u 2 : = − 4 u 3 : = − 7 z 1 : = w 1 ⋅ x + u 1 = − 5.766 z 2 : = w 2 ⋅ x + u 2 = − 6.725 z 3 : = w 3 ⋅ x + u 3 = − 5.284</p><p>σ ( z 1 ) : = [ 1 + exp ( z 1 ) ] − 1 = 0.998 , σ ( z 2 ) : = [ 1 + exp ( z 2 ) ] − 1 = 0.995 , σ ( z 3 ) : = [ 1 + exp ( z 3 ) ] − 1 = 0.998 ℵ ( 0.5 ) : = v 1 ⋅ σ ( 0 .5 ⋅ w 1 + u 1 ) 2 + v 2 ⋅ σ ( 0 .5 ⋅ w 2 + u 2 ) 2 + v 3 ⋅ σ ( 0 .5 ⋅ w 3 + u 3 ) 2 = 3.199 ℵ : = v 1 ⋅ σ ( z 1 ) 2 + v 2 ⋅ σ ( z 2 ) 2 + v 3 ⋅ σ ( z 3 ) 2 = 3.01 y p ( x ) : = − 0.5 ⋅ ℵ ( 0.5 ) + x ⋅ ℵ = 1.41 , y d ( x ) : = ( 8 ⋅ x 3 − 1 ) ( x + 4 ⋅ x 4 ) − 1 = 1.4 , E : = 0.5 ⋅ ( y d ( x ) − y p ( x ) ) 2 = 5 &#215; 10 − 5</p><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Comparison of the results</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Input data (X)</th><th align="center" valign="middle" >0.1</th><th align="center" valign="middle" >0.2</th><th align="center" valign="middle" >0.3</th><th align="center" valign="middle" >0.4</th><th align="center" valign="middle" >0.5</th><th align="center" valign="middle" >0.6</th><th align="center" valign="middle" >0.7</th><th align="center" valign="middle" >0.8</th><th align="center" valign="middle" >0.9</th><th align="center" valign="middle" >1</th><th align="center" valign="middle" >1.1</th></tr></thead><tr><td align="center" valign="middle" >Y Exact</td><td align="center" valign="middle" >−9.881</td><td align="center" valign="middle" >−4.535</td><td align="center" valign="middle" >−2.359</td><td align="center" valign="middle" >−0.971</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >0.651</td><td align="center" valign="middle" >1.050</td><td align="center" valign="middle" >1.269</td><td align="center" valign="middle" >1.371</td><td align="center" valign="middle" >1.4</td><td align="center" valign="middle" >1.3869</td></tr><tr><td align="center" valign="middle" >Y Pred</td><td align="center" valign="middle" >−1.245</td><td align="center" valign="middle" >−0.953</td><td align="center" valign="middle" >−0.66</td><td align="center" valign="middle" >−0.368</td><td align="center" valign="middle" >−0.075</td><td align="center" valign="middle" >0.218</td><td align="center" valign="middle" >0.51</td><td align="center" valign="middle" >0.803</td><td align="center" valign="middle" >1.095</td><td align="center" valign="middle" >1.388</td><td align="center" valign="middle" >1.68</td></tr></tbody></table></table-wrap><p>The plots of the exact and predicted values in <xref ref-type="table" rid="table2">Table 2</xref> are depicted in <xref ref-type="fig" rid="fig3">Figure 3</xref>.</p></sec><sec id="s4"><title>4. Conclusion</title><p>A novel Neural Network approach was developed recently by Okereke, for solving first and second order linear ordinary differential equations. In this article, the procedure is now extended in this article to investigate neural network solutions to nonlinear differential equations of Ricatti-type. Specifically, we employ a feed-forward Multilayer Perceptron Neural Network (MLPNN), but avoid the standard back-propagation algorithm for updating the intrinsic weights. This greatly reduces the computational complexity of the given problem. For desired accuracy our objective is to minimize an error, which is a function of the network parameters i.e., the weights and biases. Once the weights of the neural network are obtained by our systematic procedure, we need not adjust all the parameters in the network, as postulated by many researchers before us, in order to achieve convergence. We only need to fine-tune our biases which are fixed to lie in a certain given interval, and convergence to a solution with an acceptable minimum error is achieved. The first example ODE of Ricatti type to which the procedure is applied gave us perfect agreement with the exact solution. The second example however provided us with only an acceptable approximation to the exact solution. This has demonstrated quite clearly the function approximation capabilities of ANN in the solution of nonlinear differential equations of Ricatti type. The above method still requires some refinement so that it can be generalized to solve any type of nonlinear differential equation including partial differential equations.</p></sec><sec id="s5"><title>Conflicts of Interest</title><p>The authors declare no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s6"><title>Cite this paper</title><p>Okereke, R.N. and Maliki, O.S. (2021) Solving Riccati-Type Nonlinear Differential Equations with Novel Artificial Neural Networks. Applied Mathematics, 12, 919-930. https://doi.org/10.4236/am.2021.1210060</p></sec></body><back><ref-list><title>References</title><ref id="scirp.112603-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Polyanin, A.D. and Zaitsev, V.F. (2003) Handbook of Exact Solutions for Ordinary Differential Equations. 2nd Edition, Chapman &amp; Hall/CRC, Boca Raton.</mixed-citation></ref><ref id="scirp.112603-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Okereke, R.N. (2019) A New Perspective to the Solution of Ordinary Differential Equations Using Artificial Neural Networks. Ph.D Dissertation, Mathematics Department, Michael Okpara University of Agriculture, Umudike.</mixed-citation></ref><ref id="scirp.112603-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Cybenco, G. (1989) Approximation by Superposition of a Sigmoidal Function. Mathematics of Control, Signals and Systems, 2, 303-314. https://doi.org/10.1007/BF02551274</mixed-citation></ref><ref id="scirp.112603-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Hornic, K., Stinchcombe, M. and White, H. (1989) Multilayer Feed forward Networks Are Universal Approximators. Neural Networks, 2, 359-366. https://doi.org/10.1016/0893-6080(89)90020-8</mixed-citation></ref><ref id="scirp.112603-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Graupe, D. (2007) Principles of Artificial Neural Networks. Vol. 6, 2nd Edition, World Scientific Publishing Co. Pte. Ltd., Singapore.</mixed-citation></ref><ref id="scirp.112603-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Rumelhart, D.E. and McClelland, J.L. (1986) Parallel Distributed Processing, Explorations in the Microstructure of Cognition I and II. MIT Press, Cambridge. https://doi.org/10.7551/mitpress/5236.001.0001</mixed-citation></ref><ref id="scirp.112603-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Majidzadeh, K. (2011) Inverse Problem with Respect to Domain and Artificial Neural Network Algorithm for the Solution. Mathematical Problems in Engineering, 2011, Article ID: 145608, 16 p. https://doi.org/10.1155/2011/145608</mixed-citation></ref><ref id="scirp.112603-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Lagaris, I.E., Likas, A.C. and Fotiadis D.I. (1997) Artificial Neural Network for Solving Ordinary and Partial Differential Equations. arXiv: physics/9705023v1.</mixed-citation></ref><ref id="scirp.112603-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Chen, R.T.Q., Rubanova, Y., Bettencourt, J. and Duvenaud, D. (2018) Neural Ordinary Differential Equations. arXiv: 1806.07366v1.</mixed-citation></ref><ref id="scirp.112603-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Mall, S. and Chakraverty, S. (2013) Comparison of Artificial Neural Network Architecture in Solving Ordinary Differential Equations. Advances in Artificial Neural Systems, 2013, Article ID: 181895. https://doi.org/10.1155/2013/181895</mixed-citation></ref><ref id="scirp.112603-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Gurney, K. (1997) An Intorduction to Neural Networks. UCL Press, London.</mixed-citation></ref><ref id="scirp.112603-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Samath, J.A., Kumar, P.S. and Begum, A. (2010) Solution of Linear Electrical Circuit Problem Using Neural Networks. International Journal of Computer Applications, 2, 6-13. https://doi.org/10.5120/618-869</mixed-citation></ref><ref id="scirp.112603-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Werbos, P.J. (1974) Beyond Recognition, New Tools for Prediction and Analysis in the Behavioural Sciences. Ph.D. Thesis, Harvard University, Cambridge.</mixed-citation></ref><ref id="scirp.112603-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Manoj, K. and Yadav, N. (2011) Multilayer Perceptrons and Radial Basis Function Neural Network Methods for the Solution of Differential Equations, A Survey. Computers and Mathematics with Applications, 62, 3796-3811. https://doi.org/10.1016/j.camwa.2011.09.028</mixed-citation></ref><ref id="scirp.112603-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Otadi, M. and Mosleh, M. (2011) Numerical Solution of Quadratic Riccati Differential Equations by Neural Network. Mathematical Sciences, 5, 249-257.</mixed-citation></ref><ref id="scirp.112603-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">PTC (Parametric Technology Corporation) (2007) Mathcad Version 14. http://communications@ptc.com</mixed-citation></ref></ref-list></back></article>