<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">OALibJ</journal-id><journal-title-group><journal-title>Open Access Library Journal</journal-title></journal-title-group><issn pub-type="epub">2333-9705</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/oalib.1106799</article-id><article-id pub-id-type="publisher-id">OALibJ-103104</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Biomedical&amp;Life Sciences</subject><subject> Business&amp;Economics</subject><subject> Chemistry&amp;Materials Science</subject><subject> Computer Science&amp;Communications</subject><subject> Earth&amp;Environmental Sciences</subject><subject> Engineering</subject><subject> Medicine&amp;Healthcare</subject><subject> Physics&amp;Mathematics</subject><subject> Social Sciences&amp;Humanities</subject></subj-group></article-categories><title-group><article-title>
 
 
  Parallel Self-Timed Adder with Lookahead-Carry Generator
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Mohammad</surname><given-names>Ashfak Habib</given-names></name><xref ref-type="aff" rid="aff1"><sub>1</sub></xref></contrib></contrib-group><aff id="aff1"><label>1</label><addr-line>Department of Computer Science and Engineering, Chittagong University of Engineering and Technology, Chittagong, Bangladesh</addr-line></aff><pub-date pub-type="epub"><day>01</day><month>09</month><year>2020</year></pub-date><volume>07</volume><issue>09</issue><fpage>1</fpage><lpage>11</lpage><history><date date-type="received"><day>7,</day>	<month>September</month>	<year>2020</year></date><date date-type="rev-recd"><day>20,</day>	<month>September</month>	<year>2020</year>	</date><date date-type="accepted"><day>23,</day>	<month>September</month>	<year>2020</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  
    Parallel self-timed adder (PASTA) is a newly introduced asynchronous adder. It shows appreciable average-case performance without any special speedup circuitry or look-ahead schema, but its worst-case performance is almost similar to that of ripple carry adder. It is therefore an important research issue to find a technique to improve its worst-case performance without any significant compromise in its other performances. This paper investigates the possibility of such performance improvement of the basic architecture of PASTA by changing its carry propagation schema. The existing ripple fashioned carry propagation schema is replaced by four different lookahead-carry generators. Four different implementations of PASTA with four different types of lookahead-carry generators are presented. The carry propagation delays of the proposed implementations are compared with that of the basic implementation of PASTA. More impressive worst case performances are found for the proposed implementations. The amount of improvement is minimum 45.25% and maximum 61.09%. The proposed designs are regular and do not have any practical limitations of fan-ins or fan-outs. Simulation-based results validate the practicality as well as the superiority of the proposed architecture over the existing architecture of PASTA. 
  
 
</p></abstract><kwd-group><kwd>Arithmetic Circuit</kwd><kwd> Binary Adder</kwd><kwd> Asynchronous Circuit</kwd><kwd> Self-Timed Adder</kwd><kwd> Parallel Prefix Adder</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Statistical analysis shows that, in a prototypical RISC machine, 72 percent of the instructions perform addition (or subtraction) in the datapath [<xref ref-type="bibr" rid="scirp.103104-ref1">1</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref2">2</xref>]. It is even reported to reach 80 percent in ARM processors [<xref ref-type="bibr" rid="scirp.103104-ref3">3</xref>]. Therefore, binary addition is one of the most common arithmetic operations that a computer processor performs. According to the hardware design principal, make the common case fast [<xref ref-type="bibr" rid="scirp.103104-ref4">4</xref>], faster hardware for addition operation can achieve faster processor. Designing a faster binary adder is therefore an interesting research issue.</p><p>Researchers are thoroughly investigating the addition operation since the beginning of modern computing [<xref ref-type="bibr" rid="scirp.103104-ref5">5</xref>] and they are still working on it. Recently the architecture and performance of a new adder PASTA have been discussed by Rahman et al. [<xref ref-type="bibr" rid="scirp.103104-ref6">6</xref>], which uses recursive process for generating the final result. Though PASTA is an asynchronous adder and uses ripple fashioned carry propagation technique, its performance is compared with various reputed synchronous and asynchronous adders and shown competitive in all respect [<xref ref-type="bibr" rid="scirp.103104-ref6">6</xref>]. The mentionable achievements of PASTA are: simple design (area and interconnection-wise equivalent to ripple carry adder), logarithmic average time performance and highly practical and efficient completion detection unit.</p><p>PASTA is a self-timed adder and it has a completion detection mechanism. An adder which can announce the completion of its operation can take the advantage of the shorter average-case propagation delay and, in turn, exhibit average case performance [<xref ref-type="bibr" rid="scirp.103104-ref7">7</xref>]. Some recent studies [<xref ref-type="bibr" rid="scirp.103104-ref8">8</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref9">9</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref10">10</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref11">11</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref12">12</xref>] further investigated the architecture of PASTA but none of those studies changed its carry propagation technique. Though PASTA has appreciable average case performance, improving its worst case performance can make it more acceptable. This paper presents four different enhanced architectures of PASTA with modified carry propagation technique. This modification makes its worst case delay nearly half without adversely degrading its other performances. The proposed implementations are not only appropriate for asynchronous systems but also suitable for Globally Asynchronous Locally Synchronous (GALS) systems [<xref ref-type="bibr" rid="scirp.103104-ref13">13</xref>] because of their improved worst case performance and efficient completion detection mechanism. For the ease of explanation, the term Enhanced Parallel Self-Timed Adder (EPASTA) is used to indicate the proposed adder circuits.</p></sec><sec id="s2"><title>2. Methods</title><p>Total nine different 16-bit adders are implemented. Five of them are existing adders and the remaining four are the four different varieties of the proposed EPASTA. The carry propagation delays of the proposed adders are compared with the delays of the existing adders.</p><sec id="s2_1"><title>2.1. Existing Adders</title><p>The basic architecture of an n-bit PASTA is adopted from [<xref ref-type="bibr" rid="scirp.103104-ref6">6</xref>] and it is illustrated in <xref ref-type="fig" rid="fig1">Figure 1</xref>. One can intuitively understand, by examining the architecture of</p><p>PASTA, that the carry propagation mechanism of this adder is similar to that of the basic ripple carry adder. Its operation is divided into two phases namely Initial Phase and Recursive Phase. The common selector (SEL) sends signal for all the 2 &#215; 1 multiplexers. It determines the appropriate phase. In initial phase SEL = 0 and in recursive phase SEL = 1. In initial phase the multiplexer in the i<sup>th</sup> bit position allows a<sub>i</sub> and b<sub>i</sub> to go to the corresponding half adder and the half adder produces the initial values of the sum (S<sub>i</sub>) and the output carry ( C i + 1 ) bits. In the recursive phase, the feedback path allows the initial sum to be added to the input carry C<sub>i</sub> recursively. The values of the sum and the output carry bits are recalculated for every recursion cycle. The recursion is terminated when the stopping criterion is met. The completion detection unit produces an asserted Terminate signal for indicating the completion of operation.</p><p>The lookahead-carry generation techniques of four well known tree-like parallel synchronous adders are used in the EPASTA. Chosen parallel synchronous adders are: Block Carry Lookahead Adder (BCLA), Kogge Stone Adder (KSA), Brent-Kung Adder (BKA) and Sklansky’s Conditional Sum Adder (SCSA). These adders are well known and have preferable worst-case performances. The basic structures of these adders were explained in [<xref ref-type="bibr" rid="scirp.103104-ref14">14</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref15">15</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref16">16</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref17">17</xref>]. Four types of sub-circuits are repeatedly used in these adders. For the ease of explanation, these sub-circuits are termed in this paper as modules and are represented by four different symbols. These sub-circuits and their corresponding symbols are shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>. Here, i, j and k indicate the bit positions, where i ≥ j ≥ k. The X-module is used in all four types of adders. This module of the i<sup>th</sup> bit position computes the following outputs:</p><p>C a r r y P r o p a g a t e ,   p i = a i ⊕ b i (1)</p><p>C a r r y G e n e r a t e ,   g i = a i ⋅ b i (2)</p><p>S u m ,   S i = a i ⊕ b i ⊕ c i (3)</p><p>Here, a<sub>i</sub> and b<sub>i</sub> are the i<sup>th</sup> bit of the n-bit operands A and B respectively. The symbol, C<sub>i</sub> represents input carry of i<sup>th</sup> position or the carry output of the (i-1)<sup>th</sup> position. For all these expressions i = 0 , 1 , ⋯ , n − 1 . The BCLA uses the Y-module that computes the carry bits. It also computes the block-carry-propagate and the block-carry-generate signals as follows:</p><p>B l o c k C a r r y P r o p a g a t e , P i , k = P i , j P j − 1 , k (4)</p><p>B l o c k C a r r y P r o p a g a t e , G i , k = G i , j + P i , j G j − 1 , k (5)</p><p>C a r r y , C j = G j − 1 , k + P j − 1 , k C k (6)</p><p>where, P i , i = p i and G i , i = g i .</p><p>The other three adders (KSA, BKA and SCSA) use gray-module and white-module in addition with X-module. These adders also operate on the principle of Block Carry Propagate and Block Carry Generate [<xref ref-type="bibr" rid="scirp.103104-ref14">14</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref15">15</xref>]. The gray-module and the white-module are the subdivisions of the Y-module of BCLA. The gray-module computes the Block Carry Propagate and Block Carry Generate whereas the white-module computes the carry C<sub>j</sub>. The internal circuits of the gray-module and the white-module are shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>(c) and <xref ref-type="fig" rid="fig2">Figure 2</xref>(d) respectively.</p></sec><sec id="s2_2"><title>2.2. Enhanced Parallel Self-Timed Adder (EPASTA)</title><p>The basic architecture of PASTA has a uniform design. An n-bit PASTA is constructed from n+1 similar blocks of circuitry. In the rest of this paper the circuitry of each block of PASTA is termed as a PASTA-block. Structure of a PASTA-block is shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>(a). This PASTA-block is used in the proposed architecture. In order to increase the visibility of the EPASTA architecture, a symbol is defined to represent this PASTA-block as shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>(b). The new symbol is termed as EP-module. For the sake of simplicity of the figure, the SEL terminal is not shown in the EP-module.</p><p>Carry propagation in PASTA is similar to that of ripple carry adder. Therefore the worst case computation time of PASTA is not so satisfactory. Adders having lower worst case computation time, computes carry earlier by using lookahead</p><p>carry generation techniques. Any carry can be computed by using only two levels of basic logic gates [<xref ref-type="bibr" rid="scirp.103104-ref4">4</xref>], but the size of the circuit increases exponentially with the size of the operands. For large operands (n &gt; 4) it is impractical and inefficient to use two level of logic for carry computation because of the limitation of fan-in and fan-out, irregular structure, use of many long wires, etc. [<xref ref-type="bibr" rid="scirp.103104-ref1">1</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref18">18</xref>]. Therefore, the practical carry-lookahead schemas usually use tree-like circuits that have simple regular structures [<xref ref-type="bibr" rid="scirp.103104-ref1">1</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref16">16</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref19">19</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref20">20</xref>] [<xref ref-type="bibr" rid="scirp.103104-ref21">21</xref>]. Four different architectures of EPASTA, having four different lookahead-carry generation schemes, are presented below.</p><p>The S i 0 and the C i + 1 0 terminals of the EP-module, in the initial phase, can be expressed as S i 0 = a i ⊕ b i and C i + 1 0 = a i ⋅ b i . These outputs are same as the p i , g i outputs of the X-module. In the first cycle of iterative phase, the Sum terminal of the EP-module will produce S i 1 = S i 0 ⊕ C i = a i ⊕ b i ⊕ C i . This output is exactly same the Sum output of the X-module. Therefore, the EP-module is equivalent to the X-module. The EPASTA circuits are implemented by replacing the X-modules of the four selected tree-like parallel adder circuits. The first of the four architectures of 16-bit EPASTA, which is constructed by replacing the X-modules of BCLA with the EP-modules, is shown in <xref ref-type="fig" rid="fig4">Figure 4</xref>. In order to store the correct value of C<sub>out</sub> an additional EP-module is attached after the most significant bit position. Moreover for sensing the completion of operation a completion detection unit similar to that of PASTA is attached. This version of EPASTA uses the lookahead-carry generator of BCLA for computing the carry bits.</p><p>It has been shown that the X-module and the EP-module are interchangeable. So, the second version of EPASTA, which uses the lookahead-carry generator of KSA, can be constructed easily by replacing the X-modules of KSA with the EP-modules. Similar to the previous version of EPASTA an additional EP-module should be added for storing the correct value of C<sub>out</sub>. A completion detection unit should be added for sensing the completion of operation. <xref ref-type="fig" rid="fig5">Figure 5</xref> illustrates the architecture of a16-bit EPASTA with lookahead-carry generator of KSA.</p><p>Similar procedure is followed to construct the other two versions of EPASTA. The X-modules of the BKA and SCSA are replaced by the EP-modules. Both the</p><p>implementations require an additional EP-module for capturing the final value of C<sub>out</sub>. The completion detection unit is also included in both the implementations for realizing the completion of operations. The EPASTA implementations with the lookahead-carry generators of BKA and SCSA are illustrated in <xref ref-type="fig" rid="fig6">Figure 6</xref> and <xref ref-type="fig" rid="fig7">Figure 7</xref> respectively.</p></sec></sec><sec id="s3"><title>3. Results</title><p>In this section the simulation results are presented for all the adders that are discussed in this paper. Though the illustrated adders are 16-bit adders, actually 32-bit versions of those adders are implemented for simulation. All the simulation are done by using an industry standard software tool and executed on 64 bit Linux platform. Simulation is performed for three different TSMC processes.</p><p>Three different types of adder operation are analyzed and those are: worst-case, best-case and average case. The best-case addition does not involve any carry propagation and hence incurs only a single bit adder delay for producing the result. The worst-case involves maximum carry propagation cascaded delays due to the propagation length of 32-bits. The average case shows how the separate carry propagation is limited within their individual propagation chain and can progress simultaneously with the other carry propagation chains. Some test-cases, representative of these cases, are chosen. The dataset used for this experiment is summarized in <xref ref-type="table" rid="table1">Table 1</xref>. The expected carry chain length for n-bit binary numbers is established in [<xref ref-type="bibr" rid="scirp.103104-ref22">22</xref>]. A carry chain length indicates the maximum number of consecutive PASTA-blocks that propagate a carry bit (1). While adding two n-bit binary numbers, a carry chain length m indicates that a carry bit will propagate through maximum of m consecutive PASTA-blocks for at least once in the whole n-bit addition operation. Since carry chain length is a factor, five different random numbers are chosen. Among these numbers, three have maximum carry chain length of 5 and two have maximum carry chain length of 6. Thus they represent an average carry chain length of 5.4. The delay is measured at 70% transition point for the related signals.</p><p>The delay performances of different adders are shown in <xref ref-type="table" rid="table2">Table 2</xref>. It is divided into three parts to differentiate between conventional synchronous parallel adders, basic architecture of PASTA and four versions of EPASTA. The top part shows the results for conventional adders (i.e. BKA, BCLA, KSA and SCSA). Worst-case delay is important for these adders because they do not have any completion sensing mechanism. The adders in the middle and the bottom part</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Dataset for comparing different adders</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Test Case</th><th align="center" valign="middle" >Operand A</th><th align="center" valign="middle" >Operand B</th><th align="center" valign="middle" >Maximum Carry Chain Length</th></tr></thead><tr><td align="center" valign="middle" >Worst Case</td><td align="center" valign="middle" >FFFF FFFF</td><td align="center" valign="middle" >0000 0001</td><td align="center" valign="middle" >32</td></tr><tr><td align="center" valign="middle" >Average Case 1</td><td align="center" valign="middle" >0501 6A44</td><td align="center" valign="middle" >FC3F 0499</td><td align="center" valign="middle" >6</td></tr><tr><td align="center" valign="middle" >Average Case 2</td><td align="center" valign="middle" >3F05 0FC0</td><td align="center" valign="middle" >0130 0041</td><td align="center" valign="middle" >6</td></tr><tr><td align="center" valign="middle" >Average Case 3</td><td align="center" valign="middle" >0902 6A44</td><td align="center" valign="middle" >F83E 0499</td><td align="center" valign="middle" >5</td></tr><tr><td align="center" valign="middle" >Average Case 4</td><td align="center" valign="middle" >3E05 0F80</td><td align="center" valign="middle" >0230 0081</td><td align="center" valign="middle" >5</td></tr><tr><td align="center" valign="middle" >Average Case 5</td><td align="center" valign="middle" >0052 40A2</td><td align="center" valign="middle" >57C5 0F84</td><td align="center" valign="middle" >5</td></tr><tr><td align="center" valign="middle" >Bast Case</td><td align="center" valign="middle" >55E1 9D5C</td><td align="center" valign="middle" >AA1E 62A3</td><td align="center" valign="middle" >0</td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Spice timing report for different 32-bit adders</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Process</th><th align="center" valign="middle"  colspan="3"  >TSMC 0.35μ (V<sub>dd</sub> = 3.3V)</th><th align="center" valign="middle"  colspan="3"  >TSMC 0.25μ (V<sub>dd</sub> = 2.5V)</th><th align="center" valign="middle"  colspan="3"  >TSMC 0.18μ (V<sub>dd</sub> = 1.8V)</th></tr></thead><tr><td align="center" valign="middle" >Best (ns)</td><td align="center" valign="middle" >Avg. (ns)</td><td align="center" valign="middle" >Worst (ns)</td><td align="center" valign="middle" >Best (ns)</td><td align="center" valign="middle" >Avg. (ns)</td><td align="center" valign="middle" >Worst (ns)</td><td align="center" valign="middle" >Best (ns)</td><td align="center" valign="middle" >Avg. (ns)</td><td align="center" valign="middle" >Worst (ns)</td></tr><tr><td align="center" valign="middle" >BKA</td><td align="center" valign="middle" >0.3090</td><td align="center" valign="middle" >1.4493</td><td align="center" valign="middle" >2.8693</td><td align="center" valign="middle" >0.2251</td><td align="center" valign="middle" >1.2158</td><td align="center" valign="middle" >2.4743</td><td align="center" valign="middle" >0.1342</td><td align="center" valign="middle" >0.8497</td><td align="center" valign="middle" >1.7939</td></tr><tr><td align="center" valign="middle" >BCLA</td><td align="center" valign="middle" >0.2039</td><td align="center" valign="middle" >1.3224</td><td align="center" valign="middle" >2.6980</td><td align="center" valign="middle" >0.1432</td><td align="center" valign="middle" >1.1601</td><td align="center" valign="middle" >2.3761</td><td align="center" valign="middle" >0.0796</td><td align="center" valign="middle" >0.8207</td><td align="center" valign="middle" >1.7342</td></tr><tr><td align="center" valign="middle" >KSA</td><td align="center" valign="middle" >0.3245</td><td align="center" valign="middle" >1.3423</td><td align="center" valign="middle" >1.7786</td><td align="center" valign="middle" >0.2383</td><td align="center" valign="middle" >1.1238</td><td align="center" valign="middle" >1.5254</td><td align="center" valign="middle" >0.1468</td><td align="center" valign="middle" >0.7823</td><td align="center" valign="middle" >1.0931</td></tr><tr><td align="center" valign="middle" >SCSA</td><td align="center" valign="middle" >0.3094</td><td align="center" valign="middle" >1.4563</td><td align="center" valign="middle" >2.5772</td><td align="center" valign="middle" >0.2254</td><td align="center" valign="middle" >1.2232</td><td align="center" valign="middle" >2.2399</td><td align="center" valign="middle" >0.1346</td><td align="center" valign="middle" >0.8580</td><td align="center" valign="middle" >1.6168</td></tr><tr><td align="center" valign="middle" >PASTA</td><td align="center" valign="middle" >0.6313</td><td align="center" valign="middle" >2.0387</td><td align="center" valign="middle" >9.0872</td><td align="center" valign="middle" >0.9593</td><td align="center" valign="middle" >1.9673</td><td align="center" valign="middle" >8.1748</td><td align="center" valign="middle" >1.8610</td><td align="center" valign="middle" >2.8601</td><td align="center" valign="middle" >7.6258</td></tr><tr><td align="center" valign="middle" >EPASTA-BKA</td><td align="center" valign="middle" >1.1919</td><td align="center" valign="middle" >3.1586</td><td align="center" valign="middle" >4.4193</td><td align="center" valign="middle" >1.2188</td><td align="center" valign="middle" >2.8962</td><td align="center" valign="middle" >3.9819</td><td align="center" valign="middle" >2.9017</td><td align="center" valign="middle" >2.8594</td><td align="center" valign="middle" >4.1784</td></tr><tr><td align="center" valign="middle" >EPASTA-BCLA</td><td align="center" valign="middle" >1.0021</td><td align="center" valign="middle" >2.8991</td><td align="center" valign="middle" >4.0468</td><td align="center" valign="middle" >1.0444</td><td align="center" valign="middle" >2.6826</td><td align="center" valign="middle" >3.6860</td><td align="center" valign="middle" >2.5684</td><td align="center" valign="middle" >2.6333</td><td align="center" valign="middle" >3.7517</td></tr><tr><td align="center" valign="middle" >EPASTA-KSA</td><td align="center" valign="middle" >1.1957</td><td align="center" valign="middle" >3.2059</td><td align="center" valign="middle" >3.6728</td><td align="center" valign="middle" >1.2244</td><td align="center" valign="middle" >2.9455</td><td align="center" valign="middle" >3.3933</td><td align="center" valign="middle" >2.9138</td><td align="center" valign="middle" >2.8426</td><td align="center" valign="middle" >2.9672</td></tr><tr><td align="center" valign="middle" >EPASTA-SCSA</td><td align="center" valign="middle" >1.1903</td><td align="center" valign="middle" >3.2221</td><td align="center" valign="middle" >4.5280</td><td align="center" valign="middle" >1.2270</td><td align="center" valign="middle" >2.9578</td><td align="center" valign="middle" >4.0840</td><td align="center" valign="middle" >2.8942</td><td align="center" valign="middle" >2.9153</td><td align="center" valign="middle" >3.8388</td></tr></tbody></table></table-wrap><p>are asynchronous adders which have completion detection mechanism and whose complete operation is divided into two phase (initial phase and recursive phase). So for computing the delay of these adders the following relation is used:</p><p>t t o t a l = t i n i t i a l + t r e c u r s i v e</p><p>Here t<sub>initial</sub> represents the time required for the state transition of the initial phase and t<sub>recursive</sub> represents the delay between SEL and the Terminate signals. The value of t<sub>total</sub> is listed in <xref ref-type="table" rid="table2">Table 2</xref>. Since the major concern of this study is to analyze the performance of the proposed adders with respect to PASTA, the minimum delays of the asynchronous adders are shown in bold face.</p><p><xref ref-type="table" rid="table2">Table 2</xref> shows that all of the four proposed architectures of EPASTA give better worst-case delay compared to that of PASTA and the amount of improvement is minimum 3.45 nS and maximum 4.66 nS (i.e. minimum 45.25% and maximum 61.09%). Though, with respect to PASTA, the best-case and average case delays of EPASTA increase most of the time, the amount of this increment is not so high (maximum 1.18 nS).</p><p>Among the four proposed architectures, EPASTA with lookahead-carry generator of KSA gives best result for the worst case. For the other two cases (best case and average case), EPASTA with lookahead-carry generator of CLA performs better than the other architectures of EPASTA.</p></sec><sec id="s4"><title>4. Conclusion</title><p>The major objective of this paper was to analyze the practicality of using the lookahead-carry generator with the newly introduced self-timed adder PASTA. Moreover, this study also investigates a probable solution to improve the worst case performance of PASTA. The results show that the proposed architecture gives better worst case performance than the basic architecture of PASTA without major compromise of the best and average case performances. Moreover, these results also support the practicality of the proposed architecture of self-timed adders.</p></sec><sec id="s5"><title>Conflicts of Interest</title><p>The author declares no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s6"><title>Cite this paper</title><p>Habib, M.A. (2020) Parallel Self-Timed Adder with Lookahead-Carry Generator. Open Access Library Journal, 7: e6799. https://doi.org/10.4236/oalib.1106799</p></sec></body><back><ref-list><title>References</title><ref id="scirp.103104-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Hennessy, J.L. and Patterson, D.A. (1990) Computer Architecture: A Quantitative Approach. Morgan Kaufmann, Waltham.</mixed-citation></ref><ref id="scirp.103104-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Franklin, M.A. and Pan, T. (1994) Performance Comparison of Asynchronous Adders. Proceedings of IEEE Symposium on Advanced Research in Asynchronous Circuits and Systems, Salt Lake City, 3-5 November 1994, 117-125. 
https://doi.org/10.1109/ASYNC.1994.656299</mixed-citation></ref><ref id="scirp.103104-ref3"><label>3</label><mixed-citation publication-type="book" xlink:type="simple">Garside, J.D. (1993) A CMOS VLSI Implementation of an Asynchronous ALU. In: Furber, S. and Edwards, M., Eds., Asynchronous Design Methodologies, IFIP Transactions, North Holland, 181-192.</mixed-citation></ref><ref id="scirp.103104-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Patterson, D.A. and Hennessy, J.L. (2014) Computer Organization and Design: The Hardware/Software Interface. 5th Edition, Morgan Kaufmann, Waltham.</mixed-citation></ref><ref id="scirp.103104-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Zimmermann, R. (1997) Binary Adder Architectures for Cell-based VLSI and Their Synthesis. Ph.D. Dissertation, Swiss Federal Institute of Technology, Zurich.</mixed-citation></ref><ref id="scirp.103104-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Rahman, M.Z., Kleeman, L. and Habib, M.A. (2014) Recursive Approach to the Design of a Parallel Self-Timed Adder. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 23, 213-217. https://doi.org/10.1109/TVLSI.2014.2303809</mixed-citation></ref><ref id="scirp.103104-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Johnson, D. and Akella, V. (1998) Design and Analysis of Asynchronous Adders. IEE Proceedings—Computers and Digital Techniques, 145, 1-8. 
https://doi.org/10.1049/ip-cdt:19981770</mixed-citation></ref><ref id="scirp.103104-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Jayanthi, A.N. (2019) Performance Improvement in VLSI Adders. International Journal of Research in Arts and Science, 5, 76-87.  
https://doi.org/10.9756/BP2019.1002/07</mixed-citation></ref><ref id="scirp.103104-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Sivakumar, M. and Omkumar, S. (2018) Design and FPGA Implementation of FBMC Transmitter by Using Clock Gating Technique based QAM, In verse FFT and Filter Bank for Low Power and High Speed Applications. Journal of Electrical Engineering &amp; Technology, 13, 2479-2484.</mixed-citation></ref><ref id="scirp.103104-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Jhamb, M. (2017) Efficient Adders for Assistive Devices. Engineering Science and Technology, 20, 95-104. https://doi.org/10.1016/j.jestch.2016.09.007</mixed-citation></ref><ref id="scirp.103104-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Vigneshwari, R., Jayasimha, T. and Sasikumar, P. (2017) Power Analysis by Combining the Modules PASTA Using DGMOSFET. Advances in Natural and Applied Sciences, 11, 691-698.</mixed-citation></ref><ref id="scirp.103104-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Sivakumar, M. and Omkumar, S. (2016) Integration of Optimized GDI Logic Based NOR Gate and Half Adder into PASTA for Low Power &amp; Low Area Applications. International Journal of Applied Engineering Research, 11, 2629-2633.</mixed-citation></ref><ref id="scirp.103104-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Chapiro, D.M. (1985) Globally-Asynchronous Locally-Synchronous Systems. Ph.D. Dissertation, Stanford University, California.</mixed-citation></ref><ref id="scirp.103104-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Kogge, P.M. and Stone, H.S. (1973) A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations. IEEE Transactions on Computers, 100, 786-793. &lt;br /&gt;https://doi.org/10.1109/TC.1973.5009159</mixed-citation></ref><ref id="scirp.103104-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Rabaey, J.M., Chandrakasan, A.P. and Nikoli?, B. (2003) Digital Integrated Circuits: A Design Perspective. Second Edition, Prentice Hall, New Jersey. </mixed-citation></ref><ref id="scirp.103104-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Brent, R.P. and Kung, H.T. (1982) A Regular Layout for Parallel Adders. IEEE Transactions on Computers, C-31, 260-264.  
https://doi.org/10.1109/TC.1982.1675982</mixed-citation></ref><ref id="scirp.103104-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Sklansky, J. (1960) Conditional-Sum Addition Logic. IRE Transactions on Electronic Computers, EC-9, 226-231. https://doi.org/10.1109/TEC.1960.5219822</mixed-citation></ref><ref id="scirp.103104-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Ngai, T.F., Irwin, M.J. and Rawat, S. (1986) Regular Area-Time Efficient Carry-Lookahead Adders. Journal of Parallel and Distributed Computing, 3, 92-105.  
&lt;br /&gt;https://doi.org/10.1016/0743-7315(86)90029-8</mixed-citation></ref><ref id="scirp.103104-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Flores, I. (1963) The Logic of Computer Arithmetic. Prentice Hall, New Jersey.</mixed-citation></ref><ref id="scirp.103104-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Unger, S.H. (1977) Tree Realizations of Iterative Circuits. IEEE Transactions on Computers, C-26, 365-383. https://doi.org/10.1109/TC.1977.1674846</mixed-citation></ref><ref id="scirp.103104-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Cheng, F.C., Unger, S.H. and Theobald, M. (2000) Self-Timed Carry-Lookahead Adders. IEEE Transactions on Computers, 49, 659-672.  
https://doi.org/10.1109/12.863035</mixed-citation></ref><ref id="scirp.103104-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Reitwiesner, G.W. (1960) The Determination of Carry Propagation Length for Binary Addition. IRE Transactions on Electronic Computers, EC-9, 35-38. 
&lt;br /&gt;https://doi.org/10.1109/TEC.1960.5221602</mixed-citation></ref></ref-list></back></article>