<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JCC</journal-id><journal-title-group><journal-title>Journal of Computer and Communications</journal-title></journal-title-group><issn pub-type="epub">2327-5219</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jcc.2016.410004</article-id><article-id pub-id-type="publisher-id">JCC-69624</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  Hardware Design of Moving Object Detection on Reconfigurable System
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Hung-Yu</surname><given-names>Chen</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Yuan-Kai</surname><given-names>Wang</given-names></name><xref ref-type="aff" rid="aff2"><sup>2</sup></xref></contrib></contrib-group><aff id="aff2"><addr-line>Department of Electrical Engineering, Fu Jen Catholic University, Taiwan</addr-line></aff><aff id="aff1"><addr-line>Graduate Institute of Applied Science and Engineering, Fu Jen Catholic University, Taiwan</addr-line></aff><pub-date pub-type="epub"><day>10</day><month>08</month><year>2016</year></pub-date><volume>04</volume><issue>10</issue><fpage>30</fpage><lpage>43</lpage><history><date date-type="received"><day>20</day>	<month>June</month>	<year>2016</year></date><date date-type="rev-recd"><day>accepted</day>	<month>7</month>	<year>August</year>	</date><date date-type="accepted"><day>10</day>	<month>August</month>	<year>2016</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Moving object detection including background subtraction and morphological processing is a critical research topic for video surveillance because of its high computational loading and power consumption. This paper proposes a hardware design to accelerate the computation of background subtraction with low power consumption. A real-time background subtraction method is designed with a frame-buffer scheme and function partition to improve throughput, and implemented using Verilog HDL on FPGA. The design parallelizes the computations of background update and subtraction with a seven-stage pipeline. A stripe-based morphological processing and accounting for the completion of detected objects is devised. Simulation results for videos of VGA resolutions on a low-end FPGA device show 368 fps throughput for only the real-time background subtraction module, and 51 fps for the whole system, including off-chip memory access. Real-time efficiency with low power consumption and low resource utilization is thus demonstrated.
 
</p></abstract><kwd-group><kwd>Background Substraction</kwd><kwd> Moving Object Detection</kwd><kwd> Field Programmable Gate Array (FPGA)</kwd><kwd> Hardware Acceleration</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>Moving object detection is typically the first processing stage in video surveillance systems. It is one of the most important steps in smart video surveillance, which aims to detect foreground objects and events. Foreground is essential for tracking objects and maintaining their identities. The detection of foreground objects can be achieved by the representation of scene background. The foreground object is determined by locating significant differences between the current frame and background representation.</p><p>There are many proposed methods for detecting moving objects, such as temporal difference, optical flow and background subtraction [<xref ref-type="bibr" rid="scirp.69624-ref1">1</xref>] . Non-recursive methods including temporal difference and optical flow adopt sliding windows to build background models. This method consists of saving all image frames of each picture into a moving window, and applying some statistic measures such as median filter [<xref ref-type="bibr" rid="scirp.69624-ref2">2</xref>] - [<xref ref-type="bibr" rid="scirp.69624-ref4">4</xref>] or mean filter [<xref ref-type="bibr" rid="scirp.69624-ref5">5</xref>] to analyze the change of each pixel by time in the window screen in order to estimate the background image. The goal of moving windows is to ensure that the background model is at its most up to date condition in order to cut out old pixels and allow the entry of new pixels. However, it does require longer moving windows when dealing with items that move slowly, which in turn requires very large pixel memory space. Background subtraction is more important in present study and has the best accuracy among these methods. Common background subtraction methods include Running Average (RA) [<xref ref-type="bibr" rid="scirp.69624-ref6">6</xref>] , Gaussian mixture model (GMM) [<xref ref-type="bibr" rid="scirp.69624-ref7">7</xref>] and nonparametric kernel methods [<xref ref-type="bibr" rid="scirp.69624-ref8">8</xref>] . Although GMM and nonparametric methods are stable algorithms, their complexity [<xref ref-type="bibr" rid="scirp.69624-ref9">9</xref>] [<xref ref-type="bibr" rid="scirp.69624-ref10">10</xref>] makes it impossible to be implemented by hardware approach. RA is also a stable algorithm, but its signal processing is more tractable for hardware implementation because of single modality in statistical modeling.</p><p><xref ref-type="fig" rid="fig1">Figure 1</xref> illustrates a general algorithmic step of background subtraction. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x6.png" xlink:type="simple"/></inline-formula>is the current real-time image pixel of the location at time<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x7.png" xlink:type="simple"/></inline-formula>. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x8.png" xlink:type="simple"/></inline-formula>is the current pixel location of the established background image. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x9.png" xlink:type="simple"/></inline-formula>is the pixel location of the established background image at time t. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x10.png" xlink:type="simple"/></inline-formula>is the adaptable threshold value of the location of the object image with noise when time is<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x11.png" xlink:type="simple"/></inline-formula>. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x12.png" xlink:type="simple"/></inline-formula>is the adaptable threshold value, which is input to the Morphology process unit. <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x13.png" xlink:type="simple"/></inline-formula>is the adaptable threshold value of the location of the background image with morphology process when time is<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x14.png" xlink:type="simple"/></inline-formula>.</p><p>DSP (Digital Signal Processing), GPU (Graphics Processing Unit) and FPGA (Field Programmable Gate Array) are three hardware solutions used to accelerate background subtraction algorithms. DSP is a powerful and very fast microprocessor, able to achieve real-time digital signal processing. DSP can use instruction-level parallelization technique with multiple paths to achieve speedup, but its computational power is limited [<xref ref-type="bibr" rid="scirp.69624-ref11">11</xref>] . GPUs are very powerful processors which outperform central processing units in special applications on computers and portable devices. GPUs are also capable of parallel computing by placing many threads on massive cores and computing them at the same time. However, they are not power efficient. FPGA is an integrated circuit designed to be configured by a designer and generally specified using a hardware description language (HDL). It is similar to that used for an ASIC (Application-specific Integrated Circuit). FPGA has the ability to handle complex operations in parallel by its reconfigurable capability. This capability combined with pipeline design can speed</p><fig id="fig1"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref></label><caption><title> Computational flow of background subtraction</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x15.png"/></fig><p>up background subtraction. The architecture of the FPGA is highly parallel and tailored to efficiently construct image and complex algorithms. FPGAs are therefore suitable for implementations of image processing and computer vision algorithms in embedded systems.</p><p>Many studies have been devoted to accelerating moving object detection by FPGA. The following papers are consistently devoted to achieving real-time performance for 640 &#215; 480 or 720 &#215; 576 resolutions with high-end and high-power FPGA. Appiah et al. [<xref ref-type="bibr" rid="scirp.69624-ref12">12</xref>] achieved 60 fps by FPGA, which is promising compared with the 25 fps achieved by a 3 GHz Pentium 4. However, the method requires larger hardware resources because four frame buffers must be allocated to handle the process. Cucchiara et al. [<xref ref-type="bibr" rid="scirp.69624-ref13">13</xref>] applied four FPGAs with four frame buffers to achieve object detection. Its best performance can only achieve 5 fps, which is far from meeting real-time requirements. Elgammal et al. [<xref ref-type="bibr" rid="scirp.69624-ref8">8</xref>] used Wronskian statistics for moving object detection and implemented it on FPGA. However, its maximum performance of 15 fps is not sufficient for real-time detection [<xref ref-type="bibr" rid="scirp.69624-ref14">14</xref>] - [<xref ref-type="bibr" rid="scirp.69624-ref16">16</xref>] . Genovese et al. [<xref ref-type="bibr" rid="scirp.69624-ref17">17</xref>] applied an OpenCV GMM algorithm implementation able to process 22 fps on 1920 &#215; 1080 resolution when implemented on Virtex5 FPGA. Jang et al. [<xref ref-type="bibr" rid="scirp.69624-ref18">18</xref>] proposed a circuit able to process 1024 &#215; 1024 video sequences at 38 fps when implemented on a VirtexII FPGA platform. Genovese et al. [<xref ref-type="bibr" rid="scirp.69624-ref19">19</xref>] applied two hardware implementations of the OpenCV version of the Gaussian Mixture Model (GMM), a background identification algorithm. When implemented on Virtex6, the proposed circuit processed 60 fps on a 1920 &#215; 1080 resolution.</p><p>The architecture proposed in this study was designed with a hardware design for running average and morphology algorithms by FPGA. This system is proposed in order to achieve real-time performance with low power consumption. The proposed circuit has been experimentally validated through experimental measurements on a hardware platform.</p><p>The main contributions of this paper are the following:</p><p>1) An innovative, hardware-oriented formulation of the RA equations that allows hardware speed improvement and saving.</p><p>2) A background subtraction and morphology algorithm is proposed and accelerated by reconfigurable hardware which allows embedded systems to operate in real-time.</p><p>3) The experimental demonstration of the proposed FPGA circuit in running on-line video systems.</p><p>The main block includes five modules as shown in <xref ref-type="fig" rid="fig2">Figure 2</xref>. The block must convert color format from RGB to YCbCr, establish the background model by Y components, apply the background subtraction to compare the foreground and background’s Y components, and finally perform morphology to obtain the result of object detection.</p><p>The remainder of this paper is organized as follows. Section 2 reviews the running average and morphology algorithm, and also presents the proposed Real-Time Background Subtraction (RTBS) design. Section 3 presents the proposed RTBS system. Simulation and verification of the design are given in Section 4. Section 5 concludes advantages of the design.</p></sec><sec id="s2"><title>2. Methodology</title><p>This section mainly describes the system build algorithms and methods. It first reviews the formulation of the background subtraction and the RTBS algorithm. The RTBS improvement of the RA algorithm by removing division operation is then described. Dataflow of the RTBS is also analyzed. Next, the morphology algorithm and pipeline process are discussed. Dataflow of the morphology is also analyzed. Finally, the pipeline process diagram of morphology is explained.</p><sec id="s2_1"><title>2.1. Background Subtraction</title><p>The background subtraction algorithm consists of two steps: differencing and background modeling. The differencing</p><fig id="fig2"  position="float"><label><xref ref-type="fig" rid="fig2">Figure 2</xref></label><caption><title> Detailed processing modules of the main block</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x16.png"/></fig><p>step extracts motion pixels by computing the difference between the current frame and the background model. The background model is statistically built with single modality assumption. The differencing step can be formulated as follows:</p><disp-formula id="scirp.69624-formula549"><label>(1)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730404x17.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x18.png" xlink:type="simple"/></inline-formula> is the pixel value of the current frame at time<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x19.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x20.png" xlink:type="simple"/></inline-formula>is the background model at time t, and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x20.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x21.png" xlink:type="simple"/></inline-formula> is the subtraction result.</p><p>The background model of each pixel is assumed to be single Gaussian, and its parameters can be recursively updated by a new frame, which can improve computational efficiency and reduce memory resource allocation [<xref ref-type="bibr" rid="scirp.69624-ref8">8</xref>] . The recursive form of the expected value of the single Gaussian, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x22.png" xlink:type="simple"/></inline-formula>, is described as follows:</p><disp-formula id="scirp.69624-formula550"><label>(2)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730404x23.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x24.png" xlink:type="simple"/></inline-formula> is the established background image at time t, and k is the learning parameter controlling the background learning speed. The difference image <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x24.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x25.png" xlink:type="simple"/></inline-formula> is then thresholded into <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x24.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x25.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x26.png" xlink:type="simple"/></inline-formula> by an adaptive threshold obtained by recursively updating the variance of the Gaussian model.</p><disp-formula id="scirp.69624-formula551"><label>(3)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730404x27.png"  xlink:type="simple"/></disp-formula><disp-formula id="scirp.69624-formula552"><label>(4)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730404x28.png"  xlink:type="simple"/></disp-formula><p>where standard deviation <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x29.png" xlink:type="simple"/></inline-formula> is applied to the adaptive threshold and recursively updated by the current frame. The parameter <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x29.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x30.png" xlink:type="simple"/></inline-formula> determines the desired precision of thresholding.</p><p>The above formulation for background subtraction and updating includes five multiplications, two divisions, and one radical expression operation. The divisions in Equation (2) and Equation (4) require significant logic circuit resources and can slow down computational performance. By applying integer arithmetic to replace the division operation with bit shifting, the hardware design of the RA algorithm is significantly improved. In addition, the variance <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x31.png" xlink:type="simple"/></inline-formula> obtained in Equation (4) must be square-rooted into standard deviation <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x31.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x32.png" xlink:type="simple"/></inline-formula> for the thresholding in Equation (3). The implicit radical expression must be eliminated. As a result, reformulation is necessary in order to shorten computational time as well as reduce resource consumption.</p><p>In order to apply shift circuit instead of division, the denominators of Equation (2) and Equation (4) must be modified. The <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x33.png" xlink:type="simple"/></inline-formula>is replaced with <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x33.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x34.png" xlink:type="simple"/></inline-formula> and the <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x33.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x34.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x35.png" xlink:type="simple"/></inline-formula> in Equation (3) is substituted with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x33.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x34.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x35.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x36.png" xlink:type="simple"/></inline-formula>, where both sides of the condition must be squared. The three new equations are given below:</p><disp-formula id="scirp.69624-formula553"><label>(5)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730404x37.png"  xlink:type="simple"/></disp-formula><disp-formula id="scirp.69624-formula554"><label>(6)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730404x38.png"  xlink:type="simple"/></disp-formula><disp-formula id="scirp.69624-formula555"><label>(7)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730404x39.png"  xlink:type="simple"/></disp-formula><p>where N is the m power of 2, i.e.,<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x40.png" xlink:type="simple"/></inline-formula>. The <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x40.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x41.png" xlink:type="simple"/></inline-formula> of RA is replaced with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x40.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x41.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x42.png" xlink:type="simple"/></inline-formula>, and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x40.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x41.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x42.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x43.png" xlink:type="simple"/></inline-formula> is replaced with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x40.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x41.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x42.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x43.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x44.png" xlink:type="simple"/></inline-formula>. As a result, shift circuit is applied to replacing division operations. Mathematically, the reformulation produces residue between RA and RTBS on background updating. However, it can be demonstrated that, practically, the residue vanishes as t increases.</p><p>Before using Verilog for hardware design, Equation (5) is analyzed in more detail to identify the data flow of background updating. The data flow is shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>(a). The data flow of Equation (6) to find the adaptive threshold is shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>(b). Equation (7) performs a new adaptive thresholding mechanism to find objects. Its dataflow diagram is shown in <xref ref-type="fig" rid="fig3">Figure 3</xref>(c).</p><p>Now the required arithmetic operations between RTBS and RA are compared. A detailed comparison is given in <xref ref-type="table" rid="table1">Table 1</xref>. Although RTBS requires two extra multiplications, it eliminates the need for division and radical expression, which can dramatically reduce hardware resource utilization. <xref ref-type="fig" rid="fig4">Figure 4</xref> illustrates the residues for<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x45.png" xlink:type="simple"/></inline-formula>. Higher residues may exist for small t, but residues diminish to zero when<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x45.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x46.png" xlink:type="simple"/></inline-formula>.</p><fig id="fig3"  position="float"><label><xref ref-type="fig" rid="fig3">Figure 3</xref></label><caption><title> Data flow diagram: (a) Background modeling in Equation (5). (b) Adaptive threshold determination in Equation (6). (c) Foreground extraction in Equation (7)</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x47.png"/></fig><fig id="fig4"  position="float"><label><xref ref-type="fig" rid="fig4">Figure 4</xref></label><caption><title> Residue analysis between Equation (2) and Equation (5)</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x48.png"/></fig><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Arithmetic operations of running average and the RTBS</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Operation</th><th align="center" valign="middle" >RA</th><th align="center" valign="middle" >RTBS</th></tr></thead><tr><td align="center" valign="middle" >Addition (+)</td><td align="center" valign="middle" >4</td><td align="center" valign="middle" >2</td></tr><tr><td align="center" valign="middle" >Subtraction (−)</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >3</td></tr><tr><td align="center" valign="middle" >Multiplication (&#215;)</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >7</td></tr><tr><td align="center" valign="middle" >Division (/)</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >0</td></tr><tr><td align="center" valign="middle" >Radical Expression (√)</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >0</td></tr></tbody></table></table-wrap></sec><sec id="s2_2"><title>2.2. Morphology</title><p>Morphology theory’s hardware designs mainly apply the solutions of corrosion and expansion. Before discussing these two solutions, it is necessary to understand an important step, which is shown in <xref ref-type="fig" rid="fig5">Figure 5</xref>, called the structuring element. This study mainly uses a 3 &#215; 3 mask, using liner buffer alone with register to get the 9 closest pixel (M1 ~ M9), in which the M5 will be the Origin.</p><p>Erosion and dilation are the two basic elements of image handling, and both Opening and Closing will apply these two principles. However, both erosion and dilation must establish the 3 &#215; 3 image window in order to obtain the pixels from M1 ~ M9 and to define P1 ~ P9 as the value in the structuring element. The final number which applies the AND gate is erosion, while the one using the OR gate is expansion (see <xref ref-type="fig" rid="fig6">Figure 6</xref>).</p><p>The following section will describe the flow of the morphology data process. As with the DFG shown in <xref ref-type="fig" rid="fig7">Figure 7</xref>, a few nodes are first defined as follows:</p><p>・ “E” is the image window for 3 &#215; 3 for image erosion.</p><p>・ “D” is the image window for 3 &#215; 3 for dilation.</p><p>First, it is necessary to establish a 3 &#215; 3 image window and enter node “E” for erosion followed by node “D” for dilation. After the image is opened for clear up, a 3 &#215; 3 image window is established, and node “D” is entered to handle the erosion, then again, another 3 &#215; 3 image window is established in order to enter “E” for</p><fig id="fig5"  position="float"><label><xref ref-type="fig" rid="fig5">Figure 5</xref></label><caption><title> 3 &#215; 3 structuring element</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x49.png"/></fig><fig-group id="fig6"><label><xref ref-type="fig" rid="fig6">Figure 6</xref></label><caption><title> Hardware implementation structure: (a) Erosion, (b) Dilation.</title></caption><fig id ="fig6_1"><label> (b)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x51.png"/></fig><fig id ="fig6_2"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x50.png"/></fig></fig-group><fig id="fig7"  position="float"><label><xref ref-type="fig" rid="fig7">Figure 7</xref></label><caption><title> Morphology process DFG</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x52.png"/></fig><p>dilation in order to complete Closing to fix up the image.</p><p>In Morphology, a 4-stage pipeline is applied to handle the issue as shown in <xref ref-type="fig" rid="fig8">Figure 8</xref>.</p></sec></sec><sec id="s3"><title>3. System</title><p>This section first reviews the system of the RTBS algorithm. This study applies a Field Programmable Gate Array. <xref ref-type="fig" rid="fig9">Figure 9</xref> shows the system structure of a moving object detection system.</p><p>A 1.3 megapixel CMOS digital module was applied to obtain the image resource, further image handling was conducted by FPGA, the image color scheme was changed from RGB to YCbCr, the background was established by its Y, the Background Subtraction was applied to compare the foreground and background’s Y, and moving object detection was finally achieved.</p><p>The Real Time Background Subtraction (RTBS) integrated into the system implements Background Subtraction, background updating and adaptive threshold. Thus, the overall operation can be implemented in hardware using subs, adds, shifters and multipliers. The RTBS employs hardware features such as parallelism and pipelining. The architecture is pipelined into 7 stages. In the input image data fetches 2 pixels from the input image port, and forwards them to the input ports of the line buffers (10 bits to each buffer), which have a FIFO structure.</p><p>Background Subtraction and Morphology were integrated mainly due to the simple hardware structure, which can be easily parallelized to provide more than one pixel per single cycle. This is important, as the ability of the RTBS to perform parallel computations depends on the ability of the line buffer to provide multiple pixels per cycle. The Morphology employs hardware features such as parallelism and pipelining, in an effort to parallelize the repetitive calculations involved in the Erosion and Dilation operations, and uses optimized memory structures in order to reduce the memory reading redundancy. The architecture of the RTBS is shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>0.</p><p>The architecture implementation of the RTBS algorithm consists of a memory controller, RGB to YCbCr color space converters (RGB 2YCvCr) and Background Image storage. The memory controller fetches the RGB color values corresponding to the support first Line Buffer (FIFO Buffer 1) and second Line Buffer (FIFO Buffer 2) in a column-wise fashion (1 pixel value per input image every clock cycle) from the external memory. The architecture of the 4 port SDRAM control block is shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>1.</p><p>Those values are then converted to grayscale by the RGB2YCbCr unit, and to their corresponding 10-bit YCbCr representation by the RGB2YCbCr units. The Background image values of RTBS computed by the Background Subtraction unit and Morphology unit are temporarily stored in on-chip SDRAM frame buffers.</p><p>The design was simulated and implemented on a low-end FPGA with low power consumption. The FPGA has about five hundred thousand gate counts, 150 18 &#215; 18 multipliers and 60 KB internal memory. The main system clock works at 25 MHz, which has very low power consumption compared with that of a PC with a 2.4 GHz processor. Verilog HDL is adopted for implementation and QUARTUS II for synthesis. An external 64 MB SDRAM memory is used for frame buffers to store background models.</p><fig id="fig8"  position="float"><label><xref ref-type="fig" rid="fig8">Figure 8</xref></label><caption><title> Pipeline process diagram for Morphology</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x53.png"/></fig><fig id="fig9"  position="float"><label><xref ref-type="fig" rid="fig9">Figure 9</xref></label><caption><title> Structure of FPGA moving object detection system</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x54.png"/></fig><fig id="fig10"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>0</label><caption><title> The architecture of the RTBS</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x55.png"/></fig><fig id="fig11"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>1</label><caption><title> The 4 port SDRAM control block</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x56.png"/></fig><p>A 1.3 megapixel CMOS sensor module is responsible for acquiring raw color images for the FPGA platform. The raw data format comes with a Bayes pattern arrangement, and has to be converted to RGB color format. The system is divided into three blocks, as shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>2. The main block receives raw Bayes data and performs the RTBS background subtraction task. Background models <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x57.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x57.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x58.png" xlink:type="simple"/></inline-formula> are stored in the off-chip SDRAM memory because of the limited internal memory of the FPGA.</p><p>The subtraction result is sent to the VGA controller for display due to the constraints of peripheral circuits, and the system has two clock rates: 25 MHz for CMOS image acquisition and processing, and 120 MHz for SDRAM access of background models.</p></sec><sec id="s4"><title>4. Simulation and Experiment</title><p>First, this section presents the FPGA hardware circuit by experiment image testing to verify its accuracy. Next, Frame Rate is applied in order to analyze and discuss the Throughput. The experiment also applies the results of the BS and RTBS equations to determine the difference between the two after image analysis. The process of the FPGA hardware resource usage is explained later in the article.</p><sec id="s4_1"><title>4.1. Frame Rate Analysis and Experiment</title><p>This section will further verify the performance of the system. A seven-segment display was used to count the processed number of frames within a minute-with an average result of 51 fps. This result differs from the 127 fps result from synthesis and simulation as synthesis and simulation have different clock rates.</p><p>Next, the main clock is analyzed. The clock rate in the main block is 25 MHz. Furthermore, each frame will have the waiting blanking time, with the current time of exposure setting. The actual image information clock is around 70%, which equals 25 MHz &#215; 0.70 = 17.5 MHz. It is further calculated that its frame rate is 56 fps (17.5 MHz/0.3072 M).</p><p>From the above analysis the result of 56 fps is derived, which is very close to 51 fps. It is also known that the limit of this experiment is the main clock, which is due to the CMOS sensor clock.</p><p>Moreover, the RTBS was implemented using a 1.8 GHz P4 CPU with 1GB DDR/333 MHz memory. This yielded a frame rate of 3.22 fps, which differs significantly from the FPGA result. <xref ref-type="table" rid="table2">Table 2</xref> shows more detailed frame rate results, and demonstrates that hardware is far better than software in terms of efficiency.</p><fig id="fig12"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>2</label><caption><title> Diagram of system block</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x59.png"/></fig><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Frame rate analysis</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Block</th><th align="center" valign="middle" >FPGA Simulation</th><th align="center" valign="middle" >FPGA Verification</th><th align="center" valign="middle" >PC</th></tr></thead><tr><td align="center" valign="middle" >Main Block</td><td align="center" valign="middle" >368</td><td align="center" valign="middle" >56</td><td align="center" valign="middle" >−</td></tr><tr><td align="center" valign="middle" >System Block</td><td align="center" valign="middle" >127</td><td align="center" valign="middle" >51</td><td align="center" valign="middle" >3.22</td></tr></tbody></table></table-wrap><p>This study uses QUARTUS II to further analyze the frame rate performance by simulation. The QUARTUS II contains TimeQuest Timing Analyzer that applies industry-standard Synopsys Design Constraint methodology for constraint designs. The timing characteristics and timing performance of our system are obtained from the analyzer. The timing analysis reports that the clock period of the main block is 8.8 ns, which equals 113 MHz. Therefore, theoretically, the frame rate of the main block is 368 fps (113 MHz/0.3072 M). The SDRAM controller’s critical path clock period is 6.4 ns, which equals 156 MHz. However, because the SDRAM controller consists of 4 read/write ports, only 39 MHz can be achieved. Therefore, the frame rate is 127 fps (39 MHz/ 0.3072 M).</p><p>The proposed method was tested against this test set, and achieved the result shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>3.</p></sec><sec id="s4_2"><title>4.2. Difference between BS and RTBS Equations</title><p>There is one well-known issue of background subtraction approach. If there is an object in the image at the beginning when the system starts to establish its background, the object will stay in the background for a period. This issue cloud also be a problem for the proposed RTBS method because the RTBS is an approximation of background subtraction methods. An experiment for RTBS concerning remain image is conducted, and analyzed with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730404x60.png" xlink:type="simple"/></inline-formula>. During the FPGA experiment the frame number is shown by a seven-section monitor, which means a one is added to the seven-section monitor when the system completes the processing of one image. The experiment applies a man’s hand as the background’s foreign body, and maintains it for 20, 40, 60, 80 and 100 frames of 128, after which the hand object is removed. It is then noted how many frames it takes to clear up the object.</p><p>Results of the experiment are shown in <xref ref-type="fig" rid="fig1">Figure 1</xref>4. The RTBS needs two frames to remove the hand object if the object appears in the beginning and lasts for 20 frames. With more lasting frames, the time to remove increases linearly. However, after 80 frames, it only needs around 10 image frames to clear up the object.</p></sec><sec id="s4_3"><title>4.3. FPGA Hardware Resource Usage</title><p>FPGA resources include logic circuit and memory. Two frame buffers of about 998 KB of SDRAM memory are required external resources. <xref ref-type="table" rid="table3">Table 3</xref> shows the usage of FPGA resources for each function in the system. The analysis of resource utilization and memory requirement was calculated by using Altera QUARTUS II analyzer tool. QUARTUS II tool allows the user to launch Modelsim simulator from within the software using NativeLink. It facilitates the process of simulation by providing an easy to use mechanism and precompiled libraries for EDA (Electronic Design Automation) RTL (Register Transfer Level) and Gate-level Timing simulation.</p><fig-group id="fig13"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>3</label><caption><title> Detection results of the RTBS. (a) Background frames, (b) Test frames, (c) Subtraction results.</title></caption><fig id ="fig13_1"><label> (b)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x63.png"/></fig><fig id ="fig13_2"><label> (c)</label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x62.png"/></fig><fig id ="fig13_3"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x61.png"/></fig><fig id ="fig13_4"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x66.png"/></fig><fig id ="fig13_5"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x65.png"/></fig><fig id ="fig13_6"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x64.png"/></fig></fig-group><fig id="fig14"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>4</label><caption><title> The linear relating between the lasting time of an object and the removal time of the object</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730404x67.png"/></fig><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> Resource utilization in FPGA</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Module</th><th align="center" valign="middle"  rowspan="2"  >Memory</th><th align="center" valign="middle"  colspan="2"  >Logic Circuit</th></tr></thead><tr><td align="center" valign="middle" >Logic Element</td><td align="center" valign="middle" >Equivalent Gate Count</td></tr><tr><td align="center" valign="middle" >RTBS Module</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >154</td><td align="center" valign="middle" >1848</td></tr><tr><td align="center" valign="middle" >Morphology Module</td><td align="center" valign="middle" >7656</td><td align="center" valign="middle" >104</td><td align="center" valign="middle" >1248</td></tr><tr><td align="center" valign="middle" >RAW2RGB Module</td><td align="center" valign="middle" >12,800</td><td align="center" valign="middle" >88</td><td align="center" valign="middle" >1056</td></tr><tr><td align="center" valign="middle" >Mirror Module</td><td align="center" valign="middle" >19,200</td><td align="center" valign="middle" >21</td><td align="center" valign="middle" >252</td></tr><tr><td align="center" valign="middle" >RGB2Y Module</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >80</td><td align="center" valign="middle" >960</td></tr><tr><td align="center" valign="middle" >SDRAM Controller Module</td><td align="center" valign="middle" >52,348</td><td align="center" valign="middle" >828</td><td align="center" valign="middle" >9936</td></tr><tr><td align="center" valign="middle" >I/O and Other Modules</td><td align="center" valign="middle" >900</td><td align="center" valign="middle" >343</td><td align="center" valign="middle" >4116</td></tr><tr><td align="center" valign="middle" >Total Used Resource</td><td align="center" valign="middle" >92,904</td><td align="center" valign="middle" >1618</td><td align="center" valign="middle" >19,416</td></tr></tbody></table></table-wrap><p>Generally speaking the LE (Logic Element) of ALTERA requires 8 ~ 21 logic gates. Typically it will be 12 logic gates [<xref ref-type="bibr" rid="scirp.69624-ref20">20</xref>] ; internal memory usually requires 4 logic gates combined as 1 bit [<xref ref-type="bibr" rid="scirp.69624-ref20">20</xref>] . This study uses a typical value to estimate the design for the hardware’s standard logic gates. <xref ref-type="table" rid="table4">Table 4</xref> shows the resource analysis index created for the DE2 Development board:</p><p>From the above index, it is determined that FPGA uses N4K internal memory, which is about equal to using 371,616 logic gates plus the BS equation, and other logic circuits use almost 19,416 logic gates. Therefore, the whole FPGA uses around 371,616 + 19,416 = 391,032 logic gates.</p><p>Finally, <xref ref-type="table" rid="table5">Table 5</xref> shows the realization of the proposed hardware performance compared with that achieved in past papers on Background Subtraction algorithms in FPGA.</p></sec></sec><sec id="s5"><title>5. Conclusion</title><p>This paper proposes a background subtraction and morphology algorithm accelerated by reconfigurable hardware which can help embedded systems achieve real-time security monitoring. The design partitions the functions into background modeling, subtraction and morphology. The high-cost function, background modeling, is reformulated by eliminating division operations, which both reduces resource utilization and improves performance. Data flow analysis further details the calculation of the design. In simulation, a high frame rate of 384 fps for the background subtraction with morphology and modeling module can be achieved at 25 Mhz for 640 &#215; 480 resolution videos. Real-time performance of 51 fps for the whole system, including off-chip memory</p><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> Using resource percentage in DE22C35FPGA</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Module</th><th align="center" valign="middle"  colspan="2"  >Memory</th><th align="center" valign="middle"  colspan="2"  >Logic Circuit</th><th align="center" valign="middle"  rowspan="2"  >18&#215;18 Multiplier</th><th align="center" valign="middle"  rowspan="2"  >PLLs</th><th align="center" valign="middle"  rowspan="2"  >I/O pin</th><th align="center" valign="middle"  rowspan="2"  >8 MB SDRAM (bits)</th></tr></thead><tr><td align="center" valign="middle" >Block</td><td align="center" valign="middle" >Memory bit</td><td align="center" valign="middle" >Logic Element</td><td align="center" valign="middle" >Equivalent Gate Count</td></tr><tr><td align="center" valign="middle" >Total DE2 Resource</td><td align="center" valign="middle" >105</td><td align="center" valign="middle" >483,840</td><td align="center" valign="middle" >33,216</td><td align="center" valign="middle" >398,592</td><td align="center" valign="middle" >35</td><td align="center" valign="middle" >4</td><td align="center" valign="middle" >475</td><td align="center" valign="middle" >67,108,864</td></tr><tr><td align="center" valign="middle" >Used Resource</td><td align="center" valign="middle" >36</td><td align="center" valign="middle" >92,904</td><td align="center" valign="middle" >1618</td><td align="center" valign="middle" >19,416</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >1</td><td align="center" valign="middle" >425</td><td align="center" valign="middle" >7,987,200</td></tr><tr><td align="center" valign="middle" >Used Percentage</td><td align="center" valign="middle" >34.29%</td><td align="center" valign="middle" >19.20%</td><td align="center" valign="middle" >4.87%</td><td align="center" valign="middle" >4.87%</td><td align="center" valign="middle" >14.29%</td><td align="center" valign="middle" >25.00%</td><td align="center" valign="middle" >89.47%</td><td align="center" valign="middle" >11.90%</td></tr></tbody></table></table-wrap><table-wrap id="table5" ><label><xref ref-type="table" rid="table5">Table 5</xref></label><caption><title> FPGA effectiveness and resource comparison table for past papers</title></caption><table><tbody><thead><tr><th align="center" valign="middle"  rowspan="2"  >Ref</th><th align="center" valign="middle"  rowspan="2"  >HW Algorithm</th><th align="center" valign="middle" >Device</th><th align="center" valign="middle"  colspan="4"  >Logic Circuit</th><th align="center" valign="middle"  rowspan="2"  >Clock (MHz)</th><th align="center" valign="middle"  colspan="3"  >Performance</th><th align="center" valign="middle"  rowspan="2"  >Power (w)</th></tr></thead><tr><td align="center" valign="middle" ></td><td align="center" valign="middle" >LUTs</td><td align="center" valign="middle" >FF</td><td align="center" valign="middle" >B-RAM</td><td align="center" valign="middle" >DSP</td><td align="center" valign="middle" >Image Size</td><td align="center" valign="middle" >fps</td><td align="center" valign="middle" >Nfps<sup>*</sup></td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.69624-ref12">12</xref>]</td><td align="center" valign="middle" >RTBS Module</td><td align="center" valign="middle" >Virtex II XC2v6000</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >25</td><td align="center" valign="middle" >720 &#215; 576 monochromatic</td><td align="center" valign="middle" >60</td><td align="center" valign="middle" >27</td><td align="center" valign="middle" >No Data</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.69624-ref13">13</xref>]</td><td align="center" valign="middle" >Morphology Module</td><td align="center" valign="middle" >XC4010E</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >25</td><td align="center" valign="middle" >640 &#215; 480</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >No Data</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.69624-ref8">8</xref>]</td><td align="center" valign="middle" >RAW2RGB Module</td><td align="center" valign="middle" >XCV800</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >25</td><td align="center" valign="middle" >640 &#215; 480</td><td align="center" valign="middle" >15</td><td align="center" valign="middle" >15</td><td align="center" valign="middle" >0.121</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.69624-ref17">17</xref>]</td><td align="center" valign="middle" >Mirror Module</td><td align="center" valign="middle" >Virtex 5</td><td align="center" valign="middle" >1572</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >47</td><td align="center" valign="middle" >1920 &#215; 1080</td><td align="center" valign="middle" >22</td><td align="center" valign="middle" >79</td><td align="center" valign="middle" >0.027</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.69624-ref18">18</xref>]</td><td align="center" valign="middle" >RGB2Y Module</td><td align="center" valign="middle" >Virtex 2</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >No data</td><td align="center" valign="middle" >40</td><td align="center" valign="middle" >1024 &#215; 1024</td><td align="center" valign="middle" >38</td><td align="center" valign="middle" >81</td><td align="center" valign="middle" >No Data</td></tr><tr><td align="center" valign="middle" >[<xref ref-type="bibr" rid="scirp.69624-ref19">19</xref>]</td><td align="center" valign="middle" >SDRAM Controller Module</td><td align="center" valign="middle" >Virtex 6</td><td align="center" valign="middle" >788</td><td align="center" valign="middle" >363</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >189</td><td align="center" valign="middle" >1920 &#215; 1080</td><td align="center" valign="middle" >91</td><td align="center" valign="middle" >81</td><td align="center" valign="middle" >0.0012</td></tr><tr><td align="center" valign="middle" >Our</td><td align="center" valign="middle" >I/O and Other Modules</td><td align="center" valign="middle" >Cyclone II</td><td align="center" valign="middle" >1618</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >36</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >25</td><td align="center" valign="middle" >640 &#215; 480</td><td align="center" valign="middle" >56</td><td align="center" valign="middle" >56</td><td align="center" valign="middle" >0.0026</td></tr><tr><td align="center" valign="middle" >Our<sup>#</sup></td><td align="center" valign="middle" >Total Used Resource</td><td align="center" valign="middle" >Cyclone II</td><td align="center" valign="middle" >1618</td><td align="center" valign="middle" >0</td><td align="center" valign="middle" >36</td><td align="center" valign="middle" >5</td><td align="center" valign="middle" >113</td><td align="center" valign="middle" >720 &#215; 576</td><td align="center" valign="middle" >368</td><td align="center" valign="middle" >110</td><td align="center" valign="middle" >0.0026</td></tr></tbody></table></table-wrap><p>Nfps<sup>*</sup>: Fps Normalized by 150 MHz clock and 1920 &#215; 1080 Resolution; Our<sup>#</sup>: Our simulation results for 720 &#215; 576 videos.</p><p>access, demonstrates the efficiency of the design. The implementation on low-end FPGA with low frequency indicates low power consumption. The final verification results show resource utilization of no more than 400 K gate counts, two frame buffers, and 1 MB SDRAM memory size. Further study of complex background subtraction algorithms such as Gaussian mixture model and LBP background subtraction is promising.</p></sec><sec id="s6"><title>Cite this paper</title><p>Hung-Yu Chen,Yuan-Kai Wang, (2016) Hardware Design of Moving Object Detection on Reconfigurable System. Journal of Computer and Communications,04,30-43. doi: 10.4236/jcc.2016.410004</p></sec></body><back><ref-list><title>References</title><ref id="scirp.69624-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Piccardi, M. (2004) Background Subtraction Techniques: A Review. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, The Hague, Netherlands, 10-13 October 2004, 3099-3104.  
http://dx.doi.org/10.1109/icsmc.2004.1400815</mixed-citation></ref><ref id="scirp.69624-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Cucchiara, R., Grana, C., Piccardi, M. and Prati, A. (2003) Detecting Moving Objects, Ghosts, and Shadows in Video Streams. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25, 1337-1342.  
http://dx.doi.org/10.1109/TPAMI.2003.1233909</mixed-citation></ref><ref id="scirp.69624-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Yang, Y.H. and Levine, M.D. (1992) The Background Primal Sketch: An Approach for Tracking Moving Objects. Machine Vision and Applications, 5, 17-34.</mixed-citation></ref><ref id="scirp.69624-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Masoud, O. and Papanikolopoulos, N.P. (2001) A Novel Method for Tracking and Counting Pedestrians in Real-time Using a Single Camera. IEEE Transactions Vehicular Technology, 50, 1267-1278.</mixed-citation></ref><ref id="scirp.69624-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Shoushtarian, B. and Bez, H.E. (2005) A Practical Adaptive Approach for Dynamic Backgrounds Subtraction Using an Invariant Colour Model and Object Tracking. Pattern Recognition Letters, 26, 5-26.</mixed-citation></ref><ref id="scirp.69624-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Wren, C.R., Azarbayejani, A., Darrell, T. and Pentland, A.P. (1997) Pfinder: Real-Time Tracking of the Human Body. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 780-785. http://dx.doi.org/10.1109/34.598236</mixed-citation></ref><ref id="scirp.69624-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Stauffer, C. and Grimson, W.E.L. (2000) Learning Patterns of Activity Using Real-Time Tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 747-757.</mixed-citation></ref><ref id="scirp.69624-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Elgammal, A., Duraiswami, R., Harwood, D. and Davis, L.S. (2002) Background and Foreground Modeling Using Nonparametric Kernel Density Estimation for Visual Surveillance. Proceedings of the IEEE, 90, 1151-1163.  
http://dx.doi.org/10.1109/JPROC.2002.801448</mixed-citation></ref><ref id="scirp.69624-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Chen, T.P., Horst, H., Alexander, B., Roman, B., Konstantin, R. and Alexander, K. (2005) Computer Vision Workload Analysis: Case Study of Video Surveillance Systems. Intel Technology Journal, 9, 109-118.</mixed-citation></ref><ref id="scirp.69624-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Ooi, M.P. (2006) Hardware Implementation for Face Detection on Xilinx Virtex-II FPGA Using the Reversible Component Transformation Colour Space. Proceedings of the IEEE International Workshop on Electronic Design, Test and Applications, Kuala Lumpur, 17-19 January 2006, 41-46.</mixed-citation></ref><ref id="scirp.69624-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Bramberger, M., Brunner, J. and Rinner, B. (2004) Real Time Video Analysis on an Embedded Smart Camera for Traffic Surveillance. Proceedings of the IEEE 10th Computer Society Conference on Real-Time and Embedded Technology and Applications Symposium, Toronto, Canada, 25-28 May 2004, 174-181.</mixed-citation></ref><ref id="scirp.69624-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Appiah, P., Hunter, K., Ormston, A. and Ormston, S. (2005) An FPGA-Based Infant Monitoring System. Proceedings of the IEEE International Conference on Field-Programmable Technology, Singapore, 11-14 December 2005, 315-316.</mixed-citation></ref><ref id="scirp.69624-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Cucchiara, R., Onfiani, P., Prati, A. and Scarabottolo, N. (1999) Segmentation of Moving Objects at Frame Rate: A Dedicated Hardware Solution. Proceedings of the IEEE 7th International Conference, Image Processing and Its Applications, Manchester, UK, 12-15 July 1999, 138-142. http://dx.doi.org/10.1049/cp:19990297</mixed-citation></ref><ref id="scirp.69624-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Stauffer, C. and Grimson, W.E.L. (1999) Adaptive Background Mixture Models for Real-time Tracking. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 246-252.</mixed-citation></ref><ref id="scirp.69624-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Kaew TraKul Pong, P. and Bowden, R. (2001) An Improved Background Mixture Model for Real-time Tracking with Shadow Detection. Proceedings of the 2nd European Workshop on Advanced Video Based Surveillance Systems, London, UK, 7-10 September 2001, 135-144.</mixed-citation></ref><ref id="scirp.69624-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Theocharides, T., Vijaykrishnan, N. and Irwin, M.J. (2006) A Parallel Architecture for Hardware Face Detection. Proceedings of the IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures, Karlsruhe, 2-3 March 2006, 452-453. http://dx.doi.org/10.1109/ISVLSI.2006.10</mixed-citation></ref><ref id="scirp.69624-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Genovese, M., Napoli, E. and Petra, N. (2010) OpenCV Compatible Real Time Processor for Background Foreground Identification. Proceedings of the International Conference on Microelectronics, Nis, Serbia, 16-19 May 2010, 467-470.</mixed-citation></ref><ref id="scirp.69624-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Jang, H. Ardo, H. and Owall, V. (2005) Hardware Accelerator Design for Video Segmentation with Multi-Modal Background Modelling. Proc. IEEE Int. Symp. Circuits Syst., Japan, 23-26 May 2005, 1142-1145.</mixed-citation></ref><ref id="scirp.69624-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Genovese, M. and Napoli, E. (2014) ASIC and FPGA Implementation of the Gaussian Mixture Model Algorithm for Real-Time Segmentation of High Definition Video. IEEE Transaction on Very Large Scale Integration (VLSI) System, 22, 537-547.</mixed-citation></ref><ref id="scirp.69624-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Altera Corporation (1999) Gate Counting Methodology for APEX 20K Devices.</mixed-citation></ref></ref-list></back></article>