<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">JCC</journal-id><journal-title-group><journal-title>Journal of Computer and Communications</journal-title></journal-title-group><issn pub-type="epub">2327-5219</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/jcc.2016.46004</article-id><article-id pub-id-type="publisher-id">JCC-66919</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Computer Science&amp;Communications</subject></subj-group></article-categories><title-group><article-title>
 
 
  Supporting Information Extraction from Visual Documents
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Giuseppe</surname><given-names>Della Penna</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Sergio</surname><given-names>Orefice</given-names></name><xref ref-type="aff" rid="aff1"><sup>1</sup></xref></contrib></contrib-group><aff id="aff1"><addr-line>Department of Information Engineering, Computer Science and Mathematics, University of L’Aquila Via Vetoio, L’Aquila, Italy</addr-line></aff><pub-date pub-type="epub"><day>12</day><month>05</month><year>2016</year></pub-date><volume>04</volume><issue>06</issue><fpage>36</fpage><lpage>48</lpage><history><date date-type="received"><day>10</day>	<month>March</month>	<year>2016</year></date><date date-type="rev-recd"><day>accepted</day>	<month>27</month>	<year>May</year>	</date><date date-type="accepted"><day>30</day>	<month>May</month>	<year>2016</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Visual Information Extraction (VIE) is a technique that enables users to perform information extraction from visual documents driven by the visual appearance and the spatial relations occurring among the elements in the document. In particular, the extractions are expressed through a query language similar to the well known SQL. To further reduce the human effort in the extraction task, in this paper we present a fully formalized assistance mechanism that helps users in the interactive formulation of the queries.
 
</p></abstract><kwd-group><kwd>Information Extraction</kwd><kwd> Spatial Relations</kwd><kwd> Visual Appearance</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The process of knowledge acquisition from generic information domains has its central phase in the Information Extraction (IE), which aims to extract from the located documents relevant information that appear in certain semantic or syntactic relationships [<xref ref-type="bibr" rid="scirp.66919-ref1">1</xref>] . In particular, IE tries to process the relevant information found on the documents in order to make it available to structured queries. Most often, information extraction systems are customised for specific application domains, and require manual or semi-automatic training sessions.</p><p>In order to apply IE to visual documents, we have developed a technique called Visual Information Extraction (VIE) that is an IE approach based on the visual appearance of the information, conceived as its user-perceived rendering. This allows one to shift the IE problem from the low level of code (e.g., raster graphics, vector drawing, word processor formatted text, web page, etc.) to the higher level of visual features, providing a paradigm of the kind “what you see drives your search” that supports a natural query formulation. In particular, the extractions are based on the spatial arrangement occurring among the objects of a document which is modelled by common qualitative spatial relations like topological (i.e., overlapping, adjacency, containment) and direction (i.e., left, up, etc.) relations. Then, the focus of our approach is on the spatial feature, which best characterises it and also represents the main novelty with respect to the current IE literature.</p><p>The origin of the VIE technique can be found in [<xref ref-type="bibr" rid="scirp.66919-ref2">2</xref>] [<xref ref-type="bibr" rid="scirp.66919-ref3">3</xref>] . In those works, the approach is inspired by the box spatial relation theory [<xref ref-type="bibr" rid="scirp.66919-ref4">4</xref>] , such that graphical objects are syntactically described and manipulated through their bounding boxes whatever the shape of the object is. However, the box formalism has some intrinsic limitations that make no possible to use it in contexts where graphical objects are represented by complex figures. The major shortcoming of such an early proposal is that the Visual Information Extraction technique is strongly customised to specific application domains. Indeed, in [<xref ref-type="bibr" rid="scirp.66919-ref2">2</xref>] [<xref ref-type="bibr" rid="scirp.66919-ref3">3</xref>] the VIE framework is ad-hoc adapted to perform extractions on web pages, PDF documents, respectively, and the technique presented in each of these works is ad-hoc customised for the corresponding domain. Finally, in [<xref ref-type="bibr" rid="scirp.66919-ref5">5</xref>] the VIE theory is refined, allowing to work directly on the real shapes, and generalised in order to be applied to a wide range of interesting visual domains, including geospatial data.</p><p>The Visual Information Extraction technique has been accomplished thanks to an extraction language, namely the Spatial Relation Query Language (SRQL), which allows users to write SQL-like queries based on the visual arrangement of the information. Moreover, both the spatial relation theory and the SRQL language are implemented in a graphical software system, the Spatial Relation Query (SRQ) tool, which also allows the use of further semantic information to refine the queries, thus providing an environment where the information extraction can be performed integrating spatial relations, visual attributes, textual content, etc.</p><p>The main purpose of the research presented in this paper is to further reduce the human effort needed to compose the extraction queries through an assistance mechanism that may be used to interactively formulate the queries. To this aim, we have developed a spatial relation-based assistance technique whose theoretical issues are described in this paper. In particular, we reused the spatial relation formalism of [<xref ref-type="bibr" rid="scirp.66919-ref5">5</xref>] (which has been extended with connection-based relations in order to model also interconnections among objects) and we extensively developed the information extraction task theory based upon the new notion of pictorial view, providing a more refined specification of the queries that has been integrated in the algorithm for the assisted formulation of the queries.</p><p>The paper is organised as follows. Related work is summarised in Section 2. In Section 3 we illustrate the spatial relation formalism underlying the VIE technique, whereas in Section 4 we introduce the notion of Pictorial View and give an in-depth overview of the formalisation of the query process. Section 5 contains the algorithm for the interactive formulation of the queries together with some experimentation examples. Finally, some concluding remarks are outlined in Section 6.</p></sec><sec id="s2"><title>2. Related Work</title><p>Information extraction is a very active research area that has received a growing attention from different communities, such as the Artificial Intelligence, Information Retrieval and Processing and Web communities.</p><p>In particular, the enormous amount of information present in the web makes it the most appealing domain for developing new information extraction techniques. In the last decade, many approaches have been proposed in this field, and they are usually classified in the following categories (for an extended survey of these approaches we refer the reader to [<xref ref-type="bibr" rid="scirp.66919-ref6">6</xref>] [<xref ref-type="bibr" rid="scirp.66919-ref7">7</xref>] ):</p><p>・ HTML-aware tools (e.g., LIXTO [<xref ref-type="bibr" rid="scirp.66919-ref8">8</xref>] and SCRAP [<xref ref-type="bibr" rid="scirp.66919-ref9">9</xref>] ), that make use of the HTML parse trees to create extraction rules.</p><p>・ Wrapper induction tools (e.g., DEPTA [<xref ref-type="bibr" rid="scirp.66919-ref10">10</xref>] ), where the extraction rules are derived from a given set of training examples or using pattern discovery techniques.</p><p>・ NLP-based tools (e.g., WHISK [<xref ref-type="bibr" rid="scirp.66919-ref11">11</xref>] or the technique in [<xref ref-type="bibr" rid="scirp.66919-ref12">12</xref>] ), where the rules are derived by considering phrases and sentences and applying either syntactic analysis, filtering, part-of-speech tagging or lexical semantic tagging.</p><p>・ Modelling-based tools (e.g., DEByE [<xref ref-type="bibr" rid="scirp.66919-ref13">13</xref>] or the Bayesian learning framework in [<xref ref-type="bibr" rid="scirp.66919-ref14">14</xref>] ), which rely on locating in the web page portions of data conforming to a predefined structure.</p><p>・ ontology-based tools (e.g., WeDaX [<xref ref-type="bibr" rid="scirp.66919-ref15">15</xref>] or [<xref ref-type="bibr" rid="scirp.66919-ref16">16</xref>] ), that are based on the data and not on the structure of the source documents, and ontologies are used to locate constants in the page and construct objects with them.</p><p>Furthermore, a number of languages for wrappers development have been designed (e.g., Minerva [<xref ref-type="bibr" rid="scirp.66919-ref17">17</xref>] or TSIMMIS [<xref ref-type="bibr" rid="scirp.66919-ref18">18</xref>] ) as well as specific-targeted approaches tools like WiNTs [<xref ref-type="bibr" rid="scirp.66919-ref19">19</xref>] , VENTex [<xref ref-type="bibr" rid="scirp.66919-ref20">20</xref>] , GRAPE [<xref ref-type="bibr" rid="scirp.66919-ref21">21</xref>] or SRES [<xref ref-type="bibr" rid="scirp.66919-ref22">22</xref>] .</p><p>What differentiates our approach from the cited works is that we directly exploit the visual appearance of the information, rather than relying on the structural and textual information. Furthermore, we do not require any predefined model (as it is the case for the approaches based on training sets, ontologies or wrappers).</p><p>Information Extraction literature does include works dealing with the visual appearance of information (see, e.g., [<xref ref-type="bibr" rid="scirp.66919-ref19">19</xref>] [<xref ref-type="bibr" rid="scirp.66919-ref20">20</xref>] [<xref ref-type="bibr" rid="scirp.66919-ref23">23</xref>] [<xref ref-type="bibr" rid="scirp.66919-ref24">24</xref>] ), but they are based on visual web page analysis and are targeted to specific tasks (like table extraction or similarities detection). Finally, in [<xref ref-type="bibr" rid="scirp.66919-ref25">25</xref>] the authors propose a visual information extraction algorithm which is specifically targeted to extraction from PDF documents.</p><p>Some more recent works are presented in [<xref ref-type="bibr" rid="scirp.66919-ref26">26</xref>] - [<xref ref-type="bibr" rid="scirp.66919-ref28">28</xref>] . The former presents an algebra for expressing spatial and textual rules that can be defined directly at a layout level. It is an interesting approach that, however, is restricted to the web-page domain. On the other hand, the work in [<xref ref-type="bibr" rid="scirp.66919-ref27">27</xref>] deals with the domain of charts, and proposes an approach for improving their accessibility. Finally, the work in [<xref ref-type="bibr" rid="scirp.66919-ref28">28</xref>] presents an hybrid approach based on decision tree learning and extracting rules.</p></sec><sec id="s3"><title>3. Spatial Relation Theory</title><p>In this section we provide the basic notions of the spatial relation theory underlying our VIE technique.</p><sec id="s3_1"><title>3.1. Graphical Objects</title><p>A graphical object is formally defined as pair<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x6.png" xlink:type="simple"/></inline-formula>, where C denotes the set of all the points <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x7.png" xlink:type="simple"/></inline-formula> forming the external contour of O (which is disjoint from its internal area), and A is the set of the object attributes. Each attribute is a pair<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x8.png" xlink:type="simple"/></inline-formula>, where a is a property name and v its value. More in detail, A models the object semantic properties (e.g., name, colour, font or latitude), and never includes attributes depending on C (such as the object width, area, etc.), whereas C is used essentially for its syntactic manipulation.</p><p>We consider only graphical objects which do not contain holes and have a contour that can be modelled as a closed curve without self loops (simple curve). We do this since this restriction still defines a wide and signific- ant domain, excluding only graphical objects with very irregular shapes, which would be hard to handle and, often, of little interest.</p><p>In the following, we shall write <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x9.png" xlink:type="simple"/></inline-formula> to indicate a generic point of C and<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x10.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x11.png" xlink:type="simple"/></inline-formula>to indicate the x and y-coordinate, respectively, of p. Moreover, we shall use the expression <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x12.png" xlink:type="simple"/></inline-formula> to indicate the set of points enclosed by the contour C of the graphical object O, that we shall call the internal points of O. Finally, we shall write <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x13.png" xlink:type="simple"/></inline-formula> to indicate a generic attribute in A and shall use <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x14.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x15.png" xlink:type="simple"/></inline-formula> to refer to its property name and value, respectively, and, for any attribute a with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x16.png" xlink:type="simple"/></inline-formula>, we shall use the common object- oriented notation <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x17.png" xlink:type="simple"/></inline-formula> to denote<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x18.png" xlink:type="simple"/></inline-formula>.</p><p><xref ref-type="fig" rid="fig1">Figure 1</xref> shows a fragment of a geographic map containing five nations (a, c, e, f, g), a lake (b) and a river (d). All these map symbols can be naturally modelled as graphical objects. As an example, let us give a possible definition of the graphical object <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x19.png" xlink:type="simple"/></inline-formula> corresponding to the nation e on the map. In this case, the component C of <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x20.png" xlink:type="simple"/></inline-formula> would be the set of points forming the nation contour highlighted in the figure, whereas the graphical object attributes <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x20.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x21.png" xlink:type="simple"/></inline-formula> may be, e.g., the set</p><disp-formula id="scirp.66919-formula121"><graphic  xlink:href="http://html.scirp.org/file/4-1730342x22.png"  xlink:type="simple"/></disp-formula></sec><sec id="s3_2"><title>3.2. Spatial Relations</title><p>In this section, we briefly illustrate the spatial relation formalism underlying the VIE technique.</p><p>So far [<xref ref-type="bibr" rid="scirp.66919-ref5">5</xref>] , we have considered only position-based spatial relations, describing where a graphical object is placed relative to one other. Below we give a brief summary of such relations.</p><sec id="s3_2_1"><title>3.2.1. Position-Based Relations</title><p>The position-based relations can be subdivided in three classes:</p><p>・ disjoint relations, holding between objects with disjoint contours and areas (e.g., “left of”, “above”, “close to”);</p><fig-group id="fig1"><label><xref ref-type="fig" rid="fig1">Figure 1</xref></label><caption><title> A sample geographic map.</title></caption><fig id ="fig1_1"><label></label><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x23.png"/></fig></fig-group><p>・ overlap relations, holding between objects with intersecting areas and possibly intersecting contours (i.e., “full inclusion” or “partial intersection”);</p><p>・ attach relations, holding between objects with intersecting contours but not areas (i.e., adjacency in one or more points).</p><p>The four distinct disjoint spatial relations are UP, DOWN, LEFT and RIGHT, and are sufficient to model any disjoint spatial arrangement between two graphical objects. Actually, since of course UP and DOWN are inverse of each other, as well as LEFT and RIGHT, we could restrict the set of disjoint spatial relations to just one of the possible pairs, e.g., LEFT and DOWN. However, for sake of completeness, in the following we will present the complete set of disjoint spatial relations.</p><p>In order to define them, we need to introduce the following terminology for a generic graphical object<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x24.png" xlink:type="simple"/></inline-formula>. Note that, in our formalism, we refer to the canonical orientation of the cartesian axes (i.e., the x coordinate increases rightwards, and the y one increases upwards).</p><p>1) <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x25.png" xlink:type="simple"/></inline-formula>represents the highest y coordinate of the points in the contour C, that is the y coordinate of each point in<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x25.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x26.png" xlink:type="simple"/></inline-formula>, where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x25.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x26.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x27.png" xlink:type="simple"/></inline-formula> denotes the set of upmost points of the contour C.</p><p>2) <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x28.png" xlink:type="simple"/></inline-formula>represents the lowest y coordinate of the points in the contour C, that is the y coordinate of each point in<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x28.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x29.png" xlink:type="simple"/></inline-formula>, where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x28.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x29.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x30.png" xlink:type="simple"/></inline-formula> denotes the set of downmost points of the contour<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x28.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x29.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x30.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x31.png" xlink:type="simple"/></inline-formula>.</p><p>3) <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x32.png" xlink:type="simple"/></inline-formula>represents the lowest x coordinate of the points in the contour C, that is the x coordinate of each point in<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x32.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x33.png" xlink:type="simple"/></inline-formula>, where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x32.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x33.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x34.png" xlink:type="simple"/></inline-formula> denotes the set of leftmost points of the contour C.</p><p>4) <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x35.png" xlink:type="simple"/></inline-formula>represents the highest x coordinate of the points in the contour C, that is the x coordinate of each point in<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x35.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x36.png" xlink:type="simple"/></inline-formula>, where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x35.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x36.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x37.png" xlink:type="simple"/></inline-formula> denotes the set of rightmost points of the contour C.</p><p>Then, given two graphical objects <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x38.png" xlink:type="simple"/></inline-formula> and<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x38.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x39.png" xlink:type="simple"/></inline-formula>, the four disjoint spatial relations are defined as in <xref ref-type="table" rid="table1">Table 1</xref>. Intuitively, the UP relation models a spatial arrangement between a and b whenever the graphical object b is completely (both internal points and external contour) below the graphical object a. For example, <xref ref-type="fig" rid="fig2">Figure 2</xref>(a) shows a spatial arrangement where the UP relation holds between an hexagon a and a circle b. Note that, in the figure, the round callouts are not part of the graphical objects, but are only used to indicate their names.</p><p>The two overlap spatial relations INCLUDE and INTERSECT are defined as shown in <xref ref-type="table" rid="table2">Table 2</xref>.</p><p>In particular, the INCLUDE relation models a spatial arrangement between a and b whenever the graphical object b is properly contained in a, and then their external contours do not intersect.</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> Definition of disjoint spatial relations</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >UP</th><th align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x40.png" xlink:type="simple"/></inline-formula></th></tr></thead><tr><td align="center" valign="middle" >DOWN</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x41.png" xlink:type="simple"/></inline-formula></td></tr><tr><td align="center" valign="middle" >LEFT</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x42.png" xlink:type="simple"/></inline-formula></td></tr><tr><td align="center" valign="middle" >RIGHT</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x43.png" xlink:type="simple"/></inline-formula></td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Definition of overlap spatial relations</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >INCLUDE</th><th align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x44.png" xlink:type="simple"/></inline-formula></th></tr></thead><tr><td align="center" valign="middle" >INTERSECT</td><td align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x45.png" xlink:type="simple"/></inline-formula></td></tr></tbody></table></table-wrap><fig id="fig2"  position="float"><label><xref ref-type="fig" rid="fig2">Figure 2</xref></label><caption><title> Spatial relations UP, INTERSECT and TOUCH</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x46.png"/></fig><p>On the other hand, the INTERSECT relation models a spatial arrangement between a and b whenever there is a partial or full overlapping of their internal points, and there is an intersection between their external contours.</p><p>For example, <xref ref-type="fig" rid="fig2">Figure 2</xref>(b) shows a spatial arrangement where the INTERSECT relations holds between the objects a and b, respectively.</p><p>Finally, the attach spatial relation TOUCH is defined as shown in <xref ref-type="table" rid="table3">Table 3</xref>.</p><p>As a matter of fact, the TOUCH relation models a spatial arrangement between a and b whenever there is no overlapping of their internal points, but there is an intersectionn between their external contours.</p><p>For example, <xref ref-type="fig" rid="fig2">Figure 2</xref>(c) shows a spatial arrangement where the TOUCH relation holds between the objects a and b.</p></sec><sec id="s3_2_2"><title>3.2.2. Connection-Based Relations</title><p>In this section, let us introduce the new connection-based spatial relation LINK(h,k), which models the inter connections among graphical objects, and is defined as shown in <xref ref-type="table" rid="table4">Table 4</xref>. Note that LINK(h,k) is an explicit relation, therefore it is visualised through a polyline.</p><p>In other words, LINK(h,k) is a relation that models a spatial arrangement between graphical objects a and b whenever they are disjoint and there is a connection between a point <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x47.png" xlink:type="simple"/></inline-formula> and a point<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x47.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x48.png" xlink:type="simple"/></inline-formula>. For example, <xref ref-type="fig" rid="fig3">Figure 3</xref> shows a spatial arrangement where the LINK(h,k) relation holds between the objects a and b. It is worth noting that, in the definition of LINK(h,k) given in <xref ref-type="table" rid="table4">Table 4</xref>, h and k are predefined (both in number and position) subsets of C and<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x47.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x48.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x49.png" xlink:type="simple"/></inline-formula>, respectively.</p><p>In particular, in general graphs (e.g., state-transition diagrams, see <xref ref-type="fig" rid="fig4">Figure 4</xref>(a)), where the specific connec- tion point is not relevant for the relation semantics, these subsets coincide with the overall contours, whereas in “puzzle” like languages (e.g., tiles, see <xref ref-type="fig" rid="fig4">Figure 4</xref>(b)) they coincide with predefined contiguous segments of the contours. In the latter case, the specific connection point is still not relevant, as long as it belongs to one of the predefined segments. Finally, in plex ( [<xref ref-type="bibr" rid="scirp.66919-ref29">29</xref>] ) languages (e.g., flow charts, see <xref ref-type="fig" rid="fig4">Figure 4</xref>(c)) h and k contain specific predefined points of the contours, which determine the relation semantics. For example, in <xref ref-type="fig" rid="fig4">Figure 4</xref>(c)), the flowchart diamond D has exactly three connection points, with a precise position on the contour of the object,</p><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> Definition of the TOUCH spatial relation</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >TOUCH</th><th align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x50.png" xlink:type="simple"/></inline-formula></th></tr></thead></tbody></table></table-wrap><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> Definition of the LINK(h,k) spatial relation</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >LINK(h,k)</th><th align="center" valign="middle" ><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x51.png" xlink:type="simple"/></inline-formula></th></tr></thead></tbody></table></table-wrap><fig id="fig3"  position="float"><label><xref ref-type="fig" rid="fig3">Figure 3</xref></label><caption><title> Spatial relation LINK(h,k)</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x52.png"/></fig><fig id="fig4"  position="float"><label><xref ref-type="fig" rid="fig4">Figure 4</xref></label><caption><title> Examples of interconnections</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x53.png"/></fig><p>corresponding to the input flow, the “true” output flow and the “false” output flow, respectively.</p></sec></sec></sec><sec id="s4"><title>4. VIE Query Theory</title><p>In this section we provide the theoretical foundations of the VIE query task showing how to formalise queries to extract information from visual documents, i.e., generic documents composed of graphical objects with attri- butes arranged in a two-dimensional space.</p><p>To begin, in order to formally represent a visual document let us introduce the notion of pictorial view, which is defined as a set of triples</p><disp-formula id="scirp.66919-formula122"><graphic  xlink:href="http://html.scirp.org/file/4-1730342x54.png"  xlink:type="simple"/></disp-formula><p>where <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x55.png" xlink:type="simple"/></inline-formula> are graphical objects and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x55.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x56.png" xlink:type="simple"/></inline-formula> is a spatial relation. The triple <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x55.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x56.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x57.png" xlink:type="simple"/></inline-formula> means that the spatial relation <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x55.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x56.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x57.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x58.png" xlink:type="simple"/></inline-formula> holds between the objects <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x55.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x56.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x57.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x58.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x59.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x55.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x56.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x57.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x58.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x59.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x60.png" xlink:type="simple"/></inline-formula> in the given visual representation.</p><p>In other words, we represent visual documents through the notion of pictorial view, conceived as a collection of graphical objects and a set of spatial relations arranging them in a two dimensional space. Pictorial views allow one to characterise the essential content of visual representations, including only the knowledge strictly necessary to manipulate its graphical objects.</p><p>Indeed, a visual document can have many different pictorial views, each one containing only the part of the overall information which is strictly necessary for that specific view. Therefore, a pictorial view is not required to refer to all the graphical objects present in a visual document or list all the relations holding among them.</p><p>Then, we can define in detail the structure of a VIE query, modelling the task of information extraction from a pictorial view. In the following, we use P to denote a generic pictorial view, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x61.png" xlink:type="simple"/></inline-formula>and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x61.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x62.png" xlink:type="simple"/></inline-formula> to denote the set of graphical objects and spatial relations included in P, respectively, and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x61.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x62.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x63.png" xlink:type="simple"/></inline-formula> a set of attribute names relative to the graphical objects in<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x61.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x62.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x63.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x64.png" xlink:type="simple"/></inline-formula>.</p><p>Definition 1 (Query) A VIE query Q can be:</p><p>1) a simple query, i.e., a sequence <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x65.png" xlink:type="simple"/></inline-formula> of query steps;</p><p>2) a combined query, i.e. the combination of two queries through the set operations <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x66.png" xlink:type="simple"/></inline-formula> (union), <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x66.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x67.png" xlink:type="simple"/></inline-formula> (intersection), \(difference), e.g.<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x66.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x67.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x68.png" xlink:type="simple"/></inline-formula>.</p><p>Definition 2 (Query step) A query step s can be one of the following:</p><p>1) a direct relation step, written as R or<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x69.png" xlink:type="simple"/></inline-formula>, with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x69.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x70.png" xlink:type="simple"/></inline-formula>;</p><p>2) an inverse relation step, written as R or<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x71.png" xlink:type="simple"/></inline-formula>, with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x71.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x72.png" xlink:type="simple"/></inline-formula>;</p><p>3) a global (direct or inverse) relation step, written as <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x73.png" xlink:type="simple"/></inline-formula> (direct) or <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x73.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x74.png" xlink:type="simple"/></inline-formula> (inverse), with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x73.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x74.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x75.png" xlink:type="simple"/></inline-formula>;</p><p>4) an attribute filter step, written as a sequence<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x76.png" xlink:type="simple"/></inline-formula>, where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x77.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x77.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x78.png" xlink:type="simple"/></inline-formula>is a constant value and<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x76.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x77.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x78.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x79.png" xlink:type="simple"/></inline-formula>;</p><p>5) a order step, written as<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x80.png" xlink:type="simple"/></inline-formula>, where<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x80.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x81.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x80.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x81.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x82.png" xlink:type="simple"/></inline-formula>and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x80.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x81.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x82.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x83.png" xlink:type="simple"/></inline-formula> are “internal attributes” denoting the uppermost and downmost y-coordinate and the leftmost and rightmost x-coordinate of a point in a object contour, respectively;</p><p>6) a extract step, written as <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x84.png" xlink:type="simple"/></inline-formula> or<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x84.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x85.png" xlink:type="simple"/></inline-formula>;</p><p>7) a subquery step, written as a valid query<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x86.png" xlink:type="simple"/></inline-formula>.</p><p>Definition 3 (Query Context) A query context is an ordered list of graphical objects from P, written as <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x87.png" xlink:type="simple"/></inline-formula> with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x87.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x88.png" xlink:type="simple"/></inline-formula>.</p><p>Definition 4 (Query Results) A query Q, executed on a pictorial view P in a context<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x89.png" xlink:type="simple"/></inline-formula>, returns a list of graphical objects denoted by<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x89.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x90.png" xlink:type="simple"/></inline-formula>, which is recursively defined as follows.</p><p>For simple queries:</p><p>1) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x91.png" xlink:type="simple"/></inline-formula>, then <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x91.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x92.png" xlink:type="simple"/></inline-formula></p><p>2) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x93.png" xlink:type="simple"/></inline-formula>, then <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x93.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x94.png" xlink:type="simple"/></inline-formula></p><p>3) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x95.png" xlink:type="simple"/></inline-formula>, then <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x95.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x96.png" xlink:type="simple"/></inline-formula></p><p>4) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x97.png" xlink:type="simple"/></inline-formula>, then <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x97.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x98.png" xlink:type="simple"/></inline-formula></p><p>(note that the actual order of objects in the result of these steps is left unspecified).</p><p>5) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x99.png" xlink:type="simple"/></inline-formula>, then <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x99.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x100.png" xlink:type="simple"/></inline-formula> with <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x99.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x100.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x101.png" xlink:type="simple"/></inline-formula> and <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x99.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x100.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x101.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x102.png" xlink:type="simple"/></inline-formula> holds.</p><p>6) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x103.png" xlink:type="simple"/></inline-formula>, then <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x103.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x104.png" xlink:type="simple"/></inline-formula> ordered w.r.t. the ascending/descending (d) value of attribute a.</p><p>7) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x105.png" xlink:type="simple"/></inline-formula>, then<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x105.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x106.png" xlink:type="simple"/></inline-formula>.</p><p>8) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x107.png" xlink:type="simple"/></inline-formula>, then<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x107.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x108.png" xlink:type="simple"/></inline-formula>.</p><p>9) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x109.png" xlink:type="simple"/></inline-formula>, then<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x109.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x110.png" xlink:type="simple"/></inline-formula>.</p><p>10) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x111.png" xlink:type="simple"/></inline-formula>, then<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x111.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x112.png" xlink:type="simple"/></inline-formula>.</p><p>For combined queries, let <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x113.png" xlink:type="simple"/></inline-formula> be two (sub) queries, with<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x113.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x114.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x113.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x114.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x115.png" xlink:type="simple"/></inline-formula>:</p><p>11) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x116.png" xlink:type="simple"/></inline-formula>,<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x116.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x117.png" xlink:type="simple"/></inline-formula>.</p><p>12) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x118.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x118.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x119.png" xlink:type="simple"/></inline-formula>with <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x118.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x119.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x120.png" xlink:type="simple"/></inline-formula> and<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x118.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x119.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x120.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x121.png" xlink:type="simple"/></inline-formula>.</p><p>13) if<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x122.png" xlink:type="simple"/></inline-formula>, <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x122.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x123.png" xlink:type="simple"/></inline-formula>with <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x122.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x123.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x124.png" xlink:type="simple"/></inline-formula> and<inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x122.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x123.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x124.png" xlink:type="simple"/></inline-formula><inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x125.png" xlink:type="simple"/></inline-formula>.</p><p>To better understand how a VIE query extracts graphical objects from a pictorial view following the rules introduced in the above query results definition, let us consider the visual document depicted in <xref ref-type="fig" rid="fig5">Figure 5</xref>. For sake of simplicity, we avoid to introduce specific pictorial views of this document, and in the queries reported below we consider only spatial relations (e.g., Left or Right) and attributes (e.g., the shape) which can be clearly desumed from that document. In other words, the following queries will be based on an overall pictorial view containing all the ten graphical objects together with all the disjoint spatial relations holding among them.</p><p>・ <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x126.png" xlink:type="simple"/></inline-formula></p><p>(this query extracts all the objects having a square on the left, i.e. T1, T3, T4, C2, C3, S3)</p><p>・ <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x127.png" xlink:type="simple"/></inline-formula></p><p>(this query extracts all the objects having a circle on the upright, i.e., T2, S2, T3, T4)</p><p>・ <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x128.png" xlink:type="simple"/></inline-formula></p><p>(this query combines the previous <img data-original="http://html.scirp.org/file/4-1730342x129.png" /> and <img data-original="http://html.scirp.org/file/4-1730342x130.png" /> and allows to extract all the objects having a square on the left and a circle on the upright, i.e., T3, T4)</p><p>・ <inline-formula><inline-graphic xlink:href="http://html.scirp.org/file/4-1730342x131.png" xlink:type="simple"/></inline-formula></p><disp-formula id="scirp.66919-formula123"><label>(this query uses as a subquery in order to extract the first object in y-order having a square on the left and a circle on the upright, i.e., T3)</label><graphic position="anchor" xlink:href="http://html.scirp.org/file/4-1730342x132.png"  xlink:type="simple"/></disp-formula></sec><sec id="s5"><title>5. Assisted Query Formulation</title><p>In the Introduction, we outlined the origin and meaning of the VIE research, whereas in the previous section we presented the new VIE query theory. In this section, we show how the VIE framework has been upgraded by developing an incremental algorithm to assist the user in the interactive formulation of extraction queries. The algorithm follows the query formalisation illustrated in Section 4, which as been implemented in a prototypal software application written in Java. The overall query definition algorithm is shown in <xref ref-type="fig" rid="fig6">Figure 6</xref>.</p><p>Starting form a pictorial view (which is shown in the application interface) and an empty query (which conventionally selects all the graphical objects of the pictorial view), in each iteration the application highlights the objects returned by the current query and allows the user to choose the next step to add to the query. Then, the objects resulting from the application of this step to the current query are previewed on the interface. Once the user confirms the step, the query is updated to the current results. This process is repeated until the user confirms the query, which is then stored in the application query library.</p><p>More in detail, <xref ref-type="fig" rid="fig7">Figure 7</xref> show the specific algorithm fragments underlying the Get_Step function. As an example, as shown in <xref ref-type="fig" rid="fig7">Figure 7</xref>(a) after selecting the “Relation Step” mode, the user should choose a feasible graphical object from the highlighted ones. Then, the interface opens a menu displaying the feasible spatial relations holding between the selected object and the other ones, and the user can select one of these relations. This process is repeated so that the user can select more objects and relative relations, which are combined through the AND operator. Finally, the corresponding direct relation step (which is the only relation step currently supported by the application) is built and returned to the overall algorithm.</p><p>Figures 8-11 show an example where the above algorithm is executed during a sample running session of the</p><fig id="fig5"  position="float"><label><xref ref-type="fig" rid="fig5">Figure 5</xref></label><caption><title> A sample visual document</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x133.png"/></fig><fig id="fig6"  position="float"><label><xref ref-type="fig" rid="fig6">Figure 6</xref></label><caption><title> Overall query definition algorithm</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x134.png"/></fig><fig id="fig7"  position="float"><label><xref ref-type="fig" rid="fig7">Figure 7</xref></label><caption><title> Specific query step definition algorithms</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x135.png"/></fig><fig id="fig8"  position="float"><label><xref ref-type="fig" rid="fig8">Figure 8</xref></label><caption><title> The application main interface</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x136.png"/></fig><fig id="fig9"  position="float"><label><xref ref-type="fig" rid="fig9">Figure 9</xref></label><caption><title> Building a relation step</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x137.png"/></fig><fig id="fig10"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>0</label><caption><title> Building an attribute step</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x138.png"/></fig><fig id="fig11"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>1</label><caption><title> Completing the query definition</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x139.png"/></fig><p>application graphical interface. In particular, the user builds a query which selects the graphical objects having some other object below them (<xref ref-type="fig" rid="fig9">Figure 9</xref>) and a weight attribute greater than or equal to 34 (<xref ref-type="fig" rid="fig1">Figure 1</xref>0). Then, this query is stored in the library (<xref ref-type="fig" rid="fig1">Figure 1</xref>1(a)) and reused to build a second query which adds an extraction step (<xref ref-type="fig" rid="fig1">Figure 1</xref>1(b)) in order to select only the first of such objects, in extraction order.</p><fig id="fig12"  position="float"><label><xref ref-type="fig" rid="fig1">Figure 1</xref>2</label><caption><title> Query library in XML format</title></caption><graphic mimetype="image"   position="float"  xlink:type="simple"  xlink:href="http://html.scirp.org/file/4-1730342x140.png"/></fig></sec><sec id="s6"><title>6. Conclusions</title><p>In this paper we have developed a general query theory which has been integrated in the VIE technique, allowing to formalise how graphical objects can be extracted from visual documents.</p><p>This theory has been used to define an incremental algorithm implemented in a software application that allows to formulate queries through a user-friendly graphical interface, directly interacting with the visual do- cument.</p><p>Our main further research aims to integrate this application within the SRQ tool, providing it with an assisted visual editor to perform VIE without the need to write complex SRQL statements. To this aim, the library of queries created by the application can be exported (see the XML fragment in <xref ref-type="fig" rid="fig1">Figure 1</xref>2 as an example) in a format compatible with the SRQ internal query representation.</p><p>Once accomplished, this integration would allow us to experiment our assisted query mechanism also on the different, meaningful domains which are currently supported by the SRQ tool, like web pages or geospatial data.</p></sec><sec id="s7"><title>Cite this paper</title><p>Giuseppe Della Penna,Sergio Orefice, (2016) Supporting Information Extraction from Visual Documents. Journal of Computer and Communications,04,36-48. doi: 10.4236/jcc.2016.46004</p></sec></body><back><ref-list><title>References</title><ref id="scirp.66919-ref1"><label>1</label><mixed-citation publication-type="journal" xlink:type="simple"><name name-style="western"><surname>Moens</surname><given-names> M.-F. </given-names></name>,<etal>et al</etal>. (<year>2006</year>)<article-title>Information Extraction: Algorithms and Prospects in a Retrieval Context</article-title><source> The Information Retrieval Series</source><volume> 21</volume>,<fpage> 1</fpage>-<lpage>45</lpage>.<pub-id pub-id-type="doi"></pub-id></mixed-citation></ref><ref id="scirp.66919-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Della Penna, G., Magazzeni, D. and Orefice, S. (2010) Visual Extraction of Information from Web Pages. Journal of Visual Languages and Computing, 21, 23-32. http://dx.doi.org/10.1016/j.jvlc.2009.06.001</mixed-citation></ref><ref id="scirp.66919-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Della Penna, G., Magazzeni, D. and Orefice, S. (2012) A Spatial Relation-Based Framework to Perform Visual Information Extraction. Knowledge and Information Systems, 30, 667-692.</mixed-citation></ref><ref id="scirp.66919-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Costagliola, G., De Lucia, A., Orefice, S. and Polese, G. (2002) A Classification Framework to Support the Design of Visual Languages. Journal of Visual Languages and Computing, 13, 573-600. http://dx.doi.org/10.1006/jvlc.2002.0234</mixed-citation></ref><ref id="scirp.66919-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Della Penna, G., Magazzeni, D. and Orefice, S. (2013) A General Theory of Spatial Relations to Support a Graphical Tool for Visual Information Extraction. Journal of Visual Languages and Computing, 24, 71-87.  
http://dx.doi.org/10.1016/j.jvlc.2012.11.002</mixed-citation></ref><ref id="scirp.66919-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Laender, A.H.F., Ribeiro-Neto, B.A., da Silva, A.S. and Teixeira, J.S. (2002) A Brief Survey of Web Data Extraction Tools. ACM SIGMOD Record, 31, 84-93. http://dx.doi.org/10.1145/565117.565137</mixed-citation></ref><ref id="scirp.66919-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Lam, M. and Gong, Z. (2005) Web Information Extraction. Proceedings of the IEEE International Conference on Information Acquisition, Hong Kong and Macau, 27 June-3 July 2005, 596-601..</mixed-citation></ref><ref id="scirp.66919-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Gottlob, G., Koch, C., Baumgartner, R., Herzog, M. and Flesca, S. (2004) The LIXTO Data Extraction Project—Back and Forth between Theory and Practice. Proceedings of the Symposium on Principles of Database Systems (PODS-04), Paris, 14 June 2004, 1-12.</mixed-citation></ref><ref id="scirp.66919-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Fazzinga, B., Flesca, S. and Tagarelli, A. (2011) Schema-Based Web Wrapping. Knowledge and Information Systems, 26, 127-173. http://dx.doi.org/10.1007/s10115-009-0275-2</mixed-citation></ref><ref id="scirp.66919-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Zhai, Y. and Liu, B. (2005) Web Data Extraction Based on Partial Tree Alignment. WWW’05: Proceedings of the 14th International Conference on World Wide Web, Chiba, 10-14 May 2005, 76-85.  
http://dx.doi.org/10.1145/1060745.1060761</mixed-citation></ref><ref id="scirp.66919-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Soderland, S. (1999) Learning Information Extraction Rules for Semi-Structured and Free Text. Machine Learning, 34, 233-272. http://dx.doi.org/10.1023/A:1007562322031</mixed-citation></ref><ref id="scirp.66919-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Mittendorfer, M. and Winiwarter, W. (2002) Exploiting Syntactic Analysis of Queries for Information Retrieval. Data &amp; Knowledge Engineering, 42, 315-325. http://dx.doi.org/10.1016/S0169-023X(02)00049-6</mixed-citation></ref><ref id="scirp.66919-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Laender, A.H.F., Ribeiro-Neto, B. and da Silva, A.S. (2002) DEByE—Data Extraction by Example. Data &amp; Knowledge Engineering, 40, 121-154. http://dx.doi.org/10.1016/S0169-023X(01)00047-7</mixed-citation></ref><ref id="scirp.66919-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Wong, T.-L. and Lam, W. (2010) Learning to Adapt Web Information Extraction Knowledge and Discovering New Attributes via a Bayesian Approach. IEEE Transactions on Knowledge and Data Engineering, 22, 523-536.  
http://dx.doi.org/10.1109/TKDE.2009.111</mixed-citation></ref><ref id="scirp.66919-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Snoussi, H., Magnin, L. and Nie, J. (2002) Towards an Ontology-Based Web Data Extraction. BASeWEB Proceedings of the Fifteenth Canadian Conference on Artificial Intelligence AI 2002, Alberta, 27-29 May 2002.</mixed-citation></ref><ref id="scirp.66919-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Jimeno-Yepes, A., Llavori, R.B. and Rebholz-Schuhmann, D. (2010) Ontology Refinement for Improved Information Retrieval. Information Processing &amp; Management, 46, 426-435. http://dx.doi.org/10.1016/j.ipm.2009.05.008</mixed-citation></ref><ref id="scirp.66919-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Crescenzi, V. and Mecca, G. (1998) Grammars Have Exceptions. Information Systems, 23, 539-565.  
http://dx.doi.org/10.1016/S0306-4379(98)00028-3</mixed-citation></ref><ref id="scirp.66919-ref18"><label>18</label><mixed-citation publication-type="book" xlink:type="simple">Hammer, J., McHugh, J. and Garcia-Molina, H. (1997) Semistructured Data: The TSIMMIS Experience. In: Manthey, R. and Wolfengagen, V., Eds., Advances in Databases and Information Systems, Springer, St. Petersburg, 1-13.</mixed-citation></ref><ref id="scirp.66919-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Zhao, H., Meng, W., Wu, Z., Raghavan, V. and Yu, C. (2005) Fully Automatic Wrapper Generation for Search Engines. Proceedings of the 14th International Conference on World Wide Web, Chiba, 10-14 May 2005, 66-75.  
http://dx.doi.org/10.1145/1060745.1060760</mixed-citation></ref><ref id="scirp.66919-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Gatterbauer, W., Bohunsky, P., Herzog, M., Krüpl, B. and Pollak, B. (2007) Towards Domain-Independent Information Extraction from Web Tables. Proceedings of the 16th International Conference on World Wide Web, Banff, 8-12 May 2007, 71-80. http://dx.doi.org/10.1145/1242572.1242583</mixed-citation></ref><ref id="scirp.66919-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Jiang, L., Wang, J., An, N., Wang, S., Zhan, J. and Li, L. (2009) GRAPE: A Graph-Based Framework for Disambiguating People Appearances in Web Search. Proceedings of IEEE International Conference on Data Mining, Miami, 6-9 December 2009, 199-208. http://dx.doi.org/10.1109/icdm.2009.25</mixed-citation></ref><ref id="scirp.66919-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Rosenfeld, B. and Feldman, R. (2008) Self-Supervised Relation Extraction from the Web. Knowledge and Information Systems, 17, 17-33. http://dx.doi.org/10.1007/s10115-007-0110-6</mixed-citation></ref><ref id="scirp.66919-ref23"><label>23</label><mixed-citation publication-type="book" xlink:type="simple">Gu, X., Chen, J., Ma, W. and Chen, G. (2002) Visual Based Content Understanding towards Web Adaptation. In: De Bra, P., Brusilovsky, P. and Conejo, R., Eds., Adaptive Hypermedia and Adaptive Web-Based Systems, Springer-Verlag, Berlin, 164-173. http://dx.doi.org/10.1007/3-540-47952-x_18</mixed-citation></ref><ref id="scirp.66919-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Yang, Y. and Zhang, H. (2001) HTML Page Analysis Based on Visual Cues. Proceedings of the 6th International Conference on Document Analysis and Recognition, Seattle, 10-13 September 2001, 859-864.  
http://dx.doi.org/10.1109/ICDAR.2001.953909</mixed-citation></ref><ref id="scirp.66919-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">Aumann, Y., Feldman, R., Liberzon, Y., Rosenfeld, B. and Schler, J. (2006) Visual Information Extraction. Knowledge and Information Systems, 10, 1-15. http://dx.doi.org/10.1007/s10115-006-0014-x</mixed-citation></ref><ref id="scirp.66919-ref26"><label>26</label><mixed-citation publication-type="book" xlink:type="simple">Chenthamarakshan, V., Varadarajan, R., Deshpande, P.M., Krishnapuram, R. and Stolze, K. (2012) WYSIWYE: An Algebra for Expressing Spatial and Textual Rules for Information Extraction. In: Gao, H., Lim, L., Wang, W., Li, C. and Chen, L., Eds., Web-Age Information Management—13th International Conference, Springer, Berlin, 419-433.  
http://dx.doi.org/10.1007/978-3-642-32281-5_41</mixed-citation></ref><ref id="scirp.66919-ref27"><label>27</label><mixed-citation publication-type="other" xlink:type="simple">Gao, J., Zhou, Y. and Barner, K.E. (2012) View: Visual Information Extraction Widget for Improving Chart Images Accessibility. 19th IEEE International Conference on Image Processing, Orlando, 30 September-3 October 2012, 2865-2868. http://dx.doi.org/10.1109/icip.2012.6467497</mixed-citation></ref><ref id="scirp.66919-ref28"><label>28</label><mixed-citation publication-type="other" xlink:type="simple">Uzun, E., Agun, H.V. and Yerlikaya, T. (2013) A Hybrid Approach for Extracting Informative Content from Web Pages. Information Processing &amp; Management, 49, 928-944. http://dx.doi.org/10.1016/j.ipm.2013.02.005</mixed-citation></ref><ref id="scirp.66919-ref29"><label>29</label><mixed-citation publication-type="other" xlink:type="simple">Feder, J. (1971) Plex Languages. Information Sciences, 3, 225-241. http://dx.doi.org/10.1016/S0020-0255(71)80008-7</mixed-citation></ref></ref-list></back></article>