<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN" "JATS-journalpublishing1-4.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.4" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">nr</journal-id>
      <journal-title-group>
        <journal-title>Natural Resources</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2158-7086</issn>
      <issn pub-type="ppub">2158-706X</issn>
      <publisher>
        <publisher-name>Scientific Research Publishing</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.4236/nr.2025.1613028</article-id>
      <article-id pub-id-type="publisher-id">nr-148411</article-id>
      <article-categories>
        <subj-group>
          <subject>Article</subject>
        </subj-group>
        <subj-group>
          <subject>Earth</subject>
          <subject>Environmental Sciences</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Extraction of Unique Plant Species Communities from the Sub-Humid Humid Bioclimate of Martinique</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name name-style="western">
            <surname>Simphor</surname>
            <given-names>Jean-Emile</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <contrib-id contrib-id-type="orcid">0000-0001-6313-3976</contrib-id>
          <name name-style="western">
            <surname>Claude</surname>
            <given-names>Jean-Philippe</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <name name-style="western">
            <surname>Joseph</surname>
            <given-names>Philippe</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
      </contrib-group>
      <aff id="aff1"><label>1</label> UMR ESPACE DEV-BIORECA Laboratory, University of Antilles, Schœlcher, France </aff>
      <author-notes>
        <fn fn-type="conflict" id="fn-conflict">
          <p>The authors declare no conflicts of interest regarding the publication of this paper.</p>
        </fn>
      </author-notes>
      <pub-date pub-type="epub">
        <day>17</day>
        <month>12</month>
        <year>2025</year>
      </pub-date>
      <pub-date pub-type="collection">
        <month>12</month>
        <year>2025</year>
      </pub-date>
      <volume>16</volume>
      <issue>13</issue>
      <fpage>565</fpage>
      <lpage>583</lpage>
      <history>
        <date date-type="received">
          <day>15</day>
          <month>01</month>
          <year>2025</year>
        </date>
        <date date-type="accepted">
          <day>26</day>
          <month>12</month>
          <year>2025</year>
        </date>
        <date date-type="published">
          <day>29</day>
          <month>12</month>
          <year>2025</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>© 2025 by the authors and Scientific Research Publishing Inc.</copyright-statement>
        <copyright-year>2025</copyright-year>
        <license license-type="open-access">
          <license-p> This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link> ). </license-p>
        </license>
      </permissions>
      <self-uri content-type="doi" xlink:href="https://doi.org/10.4236/nr.2025.1613028">https://doi.org/10.4236/nr.2025.1613028</self-uri>
      <abstract>
        <p>Biodiversity in forest ecosystems is crucial for regulating ecological processes and delivering essential ecosystem services. In this study, we investigate how specific plant species communities in secondary forest formations (FSS) reflect particular bioclimatic conditions and successional stages in a Sub-Humid Humid (SHH) environment. Our dataset comprises four survey stations (S51, S103, S119, S120) where minimal sampling areas were determined (ranging from 500 to 1000 m<sup>2</sup>), yielding 29 to 45 recorded species per station. Environmental variables such as altitude, total biomass, and total basal area were also collected. Using the ECLAT algorithm to identify frequent species sets, followed by hierarchical clustering (CAH) and principal component analysis (PCA), we were able to highlight recurring species assemblages and singular, ecologically significant species. Further bivariate analyses of the distribution index (Id) against basal area (St_Totale) confirmed that the most frequent species often exhibit higher distribution and larger basal area. These findings underscore the potential of combining data mining techniques with conventional statistical methods to unravel complex patterns in forest ecosystem dynamics. Our results provide a valuable foundation for scaling up to more extensive datasets to better understand the ecological and environmental drivers shaping secondary forest communities under changing climate conditions.</p>
      </abstract>
      <kwd-group kwd-group-type="author-generated" xml:lang="en">
        <kwd>Biodiversity</kwd>
        <kwd>Ecology</kwd>
        <kwd>Item</kwd>
        <kwd>Itemset</kwd>
        <kwd>Frequent Itemset</kwd>
        <kwd>Clustering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec1">
      <title>1. Introduction</title>
      <p>The biological diversity of forest ecosystems plays a crucial role in regulating ecological processes and providing essential ecosystem services. As forests evolve, their specific composition of plant species changes, reflecting adaptations to shifting bioclimatic conditions as well as complex biotic and abiotic interactions [<xref ref-type="bibr" rid="B1">1</xref>][<xref ref-type="bibr" rid="B2">2</xref>]. Investigating unique plant species communities, whether very frequent or, conversely, very rare within specific bioclimates, and their association with different stages of succession can provide valuable insights into forest ecosystem dynamics in the face of environmental disturbances, particularly climate change.</p>
      <p>In this article, our initial focus is on the bioclimatic characteristics and the stage of forest formation evolution. Specifically, we concentrate on secondary forest formations. Our objective is to demonstrate the viability of a methodological approach combining data mining techniques and statistical analyses to extract meaningful insights from a small dataset, based exclusively on station-level species surveys. By deliberately focusing on a limited number of stations (four), carefully selected to reflect significant ecological contrasts (altitude, biomass, management history), we propose a qualitative exploratory approach to evaluate the relevance of the methods employed before scaling them up to larger datasets.</p>
      <p>This deliberate choice of a small number of stations is based on the need to test the robustness of the methodology in distinct ecological contexts, including <italic>Swietenia</italic><italic>macrophylla</italic> plantations where vegetation regeneration is influenced by anthropic management, as well as “classic” stations with no known recent human intervention. This approach avoids biases linked to premature generalizations and lays the groundwork for future extrapolation. Thus, our central question is whether, using this limited dataset, it is possible to reveal unique plant species communities that reflect specific ecological conditions, while addressing the challenges posed by the complexity of forest ecosystems.</p>
      <p>By validating this small-scale qualitative methodology, we hope to pave the way for its application on a larger scale, integrating richer and more diverse datasets to better understand ecological dynamics in tropical bioclimates.</p>
    </sec>
    <sec id="sec2">
      <title>2. Materials: Station Data from the Sub-Humid Humid Bioclimate for Secondary Sylvatic Stage Formations</title>
      <p>We have four stations for plant species surveys in Sub-Humid Humid (SHH) bioclimate where physiognomic types correspond to Secondary Sylvatic Formations (SSF). These are stations S51, S103, S119, S120, referred to as such throughout. These four studied stations exhibit distinct ecological characteristics, reflecting significant variations in environmental conditions and management history. Stations S120 and S119 are classified as “classic,” representing natural environments with no known recent or major human intervention. In contrast, stations S51 and S103 are Swietenia macrophylla (SWMAC) plantations, meaning their original vegetation was removed before being reconstituted under the canopy of these plantations.</p>
      <p>In terms of altitude, S103 is the highest (130 m), followed by S120 (45 m), S119 (39 m), and finally S51, the lowest station (30 m). The station areas also vary: S103 is the largest (1000 m<sup>2</sup>), while S119 is the smallest (500 m<sup>2</sup>), with S120 and S51 having intermediate areas of 920 m<sup>2</sup> and 700 m<sup>2</sup>, respectively.</p>
      <p>For each station, plant species surveys were conducted on a surface corresponding to the minimal area. The minimal area refers to the smallest sampling surface beyond which adding additional surface area does not result in the appearance of new species (or very few). In other words, it is the smallest area that allows for an accurate representation of the floristic composition of an environment. The minimal area expressed in m<sup>2</sup> for stations S51, S103, S119, S120 is 700, 1000, 500, and 920 respectively.</p>
      <p>Total biomass, an indicator of ecosystem density and productivity, is highest in station S119 (11.04) and lowest in S120 (3.31). The <italic>Swietenia macrophylla</italic> (SWMAC) plantations (S51 and S103) exhibit intermediate biomass levels (5.14 and 7.17, respectively). These ecological differences provide an opportunity to explore how plant community dynamics vary in response to the specific conditions of each station.</p>
      <p>Stations S51, S103, S119, S120 contain 29, 45, 38, and 30 species respectively. We also have environmental information for each station. This includes altitude, total species biomass, total species density (m<sup>2</sup>), station surface area, and forest type. Similarly, for each species in each station, we have the following variables: rf (relative frequency), Density, Di (Distribution index), BA_Total (Basal Area), DI (Dominance Index).</p>
      <p>We emphasize that, within the framework of our qualitative methodological approach, only the species abundance matrix from the four stations serves as our initial dataset. All the aforementioned station characteristics were not known or used in the implementation of our approach.</p>
      <fig id="fig1">
        <label>Figure 1</label>
        <graphic xlink:href="https://html.scirp.org/file/2001269-rId15.jpeg?20251229112534" />
      </fig>
      <p>Map of Martinique with Station Representation</p>
    </sec>
    <sec id="sec3">
      <title>3. Methods</title>
      <p>In order to search for frequent and unique plant species communities in the SHH bioclimate at the SSF evolution stage, and based on the abundance matrix and environmental data table, we proceed in two main steps. We specify that our abundance matrix allows us to represent the presence and abundance of species for each of the species identified in the 4 stations of our study sample. Below we present an excerpt from our abundance matrix.</p>
      <sec id="sec3dot1">
        <title>3.1. Descriptive Data Analysis</title>
        <p>The descriptive analysis will allow us to summarize and characterize the data to provide an overall view. The objective is to describe and synthesize, using an observational approach, the main characteristics of our dataset.</p>
      </sec>
      <sec id="sec3dot2">
        <title>3.2. Exploratory Data Analysis</title>
        <p>With exploratory data analysis (EDA), we will search for relationships, structures, and trends in the data. The goal is to highlight patterns (trends, anomalies, groupings) that might not have been noticed with descriptive analysis alone. We aim to generate or validate hypotheses to better understand the underlying dynamics within the dataset.</p>
        <p>3.2.1. Extraction of Frequent Plant Species Communities</p>
        <p><bold>1</bold><bold>)</bold><bold>Concept</bold><bold>of</bold><bold>Items,</bold><bold>Itemsets,</bold><bold>Frequent</bold><bold>Itemsets,</bold><bold>and</bold><bold>Support</bold></p>
        <p>We rely on the concepts of itemsets and frequent itemsets, which are used in knowledge discovery in data or data mining. An itemset is a set of elements or items that appear together in a transaction or dataset. It represents a combination of features, attributes, or elements analyzed to uncover interesting patterns or associations within large databases. Itemsets are fundamental for analytical techniques such as association rules, where they help identify significant relationships and co-occurrences within the data.</p>
        <p>Let I = {i1, i2, …, im} be a set of items. A transactional database D is a set of transactions, where each transaction T is a subset of items, <italic>i.e.</italic>, T⊆I.</p>
        <p>An itemset X is any subset of I, such that X⊆I. The support of an itemset X in D is the proportion of transactions T∈D for which X⊆T [<xref ref-type="bibr" rid="B3">3</xref>][<xref ref-type="bibr" rid="B4">4</xref>].</p>
        <p>An itemset X⊆I is considered frequent if it appears in a proportion of transactions exceeding a predefined threshold, known as the minimum support threshold, min_sup. Formally, if supp(X) denotes the support of the itemset X in the transactional database D, then X is frequent if supp(X) ≥ min_sup.</p>
        <p>In our study, an item corresponds to a plant species, an itemset to a community of plant species, and a frequent itemset to a community of plant species that occurs together in a significant number of stations, exceeding the predefined support threshold. We hypothesize that frequent plant species communities can reveal groups of species that share preferences for certain ecological conditions or are similarly influenced by environmental disturbances.</p>
        <p><bold>2</bold><bold>)</bold><bold>Algorithms</bold><bold>for</bold><bold>Extracting</bold><bold>Frequent</bold><bold>Itemsets</bold></p>
        <p>In data mining, the first algorithm developed for extracting frequent itemsets is the Apriori algorithm developed by [<xref ref-type="bibr" rid="B3">3</xref>][<xref ref-type="bibr" rid="B4">4</xref>]. This algorithm relies on the “anti-monotone” property, which states that if an itemset is not frequent, then all of its supersets cannot be frequent either. This significantly reduces the search space. Several other algorithms have been developed since, such as FP-growth [<xref ref-type="bibr" rid="B5">5</xref>], Fiasco [<xref ref-type="bibr" rid="B6">6</xref>]-[<xref ref-type="bibr" rid="B8">8</xref>].</p>
        <p>Another notable algorithm is ECLAT, developed by [<xref ref-type="bibr" rid="B9">9</xref>] (Zaki <italic>et</italic><italic>al.</italic>, 1997). ECLAT (Equivalence Class Clustering and bottom-up Lattice Traversal) is an efficient method used in data mining for finding frequent itemsets. Here are some key points about this algorithm:</p>
        <p>It employs a vertical data format, where each item is associated with a list of transaction IDs (TIDs) in which it appears, instead of a traditional horizontal layout of transactions.The algorithm performs a depth-first search (DFS) to explore the lattice of itemsets.By intersecting TID lists of items, ECLAT efficiently computes the support of itemsets without generating unnecessary candidates.It is well-suited for large and sparse datasets, providing better performance in such cases compared to Apriori.</p>
        <p>In our ecological context, particularly with a large number of short-lived species communities and relatively high minimum support thresholds, the ECLAT algorithm is particularly well-suited. The input data in our case will consist of different sites with a list of the plant species identified for each site.</p>
        <p>3.2.2. Hierarchical Clustering Using Agglomerative Classification</p>
        <p>Hierarchical Ascendant Classification (HAC) [<xref ref-type="bibr" rid="B10">10</xref>][<xref ref-type="bibr" rid="B11">11</xref>], is a clustering method used to classify objects (individuals, sites, variables) into homogeneous groups.</p>
        <p>Initially, each object forms a separate cluster. At each step, the two most similar clusters are merged based on an agglomeration criterion (Ward, centroid, single linkage, etc.). The process continues until a single cluster containing all objects is formed. The results are visualized as a dendrogram, a tree-like diagram that helps determine the optimal number of groups. This technique is widely used in statistics, ecology, marketing, and any field requiring data segmentation.</p>
        <p><bold>1</bold><bold>)</bold><bold>Jaccard</bold><bold>Distance</bold><bold>Matrix</bold></p>
        <p>In this article, we have chosen to work with the Jaccard distance matrix, calculated from the Jaccard similarity index [<xref ref-type="bibr" rid="B11">11</xref>][<xref ref-type="bibr" rid="B12">12</xref>] and [<xref ref-type="bibr" rid="B13">13</xref>]. The Jaccard similarity index is defined, for two stations <italic>A</italic> and <italic>B</italic>, as:</p>
        <disp-formula id="FD1">
          <mml:math>
            <mml:mrow>
              <mml:mi>J</mml:mi>
              <mml:mrow>
                <mml:mo>(</mml:mo>
                <mml:mrow>
                  <mml:mi>A</mml:mi>
                  <mml:mo>,</mml:mo>
                  <mml:mi>B</mml:mi>
                </mml:mrow>
                <mml:mo>)</mml:mo>
              </mml:mrow>
              <mml:mo>=</mml:mo>
              <mml:mfrac>
                <mml:mrow>
                  <mml:mtext>number of species present simultaneously in</mml:mtext>
                  <mml:mi>A</mml:mi>
                  <mml:mtext>and</mml:mtext>
                  <mml:mi>B</mml:mi>
                </mml:mrow>
                <mml:mrow>
                  <mml:mtext>number of species present in at least</mml:mtext>
                  <mml:mi>A</mml:mi>
                  <mml:mtext>or</mml:mtext>
                  <mml:mi>B</mml:mi>
                </mml:mrow>
              </mml:mfrac>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>More formally, let:</p>
        <p><italic>a</italic>: the number of species present in both stations;<italic>b</italic>: the number of species present in station <italic>A</italic> but absent in <italic>B</italic>;<italic>c</italic>: the number of species present in station <italic>B</italic> but absent in <italic>A</italic>.</p>
        <p>The similarity index is then expressed as: <inline-formula><mml:math><mml:mrow><mml:mi> J </mml:mi><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mi> A </mml:mi><mml:mo> , </mml:mo><mml:mi> B </mml:mi></mml:mrow><mml:mo> ) </mml:mo></mml:mrow><mml:mo> = </mml:mo><mml:mfrac><mml:mi> a </mml:mi><mml:mrow><mml:mi> a </mml:mi><mml:mo> + </mml:mo><mml:mi> b </mml:mi><mml:mo> + </mml:mo><mml:mi> c </mml:mi></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula> .</p>
        <p>The Jaccard distance is calculated as: <inline-formula><mml:math><mml:mrow><mml:msub><mml:mi> d </mml:mi><mml:mrow><mml:mtext> Jaccard </mml:mtext></mml:mrow></mml:msub><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mi> A </mml:mi><mml:mo> , </mml:mo><mml:mi> B </mml:mi></mml:mrow><mml:mo> ) </mml:mo></mml:mrow><mml:mo> = </mml:mo><mml:mn> 1 </mml:mn><mml:mo> − </mml:mo><mml:mi> J </mml:mi><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mi> A </mml:mi><mml:mo> , </mml:mo><mml:mi> B </mml:mi></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> .</p>
        <p>This approach highlights the species effectively shared between stations. The Jaccard index is an asymmetric index; the more species two stations share (<italic>i.e.</italic>, the larger a), the higher their similarity index and the lower their distance. Joint absences do not artificially inflate the similarity, as Jaccard focuses exclusively on actual presences. Thus, the similarity between two stations primarily reflects the number of shared species rather than the number of species absent in both.</p>
      </sec>
    </sec>
    <sec id="sec4">
      <title>4. Results</title>
      <sec id="sec4dot1">
        <title>4.1. Species Data Abundance Matrix and Descriptive Analysis</title>
        <p>We construct our abundance matrix, which includes 90 species columns for the four stations (S51, S103, S119, S120) representing the evolutionary stage of secondary sylvatic forest formations.</p>
        <p>An excerpt of our initial abundance matrix is presented in <xref ref-type="fig" rid="fig1">Figure 1</xref> below.</p>
        <fig id="fig2">
          <label>Figure 2</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId22.jpeg?20251229112542" />
        </fig>
        <p><bold>Figure 1</bold><bold>.</bold> Extract from the abundance matrix data.</p>
        <p>For example, in the excerpt of this abundance matrix, we can observe the species ANIN (Andira inermis) in the third column, with 2 individuals in S103, 17 in S119, 44 in S120, and 0 in S51.</p>
        <p>In terms of species richness per station, the results are represented in <xref ref-type="fig" rid="fig2">Figure 2</xref>:</p>
        <fig id="fig3">
          <label>Figure 3</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId23.jpeg?20251229112542" />
        </fig>
        <p><bold>Figure 2</bold><bold>.</bold> Richness per station.</p>
        <p>We observe that station S103 has the highest number of species, with a total of 45, followed by S119 with 38. Stations S120 and S51 are less diverse, with 30 and 29 species, respectively. The stations S103, S119, S120, and S51 contain 1216, 666, 530, and 801 floristic units, respectively.</p>
        <p>To enrich the information provided in the above figure with the species names and their counts per station, we use stacked bar charts to represent the most abundant species in the four stations. Specifically, in <xref ref-type="fig" rid="fig3">Figure 3(a)</xref> and <xref ref-type="fig" rid="fig3">Figure 3(b)</xref> below, we show the top 10 most abundant species by number of observations and by percentage (Top 10). In <xref ref-type="fig" rid="fig4">Figure 4(a)</xref> and <xref ref-type="fig" rid="fig4">Figure 4(b)</xref> below, we display the species ranked 11th to 20th by number of observations and percentage (Top 11-20). Charts for other less abundant species can be provided if needed.</p>
        <fig id="fig4">
          <label>Figure 4</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId24.jpeg?20251229112542" />
        </fig>
        <fig id="fig5">
          <label>Figure 5</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId25.jpeg?20251229112542" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 3</bold><bold>.</bold> Top 10: Species abundance by station (a): Number, (b): Percentage.</p>
        <fig id="fig6">
          <label>Figure 6</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId26.jpeg?20251229112542" />
        </fig>
        <fig id="fig7">
          <label>Figure 7</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId27.jpeg?20251229112542" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 4</bold><bold>.</bold> Top 11 to 20: Species abundance by station (a): Number, (b): Percentage.</p>
        <p>We observe from the above charts that the species SWMAC (<italic>Swietenia</italic><italic>macrophylla</italic>, <italic>Meliaceae</italic>, Tree) is the most abundant species in terms of observations, with a total count of 685, including 564 units in S103 and 121 units in S51. Notably, SWMAC is present in only two out of the four stations.</p>
        <p>We also note that the next most abundant species is MYSPLEN (<italic>Myrcia</italic><italic>splendens</italic>, <italic>Myrtaceae</italic>, Shrub), with a total count of 328, distributed as 144 units in S103, 79 units in S119, and 105 units in S120. MYSPLEN is found in three out of the four stations. Similarly, the species FUEL (<italic>Funtumia</italic><italic>elastica</italic>, <italic>Apocynaceae</italic>, Tree) has a total observation count of 285, with 25 units in S103, 236 in S119, and 24 in S120. This species is also present in three out of the four stations, the same three as MYSPLEN.</p>
        <p>Additionally, the species TACI is present in all four stations, with counts of 4, 15, 19, and 8 units in stations S51, S120, S119, and S103, respectively.</p>
        <p>Furthermore, we provide in <xref ref-type="fig" rid="fig5">Figure 5</xref> below a boxplot representation of the species abundance distribution. For readability purposes, we have only included species found in at least two of the four stations. Species specific to a single station are therefore not represented.</p>
        <fig id="fig8">
          <label>Figure 8</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId28.jpeg?20251229112542" />
        </fig>
        <p><bold>Figure 5</bold><bold>.</bold> Boxplot: Abundance by species.</p>
        <p>Just like in <xref ref-type="fig" rid="fig3">Figure 3</xref>, <xref ref-type="fig" rid="fig5">Figure 5</xref> also highlights the dominance in terms of abundance of the species SWMAC, MYSPLEN, and FUEL. An additional piece of information provided by <xref ref-type="fig" rid="fig5">Figure 5</xref> concerns the species FUEL, which shows an “outlier” in station S119 with 236 units. This indicates that for FUEL, which is highly abundant in the stations where it is present (S103, S119, S120), there is an overabundance in station S119 compared to stations S103 and S120.</p>
        <p>Similarly, <xref ref-type="fig" rid="fig5">Figure 5</xref> reveals the presence of other “outliers” concerning the species COSW (Coccoloba swartzii, Polygonaceae, Tree), COAL (Cordia alliodora, Boraginaceae, Tree), ININ (Inga ingoides, Mimosaceae, Tree), MYCIT (Myrcia citrifolia, Myrtaceae, Shrub), MYFAL (Myrcia fallax, Myrtaceae, Tree), OCCOR (Ocotea coriacea, Lauraceae, Tree), PIRET (Piper reticulatum, Piperaceae, Shrub), and PIFRAG (Pisonia fragans, Nyctaginaceae, Tree).</p>
        <p>To provide an exhaustive view, <xref ref-type="fig" rid="fig6">Figure 6</xref> below presents the total abundance by species for all species across the four stations. This table is sorted in descending order of species abundance.</p>
        <fig id="fig9">
          <label>Figure 9</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId29.jpeg?20251229112542" />
        </fig>
        <p><bold>Figure 6</bold><bold>.</bold> Total abundance by species.</p>
        <p>We also present <xref ref-type="fig" rid="fig7">Figure 7</xref> below, which provides information on the absolute frequency of all species by station, <italic>i.e.</italic>, the number of stations where each species was observed. This table is sorted in descending order of the number of stations where the species were observed.</p>
        <fig id="fig10">
          <label>Figure 10</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId30.jpeg?20251229112542" />
        </fig>
        <p><bold>Figure 7</bold><bold>.</bold> Species frequency.</p>
        <p>From <xref ref-type="fig" rid="fig4">Figure 4</xref> and <xref ref-type="fig" rid="fig6">Figure 6</xref>, we can note that there is only one species, TACI (Tabernaemontana citrifolia, Apocynaceae, Tree), which is present in all four stations. Following this, the species ANIN (Andira inermis, Fabaceae, Tree), COSW (Coccoloba swartzii, Polygonaceae, Tree), FUEL (Funtumia elastica, Apocynaceae, Tree), ININ (Inga ingoides, Mimosaceae, Tree), MYSPLEN (Myrcia splendens, Myrtaceae, Shrub), SIAM (Simarouba amara, Simaroubaceae, Tree), INLA (Inga laurina, Mimosaceae, Tree), and PIFRAG (Pisonia fragans, Nyctaginaceae, Tree) are present in three out of the four stations.</p>
        <p>From <xref ref-type="fig" rid="fig1">Figure 1</xref>, <xref ref-type="fig" rid="fig5">Figure 5</xref> and <xref ref-type="fig" rid="fig6">Figure 6</xref>, we can also highlight species that are found in only one of the four stations. For example, ACRE (Acacia retusa) is found exclusively in station S103, and BUGL (Bunchosia glandulosa) is present only in station S51. These species will henceforth be referred to as station-specific species.</p>
        <p>From this descriptive analysis of the data, two major characteristics emerge:</p>
        <p>Abundance or overabundance of certain species compared to others, as observed, for example, with the species SWMAC, MYSPLEN, and FUEL.Frequency of species occurrence across stations. For instance, the species TACI, though far less abundant than SWMAC, is present in all four stations, while SWMAC is found in only two. Moreover, several species are present in only one station.</p>
      </sec>
      <sec id="sec4dot2">
        <title>4.2. Exploratory Analysis</title>
        <p>With the descriptive analysis in the previous paragraph, we clearly highlighted the characteristics of abundance, and even overabundance, of certain species across stations. In this paragraph, we aim to advance our analysis by identifying groupings or communities of distinctive species that are significant indicators of particular ecological situations and conditions.</p>
        <p>We adopt a community-based approach, focusing on the specific composition of the species present before considering their abundance. This analysis builds upon the “frequent” species communities explained in paragraph 3 above.</p>
        <p>Frequent and Specific Plant Species Communities</p>
        <p>All the results in this paragraph presented below were obtained using the ECLAT algorithm.</p>
        <p><bold>1</bold><bold>)</bold><bold>Frequent</bold><bold>plant</bold><bold>species</bold><bold>communities</bold><bold>across</bold><bold>the</bold><bold>four</bold><bold>stations</bold></p>
        <p>With a minimum support value of min_supp = 100%, only one species, TACI, is present in all four stations: S51 ∩ S103 ∩ S119 ∩ S120 = {TACI}.</p>
        <p>The species TACI (<italic>Tabernaemontana</italic><italic>citrifolia</italic>, <italic>Apocynaceae</italic>, Tree) from the Apocynaceae family is the only species present across all four stations of the secondary sylvatic forest formations. This is the sole species exhibiting this characteristic in our dataset.</p>
        <p>In terms of its abundance, it was observed in stations S103, S119, S120, and S51 with counts of 8, 19, 15, and 4, respectively. This species is consistently found across the four secondary sylvatic formations.</p>
        <p><bold>2</bold><bold>)</bold><bold>Frequent</bold><bold>plant</bold><bold>species</bold><bold>communities</bold><bold>across</bold><bold>three</bold><bold>stations</bold></p>
        <p>With a minimum support value of min_supp = 75%, the following species communities are present in three out of the four stations:</p>
        <p>S103∩S119∩S120 = {ANIN (<italic>Andira</italic><italic>inermis</italic>, Fabaceae, Tree), COSW (<italic>Coccoloba</italic><italic>swartzii</italic>, Polygonaceae, Tree), EUMO (<italic>Eugenia</italic><italic>monticola</italic>, Myrtaceae, Shrub), FUEL (<italic>Funtumia</italic><italic>elastica</italic>, Apocynaceae, Tree), ININ (<italic>Inga</italic><italic>ingoides</italic>, Mimosaceae, Tree), MYSPLEN (<italic>Myrcia</italic><italic>splendens</italic>, Myrtaceae, Shrub), SIAM (<italic>Simarouba</italic><italic>amara</italic>, Simaroubaceae, Tree), SWAU (<italic>Swietenia</italic><italic>aubrevilleana</italic>, Meliaceae, Tree), TACI (<italic>Tabernaemontana</italic><italic>citrifolia</italic>, Apocynaceae, Tree)}.</p>
        <p>This community is present in stations S103, S119, and S120. It is referred to as maximal because it cannot be expanded further without reducing its support (3/4), which is the ratio of the number of stations containing the community (3) to the total number of stations (4).</p>
        <p>S51∩S103∩S119 = {TACI (<italic>Tabernaemontana</italic><italic>citrifolia</italic>, Apocynaceae, Tree), INLA (<italic>Inga</italic><italic>laurina</italic>, Mimosaceae, Tree)},S51∩S103∩S120 = {TACI <italic>Tabernaemontana</italic><italic>citrifolia</italic>, Apocynaceae, Tree, PIFRAG (Pisonia <italic>fragans</italic>, Nyctaginaceae, Tree).</p>
        <p>These two communities, namely {TACI, INLA} and {TACI, PIFRAG}, are also maximal itemsets.</p>
        <p><bold>3</bold><bold>)</bold><bold>Frequent</bold><bold>plant</bold><bold>species</bold><bold>communities</bold><bold>across</bold><bold>two</bold><bold>stations</bold></p>
        <p>Similarly, with a minimum support value of min_supp = 50%, we obtain:</p>
        <p>S51 ∩ S119 = {INLA, OCCOR, PIAMA, TACI},S51 ∩ S120 = {PIFRAG, TACI},S51 ∩ S103 = {BOSU, CHCO, CHAR, COCA, ERHA, INLA, MYCIT, PIFRAG, PSMI, PSNE, SWMAC, TACI},S103 ∩ S119 = {AISP, ANIN, COSW, EUMO, FUEL, ININ, INLA, MAIN, MYSPLEN, PIDIL, SIAM, SWAU, TACI},S103 ∩ S120 = {ANIN, BRAL, CACA, COSW, EUMO, FUEL, ININ, MYFAL, MYSPLEN, PIFRAG, SIAM, SWAU, TACI},S119 ∩ S120 = {ANIN, CASP, CEPE, COSW, COAL, COSU, CUAM, EUMO, FUEL, HITRI, ININ, MYSPLEN, OCCER, PIADUN, PIRET, SASAM, SACAR, SIAM, SWAU, TACI, ZACA}.</p>
        <p><bold>4</bold><bold>)</bold><bold>Plant species specific to each site</bold></p>
        <p>Finally, with min_supp = 25%, we obtain the species that are specific to the different stations. These are species found in only one of the four stations. For example, the species BUGL below, which is part of the set S51_SpecificSpecies, is a species found exclusively in station S51 and not in the other three stations, namely S103, S119, and S120.</p>
        <p>S51_SpecificSpecies = {BUGL, CACY, CAIN, COPU, COBA, COCO, EULI, MAZA, MEB, OUGUIL, RIHUM, SIFO, SIOB, TAHET, TRTR},S103_SpecificSpecies = {ACRE, BAMU, CADE, CHAL, EUAL, EUPS, FICI, HACA, IXFER, LOHE, LOPU, MABI, MALAE, ODNY, PACR, PIPE, PIRACE, PSMA, RAACU},S119_SpecificSpecies = {ARAL, CESC, CELA, CESP, CISP, COLI, COMO, MAAME, OCPAT, SWMAH, TEC},S120_SpecificSpecies = {ANMU, BUSI, CLHI, HORAC, HYCOU}.</p>
      </sec>
      <sec id="sec4dot3">
        <title>4.3. Species Clustering</title>
        <p>Using the different frequent species communities obtained in the previous paragraph with the ECLAT algorithm, we implement another approach based on clustering using Hierarchical Ascendant Classification (HAC) and Principal Component Analysis (PCA) to verify if the obtained communities are indicative of specific ecological conditions. We perform a Hierarchical Ascendant Classification (HAC) on our dataset. For readability purposes, species specific to only one station are not displayed in the dendrograms below. In <xref ref-type="fig" rid="fig8">Figure 8(a)</xref>, we present the dendrogram obtained using the Ward method, and in <xref ref-type="fig" rid="fig8">Figure 8(b)</xref>, we show the dendrogram obtained using the “complete linkage” method.</p>
        <p>Below, we present the cross-tables that allow for a comparison of the clusters obtained using the Ward, Single, and Complete Linkage methods.</p>
        <p>From the cross-tabulation tables in <xref ref-type="fig" rid="fig9">Figure 9</xref>, the three methods Ward, Single, and Complete produce identical results when the dendrograms are divided into 9 clusters. The cross-tabulation tables are diagonal, meaning the species clusters obtained are exactly the same across all three methods.</p>
        <p>Thus, we opt for the selection of 9 clusters. The clusters obtained are represented on the factorial plane in <xref ref-type="fig" rid="fig10">Figure 10</xref>.</p>
        <fig id="fig11">
          <label>Figure 11</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId31.jpeg?20251229112545" />
        </fig>
        <fig id="fig12">
          <label>Figure 12</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId32.jpeg?20251229112545" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 8</bold><bold>.</bold> CAH_Jaccard_distance, (a): ward, (b): «complete» linkage.</p>
        <fig id="fig13">
          <label>Figure 13</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId33.jpeg?20251229112545" />
        </fig>
        <fig id="fig14">
          <label>Figure 14</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId34.jpeg?20251229112546" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 9</bold><bold>.</bold> cross-tables (a): ward &amp; single linkage, (b): complete &amp; single linkage.</p>
        <fig id="fig15">
          <label>Figure 15</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId35.jpeg?20251229112546" />
        </fig>
        <p><bold>Figure 10</bold><bold>.</bold> Clusters in the factorial plane.</p>
        <p>The first factorial plane explains 62.7% for dimension 1% and 20% for dimension 2, accounting for a total of 82.7% of the variance in the abundance matrix. This factorial plane is therefore highly significant.</p>
      </sec>
      <sec id="sec4dot4">
        <title>4.4. Bivariate Analysis of Basal Area as a Function of the Species Distribution Index for Frequent Species Communities</title>
        <p>Among the variables characterizing the species in the stations fr (relative frequency), Density, Id (Distribution Index), St_Totale (Basal Area), and ID (Dominance Index), we know that ID = Id * St_Totale and Id = fr * Density. Therefore, we select the two uncorrelated variables, Id and St_Totale, to perform a bivariate analysis. The objective is to analyze the positioning of the previously identified frequent species with respect to these two variables.</p>
        <p>The relevance of analyzing the distribution index in relation to basal area was demonstrated in [<xref ref-type="bibr" rid="B14">14</xref>]. <xref ref-type="fig" rid="fig11">Figure 11</xref> below illustrates the case of station S51.</p>
        <p>We use three colors for coding the species. </p>
        <p>Green: for Species found exclusively in station S51 and absent in all others. These species, marked in green on the chart, are specific to station S51 (Speci_S51).Blue: for Species present in two stations including S51. These are shown in blue Freq_2_St).Red: for Species found in three or four stations including S51. These are shown in Freq_3_4_St).</p>
        <p>This color coding is consistent across <xref ref-type="fig" rid="fig11">Figures 11-13</xref>.</p>
        <p>In <xref ref-type="fig" rid="fig11">Figure 11(a)</xref>, we observe that one species, SWMAC, has a significantly higher basal area and distribution index compared to other species. Statistically, this species is an outlier, as shown by the corresponding boxplots for basal area and distribution index, located at the top and right of the graph, respectively. In such a case, we alert the ecologist, indicating that it is necessary to provide an ecological explanation.</p>
        <p>Regarding the species PIFRAG (Pisonia fragrans), INLA (Inga laurina), and TACI (Tabernaemontana citrifolia), marked in red on the figure and part of the community of frequent species present in three or four stations, the corresponding boxplots for both basal area and distribution index show that they have basal areas and distributions comparable to other species, except for the outliers. These species provide indications about the ecological development level of the station. </p>
        <p>We note that all species are represented on the figure, but not all are labeled for readability purposes. This will also be the case for the three figures below. Similarly, for species referenced as Freq_2_St (excluding SWMAC), their distribution is equivalent to that of species referenced as Freq_3_4_St.</p>
        <p>Concerning the species specific to station S51, in terms of the distribution index, two species behave as outliers: MAZA (Manilkara zapota) and TRTR (Triphasia trifolia). In terms of basal area, three species exhibit outlier behavior: EULI (Eugenia ligustrina), MAZA (Manilkara zapota), and TRTR (Triphasia trifolia).</p>
        <p>To improve the readability of the graph, <xref ref-type="fig" rid="fig11">Figure 11(b)</xref> is presented as a modified version of <xref ref-type="fig" rid="fig11">Figure 11(a)</xref>, excluding the representation of species SWMAC and EULI. It is observed that, in both distribution and basal area, species referenced as Freq_3_4_St, Freq_2_St, and Speci_S51 are distributed equivalently.</p>
        <fig id="fig16">
          <label>Figure 16</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId36.jpeg?20251229112547" />
        </fig>
        <fig id="fig17">
          <label>Figure 17</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId37.jpeg?20251229112547" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 11</bold><bold>.</bold> S51 Id = f(St), (a) with all species; (b) without swmac, euli, tahet.</p>
        <p>In <xref ref-type="fig" rid="fig12">Figure 12</xref> below, we present the bivariate graph for station S103.</p>
        <fig id="fig18">
          <label>Figure 18</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId38.jpeg?20251229112546" />
        </fig>
        <fig id="fig19">
          <label>Figure 19</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId39.jpeg?20251229112546" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 12</bold><bold>.</bold> S103 Id = f(St), (a) with all species; (b) without swmac, cosw, mysplen, pirace, main.</p>
        <p>Regarding station S103 and as shown in <xref ref-type="fig" rid="fig12">Figure 12(a)</xref>, several species, including SWMAC (Swietenia macrophylla), COSW (Coccoloba swartzii), MAIN (Mangifera indica), MYSPLEN (Myrcia splendens), and PIRACE (Pimenta racemosa), are positioned as “outliers.” The other frequent species referenced as Freq_3_4_St in <xref ref-type="fig" rid="fig12">Figure 12(b)</xref>, namely SWAU (Swietenia aubrevilleana), INLA (Inga laurina), ININ (Inga ingoides), TACI (Tabernaemontana citrifolia), ANIN (Andira inermis), FUEL (Funtumia elastica), and PIFRAG (Pisonia fragrans), exhibit distributions and basal areas that are highly significant for this station.</p>
        <p>In <xref ref-type="fig" rid="fig13">Figure 13</xref> below, we present the bivariate graph for station S119.</p>
        <fig id="fig20">
          <label>Figure 20</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId40.jpeg?20251229112547" />
        </fig>
        <fig id="fig21">
          <label>Figure 21</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId41.jpeg?20251229112547" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 13</bold><bold>.</bold> S119 Id = f(St), (a) with all species; (b) without fuel, siam, inin, sacar, sasam, tec, apal, mysplen, cuam.</p>
        <p>Regarding station S119, as shown in <xref ref-type="fig" rid="fig13">Figure 13(a)</xref>, for species referenced as Freq_3_4_St, the distribution index highlights FUEL (Funtumia elastica) and MYSPLEN (Myrcia splendens) as “outliers.” These species are thus significantly distributed in this station. Similarly, for Freq_3_4_St species, but considering basal area, SIAM (Simarouba amara) and ININ (Inga ingoides) exhibit significantly higher basal areas than the others.</p>
        <p>A similar observation applies to species referenced as Freq_2_St, with CUAM (Cupania americana) and SACAR (Samanea saman) as outliers for distribution, and SACAR (Sapium caribaeum), SASAM (Samanea saman), MAIN (Mangifera indica), and HITRI (Hirtella triandra) as outliers for basal area.</p>
        <p>Regarding species specific to station S119 Speci_S119, TEC (Terminalia catappa) and ARAL (Artocarpus altilis) are the most significant in terms of basal area, while COLI (Coffea liberica) is the most significant in terms of the distribution index.</p>
        <p>In <xref ref-type="fig" rid="fig14">Figure 14</xref> below, we present the bivariate graph for station S120.</p>
        <p>Station S120 exhibits a profile similar to that of station S119, in the sense that the species assemblages Freq_3_4_St and Freq_2_St are the most widely distributed and have the largest basal areas.</p>
        <p>More specifically, regarding distribution for the Freq_3_4_St assemblage, species such as MYSPLEN (Myrcia splendens), ININ (Inga ingoides), FUEL (Funtumia elastica), TACI (Tabernaemontana citrifolia), EUMO (Eugenia monticola), SWAU (Swietenia aubrevilleana), and COSW (Coccoloba swartzii) are highly distributed in this station.</p>
        <p>For the Freq_2_St assemblage, the most widely distributed species are CUAM (Cupania americana), COAL (Cordia alliodora), COSU (Cordia sulcata), and SACCAR (Sapium caribaeum). Notably, COAL also exhibits a significantly higher basal area compared to other species.</p>
        <fig id="fig22">
          <label>Figure 22</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId42.jpeg?20251229112547" />
        </fig>
        <fig id="fig23">
          <label>Figure 23</label>
          <graphic xlink:href="https://html.scirp.org/file/2001269-rId43.jpeg?20251229112547" />
        </fig>
        <p>(a) (b)</p>
        <p><bold>Figure 14</bold><bold>.</bold> S120 Id = f(St), (a) with all species; (b) without mysplen, inin, cuam, anin, siam, coal, hycou, horac.</p>
        <p>Regarding species specific to station S120 (Speci_S120), notable cases include HYCOU (Hymenaea courbaril), HORAC (Homalium racemosum), and ANMU (Annona muricata), which primarily show significant development in terms of basal area.</p>
      </sec>
    </sec>
    <sec id="sec5">
      <title>5. Discussion</title>
      <p>Through this work, we aimed to demonstrate that by applying methods for knowledge discovery in data, data mining, and statistical analyses, it is possible to highlight unique species communities that are indicative of specific ecological situations and conditions. We selected a small dataset to test our knowledge extraction methodology, given the complexity of ecological ecosystems.</p>
      <p>The most notable result concerns the identification of <italic>Tabernaemontana</italic><italic>citrifolia</italic> (TACI), the only species common to all four stations, regardless of altitude, biomass, or management history. The presence of this species across all stations suggests its ability to adapt to diverse ecological conditions, ranging from the “classic” stations (S119 and S120) to the anthropized conditions of the <italic>Swietenia</italic><italic>macrophylla</italic> plantations (S51 and S103). Based on the bivariate analysis from the previous section, this species is well distributed, with a significant basal area, particularly in the “classic” stations S119 and S120. This finding highlights <italic>T.</italic><italic>citrifolia</italic> as a potentially key species in the dynamics of the studied secondary forest formations, likely playing a central role in ecological resilience.</p>
      <p>The analysis using a support threshold of 75% identified groups of species present in three out of the four stations, indicating shared resilience despite notable ecological differences. For example, the community {ANIN, COSW, EUMO, FUEL, ININ, MYSPLEN, SIAM, SWAU, TACI} includes species widely distributed across stations S103, S119, and S120. Although these stations differ in altitude, biomass, and management history, they appear to share relatively homogeneous mesophilic conditions. This species assemblage includes trees (<italic>Andira</italic><italic>inermis</italic>, <italic>Simarouba</italic><italic>amara</italic>, and <italic>Swietenia</italic><italic>aubrevilleana</italic>) as well as shrubs (<italic>Eugenia</italic><italic>monticola</italic> and <italic>Myrcia</italic><italic>splendens</italic>). This may reflect a balanced community dynamic between pioneer and mature species.</p>
      <p>The species common to S51 and two other stations (S103 and either S119 or S120) also reveal interesting ecological patterns. For instance, the presence of <italic>Inga</italic><italic>laurina</italic> (INLA) in S51, S103, and S119, as well as <italic>Pisonia</italic><italic>fragrans</italic> (PIFRAG) in S51, S103, and S120, indicates that these species are capable of tolerating the specific ecological conditions of the <italic>Swietenia</italic><italic>macrophylla</italic> plantations. This suggests that, despite the management history of these stations, certain species are able to persist under conditions of high anthropic influence.</p>
      <p>The analysis using a 50% support threshold revealed species associations specific to station pairs. These results enhance our understanding of ecological gradients. For example, the assemblage shared by S103 and S120 {ANIN, BRAL, CACA, COSW, EUMO, FUEL, ININ, MYFAL, MYSPLEN, PIFRAG, SIAM, SWAU, TACI} includes a mix of pioneer and mature species, which aligns with the intermediate biomass levels of these stations and their relatively higher altitude. In contrast, the species associations specific to S51 and S119, such as {INLA, OCCOR, PIAMA, TACI}, likely reflect ecological responses specific to the anthropized conditions of S51 and the advanced successional characteristics of S119.</p>
      <p>The species specific to each station provides additional insights into local ecological characteristics. For example, Bougainvillea glabra (BUGL) and Cecropia schreberiana (CESC), specific to S51 and S119, respectively, reflect the pronounced differences between a plantation station and a “classic” station at a more advanced successional stage. These indicator species could be useful in future studies to refine the ecological diagnostics of these stations.</p>
      <p>Aside from Tabernaemontana citrifolia (TACI), the species assemblages identified in the Swietenia macrophylla plantations (S51 and S103) exhibit distinct compositions compared to the “classic” stations (S119 and S120). Specifically, S51 ∩ S103 = {BOSU, CHCO, CHAR, COCA, ERHA, INLA, MYCIT, PIFRAG, PSMI, PSNE, SWMAC, TACI}, while S119 ∩ S120 = {ANIN, CASP, CEPE, COSW, COAL, COSU, CUAM, EUMO, FUEL, HITRI, ININ, MYSPLEN, OCCER, PIADUN, PIRET, SASAM, SACAR, SIAM, SWAU, TACI, ZACA}. These distinctions likely reflect the influence of management history on plant community dynamics.</p>
      <p>Additionally, station S103, which has the highest altitude (130 m) and the largest surface area (1000 m<sup>2</sup>), harbors species assemblages (S103_EspecesSpecifiques = {ACRE, BAMU, CADE, CHAL, EUAL, EUPS, FICI, HACA, IXFER, LOHE, LOPU, MABI, MALAE, ODNY, PACR, PIPE, PIRACE, PSMA, RAACU}) that include species commonly associated with cooler, mesophilic, and better-preserved conditions. These results highlight the role of SWMAC plantations in shaping community structures while also revealing interactions between post-plantation regeneration and abiotic factors such as altitude.</p>
      <p>The bivariate analyses conducted on the Distribution Index (Id) and total basal area (St_Totale) revealed that the most widespread species exhibit some of the highest Id values and, in certain cases, also a high basal area. This may indicate a dominant position within the community. More broadly, these results suggest that a species combining high frequency with substantial coverage plays a significant role in ecological processes such as competition, resource availability, and canopy architecture, potentially influencing successional dynamics.</p>
    </sec>
    <sec id="sec6">
      <title>6. Conclusions</title>
      <p>This study highlights the relevance of an approach for identifying frequent plant species communities in the sub-humid humid bioclimate and mesophilic forests. The ECLAT algorithm, originally designed for analyzing transactional data in supermarket sales, successfully identified frequent and specific plant species assemblages while avoiding biases related to overabundance caused by extensive anthropization. The results were obtained without incorporating prior knowledge of the environmental or historical characteristics of the stations.</p>
      <p>Robust co-occurrence patterns were identified, accurately reflecting the ecological characteristics of the studied stations. The identification of frequent plant species assemblages, as well as station-specific species, helped to delineate characteristic groups, supporting the hypothesis that certain floristic associations are more recurrent and likely better adapted to specific bioclimatic and edaphic conditions.</p>
      <p>The validation of these results using well-established statistical methods (CAH and PCA) and the dominant positioning of frequent communities in relation to the variables Id and St_Totale further reinforces the robustness of our methodology.</p>
      <p>Additionally, the results demonstrate the relevance of a qualitative approach in an exploratory context, where the primary objective is to assess the feasibility and reliability of the methods before applying them to larger datasets and broader scales. By relying on a small dataset with significant ecological contrasts, our qualitative approach proves effective in addressing the challenges posed by the complexity of forest ecosystems while opening avenues for more extensive analyses. It provides a strong methodological foundation for further investigating vegetation succession processes and contributing to the study of ecosystem resilience under environmental pressures.</p>
      <p>Future work should extend this study to a larger set of stations and bioclimatic conditions to determine whether the observed trends persist across broader ecological contexts.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <title>References</title>
      <ref id="B1">
        <label>1.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">François, R. (2012) Éléments d’écologie: Écologie appliquée. 7th Edition, Dunod.</mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Edition, D</string-name>
            </person-group>
            <year>2012</year>
            <article-title>Éléments d’écologie: Écologie appliquée</article-title>
            <source>7th Edition</source>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B2">
        <label>2.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Joseph, P. (2009) La végétation forestière des Petites Antilles: Synthèse biogéo-graphique et écologique, bilan et perspectives. Karthala.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Joseph, P.</string-name>
            </person-group>
            <year>2009</year>
            <article-title>La végétation forestière des Petites Antilles: Synthèse biogéo-graphique et écologique, bilan et perspectives</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B3">
        <label>3.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Agrawal, R., Imieliński, T. and Swami, A. (1993) Mining Association Rules between Sets of Items in Large Databases. <italic>Proceedings of the</italic> 1993 <italic>ACM SIGMOD International Conference on Management of Data</italic>, Washington D.C., 26-28 May 1993, 207-216. https://doi.org/10.1145/170035.170072 <pub-id pub-id-type="doi">10.1145/170035.170072</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/170035.170072">https://doi.org/10.1145/170035.170072</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Agrawal, R.</string-name>
              <string-name>Swami, A.</string-name>
              <string-name>Data, W</string-name>
            </person-group>
            <year>1993</year>
            <article-title>Mining Association Rules between Sets of Items in Large Databases</article-title>
            <source>Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data</source>
            <volume>26</volume>
            <pub-id pub-id-type="doi">10.1145/170035.170072</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B4">
        <label>4.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Agrawal, R.R.S. (1994) Fast Algorithms for Mining Association Rules. <italic>Proceedings of the</italic> 20 <italic>th VLDB Conference Santiago</italic>, Chile, 12-15 September 1994, 487-499.</mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Agrawal, R.R.S.</string-name>
              <string-name>Santiago, C</string-name>
            </person-group>
            <year>1994</year>
            <article-title>Fast Algorithms for Mining Association Rules</article-title>
            <source>Proceedings of the 20th VLDB Conference Santiago</source>
            <volume>12</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B5">
        <label>5.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Han, J., Pei, J. and Yin, Y. (2000) Mining Frequent Patterns without Candidate Generation. <italic>Proceedings of the</italic>2000 <italic>ACM SIGMOD International Conference on Management of Data</italic>, Dallas, 16-18 May 16-18, 2000. https://doi.org/10.1145/342009.335372 <pub-id pub-id-type="doi">10.1145/342009.335372</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1145/342009.335372">https://doi.org/10.1145/342009.335372</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Han, J.</string-name>
              <string-name>Pei, J.</string-name>
              <string-name>Yin, Y.</string-name>
              <string-name>Data, D</string-name>
            </person-group>
            <year>2000</year>
            <article-title>Mining Frequent Patterns without Candidate Generation</article-title>
            <source>Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data</source>
            <volume>16</volume>
            <pub-id pub-id-type="doi">10.1145/342009.335372</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B6">
        <label>6.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Symphor, J.E., Mancheron, A., Vinceslas, L. and Poncelet, P. (2008) Le FIA: Un nouvel au-tomate permettant l’extraction efficace d’itemsets fréquents dans les flots de données. <italic>Extraction et gestion des connaissances</italic>( <italic>EGC</italic>’2008), <italic>Actes des</italic> 8 <italic>èmes journées Extraction et Gestion des Connaissances</italic>, Sophia, 29 janvier au 1er février 2008, 157-168.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Symphor, J.E.</string-name>
              <string-name>Mancheron, A.</string-name>
              <string-name>Vinceslas, L.</string-name>
              <string-name>Poncelet, P.</string-name>
              <string-name>Connaissances, S</string-name>
            </person-group>
            <year>2008</year>
            <article-title>Le FIA: Un nouvel au-tomate permettant l’extraction efficace d’itemsets fréquents dans les flots de données</article-title>
            <source>Extraction et gestion des connaissances (EGC’2008)</source>
            <volume>29</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B7">
        <label>7.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Vinceslas, L., Symphor, J.E., Mancheron, A. and Poncelet, P. (2008) FIASCO: Un nouvel algorithme d’extraction d’itemsets fréquents dans les flots de données. <italic>Extraction et gestion des connaissances</italic> ( <italic>EGC</italic>’2008), <italic>Actes des</italic> 8 <italic>è</italic><italic>mes journ</italic><italic>é</italic><italic>es Extraction et Gestion des Connaissances</italic>, Sophia, 29 janvier au 1er février 2008, 235-236. http://editions-rnti.fr/?inprocid=1000603</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Vinceslas, L.</string-name>
              <string-name>Symphor, J.E.</string-name>
              <string-name>Mancheron, A.</string-name>
              <string-name>Poncelet, P.</string-name>
              <string-name>Connaissances, S</string-name>
            </person-group>
            <year>2008</year>
            <article-title>FIASCO: Un nouvel algorithme d’extraction d’itemsets fréquents dans les flots de données</article-title>
            <source>Extraction et gestion des connaissances (EGC’2008)</source>
            <volume>29</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B8">
        <label>8.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Laur, P., Nock, R., Symphor, J. and Poncelet, P. (2007) Mining Evolving Data Streams for Frequent Patterns. <italic>Pattern</italic><italic>Recognition</italic>, 40, 492-503. https://doi.org/10.1016/j.patcog.2006.03.006 <pub-id pub-id-type="doi">10.1016/j.patcog.2006.03.006</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1016/j.patcog.2006.03.006">https://doi.org/10.1016/j.patcog.2006.03.006</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Laur, P.</string-name>
              <string-name>Nock, R.</string-name>
              <string-name>Symphor, J.</string-name>
              <string-name>Poncelet, P.</string-name>
            </person-group>
            <year>2007</year>
            <article-title>Mining Evolving Data Streams for Frequent Patterns</article-title>
            <source>Pattern Recognition</source>
            <volume>40</volume>
            <pub-id pub-id-type="doi">10.1016/j.patcog.2006.03.006</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B9">
        <label>9.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Zaki, M.J., Parthasarathy, S., Ogihara, M. and Li, W. (1997) New Algorithms for Fast Discovery of Association Rules.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Zaki, M.J.</string-name>
              <string-name>Parthasarathy, S.</string-name>
              <string-name>Ogihara, M.</string-name>
              <string-name>Li, W.</string-name>
            </person-group>
            <year>1997</year>
            <article-title>New Algorithms for Fast Discovery of Association Rules</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B10">
        <label>10.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Zhang, Y., Taylor, W.W. and Liu, T. (2022) Hierarchical Clustering Reveals Distinct Fish Community Structures in Response to Environmental Variation in the Yangtze River. <italic>Aquatic Conservation</italic>: <italic>Marine and Freshwater Ecosystems</italic>, 32, 1567-1579.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Zhang, Y.</string-name>
              <string-name>Taylor, W.W.</string-name>
              <string-name>Liu, T.</string-name>
            </person-group>
            <year>2022</year>
            <article-title>Hierarchical Clustering Reveals Distinct Fish Community Structures in Response to Environmental Variation in the Yangtze River</article-title>
            <source>Aquatic Conservation: Marine and Freshwater Ecosystems</source>
            <volume>32</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B11">
        <label>11.</label>
        <citation-alternatives>
          <mixed-citation publication-type="book">Legendre, P. and Legendre, L. (2012) Numerical Ecology. 3rd Edition, Elsevier.</mixed-citation>
          <element-citation publication-type="book">
            <person-group person-group-type="author">
              <string-name>Legendre, P.</string-name>
              <string-name>Legendre, L.</string-name>
              <string-name>Edition, E</string-name>
            </person-group>
            <year>2012</year>
            <article-title>Numerical Ecology</article-title>
            <source>3rd Edition</source>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B12">
        <label>12.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Kissling, W.D. and Field, R. (2022) Using Species Co-Occurrence Networks to Explore Biodiversity Patterns and Processes with the Jaccard Index. <italic>Journal</italic><italic>of</italic><italic>Biogeography</italic>, 49, 973-987.</mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Kissling, W.D.</string-name>
              <string-name>Field, R.</string-name>
            </person-group>
            <year>2022</year>
            <article-title>Using Species Co-Occurrence Networks to Explore Biodiversity Patterns and Processes with the Jaccard Index</article-title>
            <source>Journal of Biogeography</source>
            <volume>49</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B13">
        <label>13.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Borcard, D. and Gillet, F. (2018) Numerical Ecology with R. Springer.</mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Borcard, D.</string-name>
              <string-name>Gillet, F.</string-name>
            </person-group>
            <year>2018</year>
            <article-title>Numerical Ecology with R</article-title>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B14">
        <label>14.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Joseph, P., Symphor, J.É., Baillard, K., Elymarius, S., Claude, J.P., Abati, Y. and Jean-françois, Y. (2017) The Effects of Topography on Martinique’s Mesological and Floristic Differentiations: The Case of Morne Carrière (Commune of VAUCLIN). <italic>IOSR Journal of Environmental Science Toxicology and Food Technology</italic>, 11, 74-96.</mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Joseph, P.</string-name>
              <string-name>Symphor, J.</string-name>
              <string-name>Baillard, K.</string-name>
              <string-name>Elymarius, S.</string-name>
              <string-name>Claude, J.P.</string-name>
              <string-name>Abati, Y.</string-name>
            </person-group>
            <year>2017</year>
            <article-title>The Effects of Topography on Martinique’s Mesological and Floristic Differentiations: The Case of Morne Carrière (Commune of VAUCLIN)</article-title>
            <source>IOSR Journal of Environmental Science Toxicology and Food Technology</source>
            <volume>11</volume>
          </element-citation>
        </citation-alternatives>
      </ref>
    </ref-list>
  </back>
</article>