<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE article  PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd"><article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" dtd-version="3.0" xml:lang="en" article-type="research article"><front><journal-meta><journal-id journal-id-type="publisher-id">AJMB</journal-id><journal-title-group><journal-title>American Journal of Molecular Biology</journal-title></journal-title-group><issn pub-type="epub">2161-6620</issn><publisher><publisher-name>Scientific Research Publishing</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.4236/ajmb.2021.114008</article-id><article-id pub-id-type="publisher-id">AJMB-111391</article-id><article-categories><subj-group subj-group-type="heading"><subject>Articles</subject></subj-group><subj-group subj-group-type="Discipline-v2"><subject>Biomedical&amp;Life Sciences</subject></subj-group></article-categories><title-group><article-title>
 
 
  Prediction of Monophyletic Groups Based on Gene Order and Sequence Similarity in Organelle DNA
 
</article-title></title-group><contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Matyas</surname><given-names>Cserhati</given-names></name><xref ref-type="aff" rid="aff1"><sub>1</sub></xref><xref ref-type="corresp" rid="cor1"><sup>*</sup></xref></contrib></contrib-group><aff id="aff1"><label>1</label><addr-line>Independent Scholar</addr-line></aff><pub-date pub-type="epub"><day>20</day><month>08</month><year>2021</year></pub-date><volume>11</volume><issue>04</issue><fpage>83</fpage><lpage>99</lpage><history><date date-type="received"><day>14,</day>	<month>July</month>	<year>2021</year></date><date date-type="rev-recd"><day>17,</day>	<month>August</month>	<year>2021</year>	</date><date date-type="accepted"><day>20,</day>	<month>August</month>	<year>2021</year></date></history><permissions><copyright-statement>&#169; Copyright  2014 by authors and Scientific Research Publishing Inc. </copyright-statement><copyright-year>2014</copyright-year><license><license-p>This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/</license-p></license></permissions><abstract><p>
 
 
  Organelle genomics has become its own field of study. Much information can be gleaned from the study of cell organelles. The differences in the genomes of organelles, such as the mitochondrion and the chloroplast are amenable to phylogenetic and cladistic studies. These differences include the genome sequence, GC%, genome length and gene order. The conserved nature of the organelle genomes and the gene inventory of both mitochondrial and chloroplast genomes also make this easier to accomplish. This paper includes a review of existing organelle genome software. These include gene annotation and genome visualization tools as well as organelle gene databases for both mitochondrion and plastid. A new R tool, available on github, called “Organelle DNA Lineages”, or ODL, was written to compare and classify organelle genomes based on their genome sequence and gene order. The software was run on the mitochondrial genomes of a set of 51 cephalopod species, delineating ten separate monophyletic groups, including argonauts, nautiluses, octopuses, cuttlefish, and six squid groups. This new tool can help enrich and expand the field of organelle genomics.
 
</p></abstract><kwd-group><kwd>Organelle</kwd><kwd> Genome</kwd><kwd> Mitochondrion</kwd><kwd> Plastid</kwd><kwd> Organelle DNA Lineage</kwd><kwd> Chloroplast</kwd></kwd-group></article-meta></front><body><sec id="s1"><title>1. Introduction</title><p>The DNA inside mitochondria and chloroplasts can very useful in analyzing phylogenetic relationships between species. Several characteristics demonstrate why organelle DNA is amenable to such analyses. But despite its small size, organelle DNA is easy to isolate and sequence. Many species which do not have their nuclear genomes available have mitochondrial and chloroplast genome sequences instead in the NCBI database. The Organelle Genome Database at NCBI contains 19,320 organelle genome sequences as of July 28, 2021. The distribution of these organelle genomes can be seen in <xref ref-type="table" rid="table1">Table 1</xref>. This paper will focus on the mitochondrion and the plastid. Second, the mitochondrial DNA (mtDNA) overwhelmingly follows maternal inheritance patterns, avoiding complex biparental inheritance patterns.</p><p>Despite its rapid mutation rate, organelle DNA (oDNA) retains a very conserved gene inventory, very similar gene order, genome sequence similarity, and genome size between related species. Gene rearrangements include recombination, inversions, transpositions, inverse transpositions, and tandem duplication genes losses, but are thought to be rare [<xref ref-type="bibr" rid="scirp.111391-ref1">1</xref>] [<xref ref-type="bibr" rid="scirp.111391-ref2">2</xref>]. The insertion of introns and mobile elements are also thought to result in gene order rearrangement [<xref ref-type="bibr" rid="scirp.111391-ref3">3</xref>]. Plastid DNA (plDNA) may expand or contract due to the presence or absence of inverted repeats [<xref ref-type="bibr" rid="scirp.111391-ref4">4</xref>]. In general, genes, rRNAs and tRNAs are colinear with one another between species in the same monophyletic group. This makes the identification and classification of newly sequenced and annotated oDNA relatively easy.</p><p>Mitochondrial genes, for example, take part in the production of adenosine triphosphate (ATP), the energy molecule, as well as oxidative phosphorylation (OXPHOS), the tricarboxylic acid cycle (TCA), the β-oxidation of fatty acids, calcium handling, regulating apoptosis, and participation in the cell cycle. Because mtDNA lacks histones, it is more prone to molecular stress. Damaged mtDNA lies behind a number of human diseases [<xref ref-type="bibr" rid="scirp.111391-ref5">5</xref>] [<xref ref-type="bibr" rid="scirp.111391-ref6">6</xref>] [<xref ref-type="bibr" rid="scirp.111391-ref7">7</xref>]. Other illnesses related to mitochondrial dysfunction include skeletal muscle dysfunction, lung injury, acute renal failure, and immune function dysregulation [<xref ref-type="bibr" rid="scirp.111391-ref8">8</xref>].</p><p>An example of the classical mtDNA genome is that of human (NC_01290, <xref ref-type="fig" rid="fig1">Figure 1</xref>), which is 16,569 bp long and has 37 genes, coding for 13 protein subunits, 14 tRNAs and 2 rRNAs [<xref ref-type="bibr" rid="scirp.111391-ref9">9</xref>]. The 13 protein subunits form the oxidative phosphorylation complexes I, II, IV and V [<xref ref-type="bibr" rid="scirp.111391-ref7">7</xref>]. The mtDNA consists of two strands, the heavy (H) strand and the light (L) strand, which are differentiated based on their GC-content, the H-strand having a higher GC-percentage. This correlates to the sense and antisense terminology. The codon usage of the mtDNA also differs from that of the nuclear genome (see <xref ref-type="table" rid="table2">Table 2</xref>), and also has a ten-fold higher mutation rate [<xref ref-type="bibr" rid="scirp.111391-ref10">10</xref>]. The thirteen protein subunits coded on the mtDNA all take part in energy metabolism. A list of genes in the mitochondrial genome can be seen in <xref ref-type="table" rid="table3">Table 3</xref>.</p><p>The mtDNA of different groups can also differ based on the presence or absence of specific genes. These genes serve as diagnostic markers of these groups. For example, several algal lineages have a DnaB helicase encoded in the plastid genome [<xref ref-type="bibr" rid="scirp.111391-ref11">11</xref>]. Opisthokonts (animals and fungi) use DNA polymerase γ (Polγ) for the replication of the mtDNA [<xref ref-type="bibr" rid="scirp.111391-ref12">12</xref>]. The number and type of DNA polymerase also varies across eukaryotes. For example, parasite apicomplexans have lost them altogether [<xref ref-type="bibr" rid="scirp.111391-ref11">11</xref>].</p><table-wrap id="table1" ><label><xref ref-type="table" rid="table1">Table 1</xref></label><caption><title> List of different kinds of organelle genome sin the NCBI Organelle Browser as of July 28, 2021</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Type of organelle</th><th align="center" valign="middle" >Number of genomes in NCBI</th></tr></thead><tr><td align="center" valign="middle" >Mitochondrium</td><td align="center" valign="middle" >12,528</td></tr><tr><td align="center" valign="middle" >Chloroplast</td><td align="center" valign="middle" >5660</td></tr><tr><td align="center" valign="middle" >Plastid</td><td align="center" valign="middle" >1064</td></tr><tr><td align="center" valign="middle" >Apicoplast</td><td align="center" valign="middle" >62</td></tr><tr><td align="center" valign="middle" >Kinetoplast</td><td align="center" valign="middle" >3</td></tr><tr><td align="center" valign="middle" >Chromatophore</td><td align="center" valign="middle" >2</td></tr><tr><td align="center" valign="middle" >Cyanelle</td><td align="center" valign="middle" >1</td></tr></tbody></table></table-wrap><table-wrap id="table2" ><label><xref ref-type="table" rid="table2">Table 2</xref></label><caption><title> Divergent nucleotides in the mtDNA compared to the nuclear genome</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Codon</th><th align="center" valign="middle" >Universal genetic code</th><th align="center" valign="middle" >Mitochondrial genetic code</th></tr></thead><tr><td align="center" valign="middle" >UGA</td><td align="center" valign="middle" >Stop</td><td align="center" valign="middle" >Trp</td></tr><tr><td align="center" valign="middle" >AUA</td><td align="center" valign="middle" >Ile</td><td align="center" valign="middle" >Met</td></tr><tr><td align="center" valign="middle" >AGA</td><td align="center" valign="middle" >Arg</td><td align="center" valign="middle" >Stop</td></tr><tr><td align="center" valign="middle" >AGG</td><td align="center" valign="middle" >Arg</td><td align="center" valign="middle" >Stop</td></tr></tbody></table></table-wrap><table-wrap id="table3" ><label><xref ref-type="table" rid="table3">Table 3</xref></label><caption><title> The 37 genes encoded by the human mitochondrial genome</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Gene symbol</th><th align="center" valign="middle" >Function</th><th align="center" valign="middle" >Strand</th></tr></thead><tr><td align="center" valign="middle" >RNR1</td><td align="center" valign="middle" >12S ribosomal RNA</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >RNR2</td><td align="center" valign="middle" >16S ribosomal RNA</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ATP6</td><td align="center" valign="middle" >ATP synthase F0 subunit 6</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ATP8</td><td align="center" valign="middle" >ATP synthase F0 subunit 8</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >CYB</td><td align="center" valign="middle" >cytochrome b</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >CO1</td><td align="center" valign="middle" >cytochrome c oxidase subunit I</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >CO2</td><td align="center" valign="middle" >cytochrome c oxidase subunit II</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >CO3</td><td align="center" valign="middle" >cytochrome c oxidase subunit III</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ND1</td><td align="center" valign="middle" >NADH dehydrogenase subunit 1</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ND2</td><td align="center" valign="middle" >NADH dehydrogenase subunit 2</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ND3</td><td align="center" valign="middle" >NADH dehydrogenase subunit 3</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ND4</td><td align="center" valign="middle" >NADH dehydrogenase subunit 4</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ND4L</td><td align="center" valign="middle" >NADH dehydrogenase subunit 4L</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ND5</td><td align="center" valign="middle" >NADH dehydrogenase subunit 5</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >ND6</td><td align="center" valign="middle" >NADH dehydrogenase subunit 6</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TA</td><td align="center" valign="middle" >tRNA-Ala</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TR</td><td align="center" valign="middle" >tRNA-Arg</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TN</td><td align="center" valign="middle" >tRNA-Asn</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TD</td><td align="center" valign="middle" >tRNA-Asp</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TC</td><td align="center" valign="middle" >tRNA-Cys</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TE</td><td align="center" valign="middle" >tRNA-Gln</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TQ</td><td align="center" valign="middle" >tRNA-Glu</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TG</td><td align="center" valign="middle" >tRNA-Gly</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TH</td><td align="center" valign="middle" >tRNA-His</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TI</td><td align="center" valign="middle" >tRNA-Ile</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TL1</td><td align="center" valign="middle" >tRNA-Leu-UUR</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TL2</td><td align="center" valign="middle" >tRNA-Leu-CUN</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TK</td><td align="center" valign="middle" >tRNA-Lys</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TM</td><td align="center" valign="middle" >tRNA-Met</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TF</td><td align="center" valign="middle" >tRNA-Phe</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TP</td><td align="center" valign="middle" >tRNA-Pro</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TS1</td><td align="center" valign="middle" >tRNA-Ser-UCN</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TS2</td><td align="center" valign="middle" >tRNA-Ser-AGY</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TT</td><td align="center" valign="middle" >tRNA-Thr</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TW</td><td align="center" valign="middle" >tRNA-Trp</td><td align="center" valign="middle" >Heavy</td></tr><tr><td align="center" valign="middle" >TY</td><td align="center" valign="middle" >tRNA-Tyr</td><td align="center" valign="middle" >Light</td></tr><tr><td align="center" valign="middle" >TV</td><td align="center" valign="middle" >tRNA-Val</td><td align="center" valign="middle" >Heavy</td></tr></tbody></table></table-wrap><p>Besides genes, the presence or absence of certain DNA motifs can help classify species into different groups because of their genome structure. They are abundant, have a simple genomic structure, and are also conserved, and can be used as genetic markers [<xref ref-type="bibr" rid="scirp.111391-ref13">13</xref>]. For example, Xi Wen et al. [<xref ref-type="bibr" rid="scirp.111391-ref14">14</xref>] found that 218 simple sequence repeats (SSR) in the plastid genome of M. grandiflora correlate with three other Magnoliaceae species. These SSRs range from mononucleotides to hexanucleotides.</p><p>In contrast with the mtDNA, the plDNA is much larger than the mtDNA, and contains around 250 genes, including tRNAs and rRNAs, involved in photosynthesis and plastid gene regulation. Xiao-Ming et al. [<xref ref-type="bibr" rid="scirp.111391-ref15">15</xref>] found that in a study of 126 plDNA genes in 272 angiosperm species, 106 (84.1%) were found in 245 species (90.1%). Genome sizes range from 10 - 250 kbp in size, and can be present inside the plastid between 100 to 1000 copies Kbp [<xref ref-type="bibr" rid="scirp.111391-ref16">16</xref>]. Some genes in the plastid genome of some species also have introns, such as in Magnolia grandiflora [<xref ref-type="bibr" rid="scirp.111391-ref14">14</xref>]. Similar to the mtDNA, the plDNA is inherited from one parent only, except for 20% of angiosperms, where plastids are inherited from both parents. Plastid genes are also double in number and three-fold in silent substitution rates as compared to mitogenomes [<xref ref-type="bibr" rid="scirp.111391-ref17">17</xref>]. Plastid genomes are also conserved in structure and are poor in repeat elements. There are several types of plastids, which fulfil different roles in the plant. The well-known chloroplasts harvest light energy during photosynthesis. Chromoplasts are colored chloroplasts which store different pigments. Leucoplasts are colorless plastids, which store proteins, fat, monoterpenes, tannins, polyphenols, and other secondary metabolites [<xref ref-type="bibr" rid="scirp.111391-ref18">18</xref>].</p><p>In this paper, a new resource called the Organelle DNA Lineages (ODL) software will be described and compared to existing organelle DNA resources.</p></sec><sec id="s2"><title>2. Research Materials and Methods</title><sec id="s2_1"><title>2.1. Description of Software</title><p>The Organelle DNA Lineages (ODL) software was written R, version 4.0.3. It is available at https://github.com/csmatyi/odl together with all supplementary files and figures. The script is called odl5.R. As input, it takes a list with two columns: the first column contains the Latin name of the species in the study. The second column is the NCBI accession number of the oDNA sequence. Species and their corresponding accession numbers can be downloaded at the Organelle Genome Browser at https://www.ncbi.nlm.nih.gov/genome/browse#!/organelles/.</p><p>The software removes duplicate species entries. Using the “getAnnotations-GenBank” function from the “ape” package, it downloads and stores the annotation for each accession. Using the “read.GenBank” function, it retrieves the genome sequence for each accession. The software adds all of the organelle genome sequences into a DNAStringSet, and then calls the “msa” function to create a multiple sequence alignment. Then, in an all-versus-all pairwise fashion, the script calculates the sequence similarity between all possible pairs of genome sequences. The sequence similarity values are then stored in a symmetric square sequence similarity matrix.</p><p>The gene, tRNA and rRNA elements are also written to an output file, which is compatible with the input format of the CREx software [<xref ref-type="bibr" rid="scirp.111391-ref19">19</xref>]. The output can be directly uploaded to the CREx website at http://pacosy.informatik.uni-leipzig.de/crex. The GenBank records for all sequences are also written to a subdirectory as part of the output. These can be used as input for the GenomeVx visualization software [<xref ref-type="bibr" rid="scirp.111391-ref20">20</xref>].</p><p>A gene order distance matrix is calculated for each species pair by comparing the indexes of each gene/tRNA/rRNA between the two species using the following equation:</p><p>d a , b = ∑ i = 1 k | mean ( index a ( i ) ) − mean ( index b ( i ) ) | + unique ( a ) + unique ( b ) (1)</p><p>In Equation (1), the difference between the mean index value of element i (meaning genes, tRNAs, and rRNAs) between species a and b is summed up for all common elements between the two species. The sum also includes the unique number of elements pertaining to species a and b, respectively.</p><p><xref ref-type="fig" rid="fig2">Figure 2</xref> depicts a simple example of calculating the gene order distance between three hypothetical species. For example, the distance between species 1 and species 3 is 3 + 1 + 2.5 + 3 = 9.5. The distance in the index value for gene A is 4 – 1 = 3. For gene B it is 3 – 2 = 1. For gene C it is (4 + 3)/2 – 1 = 2.5. For gene D it is 5 – 2 = 3.</p><p>It is possible that a certain gene/rRNA/tRNA is present in multiple copies. In this instance, the average index value is taken for that particular gene/rRNA/tRNA. The distance value between two species may be greater than 1, since we are</p><p>looking at indexes. The gene order distance matrix is then normalized by dividing each element in the distance matrix by the maximum value of the matrix, and then subtracted from 1 to derive a gene order similarity matrix.</p><p>The genome sequence similarity matrix and the gene order similarity matrix are then visualized on a heatmap. Furthermore, these two matrixes are also combined into a “combined” matrix, giving equal weight to both the genome sequence similarity matrix and the gene order similarity matrix. The “heatmap” function is called, using the “ward.D2” method to perform species clustering. A Silhouette plot is also created by the software, which depict the average silhouette width for each possible number of clusters, showing the optimal number of clusters with a dashed line, where the average silhouette width is highest. The mean silhouette width shows how close the points in one cluster are to points in another cluster.</p><p>Clustering is performed using the “ward.D2” method on the sequence similarity matrix. Several statistics measures are written to an output file, such as the Hopkins clustering statistic, which denotes how well the matrix forms clusters. Other parameters are also recorded for each of the clusters (which correspond to monophyletic groups), including the number of species in the cluster, the average oDNA length &#177; one standard deviation, the minimum, mean, and maximum similarity value, as well as the standard error of the mean (SEM), as well as the p-value, which calculates statistical significance between similarity values between species within the cluster and between species within the cluster and outside the cluster. This p-value is a good measure of discontinuity between monophyletic groups.</p><p>The main output of the software is a gene order map, showing the order of genes/rRNAs/tRNAs along the organelle genome per species per cluster. A color-coded legend helps the user identify a given gene/rRNA/tRNA along the organelle genome of a given species.</p></sec><sec id="s2_2"><title>2.2. Other Software</title><p>GenomeVx was used to create the image of the human reference mitogenome in <xref ref-type="fig" rid="fig1">Figure 1</xref>.</p></sec><sec id="s2_3"><title>2.3. Used Sequences</title><p>The mitogenomes for 51 cephalopod species were downloaded from the Organelle Genome Browser, specified in the previous subsection. If a given species occurred more than once, only one of its accession number was randomly selected.</p></sec></sec><sec id="s3"><title>3. Results and Discussion</title><sec id="s3_1"><title>3.1. Review of Existing Organelle Software and Databases</title><p>There are a lot of different databases and programs which are tailored for storing and visualizing genome and synteny. Some of these programs include software which are amenable for the visualization of organelle genomes and gene order. In the following, some of these software and databases will be reviewed. A list of the most commonly used mitogenome and plastid genome annotation tools are listed in <xref ref-type="table" rid="table4">Table 4</xref>.</p><p>GenBank and RefSeq are the standard databases used for storing sequence data, including organelle genomes. Refseq offers curated and non-redundant data for users. Several databases exist, which are built upon GenBank and RefSeq, which store data derived from these databases and which have improved annotation and data quality. Such databases include OGRe [<xref ref-type="bibr" rid="scirp.111391-ref21">21</xref>], MetAMIGA [<xref ref-type="bibr" rid="scirp.111391-ref22">22</xref>], MitoZOA [<xref ref-type="bibr" rid="scirp.111391-ref23">23</xref>], Mamit-tRNA [<xref ref-type="bibr" rid="scirp.111391-ref24">24</xref>], MamMiBase [<xref ref-type="bibr" rid="scirp.111391-ref25">25</xref>], and tRNAdb [<xref ref-type="bibr" rid="scirp.111391-ref26">26</xref>].</p><p>CREx is a software which calculates the hypothetical most parsimonious gene order rearrangement between two organelle genomes, and then visualizes the output. Possible rearrangements include transpositions, reverse transpositions, reversals and tandem duplication random losses (TDRLs) [<xref ref-type="bibr" rid="scirp.111391-ref19">19</xref>].</p><p>Similar to CREx, EqualTDRL is a software which compares two input genomes (i.e. two mt genomes), calculates different series of tandem duplication random losses (TDRL), which hypothetically transforms the gene order of one genome into the other. It creates a diagonally symmetric map showing all the possible series of TDRLs between the two genomes [<xref ref-type="bibr" rid="scirp.111391-ref27">27</xref>].</p><p>GenomeVx is a web-based software which allows the user to create editable, colorful, publication-ready images of circular genomes, such as that of mitochondrial and plastid genomes as well as large plasmids. As input, it takes raw feature positions or GenBank files, which the user can upload to the GenomeVx website [<xref ref-type="bibr" rid="scirp.111391-ref20">20</xref>].</p><table-wrap id="table4" ><label><xref ref-type="table" rid="table4">Table 4</xref></label><caption><title> Existing software amenable for comparative depiction of organelle genomes and gene order</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >Software</th><th align="center" valign="middle" >Description</th><th align="center" valign="middle" >Website</th></tr></thead><tr><td align="center" valign="middle" >CGAP</td><td align="center" valign="middle" >Plastid genome collection, visualization, content comparison, and annotation tool</td><td align="center" valign="middle" >http://www.herbbol.org:8080/chloroplast</td></tr><tr><td align="center" valign="middle" >CpGAVAS/ CpGAVAS2</td><td align="center" valign="middle" >Integrated web server for annotating, visualizing, analyzing and submitting plastid genomes to GenBank</td><td align="center" valign="middle" >http://www.herbalgenomics.org/cpgavas; http://47.96.249.172:16019/analyzer/home</td></tr><tr><td align="center" valign="middle" >CpGDB</td><td align="center" valign="middle" >Database storing chloroplast genomes, gene sequences and annotation</td><td align="center" valign="middle" >http://gndu.ac.in/CpGDB/</td></tr><tr><td align="center" valign="middle" >CREx/CREx2</td><td align="center" valign="middle" >Visualizes most parsimonious gene order rearrangements between two species</td><td align="center" valign="middle" >http://pacosy.informatik.uni-leipzig.de/crex; http://pacosy.informatik.uni-leipzig.de/271-0-CREx2.html</td></tr><tr><td align="center" valign="middle" >DOGMA</td><td align="center" valign="middle" >Metazoan mitogenome gene annotation server for both mitogenomes and plastid genomes</td><td align="center" valign="middle" >https://dogma.ccbb.utexas.edu/</td></tr><tr><td align="center" valign="middle" >GeSeq</td><td align="center" valign="middle" >Fast, high-quality HMM-based organelle genome annotator, mainly for plastid genomes</td><td align="center" valign="middle" >https://chlorobox.mpimp-golm.mpg.de/geseq.html</td></tr><tr><td align="center" valign="middle" >EqualTDRL</td><td align="center" valign="middle" >Visualizes hypothetical series of gene order rearrangements between two species</td><td align="center" valign="middle" >http://pacosy.informatik.uni-leipzig.de/269-0-EqualTdrl.html</td></tr><tr><td align="center" valign="middle" >GenomeVx</td><td align="center" valign="middle" >Gene order visualization on small chromosomes and large plastids</td><td align="center" valign="middle" >http://wolfe.ucd.ie/genomevx/</td></tr><tr><td align="center" valign="middle" >Mitofish/ Mitoannotator</td><td align="center" valign="middle" >Annotation database and pipeline for fish mitogenomes</td><td align="center" valign="middle" >http://mitofish.aori.u-tokyo.ac.jp/</td></tr><tr><td align="center" valign="middle" >MITOS</td><td align="center" valign="middle" >Metazoan mitogenome gene annotation server</td><td align="center" valign="middle" >http://mitos.bioinf.uni-leipzig.de/index.py</td></tr></tbody></table></table-wrap><p>Mitofish is a database of annotations and re-annotations for mitochondrial genome for a large number of fish species. Its companion program, Mitoannotator is an extremely rapid, high quality annotation pipeline for fish mitogenomes [<xref ref-type="bibr" rid="scirp.111391-ref28">28</xref>].</p><p>The MITOchondrial genome annotation Server (MITOS) is a well-known, high-quality mitochondrial genome annotation pipeline. MITOS uses standardized gene names and gene boundary designations. It uses BLAST to compare existing mitochondrial proteins to the raw genome sequence to annotate the location of these genes. It also has a web server where users can upload their raw genome sequences. After selecting a translation code, the server will calculate and present results for the user. These results include tabular summary of annotated genetic elements and visualization of genes within the uploaded genome sequence. The results can then be downloaded by the user [<xref ref-type="bibr" rid="scirp.111391-ref29">29</xref>] [<xref ref-type="bibr" rid="scirp.111391-ref30">30</xref>]. DOGMA is another web service, which uses BLAST against internal databases to detect protein-coding genes and also rRNAs, and use tRNAscan-SE to discover tRNAs, however, according to Guyeux et al. [<xref ref-type="bibr" rid="scirp.111391-ref31">31</xref>] it is outdated. DOGMA and MITOS both use metazoan databases data [<xref ref-type="bibr" rid="scirp.111391-ref29">29</xref>] [<xref ref-type="bibr" rid="scirp.111391-ref32">32</xref>].</p><p>For plastid genomes, the Chloroplast Genome Database (CpGDB) stores plastid genome and individual genes sequences, and also annotation records for 3823 species [<xref ref-type="bibr" rid="scirp.111391-ref33">33</xref>]. Plastid genome annotation tools include CpGAVAS (Chloroplast Genome Annotation, Visualization, Analysis, and GenBank Submission) and CpGAVAS2 [<xref ref-type="bibr" rid="scirp.111391-ref34">34</xref>] [<xref ref-type="bibr" rid="scirp.111391-ref35">35</xref>], CGAP (Chloroplast Genome Annotation Platform) [<xref ref-type="bibr" rid="scirp.111391-ref36">36</xref>], and GeSeq [<xref ref-type="bibr" rid="scirp.111391-ref37">37</xref>].</p><p>The ChloroMitoSSRDB database is an open-source repository of perfect and imperfect microsatellites, two to six nucleotides long. Information include the position of repeats, size, motif and length polymorphisms. The repeat sequences are hyperlinked to annotated gene regions at NCBI [<xref ref-type="bibr" rid="scirp.111391-ref38">38</xref>].</p><p>The current software differs from all the previous methods in that it clusters the organelle genomes based on both gene order and sequence similarity. It then depicts a linear organelle genome map for each cluster and species in the study showing the position of each gene. The software also produces accompanying statistics files and also heatmaps showing species relationships based on gene order and sequence similarity. Other software calculates genome distance by the number of rearrangements needed to transform the organelle genome of one species into another [<xref ref-type="bibr" rid="scirp.111391-ref39">39</xref>]. The present software does this by calculating the difference between the index of the order of each gene.</p></sec><sec id="s3_2"><title>3.2. Analysis of Cephalopod Mitogenomes</title><p>Since the main goal of this paper is to present a new bioinformatics tool, usage of the ODL software will be showcased here in the mitochondrial mapping of cephalopods. Cephalopods are a class of species in the phylum Mollusca (mollusks). They have two subclasses, Nautiloidea (nautiluses) and Coleoidea, which is made up of two superorders, Decapodiformes (squids and cuttlefish), and Octopodiformes (octopuses and argonauts). The mitogenomes of 51 species as well as the outlier Danio rerio were analyzed to discover putative monophyletic groups.</p><p>After running the software, ten putative monophyletic groups were discovered, besides Daniorerio, the outlier species. A list of species and accession numbers can be found in Supplementary File 1. The statistics for each cluster can be seen in <xref ref-type="table" rid="table5">Table 5</xref>. The matrixes, clusters, and statistics for the mitochondrial gene order, sequence similarity, and combined analyses can be found in Supplementary Files 2 - 4. Here results for only the combined matrix are reported. The mitogenome map can be seen in <xref ref-type="fig" rid="fig3">Figure 3</xref>, and the corresponding combined heatmap can be seen in <xref ref-type="fig" rid="fig4">Figure 4</xref>. The Hopkins clustering statistic for the combined matrix is 0.81, which denotes good clustering. Supplementary Figures 1-3 show the Silhouette plot for clustering based on gene order similarity, mitogenome sequence similarity, and combining both methods.</p><table-wrap id="table5" ><label><xref ref-type="table" rid="table5">Table 5</xref></label><caption><title> Clustering statistics for putative predicted cephalopod clusters with more than one member based on mt genome sequence similarity and gene order similarity</title></caption><table><tbody><thead><tr><th align="center" valign="middle" >cluster</th><th align="center" valign="middle" >no. species</th><th align="center" valign="middle" >Mean DNA length &#177; sd</th><th align="center" valign="middle" >Min</th><th align="center" valign="middle" >mean</th><th align="center" valign="middle" >max</th><th align="center" valign="middle" >SEM</th><th align="center" valign="middle" >p-value</th></tr></thead><tr><td align="center" valign="middle" >1. Nautilus</td><td align="center" valign="middle" >3</td><td align="center" valign="middle" >16,027.667 &#177; 296.598</td><td align="center" valign="middle" >0.801</td><td align="center" valign="middle" >0.813</td><td align="center" valign="middle" >0.834</td><td align="center" valign="middle" >1.1E-02</td><td align="center" valign="middle" >0.000421</td></tr><tr><td align="center" valign="middle" >2. Argonauta</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >15,665.500 &#177; 656.902</td><td align="center" valign="middle" >0.75</td><td align="center" valign="middle" >0.75</td><td align="center" valign="middle" >0.75</td><td align="center" valign="middle" >NA</td><td align="center" valign="middle" >1.42E-68</td></tr><tr><td align="center" valign="middle" >3. Octopus</td><td align="center" valign="middle" >17</td><td align="center" valign="middle" >15,794.824 &#177; 196.267</td><td align="center" valign="middle" >0.756</td><td align="center" valign="middle" >0.819</td><td align="center" valign="middle" >0.903</td><td align="center" valign="middle" >7E-03</td><td align="center" valign="middle" >1.33E-211</td></tr><tr><td align="center" valign="middle" >4. Squids I</td><td align="center" valign="middle" >4</td><td align="center" valign="middle" >20,303.75 &#177; 34.798</td><td align="center" valign="middle" >0.629</td><td align="center" valign="middle" >0.68</td><td align="center" valign="middle" >0.725</td><td align="center" valign="middle" >0.023</td><td align="center" valign="middle" >0.000539</td></tr><tr><td align="center" valign="middle" >6. Squids II</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >20,564.500 &#177; 405.172</td><td align="center" valign="middle" >0.692</td><td align="center" valign="middle" >0.692</td><td align="center" valign="middle" >0.692</td><td align="center" valign="middle" >NA</td><td align="center" valign="middle" >3.97E-28</td></tr><tr><td align="center" valign="middle" >7. Squids III</td><td align="center" valign="middle" >9</td><td align="center" valign="middle" >17,244.889 &#177; 255.089</td><td align="center" valign="middle" >0.729</td><td align="center" valign="middle" >0.821</td><td align="center" valign="middle" >0.899</td><td align="center" valign="middle" >1.6E-02</td><td align="center" valign="middle" >1.46E-31</td></tr><tr><td align="center" valign="middle" >8. Squids IV</td><td align="center" valign="middle" >2</td><td align="center" valign="middle" >20,200.500 &#177; 152.028</td><td align="center" valign="middle" >0.66</td><td align="center" valign="middle" >0.66</td><td align="center" valign="middle" >0.66</td><td align="center" valign="middle" >NA</td><td align="center" valign="middle" >3.22E-40</td></tr><tr><td align="center" valign="middle" >10. Sepia</td><td align="center" valign="middle" >10</td><td align="center" valign="middle" >16,197.500 &#177; 26.488</td><td align="center" valign="middle" >0.801</td><td align="center" valign="middle" >0.846</td><td align="center" valign="middle" >0.918</td><td align="center" valign="middle" >9E-03</td><td align="center" valign="middle" >9.05E-66</td></tr></tbody></table></table-wrap></sec><sec id="s3_3"><title>3.3. The Ten Cephalopod Clusters</title><p>Argonauta and Nautilus form two separate clusters. Nautilus also has a significantly large non-coding region between the tRNAs for glutamine (Q) and threonine (T) [<xref ref-type="bibr" rid="scirp.111391-ref40">40</xref>]. Nautilus also has a mean GC% of 40% &#177; 0.26%, whereas for Argonauta this value is 22.9% &#177; 0.05%.</p><p>Octopodidae with 17 species is the largest and most significant monophyletic group, with a unique morphology of eight legs. These species come from the genera Amphioctopus, Callistoctopus, Cistopus, Hapalochlaena, and Octopus. The 17 octopus species in this study have a very conserved gene order and genome length between 15,479 bp and 16,084 bp. The octopus genomes have a mean GC% of 24.5% &#177; 0.88%. The enigmatic Vampyroteuthis infernalis [<xref ref-type="bibr" rid="scirp.111391-ref41">41</xref>] is classified as a member of this group.</p><p>Ten species of cuttlefish from the genera Sepia and Sepiella also form a monophyletic group. Their gene order is very conserved, but very different from all other cephalopod groups. The genome length is also very conserved, between 16,163 to 16,244 bp. The GC% is 24.6% &#177; 1.47%. Takumiya et al. [<xref ref-type="bibr" rid="scirp.111391-ref42">42</xref>] also found significant differences between coleoid cephalopods based on mtDNA, separating cuttlefish from all other groups.</p><p>Following are several squid groups, including four monophyletic groups and two species which don’t belong anywhere else. The first of the four groups is Loliginidae (pencil squids), with species from the genera Doryteuthis, Heterololigo, Loliolus, Sepioteuthis, and Uroteuthis. The species Sepioteuthis lessoniana is an outlier due to the translocation of three tRNA sequences (for isoleucine, valine and tryptophan) and l-rrna dn s-rrna in its mt genome. Yokobori et al. [<xref ref-type="bibr" rid="scirp.111391-ref43">43</xref>] also found that this squid group is monophyletic.</p><p>Four species, Architeuthis dux, Dosidicus gigas, Stenoteithis oualaniensis, and Todarodes pacificus form a separate group. These four species have a mean GC% of 29.8% &#177; 1.4% and a genome length from 20,254 to 20,331 bp. Xu, et al. [<xref ref-type="bibr" rid="scirp.111391-ref44">44</xref>] also separate them from all other squids.</p><p>Bathyteuthis abyssicola (the deepsea squid) belongs to its own family, Bathyteuthidae, in the suborder Oegopsina. It possibly even belongs to its own order, according to Uribe and Zardoya [<xref ref-type="bibr" rid="scirp.111391-ref45">45</xref>]. Kawashima et al. [<xref ref-type="bibr" rid="scirp.111391-ref46">46</xref>] found that the mitogenome structure of Semmirossia patagonica (Patagonian bobtail squid), from the family Sepiolidae is unique among decapods. For example, like Nautilus species, the ATP6 and ATP8 genes are not adjacent to one another. Ommastrephes bartrami (the neon flying squid) and Watasenia scintillans (the firefly squid) and Chiroteuthis picteti and Ilex argentinus cluster together into the same putative groups.</p></sec></sec><sec id="s4"><title>4. Conclusion</title><p>Studying organelle genomes is a new and expanding field of genomics research. Several tools have been devised to determine phylogeny based on gene order. Several tools and databases exist to annotate and visualize organelle genomes and to store organelle sequences. The present software, Organelle DNA Lineages was designed to cluster species into monophyletic groups based on genome sequence similarity and gene order and visualize the results. The software was run on a set of 51 cephalopod species and found ten clusters. This software can help expand the field of organelle genomics.</p></sec><sec id="s5"><title>Acknowledgements</title><p>No acknowledgements. The author reports no conflict of interests, and no funding.</p></sec><sec id="s6"><title>Conflicts of Interest</title><p>The author declares no conflicts of interest regarding the publication of this paper.</p></sec><sec id="s7"><title>Cite this paper</title><p>Cserhati, M. (2021) Prediction of Monophyletic Groups Based on Gene Order and Sequence Similarity in Organelle DNA. American Journal of Molecular Biology, 11, 83-99. https://doi.org/10.4236/ajmb.2021.114008</p></sec><sec id="s8"><title>Abbreviations and Acronyms</title><p>ATP: adenosine triphosphate</p><p>BLAST: Basic Local Alignment Software Tool</p><p>CpGAVAS: Chloroplast Genome Annotation, Visualization, Analysis, and GenBank Submission</p><p>CpGDB: Chloroplast Genome Database</p><p>CREx: Common interval Rearrangement Explorer</p><p>DOGMA: Dual Organellar GenoMe Annotator</p><p>GC%: GC content</p><p>MITOS: MITOchondrial genome annotation Server</p><p>Msa: multiple sequence alignment</p><p>mt: mitochondrial</p><p>mtDNA: mitochondrial DNA</p><p>NCBI: National Center for Biotechnology Information</p><p>oDNA: organelle DNA</p><p>ODL: Organelle DNA Lineages</p><p>OGRe: Overlap Graph-based Read ClustEring</p><p>OXPHOS: oxidative phosphorylation</p><p>Pl: plastid</p><p>plDNA: plastid DNA</p><p>rRNA: ribosomal RNA</p><p>SEM: standard error of the mean</p><p>SSR: simple sequence repeat</p><p>TCA: tricarboxylic acid</p><p>TDRL: tandem duplication random loss</p><p>tRNA: transfer RNA</p><p>tRNAdb: transfer RNA database</p></sec></body><back><ref-list><title>References</title><ref id="scirp.111391-ref1"><label>1</label><mixed-citation publication-type="other" xlink:type="simple">Boore, J.L., Collins, T.M., Stanton, D., Daehler, L.L. and Brown, W.M. (1995) Deducing the Pattern of Arthropod Phylogeny from Mitochondrial DNA Rearrangements. Nature, 376, 163-165. https://doi.org/10.1038/376163a0</mixed-citation></ref><ref id="scirp.111391-ref2"><label>2</label><mixed-citation publication-type="other" xlink:type="simple">Boore, J.L., Lavrov, D.V. and Brown, W.M. (1998) Gene Translocation Links Insects and Crustaceans. Nature, 392, 667-668. https://doi.org/10.1038/33577</mixed-citation></ref><ref id="scirp.111391-ref3"><label>3</label><mixed-citation publication-type="other" xlink:type="simple">Aguileta, G., de Vienne, D.M., Ross, O.N., Hood, M.E., Giraud, T., Petit, E. and Gabaldón, T. (2014) High Variability of Mitochondrial Gene Order among Fungi. Genome Biology and Evolution, 6, 451-465. https://doi.org/10.1093/gbe/evu028</mixed-citation></ref><ref id="scirp.111391-ref4"><label>4</label><mixed-citation publication-type="other" xlink:type="simple">Ravi, V., Khurana, J.P., Tyagi, A.K. and Khurana, P. (2008) An Update on Chloroplast Genomes. Plant Systematics and Evolution, 271, 101-122.  
https://doi.org/10.1007/s00606-007-0608-0</mixed-citation></ref><ref id="scirp.111391-ref5"><label>5</label><mixed-citation publication-type="other" xlink:type="simple">Friedman, J.R. and Nunnari, J. (2014) Mitochondrial Form and Function. Nature, 505, 335-343. https://doi.org/10.1038/nature12985</mixed-citation></ref><ref id="scirp.111391-ref6"><label>6</label><mixed-citation publication-type="other" xlink:type="simple">Suomalainen, A. and Battersby, B.J. (2018) Mitochondrial Diseases: The Contribution of Organelle Stress Responses to Pathology. Nature reviews: Molecular Cell Biology, 19, 77-92. https://doi.org/10.1038/nrm.2017.66</mixed-citation></ref><ref id="scirp.111391-ref7"><label>7</label><mixed-citation publication-type="other" xlink:type="simple">Yan, C., Duanmu, X., Zeng, L., Liu, B. and Song, Z. (2019) Mitochondrial DNA: Distribution, Mutations, and Elimination. Cells, 8, 379.  
https://doi.org/10.3390/cells8040379</mixed-citation></ref><ref id="scirp.111391-ref8"><label>8</label><mixed-citation publication-type="other" xlink:type="simple">Supinski, G.S., Schroder, E.A. and Callahan, L.A. (2020) Mitochondria and Critical Illness. Chest, 157, 310-322.</mixed-citation></ref><ref id="scirp.111391-ref9"><label>9</label><mixed-citation publication-type="other" xlink:type="simple">Anderson, S., Bankier, A.T., Barrell, B.G., de Bruijn, M.H., Coulson, A.R., Drouin, J., Eperon, I.C., Nierlich, D.P., Roe, B.A., Sanger, F., Schreier, P.H., Smith, A.J., Staden, R. and Young, I.G. (1981) Sequence and Organization of the Human Mitochondrial Genome. Nature, 290, 457-465. https://doi.org/10.1038/290457a0</mixed-citation></ref><ref id="scirp.111391-ref10"><label>10</label><mixed-citation publication-type="other" xlink:type="simple">Goodman, S.R. (2007) Medical Cell Biology. Third Edition, Academic Press, Cambridge.</mixed-citation></ref><ref id="scirp.111391-ref11"><label>11</label><mixed-citation publication-type="other" xlink:type="simple">Hirakawa, Y. and Watanabe, A. (2019) Organellar DNA Polymerases in Complex Plastid-Bearing Algae. Biomolecules, 9, 140. https://doi.org/10.3390/biom9040140</mixed-citation></ref><ref id="scirp.111391-ref12"><label>12</label><mixed-citation publication-type="other" xlink:type="simple">Moriyama, T., Terasawa, K. and Sato, N. (2011) Conservation of POPs, the Plant Organellar DNA Polymerases, in Eukaryotes. Protist, 162, 177-187.  
https://doi.org/10.1016/j.protis.2010.06.001</mixed-citation></ref><ref id="scirp.111391-ref13"><label>13</label><mixed-citation publication-type="other" xlink:type="simple">Jiao, Y., Jia, H.M., Li, X.W., Chai, M.L., Jia, H.J., Chen, Z., Wang, G.Y., Chai, C.Y., van de Weg, E. and Gao, Z.S. (2012) Development of Simple Sequence Repeat (SSR) Markers from a Genome Survey of Chinese Bayberry (Myrica rubra). BMC Genomics, 13, 201. https://doi.org/10.1186/1471-2164-13-201</mixed-citation></ref><ref id="scirp.111391-ref14"><label>14</label><mixed-citation publication-type="other" xlink:type="simple">Li, X., Gao, H., Wang, Y., Song, J., Henry, R., Wu, H., Hu, Z., Yao, H., Luo, H., Luo, K., Pan, H. and Chen, S. (2013). Complete Chloroplast Genome Sequence of Magnolia grandiflora and Comparative Analysis with Related Species. Science China. Life Sciences, 56, 189-198.</mixed-citation></ref><ref id="scirp.111391-ref15"><label>15</label><mixed-citation publication-type="other" xlink:type="simple">Zheng, X.-M., Wang, J.R., Feng, L., Liu, S., Pang, H.B., Qi, L., Li, J., Sun, Y., Qiao, W.H., Zhang, L.F., Cheng, Y.L. and Yang, Q.W. (2017) Inferring the Evolutionary Mechanism of the Chloroplast Genome Size by Comparing Whole-Chloroplast Genome Sequences in Seed Plants. Scientific Reports, 7, Article No. 1555.</mixed-citation></ref><ref id="scirp.111391-ref16"><label>16</label><mixed-citation publication-type="other" xlink:type="simple">Wakasugi, T., Tsudzuki, T. and Sugiura, M. (2001) The Genomics of Land Plant Chloroplasts: Gene Content and Alteration of Genomic Information by RNA Editing. Photosynthesis Research, 70, 107-118.  
https://doi.org/10.1023/A:1013892009589</mixed-citation></ref><ref id="scirp.111391-ref17"><label>17</label><mixed-citation publication-type="other" xlink:type="simple">Wolfe, K.H., Li, W.H. and Sharp, P.M. (1987) Rates of Nucleotide Substitution Vary Greatly among Plant Mitochondrial, Chloroplast, and Nuclear DNAs. Proceedings of the National Academy of Sciences of the United States of America, 84, 9054-9058. https://doi.org/10.1073/pnas.84.24.9054</mixed-citation></ref><ref id="scirp.111391-ref18"><label>18</label><mixed-citation publication-type="other" xlink:type="simple">Wise, R.R. (2006) The Diversity of Plastid Form and Function. Advances in Photosynthesis and Respiration Book 23. Springer, Berlin, 3-26.</mixed-citation></ref><ref id="scirp.111391-ref19"><label>19</label><mixed-citation publication-type="other" xlink:type="simple">Bernt, M., Merkle, D., Ramsch, K., Fritzsch, G., Perseke, M., Bernhard, D., Schlegel, M., Stadler, P.F. and Middendorf, M. (2007) CREx: Inferring Genomic Rearrangements Based on Common Intervals. Bioinformatics (Oxford, England), 23, 2957-2958. https://doi.org/10.1093/bioinformatics/btm468</mixed-citation></ref><ref id="scirp.111391-ref20"><label>20</label><mixed-citation publication-type="other" xlink:type="simple">Conant, G.C. and Wolfe, K.H. (2008) GenomeVx: Simple Web-Based Creation of Editable Circular Chromosome Maps. Bioinformatics (Oxford, England), 24, 861-862. https://doi.org/10.1093/bioinformatics/btm598</mixed-citation></ref><ref id="scirp.111391-ref21"><label>21</label><mixed-citation publication-type="other" xlink:type="simple">Jameson, D., Gibson, A.P., Hudelot, C. and Higgs, P.G. (2003) OGRe: A Relational Database for Comparative Analysis of Mitochondrial Genomes. Nucleic Acids Research, 31, 202-206. https://doi.org/10.1093/nar/gkg077</mixed-citation></ref><ref id="scirp.111391-ref22"><label>22</label><mixed-citation publication-type="other" xlink:type="simple">Feijao, P.C., Neiva, L.S., de Azeredo-Espin, A.M. and Lessinger, A.C. (2006) AMiGA: The Arthropodan Mitochondrial Genomes Accessible Database. Bioinformatics, 22, 902-903. https://doi.org/10.1093/bioinformatics/btl021</mixed-citation></ref><ref id="scirp.111391-ref23"><label>23</label><mixed-citation publication-type="other" xlink:type="simple">Lupi, R., de Meo, P.D., Picardi, E., D’Antonio, M., Paoletti, D., Castrignanò, T., Pesole, G. and Gissi, C. (2010) MitoZoa: A Curated Mitochondrial Genome Database of Metazoans for Comparative Genomics Studies. Mitochondrion, 10, 192-199.  
https://doi.org/10.1016/j.mito.2010.01.004</mixed-citation></ref><ref id="scirp.111391-ref24"><label>24</label><mixed-citation publication-type="other" xlink:type="simple">Pütz, J., Dupuis, B., Sissler, M. and Florentz, C. (2007) Mamit-tRNA, a Database of Mammalian Mitochondrial tRNA Primary and Secondary Structures, RNA (New York, N.Y.), 13, 1184-1190.</mixed-citation></ref><ref id="scirp.111391-ref25"><label>25</label><mixed-citation publication-type="other" xlink:type="simple">de Vasconcelos, A.T., Guimaraes, A.C., Castelletti, C.H., Caruso, C.S., Ribeiro, C., Yokaichiya, F., Armoa, G.R., Pereira, G., da Silva, I.T., Schrago, C.G., Fernandes, A.L., da Silveira, A.R., Carneiro, A.G., Carvalho, B.M., Viana, C.J., Gramkow, D., Lima, F.J., Corrêa, L.G., Mudado, M., Nehab-Hess, P., et al. (2005) MamMiBase: A Mitochondrial Genome Database for Mammalian Phylogenetic Studies. Bioinformatics (Oxford, England), 21, 2566-2567.  
https://doi.org/10.1093/bioinformatics/bti326</mixed-citation></ref><ref id="scirp.111391-ref26"><label>26</label><mixed-citation publication-type="other" xlink:type="simple">Bernt, M., Braband, A., Middendorf, M., Misof, B., Rota-Stabelli, O. and Stadler, P.F. (2013) Bioinformatics Methods for the Comparative Analysis of Metazoan Mitochondrial Genome Sequences. Molecular Phylogenetics and Evolution, 69, 320-327. https://doi.org/10.1016/j.ympev.2012.09.019</mixed-citation></ref><ref id="scirp.111391-ref27"><label>27</label><mixed-citation publication-type="other" xlink:type="simple">Hartmann, T., Bernt, M. and Middendorf, M. (2018) EqualTDRL: Illustrating Equivalent Tandem Duplication Random Loss Rearrangements. BMC Bioinformatics, 19, 192. https://doi.org/10.1186/s12859-018-2170-x</mixed-citation></ref><ref id="scirp.111391-ref28"><label>28</label><mixed-citation publication-type="other" xlink:type="simple">Iwasaki, W., Fukunaga, T., Isagozawa, R., Yamada, K., Maeda, Y., Satoh, T.P., Sado, T., Mabuchi, K., Takeshima, H., Miya, M. and Nishida, M. (2013) MitoFish and MitoAnnotator: A Mitochondrial Genome Database of Fish with an Accurate and Automatic Annotation Pipeline. Molecular Biology and Evolution, 30, 2531-2540.  
https://doi.org/10.1093/molbev/mst141</mixed-citation></ref><ref id="scirp.111391-ref29"><label>29</label><mixed-citation publication-type="other" xlink:type="simple">Bernt, M., Donath, A., Jühling, F., Externbrink, F., Florentz, C., Fritzsch, G., Pütz, J., Middendorf, M. and Stadler, P.F. (2013) MITOS: Improved De Novo Metazoan Mitochondrial Genome Annotation. Molecular Phylogenetics and Evolution, 69, 313-319. https://doi.org/10.1016/j.ympev.2012.08.023</mixed-citation></ref><ref id="scirp.111391-ref30"><label>30</label><mixed-citation publication-type="other" xlink:type="simple">Donath, A., Jühling, F., Al-Arab, M., Bernhart, S.H., Reinhardt, F., Stadler, P.F., Middendorf, M. and Bernt, M. (2019) Improved Annotation of Protein-Coding Genes Boundaries in Metazoan Mitochondrial Genomes. Nucleic Acids Research, 47, 10543-10552. https://doi.org/10.1093/nar/gkz833</mixed-citation></ref><ref id="scirp.111391-ref31"><label>31</label><mixed-citation publication-type="other" xlink:type="simple">Guyeux, C., Charr, J.C., Tran, H., Furtado, A., Henry, R.J., Crouzillat, D., Guyot, R. and Hamon, P. (2019) Evaluation of Chloroplast Genome Annotation Tools and Application to Analysis of the Evolution of Coffee Species. PLoS ONE, 14, e0216347.</mixed-citation></ref><ref id="scirp.111391-ref32"><label>32</label><mixed-citation publication-type="other" xlink:type="simple">Wyman, S.K., Jansen, R.K. and Boore, J.L. (2004) Automatic Annotation of Organellar Genomes with DOGMA. Bioinformatics (Oxford, England), 20, 3252-3255.  
https://doi.org/10.1093/bioinformatics/bth352</mixed-citation></ref><ref id="scirp.111391-ref33"><label>33</label><mixed-citation publication-type="other" xlink:type="simple">Singh, B.P., Kumar, A., Kaur, H., Singh, H. and Nagpal, A.K. (2020) CpGDB: A Comprehensive Database of Chloroplast Genomes. Bioinformation, 16, 171-175.  
https://doi.org/10.6026/97320630016171</mixed-citation></ref><ref id="scirp.111391-ref34"><label>34</label><mixed-citation publication-type="other" xlink:type="simple">Liu C., Shi L., Zhu Y., Chen H., Zhang J., Lin X. and Guan, X. (2012) CpGAVAS, an Integrated Web Server for the Annotation, Visualization, Analysis, and GenBank Submission of Completely Sequenced Chloroplast Genome Sequences. BMC Genomics, 13, 715. https://doi.org/10.1186/1471-2164-13-715</mixed-citation></ref><ref id="scirp.111391-ref35"><label>35</label><mixed-citation publication-type="other" xlink:type="simple">Shi, L., Chen, H., Jiang, M., Wang, L., Wu, X., Huang, L. and Liu, C. (2019) CPGAVAS2, an Integrated Plastome Sequence Annotator and Analyzer. Nucleic Acids Research, 47, W65-W73. https://doi.org/10.1093/nar/gkz345</mixed-citation></ref><ref id="scirp.111391-ref36"><label>36</label><mixed-citation publication-type="other" xlink:type="simple">Cheng, J., Zeng, X., Ren, G. and Liu, Z. (2013) CGAP: A New Comprehensive Platform for the Comparative Analysis of Chloroplast Genomes. BMC Bioinformatics, 14, 95. https://doi.org/10.1186/1471-2105-14-95</mixed-citation></ref><ref id="scirp.111391-ref37"><label>37</label><mixed-citation publication-type="other" xlink:type="simple">Tillich, M., Lehwark, P., Pellizzer, T., Ulbricht-Jones, E.S., Fischer, A., Bock, R. and Greiner, S. (2017) GeSeq—Versatile and Accurate Annotation of Organelle Genomes. Nucleic Acids Research, 45, W6-W11.</mixed-citation></ref><ref id="scirp.111391-ref38"><label>38</label><mixed-citation publication-type="other" xlink:type="simple">Sablok, G., Mudunuri, S.B., Patnana, S., Popova, M., Fares, M.A. and Porta, N.L. (2013) ChloroMitoSSRDB: Open Source Repository of Perfect and Imperfect Repeats in Organelle Genomes for Evolutionary Genomics. DNA Research, 20, 127-133. https://doi.org/10.1093/dnares/dss038</mixed-citation></ref><ref id="scirp.111391-ref39"><label>39</label><mixed-citation publication-type="other" xlink:type="simple">Bernt, M. and Middendorf, M. (2011) A Method for Computing an Inventory of Metazoan Mitochondrial Gene Order Rearrangements. BMC Bioinformatics, 12, S6.</mixed-citation></ref><ref id="scirp.111391-ref40"><label>40</label><mixed-citation publication-type="other" xlink:type="simple">Boore, J.L. (2006) The Complete Sequence of the Mitochondrial Genome of Nautilus macromphalus (Mollusca: Cephalopoda). BMC Genomics, 7, 182.  
https://doi.org/10.1186/1471-2164-7-182</mixed-citation></ref><ref id="scirp.111391-ref41"><label>41</label><mixed-citation publication-type="other" xlink:type="simple">Robison, B.H., Reisenbichler, K.R., Hunt, J.C. and Haddock, S.H. (2003) Light Production by the Arm Tips of the Deep-Sea Cephalopod Vampyroteuthis infernalis. The Biological Bulletin, 205, 102-109. https://doi.org/10.2307/1543231</mixed-citation></ref><ref id="scirp.111391-ref42"><label>42</label><mixed-citation publication-type="other" xlink:type="simple">Takumiya, M., Kobayashi, M., Tsuneki, K. and Furuya, H. (2005) Phylogenetic Relationships among Major Species of Japanese Coleoid Cephalopods (Mollusca: Cephalopoda) Using Three Mitochondrial DNA Sequences. Zoological Science, 22, 147-155. https://doi.org/10.2108/zsj.22.147</mixed-citation></ref><ref id="scirp.111391-ref43"><label>43</label><mixed-citation publication-type="other" xlink:type="simple">Yokobori, S., Lindsay, D.J., Yoshida, M., Tsuchiya, K., Yamagishi, A., Maruyama, T. and Oshima, T. (2007) Mitochondrial Genome Structure and Evolution in the Living Fossil Vampire Squid, Vampyroteuthis infernalis, and Extant Cephalopods. Molecular Phylogenetics and Evolution, 44, 898-910.  
https://doi.org/10.1016/j.ympev.2007.05.009</mixed-citation></ref><ref id="scirp.111391-ref44"><label>44</label><mixed-citation publication-type="other" xlink:type="simple">Xu, L., Wang, X. and Du, F. (2020) The Complete Mitochondrial Genome of Loliginid Squid (Uroteuthis chinensis) from Minnan-Taiwan Bank Fishing Ground. Mitochondrial DNA. Part B, Resources, 5, 428-429.  
https://doi.org/10.1080/23802359.2019.1703599</mixed-citation></ref><ref id="scirp.111391-ref45"><label>45</label><mixed-citation publication-type="other" xlink:type="simple">Uribe, J.E. and Zardoya, R. (2017) Revisiting the Phylogeny of Cephalopoda Using Complete Mitochondrial Genomes. Journal of Molluscan Studies, 83, 133-144.  
https://doi.org/10.1093/mollus/eyw052</mixed-citation></ref><ref id="scirp.111391-ref46"><label>46</label><mixed-citation publication-type="other" xlink:type="simple">Kawashima, Y., Nishihara, H., Akasaki, T., Nikaido, M., Tsuchiya, K., Segawa, S. and Okada, N. (2013) The Complete Mitochondrial Genomes of Deep-Sea Squid (Bathyteuthis abyssicola), Bob-Tail Squid (Semirossia patagonica) and Four Giant Cuttlefish (Sepia apama, S. latimanus, S. lycidas and S. pharaonis), and Their Application to the Phylogenetic Analysis of Decapodiformes. Molecular Phylogenetics and Evolution, 69, 980-993.</mixed-citation></ref></ref-list></back></article>