Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification

Bibliographic Details
Main Author: Soto Valdés, Ana Zaida
Publication Date: 2017
Format: Master thesis
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10400.8/2854
Summary: The development of high-throughput sequencing technologies has provided ecologists with an efficient approach to assess biodiversity in benthic communities, particularly with the recent advances in metabarcoding technologies using universal primers. However, analyzing such high-throughput data is posing important computational challenges, requiring specialized bioinformatics solutions at different stages during the processing pipeline, such as assembly of paired-end reads, chimera removal, correction of sequencing errors, and clustering of obtained sequences into Molecular Operational Taxonomic Units (MOTUs). The inferred MOTUs can then be used to estimate species diversity, composition, and richness. Although a number of methods have been developed and commonly used to cluster the sequences into MOTUs, relatively little guidance is available on their relative performance. We focused our study in the benthic community from a natural CO2 vent present in the Canary Islands, as it can be used as a natural laboratory in which to investigate the impacts of chronic ocean acidification. Here, we propose a pipeline for studying this community using a fragment of the mitochondrial cytochrome c oxidase I (COI) sequence. We compared two DNA extraction methods, two clustering methods and validated a robust method to eliminate false positives. We found that we can obtain optimal results purifying DNA from 0.3 g of sample. Using the step-by-step aggregation algorithm implemented in SWARM for clustering yields similar results as using the Bayesian clustering method of CROP, in much less time. We introduced the new algorithm MINT (Multiple Intersection of N Tags), in order to eliminate false positives due to random errors produced before or after the sequencing. Our results show that a fully-automated analysis pipeline can be used for assessing biodiversity of marine benthic communities using COI as a metabarcoding marker in an objective, accurate and affordable manner.
id RCAP_16b7c6544582bdf37237c1a376708ab2
oai_identifier_str oai:iconline.ipleiria.pt:10400.8/2854
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidificationEnviromental DNA (eDNA)CO2 ventCOIpipelineclusteringMINTThe development of high-throughput sequencing technologies has provided ecologists with an efficient approach to assess biodiversity in benthic communities, particularly with the recent advances in metabarcoding technologies using universal primers. However, analyzing such high-throughput data is posing important computational challenges, requiring specialized bioinformatics solutions at different stages during the processing pipeline, such as assembly of paired-end reads, chimera removal, correction of sequencing errors, and clustering of obtained sequences into Molecular Operational Taxonomic Units (MOTUs). The inferred MOTUs can then be used to estimate species diversity, composition, and richness. Although a number of methods have been developed and commonly used to cluster the sequences into MOTUs, relatively little guidance is available on their relative performance. We focused our study in the benthic community from a natural CO2 vent present in the Canary Islands, as it can be used as a natural laboratory in which to investigate the impacts of chronic ocean acidification. Here, we propose a pipeline for studying this community using a fragment of the mitochondrial cytochrome c oxidase I (COI) sequence. We compared two DNA extraction methods, two clustering methods and validated a robust method to eliminate false positives. We found that we can obtain optimal results purifying DNA from 0.3 g of sample. Using the step-by-step aggregation algorithm implemented in SWARM for clustering yields similar results as using the Bayesian clustering method of CROP, in much less time. We introduced the new algorithm MINT (Multiple Intersection of N Tags), in order to eliminate false positives due to random errors produced before or after the sequencing. Our results show that a fully-automated analysis pipeline can be used for assessing biodiversity of marine benthic communities using COI as a metabarcoding marker in an objective, accurate and affordable manner.Rodrigues, Américo do PatrocínioWangensteen, Owen S.Repositório IC-OnlineSoto Valdés, Ana Zaida2017-11-27T17:06:01Z2017-11-082017-11-08T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.8/2854urn:tid:201764504enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-25T15:09:53Zoai:iconline.ipleiria.pt:10400.8/2854Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T20:49:13.502059Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
title Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
spellingShingle Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
Soto Valdés, Ana Zaida
Enviromental DNA (eDNA)
CO2 vent
COI
pipeline
clustering
MINT
title_short Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
title_full Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
title_fullStr Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
title_full_unstemmed Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
title_sort Eukaryotic metabarcoding pipelines for biodiversity assessment of marine benthic communities affected by ocean acidification
author Soto Valdés, Ana Zaida
author_facet Soto Valdés, Ana Zaida
author_role author
dc.contributor.none.fl_str_mv Rodrigues, Américo do Patrocínio
Wangensteen, Owen S.
Repositório IC-Online
dc.contributor.author.fl_str_mv Soto Valdés, Ana Zaida
dc.subject.por.fl_str_mv Enviromental DNA (eDNA)
CO2 vent
COI
pipeline
clustering
MINT
topic Enviromental DNA (eDNA)
CO2 vent
COI
pipeline
clustering
MINT
description The development of high-throughput sequencing technologies has provided ecologists with an efficient approach to assess biodiversity in benthic communities, particularly with the recent advances in metabarcoding technologies using universal primers. However, analyzing such high-throughput data is posing important computational challenges, requiring specialized bioinformatics solutions at different stages during the processing pipeline, such as assembly of paired-end reads, chimera removal, correction of sequencing errors, and clustering of obtained sequences into Molecular Operational Taxonomic Units (MOTUs). The inferred MOTUs can then be used to estimate species diversity, composition, and richness. Although a number of methods have been developed and commonly used to cluster the sequences into MOTUs, relatively little guidance is available on their relative performance. We focused our study in the benthic community from a natural CO2 vent present in the Canary Islands, as it can be used as a natural laboratory in which to investigate the impacts of chronic ocean acidification. Here, we propose a pipeline for studying this community using a fragment of the mitochondrial cytochrome c oxidase I (COI) sequence. We compared two DNA extraction methods, two clustering methods and validated a robust method to eliminate false positives. We found that we can obtain optimal results purifying DNA from 0.3 g of sample. Using the step-by-step aggregation algorithm implemented in SWARM for clustering yields similar results as using the Bayesian clustering method of CROP, in much less time. We introduced the new algorithm MINT (Multiple Intersection of N Tags), in order to eliminate false positives due to random errors produced before or after the sequencing. Our results show that a fully-automated analysis pipeline can be used for assessing biodiversity of marine benthic communities using COI as a metabarcoding marker in an objective, accurate and affordable manner.
publishDate 2017
dc.date.none.fl_str_mv 2017-11-27T17:06:01Z
2017-11-08
2017-11-08T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.8/2854
urn:tid:201764504
url http://hdl.handle.net/10400.8/2854
identifier_str_mv urn:tid:201764504
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833598889663397888