Heuristics for multiobjective multiple sequence alignment
| Autor(a) principal: | |
|---|---|
| Data de Publicação: | 2016 |
| Outros Autores: | , |
| Tipo de documento: | Artigo |
| Idioma: | eng |
| Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| Texto Completo: | https://hdl.handle.net/10316/108494 https://doi.org/10.1186/s12938-016-0184-z |
Resumo: | Background: Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. Methods: We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. Results and conclusions: The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio. |
| id |
RCAP_99cccfea677827c37cfff8ff4e960297 |
|---|---|
| oai_identifier_str |
oai:estudogeral.uc.pt:10316/108494 |
| network_acronym_str |
RCAP |
| network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository_id_str |
https://opendoar.ac.uk/repository/7160 |
| spelling |
Heuristics for multiobjective multiple sequence alignmentMultiple sequence alignmentIterated local searchMultiobjective optimizationAlgorithmsComputational BiologyHeuristicsSequence AlignmentBackground: Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. Methods: We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. Results and conclusions: The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio.This work was financed by national funding via FCT and by the European Regional Development Fund (FEDER), through the COMPETE 2020 Operational Program for Competitiveness and Internationalization (POCI).Springer Nature2016-07-15info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://hdl.handle.net/10316/108494https://hdl.handle.net/10316/108494https://doi.org/10.1186/s12938-016-0184-zeng1475-925XAbbasi, MaryamPaquete, LuísPereira, Francisco B.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-09-30T14:48:10Zoai:estudogeral.uc.pt:10316/108494Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T05:59:51.841168Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
| dc.title.none.fl_str_mv |
Heuristics for multiobjective multiple sequence alignment |
| title |
Heuristics for multiobjective multiple sequence alignment |
| spellingShingle |
Heuristics for multiobjective multiple sequence alignment Abbasi, Maryam Multiple sequence alignment Iterated local search Multiobjective optimization Algorithms Computational Biology Heuristics Sequence Alignment |
| title_short |
Heuristics for multiobjective multiple sequence alignment |
| title_full |
Heuristics for multiobjective multiple sequence alignment |
| title_fullStr |
Heuristics for multiobjective multiple sequence alignment |
| title_full_unstemmed |
Heuristics for multiobjective multiple sequence alignment |
| title_sort |
Heuristics for multiobjective multiple sequence alignment |
| author |
Abbasi, Maryam |
| author_facet |
Abbasi, Maryam Paquete, Luís Pereira, Francisco B. |
| author_role |
author |
| author2 |
Paquete, Luís Pereira, Francisco B. |
| author2_role |
author author |
| dc.contributor.author.fl_str_mv |
Abbasi, Maryam Paquete, Luís Pereira, Francisco B. |
| dc.subject.por.fl_str_mv |
Multiple sequence alignment Iterated local search Multiobjective optimization Algorithms Computational Biology Heuristics Sequence Alignment |
| topic |
Multiple sequence alignment Iterated local search Multiobjective optimization Algorithms Computational Biology Heuristics Sequence Alignment |
| description |
Background: Aligning multiple sequences arises in many tasks in Bioinformatics. However, the alignments produced by the current software packages are highly dependent on the parameters setting, such as the relative importance of opening gaps with respect to the increase of similarity. Choosing only one parameter setting may provide an undesirable bias in further steps of the analysis and give too simplistic interpretations. In this work, we reformulate multiple sequence alignment from a multiobjective point of view. The goal is to generate several sequence alignments that represent a trade-off between maximizing the substitution score and minimizing the number of indels/gaps in the sum-of-pairs score function. This trade-off gives to the practitioner further information about the similarity of the sequences, from which she could analyse and choose the most plausible alignment. Methods: We introduce several heuristic approaches, based on local search procedures, that compute a set of sequence alignments, which are representative of the trade-off between the two objectives (substitution score and indels). Several algorithm design options are discussed and analysed, with particular emphasis on the influence of the starting alignment and neighborhood search definitions on the overall performance. A perturbation technique is proposed to improve the local search, which provides a wide range of high-quality alignments. Results and conclusions: The proposed approach is tested experimentally on a wide range of instances. We performed several experiments with sequences obtained from the benchmark database BAliBASE 3.0. To evaluate the quality of the results, we calculate the hypervolume indicator of the set of score vectors returned by the algorithms. The results obtained allow us to identify reasonably good choices of parameters for our approach. Further, we compared our method in terms of correctly aligned pairs ratio and columns correctly aligned ratio with respect to reference alignments. Experimental results show that our approaches can obtain better results than TCoffee and Clustal Omega in terms of the first ratio. |
| publishDate |
2016 |
| dc.date.none.fl_str_mv |
2016-07-15 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10316/108494 https://hdl.handle.net/10316/108494 https://doi.org/10.1186/s12938-016-0184-z |
| url |
https://hdl.handle.net/10316/108494 https://doi.org/10.1186/s12938-016-0184-z |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
1475-925X |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.publisher.none.fl_str_mv |
Springer Nature |
| publisher.none.fl_str_mv |
Springer Nature |
| dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
| instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| instacron_str |
RCAAP |
| institution |
RCAAP |
| reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
| repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
| repository.mail.fl_str_mv |
info@rcaap.pt |
| _version_ |
1833602542580269056 |