Detalhes bibliográficos
Ano de defesa: |
2024 |
Autor(a) principal: |
Alves, Tiago Lubiana |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/95/95131/tde-24102024-173719/
|
Resumo: |
With the advancements in the Human Cell Atlas and single-cell omics technologies (such as single-cell RNA-seq), the need for strategies to systematically organize knowledge about cell types has grown. Formal representation systems are essential for tasks such as managing databases and annotating omics datasets. The Wikidata infrastructure, integrated with Wikipedia, offers a valuable resource for bioinformaticians seeking structured biocurated data. We utilized it to develop WikiORA, an interactive web platform for functional enrichment analysis. Since WikiORA and similar tools rely on Wikidata\'s coverage, we enhanced its content using two leading databases: PanglaoDB, for cell markers, and the Complex Portal, for protein complexes. Alongside integrating external sources, we explored how Wikidata could be enriched via de novo biocuration, creating a system to catalog cell diversity. As a result, we transformed Wikidata into the world\'s largest multi-species catalog of cell classes, assigning unique identifiers to over 6,000 entries. The curated data are publicly accessible through a graphical interface and a SPARQL endpoint. By adhering to the 5-star Linked Open Data standard, we enabled efficient reuse of the data, supporting the development of a multilingual Cell Ontology and powering automated Wikipedia infoboxes. In summary, this case study highlights Wikidatas value as a knowledge representation tool in the life sciences, particularly for organizing information on human cell diversity. |