TY - JOUR
T1 - A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1
AU - Reisman, Steven
AU - Hatzopoulous, Thomas
AU - Läufer, Konstantin
AU - Thiruvathukal, George K.
AU - Putonti, Catherine
N1 - Steven Reisman and Thomas Hatzopoulos and Konstantin Läufer and George K. Thiruvathukal and and Catherine Putonti, A Polyglot Approach to Bioinformatics Data Integration: A Phylogenetic Analysis of HIV-1, Evolutionary Bioinformatics 2016:12 23-27, doi:10.4137/EBO.S32757.
PY - 2016/1/18
Y1 - 2016/1/18
N2 - As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest.
AB - As sequencing technologies continue to drop in price and increase in throughput, new challenges emerge for the management and accessibility of genomic sequence data. We have developed a pipeline for facilitating the storage, retrieval, and subsequent analysis of molecular data, integrating both sequence and metadata. Taking a polyglot approach involving multiple languages, libraries, and persistence mechanisms, sequence data can be aggregated from publicly available and local repositories. Data are exposed in the form of a RESTful web service, formatted for easy querying, and retrieved for downstream analyses. As a proof of concept, we have developed a resource for annotated HIV-1 sequences. Phylogenetic analyses were conducted for >6,000 HIV-1 sequences revealing spatial and temporal factors influence the evolution of the individual genes uniquely. Nevertheless, signatures of origin can be extrapolated even despite increased globalization. The approach developed here can easily be customized for any species of interest.
KW - bioinformatics
KW - HIV
KW - software engineering
KW - functional programming
KW - polyglot programming
KW - RESTful web service
KW - phylogenetics
UR - https://ecommons.luc.edu/cs_facpubs/127
UR - https://ecommons.luc.edu/bioinformatics_facpub/17
U2 - 10.4137/EBO.S32757
DO - 10.4137/EBO.S32757
M3 - Article
VL - 2016
JO - Computer Science: Faculty Publications and Other Works
JF - Computer Science: Faculty Publications and Other Works
IS - 12
ER -