-

Google Summer of Code 2014 Ideas

From Open Bioinformatics Foundation
Revision as of 16:41, 14 February 2014 by EricTalevich (talk) (BioInterchange: Convert and Exchange Biological File Formants using RESTful web service: Copied from the other wiki page)
Jump to: navigation, search

Interested mentors and students should subscribe to the OBF/GSoC mailing list and announce their interest - that is the way we can track what is happening.

Mentor names and project ideas are hosted on each member project's wiki on a dedicated Google Summer of Code page. See each of the member projects, linked below, for more details about any project:

Cross-project ideas

BioInterchange: Convert and Exchange Biological File Formants using RESTful web service

Rationale
BioInterchange Interchange data using the Resource Description Framework (RDF) and let BioInterchange automagically create RDF triples from your TSV, XML, GFF3, GVF, Newick and other files common in Bioinformatics. BioInterchange helps you transform your data sets into linked data for sharing and data integration via command line, web-service, or API. BioInterchange was conceived and designed during NBDC/DBCLS's BioHackathon 2012. Architecture and RDF serialization implementations were provided by Joachim Baran, Geraint Duck provided JSON and XML deserialization implementations and contributed to architecture decisions, guidance on ontology use and applications were given by Kevin B. Cohen and Michel Dumontier, where Michel brought forward and extended the Semanticscience Integrated Ontology (SIO). Jin-Dong Kim helped to define ontology relationships for RDFizing DBCLS' PubAnnotation category annotations. The main idea is to have a central service with can be used as a validator and as interchange service for different languages.
Approach
The project will identify the most common and used file formants for all the currently used language under OBF and will design a RESTful API and will project an implementation for all the supported languages. BioInterchange was developed with Ruby but the scope of the project is to have an agnostic system which let use implement a converter using the best language for that functionality. It expected to have a high traffic for the service so an appropriate refactoring or reimplementation using parallel techniques or languages devoted to parallel programming would be possible.
Difficulty and needed skills
The project is mid / high difficulty, aimed at talented students. Previous knowledge of Ruby or other scripting language is preferred and flexibility in learning other languages is requireed.
The project requires
Knowledge of advanced programming languages and meta-programming and some concept in parallelizing and web services design.
Mentors
Raoul J.P. Bonnal, Francesco Strozzi, Toshiaki Katayama, Joachim Baran

Native and JVM-based support for the Systems Biology Markup Language (SMBL)

BioPerl

BioPerl logo tiny.jpg

NGS-friendly BioPerl code

Convert BioPerl-DB to use DBIx::Class

Major BioPerl Reorganization (Part II)

Perl Run Wrappers for External Programs in a Flash

Lightweight BioPerl modules

Modern BioPerl: BioPerl 2.0 and beyond

Bio::Assembly

Semantic Web Support

BioPython

Biopython logo tiny.png

Indexing & Lazy-loading Sequence Parsers

BioRuby

BioRuby logo tiny.png

An ultra-fast scalable RESTful API to query large numbers of genomic variations

BioHaskell

Optimizing a novel, very sensitive alignment method