Skip to the content.

Codefest 2010

OpenBio Codefest 2010 will take place July 7th and 8th, 2010 in conjunction with BOSC 2010. This is an opportunity for OpenBio developers from projects like BioPerl, BioJava, Biopython, BioRuby, and EMBOSS to work collaboratively on improving Open Source Bioinformatics code.

Goals

OpenBio projects are typically coordinated remotely, with users from all over the world contributing and organizing themselves through mailing lists and IRC chats. Additionally, contributors work on these projects in their spare time, coordinating improving the projects with their day jobs and life outside of the computer. The objective of the Codefest is to give these talented developers a chance to be fully focused on the projects for a few days, interacting in real time. Previous Hackathons have been immensely successful at producing new high quality code and innovative project developments.

The general aim of the Codefest is improving the accessibility, functionality and interoperability of the existing libraries. The specific goals are determined based on the interests of attending members and inputs of sponsors. Some current areas of topic discussion are:

Cloud computing

Improving the presence of OpenBio libraries on distributed computing environments like Amazon Elastic Compute Cloud and Eucalyptus. Ntino has written up an excellent project proposal available for download in pdf format.

Initial work has started to develop an automated build environment that incorporates the Cloud BioLinux and bioperl-max efforts. See the blog post for full details. Code and configuration files are available from a GitHub repository. The post outlines several areas of improvements which could be targets for focused work at the Codefest.

Semantic Web

The 3rd DBCLS BioHackathon focused on the Semantic Web technologies in bioinformatics. As a result, in addition to the UniProt, several database providers including DDBJ, PDBj and KEGG have started to generate their data in RDF. These Linked Data can be queried by SPARQL and initial attempts to provide high level library for biological queries were made by BioPython and BioRuby groups. We propose to continue this challenge with all OpenBio projects to make a standard interface (query builder, ontology mapping etc.) for major biological SPARQL endpoints and handling RDF files.

To achieve this goal, we also need to develop an integrated/distributed triple store such as BioGateway. From our experience, to generate and store a large scale RDF triples is still a major issue even with standard triple stores. Additionally, we will try to convert biological queries in natural language to SPARQL with a NLP technology.

Location

Resources

Sponsorship

Space and internet for the Codefest are kindly provided by the Harvard School of Public Health Bioinformatics Core and Massachusetts General Hospital. We are actively seeking sponsors to help supplement the travel, lodging and meal costs for developers. If you’re interested in contributing to Open Source development in Bioinformatics and helping to direct the focus on the Codefest, please contact Brad.

ToDo List

Add your goals and plans for the Codefest here. This is a brainstorming section to help us organize ourselves.

Cloud computing

Work for the current community bioinformatics image (framework on GitHub):

Suggested Additions for Cloud computing image

Perl

Ruby

Python

Java

Data

Semantic Web

Bioperl/BioSQL

Key signing

Attendees

Feel free to add yourself if you are interested. We are happy to have you.

BBQ

After two days of hard work, there will be a celebratory BBQ at Brad’s house in Somerville the evening of July 8th. All are welcome for drinks and whatever magic I can whip up on my little charcoal grill.

The easiest way to get there is via cab. From Mass General Hospital, walk up Cambridge Street a few blocks to the Liberty Hotel where there is a cab stand. Ask the cab driver to take you to Medford Street in Somerville, via McGrath Highway. Partridge Avenue is located off Medford Street, on the right a few blocks after Central Street. I’ll pass out my cell phone number to everyone during the coding sessions if more directions are needed on route.

Discussion

We welcome any thoughts from interested participants. Please direct discussion to the OpenBio mailing list: open-bio-l@lists.open-bio.org.

For short-lived coordination tasks during the hackaton, an IRC channel has been setup on FreeNode: #codefest

Please use the hash tag #bosc2010 on twitter to help remote folks follow the discussion.