Open Data, Open Doors
A PhD student’s perspective on attending the 17th Annual International Biocuration Conference in Faridabad, India
The Open Bioinformatics Foundation (OBF) Event Fellowship program aims to promote diverse participation at events promoting open-source bioinformatics software development and open science practices in the biological research community. Thea Fennell, a PhD Researcher at the MRC Laboratory of Molecular Biology, was awarded an OBF Event Fellowship to attend the 17th Annual International Biocuration Conference.
“The best use for your data will be thought of by someone else” — Rufus Pollock encapsulates this concept in all its power and humility. It is at the core of open science and biocuration. Almost by definition, FAIR data is non-proprietary, yet the work done to generate, curate, and share it is invaluable. Recent proceedings at the 17th Annual International Biocuration Conference (AIBC), hosted at the Indian Biological Data Centre (IBDC) in Faridabad, were a testament to this duality.
Opening the conference, Professor Saurabh Raghuvanshi emphasised the altruistic nature of the field and its principles: integrity, accessibility, and usability. He went on to highlight the milestones reached thanks to this public-minded approach to biodata within India: development of the ZyCoV-D vaccine, widespread HPV vaccine distribution, and establishment of the world’s only Crop Phenome Database. These positive achievements prove what open bioinformatics can do – when there is sufficient political will.
This necessary symbiosis between institutional support and open science principles can also be observed at the international level. In the first conference session, covering the International Sequence Database Collaboration (INSDC), Dr Guy Cochrane highlighted how international data sharing has facilitated breakthroughs worldwide: from global virome discoveries to COVID vaccines to – of course – AlphaFold. This initial session also revealed how data standards depend upon institutional policy at multiple levels.
Professor Masanori Arita touched not only on how academic publishing companies have begun to mandate data submission to the INSDC – connecting primary researchers to data sharing by default – but also on growing links to industry data, such as through liaison with the Japanese and Korean Patent Offices. Meanwhile, the constituent members of the INSDC – the DNA Data Bank of Japan (DDBNJ), the European Bioinformatics Institute (EBI), and the National Centre for Biotechnology Information (NCBI) – are themselves governmental institutions with a corresponding role to play in shaping data policy.
International collaboration remained a key theme throughout the proceedings in the context of various global concerns, from food security to epidemiological modelling. These collaborations are evidently not without their challenges, as evident from the conference’s session on global biodata. Decentralised collaboration may be an asset in linking efforts more directly to the coalface of primary research. However, it can also lead to conflict regarding scientific or political priorities. These competing interests are writ large in the competing policies on FAIR data – from the Convention on Biology Diversity (CBD) to the World Health Organisation (WHO) to country-specific guidelines.
Data accessibility appeared to be an area of particular contention. Professor Arita initiated the discussion, delineating how FAIR principles can conflict with guidelines on indigenous data governance – the CARE principles – reflecting an ambiguity in negotiating accessibility for sensitive data over which specific individuals or groups may reasonably claim authority. This thread was taken up again by Dr Ilene Mizrachi, whose US-based work at NCBI is subject to fewer legal restrictions regarding data ownership, and then again by Dr Harpreet Singh, whose work with the Indian Council of Medical Research necessitates policy compromise in the form of “responsible data sharing”.
Despite these difficult and necessary conversations, the overall feeling of the conference was one of optimism. Perhaps, as the speakers took to the stage in the newly built IBDC, they thought of the public good – vaccines, plant breeding, drug development – already enabled by biocuration and the wider world of open bioinformatics. Perhaps, too, they were bolstered by the profession’s reputation for handling complexity, facing challenges not only with intellectual rigour but with informed compassion; what Dr Sarah Davies terms “care-ful work”. Indeed, Dr Davies’ talk – on her ethnographic study of biocurators – provided vital context for the conference (and profession).
Outlining the discipline, Dr Davies described biocuration as “being precise and exact […] to support the people in our community”, which aligned well with the technical scholarship and ethical concern showcased throughout the conference. However, she also observed economic precarity among curators, exacerbated by epistemic exclusion and “fauxtomation”. In the words of one of her study’s subjects, many database users see biocurators as “some kind of gnome, that fills the database at night”, diminishing or erasing their scientific contribution and labour without necessarily intending to do so.
Whilst Dr Davies’ talk was the only one to take biocuration itself as the object of study, her findings were consistent with many of the themes that emerged at this year’s AIBC. Economic sustainability, in particular, was a recurring focus – with environmental and social dimensions acknowledged but shelved in the face of even more pressing matters. It transpires that biocuration is precarious not only in terms of individual curators “living on those short-term contracts”, as demonstrated by Dr Davies but also at the institutional level. Dr Lynn Schriml, Professor Arita, Dr Frederic Bastian, and Dr Cochrane focused on the latter scale of economic insecurity and how best to ameliorate it. Dr Schriml, specifically, stressed the importance of database longevity – along with reliability and availability/access – as a pillar of effective biodata sharing and re-use.
Altogether, this proved an enlightening exchange for me as an early-career scientist, illuminating the peaks and pitfalls of the funding landscape. It shed light, too, on the transient availability of institutional funding and support. This can, in turn, cause the casualisation of curational labour, contributing to institutions undervaluing biocurators’ expertise and employment. No wonder, then, that so many of the pioneering collaborations covered in this conference began either as solidifications of informal researcher networks – e.g. the Global Biodata Coalition – or under the aegis of more specialised funding bodies – e.g. Bgee, as a Swiss initiative “of national importance”.
As a young researcher – with all the idealism and enthusiasm that implies – I am inclined to hope that biocuration will receive the recognition due to it in time. Certainly, I see no reason why a career dedicated to enabling ethical, open bioinformatics should be any less deserving of professional recognition – let alone a living wage – than any industry researcher or governmental policymaker. That said, shifting the culture of science (not to mention society!) will hardly happen overnight. I am also aware that the issues of ethical conundra and economic precarity are widespread in academia. However, biocuration as a discipline is especially unsung, offering fewer research grants and individual awards than other biological fields. Nor is it a coincidence that most curators were and are women – invisible women, doing invisible work.
Fittingly enough, for a female-dominated field, International Women’s Day fell during the conference itself. Whilst AIBC’s speaker lists have yet to achieve gender parity, the hosts at IBDC nevertheless took the opportunity to give women their flowers, literally. I can truthfully say I was touched by the gesture (and the beautiful bouquet of roses and baby’s breath). As discussed though, professional accreditation is far more crucial to effect change. It is a first step towards ensuring economic security for databases and their curators and cultivating mutual respect between curators and the communities they serve.
I am immeasurably grateful to this year’s organisers for awarding “Best Flash Talk” for my presentation on the application of biocuration standards for innovation in direct cell reprogramming. My thanks must also go to the Open Bioinformatics Foundation – without being awarded an OBF fellowship, I could never have afforded to present my doctoral work at AIBC. So, in the best tradition of open bioinformatics – epitomised in every speaker and delegate I had the pleasure to meet – I shall do my best to pay it forward. After all, to paraphrase Dr Pollock, the best use for my success is to open doors for someone else. And isn’t that what open science is all about?