UC Santa Barbara hosts dozens of collections, many of which fall under the purview of the Cheadle Center for Biodiversity and Ecological Restoration (CCBER). The center is applying techniques from computer and information science to revolutionize the use of the immense datasets in their research collections. The center’s director, Katja Seltmann, is incorporating these methods into her work on a $4.3 million National Science Foundation (NSF) initiative for investigating terrestrial parasites.
Seltmann is leading the biodiversity informatics component of NSF’s new Terrestrial Parasite Tracker project, which involves 27 different research institutions. Arthropods are major carriers of human disease worldwide, but scientists do not know how they will respond to the changing environment. Seltmann and her colleagues are working to structure the information available on these arthropods using ontologies by leveraging the system of sorting organisms by their scientific names and classes as ontology. This structure will enable researchers to take advantage of powerful statistical techniques and natural language processing.
Seltmann plans to use the ontologies on Ontobee, an online data server designed for ontologies. The ontologies were developed for diverse projects and used extensively for annotating genomes and understanding model organisms and hence has a wealth of terms and relationships. Further, tools such as Global Biotic Interactions along with other databases for managing natural history collection will be utilized to overcome the challenge of annotating and sharing complex ideas.
The Terrestrial Parasite Tracker project spans 1.3 million specimens of parasitic arthropods. A lot of the important information is qualitative, especially the relationships between parasites and hosts. The ontology and informatics Seltmann is working on will open these collections to new methodologies and allow institutions to link their resources for large-scale studies easily.
The ways in which researchers use data has evolved over the past few decades, requiring more advanced methods to search and share information. Rather than simply observing individual specimens, they are also beginning to use analyses of information across many collections. Some of this information is in the text format, but a great deal is less concrete, which makes it difficult to incorporate into a searchable database. Researchers often will include qualitative information on a specimen’s record, but differences in word usage between individuals mean this generally defies analysis using conventional big-data tools.
Seltmann’s goal is to devise methods to make this sort of information accessible to computers as well as humans. Going by the recently published report, detailing how networking specimen data is the next evolution of natural history collections, it is evident that ontologies will have a large role in enabling researchers to manage and share research collections.
Click here to read the original article published in The Current.
Please give your feedback on this article or share a similar story for publishing by clicking here.