Blogging Birds: Telling Informative Stories About the Lives of Birds from Telemetric Data
Communications of the ACM, March 2019, Vol. 62 No. 3, Pages 68-77
Contributed articles : Blogging Birds: Telling Informative Stories About the Lives of Birds from Telemetric Data
By Advaith Siddharthan, Kapila Ponnamperuma, Chris Mellish, et al.
Blogging birds is a novel artificial intelligence program that generates creative texts to communicate telemetric data derived from satellite tags fitted to red kites — a medium-size bird of prey — as part of a species reintroduction program in the U.K. We address the challenge of communicating telemetric sensor data in real time by enriching it with meteorological and cartographic data, codifying ecological knowledge to allow creative interpretation of the behavior of individual birds in respect to such enriched data, and dynamically generating informative and engaging data-driven blogs aimed at the general public.
The Blogging Birds system shows that raw satellite tag data can be transformed into fluent, engaging, and informative texts directed at members of the public and in support of nature conservation.
We demonstrated that computers can compete with human experts in generating creative stories from numerical data. Unlike natural language generation systems that generate texts for news reporting or for decision making in the workplace, Blogging Birds's narratives are not entirely factual. Though the system is constrained by the observed data and its ecological domain model, the red kites' reported foraging and social behaviors are only imagined to have taken place. However, including these behaviors in the narratives allows us to communicate red kite ecology to the reader, and the blogs are better appraised as a consequence. Our work thus simultaneously addresses the societal challenges of communicating data effectively and engaging the general public with scientific research.
Blogging Birds composes blogs by combining texts produced through three different types of analysis: The first is a generic factual summarization of telemetric data enriched with location-specific information about weather conditions, habitat type, and geographic features, and can be readily adapted for use in other domains. The second is the processing and ecological interpretation of movement data in the context of home range use, and the third is the exploitation of domain knowledge encoded as a collection of rules that help the system imagine possible foraging and social behaviors from environmental and geographic parameters. Much of what is creative and interesting about the blogs derives from the latter domain-specific types of data analyses. Although the developed principles apply more broadly, new applications would require construction of knowledgebases pertinent to the domain of use. While this is a clear limitation of our approach, note our ecological interpretation of movement data in particular would be applicable to several other species. For example, we have already developed a version of Blogging Birds for golden eagles (Aquila chrysaetos) for use by RSPB conservation officers, successfully reusing the second, as well as the first, type of analysis.
During the course of the project, we also discovered ecologists had limited knowledge of the foraging behavior of red kites in Scotland, as they had not been studied extensively following their relatively recent reintroduction. We could thus encode only a limited number of rules per habitat type. The absence of any large-scale corpus of texts in this domain also meant we could not apply the deep learning methods that are rapidly gaining popularity for generating linguistic variation in computer-generated texts.27 In future work, we plan to invite Blogging Birds' users to contribute behavioral observations from across the U.K., enabling us to simultaneously curate a larger set of rules and further public engagement.
Finally, our ideas demonstrated here are applicable more generally. Telemetric data is ubiquitous, captured by smartphones and other mobile devices, as well as through GPS sensors embedded in vehicles used by the transportation industry and others. Even albums of time-stamped and geo-tagged photos provide data similar to what we used here. The nature of the blogs, along with the information sources used for data enrichment, would depend on the application, to blog about a holiday or reveal the provenance and journey of a food item in a supermarket. In effect, we have demonstrated it is possible to blog about such data through a process of data enrichment and natural language generation, opening up new avenues for using AI to engage people through data.
About the Authors:
Advaith Siddharthan is a reader in the Knowledge Media Institute at The Open University, Milton Keynes, U.K.
Kapila Ponnamperuma is the lead natural language engineer at Arria NLG plc, Aberdeen, Scotland, U.K.
Chris Mellish, now retired, was a professor of computer science at the University of Aberdeen, Scotland, U.K., at the time this research was conducted.
Chen Zeng was a research assistant on the Blogging Birds Project at the time this research was conducted.
Daniel Heptinstall is a senior international biodiversity adviser on the U.K. government's Joint Nature Conservation Committee.
Annie Robinson was a research fellow on the Blogging Birds Project at the time this research was conducted.
Stuart Benn is a communications officer for the Royal Society for the Protection of Birds in North Scotland.
René van der Wal is a professor of ecology at the University of Aberdeen, Scotland, U.K.