AI-enabled scientific discovery in natural world imagery
Institute Seminar by Sara Beery
- Date: Jul 16, 2024
- Time: 10:30 AM - 11:30 AM (Local Time Germany)
- Speaker: Sara Beery
- Dr. Sara Beery is the Homer A. Burnell Career Development Professor in the MIT Faculty of Artificial Intelligence and Decision-Making. She was previously a visiting researcher at Google, working on large-scale urban forest monitoring as part of the Auto Arborist project. She received her PhD in Computing and Mathematical Sciences at Caltech in 2022, where she was advised by Pietro Perona and awarded the Amori Doctoral Prize for her thesis. Her research focuses on building computer vision methods that enable global-scale environmental and biodiversity monitoring across data modalities, tackling real-world challenges including geospatial and temporal domain shift, learning from imperfect data, fine-grained categories, and long-tailed distributions. She partners with industry, nongovernmental organizations, and government agencies to deploy her methods in the wild worldwide. She works toward increasing the diversity and accessibility of academic research in artificial intelligence through interdisciplinary capacity building and education, and has founded the AI for Conservation slack community, serves as the Biodiversity Community Lead for Climate Change AI, founded and directs the Workshop on Computer Vision Methods for Ecology, and co-leads the NSF Global Climate Center on AI and Biodiversity Change.
- Location: University of Konstanz
- Room: ZT1202 + online
- Host: Max Planck Institute of Animal Behavior
- Contact: blair.costelloe@ab.mpg.de
Natural world images collected by communities of enthusiast volunteers provide a vast and largely uncurated source of data. For instance, iNaturalist has over 180 million images tagged with species labels, already contributing immensely to research such as biodiversity monitoring and having been cited in over 4,000 scientific papers. Yet, these images are also known to contain a wealth of "secondary data" captured unintentionally or otherwise included in images and not properly reflected in image labels. Although this data contains crucial insights into interactions, animal social behavior, morphology, habitat, co-occurrence, and many more questions, the costly, time-consuming, or expert-dependent analysis needed to extract such information prevents breakthroughs. Advances in deep learning methods for language and computer vision have the potential to enable the efficient and automated processing techniques needed to unlock the "hidden treasure" in such datasets– being able to directly search large image collections for these concepts would enable richer analyses that span beyond species identification. We introduce INQUIRE, a new benchmark for expert-level text-to-image retrieval on natural world images. INQUIRE includes hundreds of scientifically motivated retrieval tasks, each composed of a text query paired with a set of all relevant image matches that we have comprehensively labeled over a new iNaturalist subset comprising 5 million natural world images from 10,000 species classes (iNat2024). The tasks collected and labeled for INQUIRE come from discussions and interviews with a range of experts including ecologists, biologists, ornithologists, entomologists, and oceanographers. Our benchmark provides a rigorous evaluation that challenges models to demonstrate advanced knowledge and visual reasoning. We perform a comprehensive evaluation of state-of-the-art multimodal models. Our evaluations show that INQUIRE poses a significant challenge which necessitates the development of the next-generation image retrieval models able to accelerate and automate scientific recovery within large collections. A key finding from our experiments is that reranking, a technique that has typically been used for text retrieval, offers a significant avenue of improvement for image retrieval.
The MPI-AB Seminar Series is open to members of MPI and Uni Konstanz. The zoom link is published each week in the MPI-AB newsletter.