Welcome to the 9th edition of "Arxiv Weekly Insights," where we delve into the latest groundbreaking research and developments from the arXiv repository.
This newsletter is brought to you by SmartXiv, the AI-powered personalized arXiv digest designed to enhance your research experience. With over 1000 research papers uploaded daily on arXiv, it's easy to miss important updates. Let SmartXiv deliver personalized recommendations so you never miss what truly matters to you.
Get started today and save 30% with your annual subscription.
Computer Vision and Pattern Recognition
Bridging Episodes and Semantics: A Novel Framework for Long-Form Video Understanding
Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen, Hung-Ting Su, Winston H. Hsu, Shang-Hong Lai
This paper introduces BREASE: BRidging Episodes And SEmantics for Long-Form Video Understanding, a model that simulates episodic memory accumulation to capture action sequences and reinforces them with semantic knowledge dispersed throughout the video. The work makes two key contributions: an Episodic COmpressor (ECO) that efficiently aggregates crucial representations from micro to semi-macro levels, and a Semantics reTRiever (SeTR) that enhances these aggregated representations with semantic information by focusing on the broader context, dramatically reducing feature dimensionality while preserving relevant macro-level information. The proposed method achieves state-of-the-art performance across multiple long video understanding benchmarks in both zero-shot and fully-supervised settings.
Computation and Language
CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models
Jonathan Bourne
This paper introduces Context Leveraging OCR Correction (CLOCR-C), which utilizes the infilling and context-adaptive abilities of transformer-based language models (LMs) to improve OCR quality. The study aims to determine if LMs can perform post-OCR correction, improve downstream NLP tasks, and the value of providing the socio-cultural context as part of the correction process. Experiments were conducted using seven LMs on three datasets, and the results demonstrate that some LMs can significantly reduce error rates, with the top-performing model achieving over a 60% reduction in character error rate on the NCSE dataset. The OCR improvements extend to downstream tasks, such as Named Entity Recognition, with increased Cosine Named Entity Similarity. Furthermore, the study shows that providing socio-cultural context in the prompts improves performance, while misleading prompts lower performance.
Image and Video Processing
Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes
Li Zhang, Basu Jindal, Ahmed Alaa, Robert Weinreb, David Wilson, Eran Segal, James Zou, Pengtao Xie
This paper introduces a generative deep learning framework that uniquely generates high-quality paired segmentation masks and medical images, serving as auxiliary data for training robust models in data-scarce environments. Unlike traditional generative models that treat data generation and segmentation model training as separate processes, the proposed method employs multi-level optimization for end-to-end data generation. This approach allows segmentation performance to directly influence the data generation process, ensuring that the generated data is specifically tailored to enhance the performance of the segmentation model. The method demonstrated strong generalization performance across 9 diverse medical image segmentation tasks and on 16 datasets, in ultra-low data regimes, spanning various diseases, organs, and imaging modalities. When applied to various segmentation models, it achieved performance improvements of 10-20\% (absolute), in both same-domain and out-of-domain scenarios. Notably, it requires 8 to 20 times less training data than existing methods to achieve comparable results.
Databases
Empowering Open Data Sharing for Social Good: A Privacy-Aware Approach
Tânia Carvalho, Luís Antunes, Cristina Costa, Nuno Moniz
This paper uses a dataset related to Covid-19 cases in the second largest hospital in Portugal to show how it is feasible to ensure data privacy while improving the quality and maintaining the utility of the data. The paper highlights the significance of modifying human rights frameworks for the digital era, pointing out gaps in existing research and offering recommendations for future investigations.
Computers and Society
GeoAI in resource-constrained environments
Marc Böhlen, Gede Sughiarta, Atiek Kurnianingsih, Srikar Reddy Gopaladinne, Sujay Shrivastava, Hemanth Kumar Reddy Gorla
This paper describes GeoAI in resource-constrained environments, focusing on the use of small, low-cost drones for data collection and the development of lightweight, efficient AI models for data processing. The paper discusses the challenges and opportunities of using GeoAI in these environments and presents a case study of using GeoAI for flood mapping in a rural area of Indonesia.
Databases
BioBricks.ai: A Versioned Data Registry for Life Sciences Data Assets
Yifan Gao, Zakariyya Mughal, Jose A. Jaramillo-Villegas, Marie Corradi, Alexandre Borrel, Ben Lieberman, Suliman Sharif, John Shaffer, Karamarie Fecho, Ajay Chatrath, Alexandra Maertens, Marc A. T. Teunis, Nicole Kleinstreuer, Thomas Hartung, Thomas Luechtefeld
This paper introduces BioBricks.ai, a centralized data repository and suite of developer-friendly tools to simplify access to scientific data. The platform currently delivers over ninety biological and chemical datasets and provides a package manager-like system for installing and managing dependencies on data sources. The paper highlights the potential of BioBricks.ai to accelerate data science workflows and facilitate the creation of novel data assets.
Thank you for joining us this week. Stay tuned for more insights in our next edition. Until then, happy researching! See you next week!