#7 Arxiv Weekly Insights

Welcome to the 7th edition of "Arxiv Weekly Insights," where we delve into the latest groundbreaking research and developments from the Arxiv repository.

This newsletter is brought to you by SmartXiv, the AI-powered personalized arXiv digest designed to enhance your research experience. With over 1000 research papers uploaded daily on arXiv, it's easy to miss important updates. Let SmartXiv deliver personalized recommendations so you never miss what truly matters to you.
Get started today and save 30% with your annual subscription.

Artificial Intelligence
Differentiable Logic Programming for Distant Supervision
Akihiro Takemura, Katsumi Inoue

This paper introduces a new method for integrating neural networks with logic programming in Neural-Symbolic AI (NeSy), aimed at learning with distant supervision. The method evaluates logical implications and constraints in a differentiable manner by embedding both neural network outputs and logic programs into matrices, facilitating more efficient learning under distant supervision.

Computer Vision and Pattern Recognition
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations
Can Qin, Congying Xia, Krithika Ramakrishnan, Michael Ryoo, Lifu Tu, Yihao Feng, Manli Shu, Honglu Zhou, Anas Awadalla, Jun Wang, Senthil Purushwalkam, Le Xue, Yingbo Zhou, Huan Wang, Silvio Savarese, Juan Carlos Niebles, Zeyuan Chen, Ran Xu, Caiming Xiong

This paper presents xGen-VideoSyn-1, a text-to-video (T2V) generation model capable of producing realistic scenes from textual descriptions. The model uses a latent diffusion model (LDM) architecture and introduces a video variational autoencoder (VidVAE) to compress video data both spatially and temporally, significantly reducing the length of visual tokens and the computational demands associated with generating long-sequence videos.

Computation and Language
RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment
Xiaohan Wang, Xiaoyan Yang, Yuqi Zhu, Yue Shen, Jian Wang, Peng Wei, Lei Liang, Jinjie Gu, Huajun Chen, Ningyu Zhang

This paper introduces RuleAlign, a framework designed to align LLMs with specific diagnostic rules. The framework develops a medical dialogue dataset comprising rule-based communications between patients and physicians and designs an alignment learning approach through preference learning.

Machine Learning
A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language
Ekdeep Singh Lubana, Kyogo Kawaguchi, Robert P. Dick, Hidenori Tanaka

This paper seeks to establish a phenomenological definition for the concept of emergence in the context of neural networks. The definition implicates the acquisition of specific structures underlying the data-generating process as a cause of sudden performance growth for specific, narrower tasks.

Artificial Intelligence
MuMA-ToM: Multi-modal Multi-Agent Theory of Mind
Haojun Shi, Suyu Ye, Xinyu Fang, Chuanyang Jin, Layla Isik, Yen-Ling Kuo, Tianmin Shu

This paper introduces MuMA-ToM, a Multi-modal Multi-Agent Theory of Mind benchmark. MuMA-ToM is the first multi-modal Theory of Mind benchmark that evaluates mental reasoning in embodied multi-agent interactions. The benchmark provides video and text descriptions of people's multi-modal behavior in realistic household environments and asks questions about people's goals, beliefs, and beliefs about others' goals.

Computers and Society
Contextual Stochastic Optimization for School Desegregation Policymaking
Hongzhao Guan, Nabeel Gillani, Tyler Simko, Jasmine Mangat, Pascal Van Hentenryck

This paper develops a joint redistricting and choice modeling framework, called redistricting with choices (RWC), to estimate how redrawing elementary school boundaries in a large US public school district might realistically impact levels of socioeconomic segregation.


Thank you for joining us this week. Stay tuned for more insights in our next edition. Until then, happy researching! See you next week!