molmo
by
allenai

Description: Code for the Molmo Vision-Language Model

View on GitHub ↗

Summary Information

Updated 19 minutes ago

Added to GitGenius on August 4th, 2025

Created on December 5th, 2024

Open Issues & Pull Requests: 34 (+0)

Number of forks: 96

Total Stargazers: 918 (+0)

Total Subscribers: 14 (+0)

Issue Activity (beta)

Open issues: 31

New in 7 days: 0

Closed in 7 days: 0

Avg open age: 238 days

Stale 30+ days: 31

Stale 90+ days: 30

Recent activity

Opened in 7 days: 0

Closed in 7 days: 0

Comments in 7 days: 0

Events in 7 days: 0

Top labels

No label distribution available yet.

Most active issues this week

No issue events were indexed in the last 7 days.

Explore full issue details

Repository Insights (GitGenius)

Median issue/PR response: 2.2 days

Mean response time: 29.1 days

90th percentile: 60.1 days

Tracked items: 34

Most active contributors

chrisc36 - 32 events, 16 issues
sangho-vision - 4 events, 2 issues
ayylemao - 2 events, 1 issues
dprokhorov17 - 2 events, 2 issues
joshmyersdean - 2 events, 1 issues

Related by overlapping contributors

Detailed Description

MolMo is a research project from the Allen Institute for AI focused on building and evaluating multi-modal models capable of reasoning about molecules. It aims to move beyond traditional single-modality approaches (like SMILES strings or 2D graphs) to leverage diverse data types – including 3D structures, reaction mechanisms, and textual descriptions – for improved molecular understanding and prediction. The core idea is that integrating these different modalities allows models to learn more robust and generalizable representations of molecules, ultimately leading to better performance on tasks like property prediction, retrosynthesis, and molecular design.

The repository provides a comprehensive framework for working with multi-modal molecular data. A key component is the MolMo Dataset, a large-scale, curated collection of molecules with associated 3D conformers, reaction information (where available), and textual descriptions sourced from scientific literature. This dataset is designed to be challenging and representative of real-world chemical data, incorporating a variety of molecular complexities and data quality levels. The dataset isn't just a static collection; it includes tools for data cleaning, standardization, and augmentation, crucial for training reliable models. It supports various molecular file formats and provides utilities for converting between them.

The repository also features implementations of several multi-modal model architectures. These include variations of graph neural networks (GNNs) combined with transformers to process both the structural and textual information. Specifically, they explore methods for effectively fusing information from different modalities, such as attention mechanisms and cross-modal transformers. The models are designed to be flexible and adaptable to different downstream tasks. They aren't limited to a single task; the framework allows for fine-tuning on specific applications like predicting molecular properties (e.g., solubility, toxicity) or generating reaction pathways.

A significant aspect of MolMo is its emphasis on rigorous evaluation. The repository includes a suite of benchmark tasks and metrics for assessing the performance of multi-modal models. These benchmarks cover a range of challenges, including property prediction, reaction prediction, and molecular similarity assessment. Crucially, the evaluation protocols are designed to compare multi-modal models against single-modality baselines, demonstrating the benefits of incorporating multiple data sources. The team provides detailed analysis of model performance, highlighting strengths and weaknesses of different approaches.

Finally, MolMo is designed to be a community resource. The code is open-source and well-documented, encouraging researchers to build upon the existing work and contribute new models and datasets. The repository includes tutorials and examples to help users get started with the framework. The project actively promotes reproducibility by providing pre-trained models and clear instructions for replicating the reported results. It represents a substantial step towards more intelligent and versatile molecular AI, offering a powerful platform for advancing research in this critical field.

molmo
by
allenai

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

molmo
by
allenaiallenai/molmo

Repository Details

molmo by allenai

Summary Information

Issue Activity (beta)

Recent activity

Top labels

Most active issues this week

Repository Insights (GitGenius)

Most active contributors

Related by overlapping contributors

molmo by allenaiallenai/molmo

Repository Details

molmo
by
allenai

molmo
by
allenaiallenai/molmo