Description: Code for running RFdiffusion
View rosettacommons/rfdiffusion on GitHub ↗
RFdiffusion is a powerful, open-source method for protein structure generation, developed by Rosetta Commons. Its primary function is to design novel protein structures, either from scratch (unconditional generation) or guided by specific input information (conditional generation). This flexibility allows researchers to tackle a wide array of protein design challenges, including motif scaffolding, binder design, and design diversification. The core of RFdiffusion utilizes a diffusion process, a technique that gradually adds noise to a protein structure and then learns to reverse this process, effectively generating new structures.
The main features of RFdiffusion are its ability to perform various protein design tasks. Unconditional protein generation allows for the creation of entirely new protein structures without any pre-existing template. The software also supports symmetric unconditional generation, enabling the design of proteins with cyclic, dihedral, and tetrahedral symmetries. Motif scaffolding is a key capability, allowing users to incorporate specific structural motifs (e.g., from a known protein) into a new protein design. This is achieved by specifying the desired arrangement of the motif within the overall structure. RFdiffusion can also be used for binder design, where the goal is to create a protein that specifically binds to a target molecule. Furthermore, the software offers design diversification, also known as "partial diffusion," which allows for the exploration of a range of designs around a starting structure, providing a way to sample different variations of a design.
The purpose of RFdiffusion is to provide a versatile and accessible tool for protein design. It aims to empower researchers to create novel protein structures with desired properties and functionalities. The software is designed to be user-friendly, with clear documentation and examples to guide users through the process. The repository provides detailed instructions for installation, including the necessary dependencies such as the SE3-Transformer library, and offers pre-trained model weights. The documentation, accessible through a dedicated website, provides a comprehensive overview of the software's capabilities and usage.
The software is primarily accessed through a command-line interface, utilizing Hydra configurations to manage various parameters. The core script for running the diffusion process is `run_inference.py`, which takes configuration files to specify the desired design parameters. Key parameters include the protein length, output directory, and the number of designs to generate. The `contigmap.contigs` parameter is crucial for defining the protein's structure, allowing users to specify the length of the protein or incorporate existing motifs from PDB files. The software also supports the `inpaint_seq` flag, which allows users to mask the sequence identity of specific residues, enabling the design of proteins with desired packing properties.
RFdiffusion offers several advanced features, including partial diffusion for design diversification and the ability to design binders. Partial diffusion allows for the generation of diverse structures by partially denoising a starting structure. Binder design is facilitated by providing the target protein and specifying the desired binding interface. The software also provides specialized models for specific tasks, such as scaffolding small motifs and designing binders, which can be selected using the `ckpt_override_path` parameter. The repository also provides examples and scripts to guide users through various design scenarios, including unconditional monomer generation, motif scaffolding, and partial diffusion. Overall, RFdiffusion is a valuable tool for researchers in structural biology and protein engineering, offering a flexible and powerful approach to protein design.
Fetching additional details & charts...