graph LR
Application_Orchestration["Application Orchestration"]
Boltz_Core_Models["Boltz Core Models"]
Neural_Network_Architecture["Neural Network Architecture"]
Data_Processing_Feature_Engineering["Data Processing & Feature Engineering"]
Data_Management_Augmentation["Data Management & Augmentation"]
Training_Evaluation_Framework["Training & Evaluation Framework"]
General_Utilities["General Utilities"]
Application_Orchestration -- "orchestrates" --> Boltz_Core_Models
Application_Orchestration -- "processes inputs using" --> Data_Processing_Feature_Engineering
Boltz_Core_Models -- "utilizes" --> Neural_Network_Architecture
Boltz_Core_Models -- "trained and evaluated by" --> Training_Evaluation_Framework
Boltz_Core_Models -- "consumes data from" --> Data_Management_Augmentation
Neural_Network_Architecture -- "provides building blocks for" --> Boltz_Core_Models
Neural_Network_Architecture -- "uses" --> General_Utilities
Data_Processing_Feature_Engineering -- "prepares data for" --> Data_Management_Augmentation
Data_Processing_Feature_Engineering -- "provides processed data to" --> Application_Orchestration
Data_Management_Augmentation -- "supplies data to" --> Boltz_Core_Models
Data_Management_Augmentation -- "receives data from" --> Data_Processing_Feature_Engineering
Training_Evaluation_Framework -- "optimizes and evaluates" --> Boltz_Core_Models
Training_Evaluation_Framework -- "uses data from" --> Data_Management_Augmentation
General_Utilities -- "supports" --> Neural_Network_Architecture
General_Utilities -- "supports" --> Data_Processing_Feature_Engineering
click Application_Orchestration href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/boltz/Application Orchestration.md" "Details"
click Boltz_Core_Models href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/boltz/Boltz Core Models.md" "Details"
click Neural_Network_Architecture href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/boltz/Neural Network Architecture.md" "Details"
click Data_Processing_Feature_Engineering href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/boltz/Data Processing & Feature Engineering.md" "Details"
click Data_Management_Augmentation href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/boltz/Data Management & Augmentation.md" "Details"
click Training_Evaluation_Framework href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/boltz/Training & Evaluation Framework.md" "Details"
click General_Utilities href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/boltz/General Utilities.md" "Details"
The Boltz project is a molecular modeling framework primarily focused on predicting and generating molecular structures and their properties, such as binding affinity and confidence. The core functionality revolves around neural network models (Boltz1 and Boltz2) that leverage diffusion processes for structure generation. The system handles comprehensive data processing, from parsing raw molecular data and performing feature engineering to managing data loading and augmentation for efficient training. It includes a robust training and evaluation framework with various loss functions, physical potentials, and optimization utilities. The overall application workflow is orchestrated to manage the flow of data and execution across these specialized modules, with general utilities supporting various operations throughout the pipeline.
Serves as the central control unit for the Boltz application, managing the overall workflow from input processing to model prediction. It coordinates data flow and execution across different modules.
Related Classes/Methods:
boltz.src.boltz.main:filter_inputs_structure(311:352)boltz.src.boltz.main:process_input(477:605)boltz.src.boltz.main:predict(933:999)
Encapsulates the primary Boltz neural network architectures (Boltz1, Boltz2), including their forward passes, training, validation, and prediction steps. It integrates diffusion, confidence, and affinity prediction capabilities.
Related Classes/Methods:
boltz.src.boltz.model.models.boltz2.Boltz2(16:300)boltz.src.boltz.model.models.boltz1.Boltz1(16:300)boltz.src.boltz.model.modules.diffusionv2.AtomDiffusion(179:677)boltz.src.boltz.model.modules.confidencev2.ConfidenceModule(19:237)boltz.src.boltz.model.modules.affinity.AffinityModule(34:135)
Provides the fundamental building blocks for the Boltz models, including input embedding, MSA and pairformer modules, various encoders (single, pairwise, atom attention), transformer blocks, and core neural network layers like attention mechanisms and triangular multiplications.
Related Classes/Methods:
boltz.src.boltz.model.modules.trunk.InputEmbedder(24:113)boltz.src.boltz.model.modules.trunk.MSAModule(116:289)boltz.src.boltz.model.modules.encoders.AtomAttentionEncoder(288:540)boltz.src.boltz.model.modules.transformersv2.DiffusionTransformer(68:137)boltz.src.boltz.model.layers.triangular_attention.attention.TriangleAttention(33:162)
Manages the parsing of raw input data (MMCIF, FASTA, YAML, CSV, A3M) into internal data structures, followed by comprehensive featurization and tokenization processes. This includes generating token, atom, MSA, template, and symmetry features, and handling molecular geometry and constraints.
Related Classes/Methods:
boltz.src.boltz.data.types.Structure(169:319)boltz.src.boltz.data.parse.mmcif_with_constraints:parse_mmcif(200:300)boltz.src.boltz.data.feature.featurizer.BoltzFeaturizer:process(100:200)boltz.src.boltz.data.tokenize.boltz.BoltzTokenizer:tokenize(35:195)boltz.src.boltz.data.msa.mmseqs2:run_mmseqs2(20:254)
Handles the loading, batching, sampling, and cropping of prepared data for training, validation, and inference. It includes various sampling strategies (cluster, random, distillation) and mechanisms to filter and augment data subsets for efficient model processing.
Related Classes/Methods:
boltz.src.boltz.data.module.training.BoltzTrainingDataModule(491:684)boltz.src.boltz.data.module.inference.BoltzInferenceDataModule:predict_dataloader(249:271)boltz.src.boltz.data.sample.cluster.ClusterSampler:sample(204:283)boltz.src.boltz.data.crop.boltz.BoltzCropper:crop(150:296)boltz.src.boltz.data.filter.static.polymer.ClashingChainsFilter:filter(202:299)
Provides the infrastructure for training Boltz models, including defining various loss functions (confidence, diffusion, validation metrics), physical potential functions to guide molecular generation, and optimization utilities like learning rate schedulers and Exponential Moving Average (EMA). It also handles outputting predicted structures and evaluating model performance.
Related Classes/Methods:
boltz.src.boltz.model.loss.confidencev2:confidence_loss(8:87)boltz.src.boltz.model.potentials.potentials:get_potentials(417:482)boltz.src.boltz.model.optim.ema.EMA(14:389)boltz.scripts.train.train:train(80:235)boltz.src.boltz.data.write.mmcif:to_mmcif(17:305)boltz.scripts.eval.aggregate_evals:eval_models(297:505)
A collection of miscellaneous utility functions supporting various operations across the project, such as default value handling, random augmentations (rotations, quaternions), and centering operations.
Related Classes/Methods: