graph LR
HTSlib_Bindings["HTSlib Bindings"]
Genomic_Data_Models["Genomic Data Models"]
Indexing_Querying["Indexing & Querying"]
Pileup_Analysis["Pileup Analysis"]
External_Tool_Wrappers["External Tool Wrappers"]
HTSlib_Bindings -- "provides raw data for" --> Genomic_Data_Models
Genomic_Data_Models -- "encapsulate data read by" --> HTSlib_Bindings
HTSlib_Bindings -- "performs low-level indexed file operations for" --> Indexing_Querying
Indexing_Querying -- "orchestrates indexed access via" --> HTSlib_Bindings
HTSlib_Bindings -- "provides alignment data streams for" --> Pileup_Analysis
Pileup_Analysis -- "consumes alignment data from" --> HTSlib_Bindings
Indexing_Querying -- "returns instances of" --> Genomic_Data_Models
Genomic_Data_Models -- "are the structured output of indexed queries from" --> Indexing_Querying
Genomic_Data_Models -- "provide structured records for" --> Pileup_Analysis
Pileup_Analysis -- "generates objects based on" --> Genomic_Data_Models
Indexing_Querying -- "optimizes data retrieval for" --> Pileup_Analysis
Pileup_Analysis -- "leverages indexed access for efficiency from" --> Indexing_Querying
click Genomic_Data_Models href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pysam/Genomic_Data_Models.md" "Details"
click Indexing_Querying href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pysam/Indexing_Querying.md" "Details"
click Pileup_Analysis href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pysam/Pileup_Analysis.md" "Details"
click External_Tool_Wrappers href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/pysam/External_Tool_Wrappers.md" "Details"
One paragraph explaining the functionality which is represented by this graph. What the main flow is and what is its purpose.
This foundational component provides the direct, low-level Cython bindings to the HTSlib C library. It is responsible for efficient reading, writing, and indexing of common genomic file formats such as SAM/BAM/CRAM, VCF/BCF, FASTA/FASTQ, and Tabix-indexed generic text files. It acts as the primary bridge between Python's ease of use and C's computational power for large-scale genomic data operations.
Related Classes/Methods:
Genomic Data Models [Expand]
This component defines Pythonic data structures and classes that represent individual genomic records parsed from the underlying HTSlib Bindings. These abstractions (e.g., aligned reads, variant calls, tabix entries) allow developers to easily access, manipulate, and interpret the biological information contained within the files without needing to interact directly with C pointers or low-level data structures.
Related Classes/Methods:
Indexing & Querying [Expand]
This component manages the creation, loading, and utilization of genomic indices (e.g., BAM index, Tabix index, BCF index) for efficient region-based data retrieval. It provides iterators and methods to query specific genomic regions or retrieve records based on their coordinates, significantly enhancing performance for large datasets.
Related Classes/Methods:
Pileup Analysis [Expand]
This component is dedicated to generating and analyzing pileup data from alignment files. It handles the complex logic of iterating through genomic positions, identifying aligned reads, and detecting variations like indels and substitutions. It provides Pythonic objects to represent pileup columns and reads for further analysis.
Related Classes/Methods:
External Tool Wrappers [Expand]
This component provides a Pythonic wrapper and a robust dispatch mechanism for executing external bioinformatics command-line tools, specifically samtools and bcftools. It allows users to leverage the full functionality of these powerful C-based utilities directly from their Python scripts, abstracting away the complexities of subprocess management and command-line argument construction.
Related Classes/Methods: