graph LR
User_Interface_Workflow_Orchestration["User Interface & Workflow Orchestration"]
Model_Management_Core_AI_Services["Model Management & Core AI Services"]
Intermediate_Data_Transformation_Refinement["Intermediate Data Transformation & Refinement"]
Output_Generation_Persistence["Output Generation & Persistence"]
Core_Utilities_Configuration["Core Utilities & Configuration"]
User_Interface_Workflow_Orchestration -- "Initiates processing" --> Model_Management_Core_AI_Services
User_Interface_Workflow_Orchestration -- "Orchestrates data flow" --> Intermediate_Data_Transformation_Refinement
User_Interface_Workflow_Orchestration -- "Manages output" --> Output_Generation_Persistence
Model_Management_Core_AI_Services -- "Provides model outputs" --> Intermediate_Data_Transformation_Refinement
Intermediate_Data_Transformation_Refinement -- "Sends processed data" --> Output_Generation_Persistence
Core_Utilities_Configuration -- "Provides configuration" --> User_Interface_Workflow_Orchestration
Core_Utilities_Configuration -- "Provides utilities" --> Model_Management_Core_AI_Services
Core_Utilities_Configuration -- "Provides utilities" --> Intermediate_Data_Transformation_Refinement
Core_Utilities_Configuration -- "Provides utilities" --> Output_Generation_Persistence
click User_Interface_Workflow_Orchestration href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//MinerU/User_Interface_Workflow_Orchestration.md" "Details"
click Model_Management_Core_AI_Services href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//MinerU/Model_Management_Core_AI_Services.md" "Details"
click Intermediate_Data_Transformation_Refinement href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//MinerU/Intermediate_Data_Transformation_Refinement.md" "Details"
click Output_Generation_Persistence href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//MinerU/Output_Generation_Persistence.md" "Details"
click Core_Utilities_Configuration href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main//MinerU/Core_Utilities_Configuration.md" "Details"
Here's the final architecture analysis for MinerU, consolidating the insights from the Control Flow Graph (CFG) and Source Code analysis into five fundamental components.
This component serves as the primary entry point for users, handling command-line interface (CLI) interactions and server-side API requests. It is responsible for parsing user inputs, managing file paths, and orchestrating the entire document analysis workflow by selecting and initiating either the traditional or VLM-based processing pipeline. It acts as the central coordinator for the overall system execution.
Related Classes/Methods:
mineru.cli.client(1:1)mineru.cli.common(1:1)mineru.cli.vlm_sglang_server(1:1)mineru.backend.pipeline.pipeline_analyze(1:1)mineru.backend.vlm.vlm_analyze(1:1)mineru.server(1:1)
This component manages the entire lifecycle of all AI models used in MinerU. It handles downloading, configuration, and local storage of models (OCR, Layout, MFR, Table, VLM). Crucially, it initializes and provides singleton instances of these models, ensuring efficient resource utilization. It also encapsulates the direct inference capabilities of these "atomic" AI models.
Related Classes/Methods:
mineru.cli.models_download(1:1)mineru.utils.models_download_utils(1:1)mineru.backend.pipeline.model_init(1:1)mineru.model.ocr.paddleocr2pytorch.pytorch_paddle(1:1)mineru.model.ocr.paddleocr2pytorch.pytorchocr.tools.infer.predict_system(1:1)mineru.model.reading_order.xycut(1:1)mineru.model.reading_order.layout_reader(1:1)mineru.model.table.rapid_table(1:1)mineru.model.mfr.unimernet.Unimernet(1:1)mineru.model.mfr.unimernet.unimernet_hf.modeling_unimernet(1:1)mineru.model.layout.doclayout_yolo(1:1)mineru.model.mfd.yolo_v8(1:1)mineru.backend.vlm.predictor(1:1)mineru.backend.vlm.hf_predictor(1:1)mineru.backend.vlm.sglang_client_predictor(1:1)mineru.backend.vlm.sglang_engine_predictor(1:1)
This component is responsible for processing and refining the raw outputs generated by the Model Management & Core AI Services. It performs crucial tasks such as block and span pre-processing (handling overlaps, merging), applies "magic models" for heuristic-based refinement and categorization, and standardizes diverse model outputs into a consistent intermediate JSON representation. It also handles paragraph splitting and structural element identification.
Related Classes/Methods:
mineru.utils.block_pre_proc(1:1)mineru.utils.span_pre_proc(1:1)mineru.utils.span_block_fix(1:1)mineru.backend.pipeline.pipeline_magic_model(1:1)mineru.backend.vlm.vlm_magic_model(1:1)mineru.backend.pipeline.model_json_to_middle_json(1:1)mineru.backend.vlm.token_to_middle_json(1:1)mineru.backend.pipeline.para_split(354:368)
This component takes the standardized intermediate JSON representation and transforms it into final, human-readable content formats such as Markdown or structured content lists. It also manages the reading of input files (PDFs, images) and the writing of all processed output data to various storage backends, including the local file system and cloud storage (e.g., S3).
Related Classes/Methods:
mineru.backend.pipeline.pipeline_middle_json_mkcontent(1:1)mineru.backend.vlm.vlm_middle_json_mkcontent(1:1)mineru.data.data_reader_writer.base(1:1)mineru.data.data_reader_writer.filebase(1:1)mineru.data.data_reader_writer.multi_bucket_s3(1:1)mineru.data.data_reader_writer.s3(1:1)mineru.data.io.base(1:1)mineru.data.io.http(1:1)mineru.data.io.s3(1:1)
This foundational component provides a collection of essential utility functions and manages application-wide configuration settings. It includes geometric calculations for bounding boxes, PDF and image manipulation tools, and a centralized configuration reader. These utilities are leveraged across nearly all other components, providing common services and ensuring consistent behavior.
Related Classes/Methods: