Skip to content

Latest commit

 

History

History
83 lines (47 loc) · 4.78 KB

File metadata and controls

83 lines (47 loc) · 4.78 KB
graph LR
    CLI_Orchestrator["CLI Orchestrator"]
    Core_Pipeline_Executor["Core Pipeline Executor"]
    Gallery_Generation_Module["Gallery Generation Module"]
    External_Visualization_Exporter["External Visualization Exporter"]
    Data_Curation_Manager["Data Curation Manager"]
    Clustering_Interface["Clustering Interface"]
    CLI_Orchestrator -- "invokes" --> Core_Pipeline_Executor
    CLI_Orchestrator -- "invokes" --> Gallery_Generation_Module
    CLI_Orchestrator -- "invokes" --> External_Visualization_Exporter
    CLI_Orchestrator -- "invokes" --> Data_Curation_Manager
    CLI_Orchestrator -- "invokes" --> Clustering_Interface
    Clustering_Interface -- "invokes" --> Core_Pipeline_Executor
Loading

CodeBoardingDemoContact

Details

The CLI & Public API subsystem encompasses the top-level functions within the fastdup package, primarily defined in fastdup/engine.py. These functions serve as the direct interface for users, enabling interaction with the fastdup core engine through both command-line commands and programmatic API calls.

CLI Orchestrator

The primary command-line interface entry point. It parses user arguments, validates inputs, and dispatches control to other high-level fastdup functions based on the specified command.

Related Classes/Methods:

Core Pipeline Executor

Executes the main fastdup data processing and analysis pipeline. This function is the primary programmatic interface to the core engine, orchestrating the underlying computations.

Related Classes/Methods:

Gallery Generation Module

A collection of functions responsible for generating various interactive HTML galleries (duplicates, outliers, components, statistics, similarity) to visualize fastdup's analysis results.

Related Classes/Methods:

External Visualization Exporter

Prepares and exports fastdup's feature vectors and metadata into a format compatible with external visualization tools like TensorBoard Projector, enabling deeper exploration of embeddings.

Related Classes/Methods:

Data Curation Manager

Provides functionality to manage and modify identified components (e.g., duplicate clusters, outlier groups) within the dataset or fastdup's internal representation, supporting data curation workflows.

Related Classes/Methods:

Clustering Interface

Initiates and manages the KMeans clustering process on image embeddings, providing a high-level API for grouping similar images.

Related Classes/Methods: