Skip to content

Latest commit

 

History

History
95 lines (64 loc) · 7.57 KB

File metadata and controls

95 lines (64 loc) · 7.57 KB
graph LR
    Data_Processing_Pipeline["Data Processing Pipeline"]
    Machine_Learning_Core["Machine Learning Core"]
    HPC_Workflow_Orchestration["HPC Workflow Orchestration"]
    Reporting_Results_Management["Reporting & Results Management"]
    Core_Utilities["Core Utilities"]
    Data_Processing_Pipeline -- "provides processed data to" --> Machine_Learning_Core
    Data_Processing_Pipeline -- "outputs intermediate feature data to" --> Reporting_Results_Management
    Machine_Learning_Core -- "outputs classification results and trained models to" --> Reporting_Results_Management
    HPC_Workflow_Orchestration -- "initiates" --> Data_Processing_Pipeline
    HPC_Workflow_Orchestration -- "initiates" --> Machine_Learning_Core
    Core_Utilities -- "used by" --> Data_Processing_Pipeline
    Core_Utilities -- "used by" --> Machine_Learning_Core
    click Data_Processing_Pipeline href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/neuro-forestwalk/Data_Processing_Pipeline.md" "Details"
    click Machine_Learning_Core href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/neuro-forestwalk/Machine_Learning_Core.md" "Details"
    click HPC_Workflow_Orchestration href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/neuro-forestwalk/HPC_Workflow_Orchestration.md" "Details"
    click Reporting_Results_Management href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/neuro-forestwalk/Reporting_Results_Management.md" "Details"
Loading

CodeBoardingDemoContact

Details

The neuro-forestwalk project is structured as a robust Machine Learning Tool for Behavioral Phenotyping, designed with a clear data pipeline and workflow architecture suitable for High-Performance Computing (HPC) environments. The analysis of its Control Flow Graph (CFG) and source code reveals five central components that manage the flow from raw DeepLabCut (DLC) tracking data to classified behavioral phenotypes.

Data Processing Pipeline [Expand]

This component is responsible for the initial ingestion of raw DeepLabCut (DLC) tracking data and associated metadata. It then transforms these raw coordinates into a rich set of quantitative behavioral features (e.g., distances, angles, event detections) and aggregates these features from multiple experimental trials into a structured dataset suitable for machine learning.

Related Classes/Methods:

Machine Learning Core [Expand]

This component encompasses the core machine learning functionalities. It identifies the most discriminative behavioral features, optimizes the hyperparameters of the classification model (specifically Random Forest), trains the model using the selected features, performs predictions on new data, and evaluates the overall classification accuracy for behavioral phenotyping.

Related Classes/Methods:

HPC Workflow Orchestration [Expand]

This component manages the entire execution flow of the neuro-forestwalk pipeline within a High-Performance Computing (HPC) environment. It handles the submission, scheduling, and monitoring of computational tasks (data processing, feature engineering, feature selection, and classification) using LSF (Load Sharing Facility) bsub scripts.

Related Classes/Methods:

Reporting & Results Management [Expand]

This component is responsible for the persistent storage, organization, and accessibility of all intermediate and final outputs generated by the analysis pipeline. This includes raw feature dataframes, lists of selected features, optimized model hyperparameters, classification predictions, and performance metrics.

Related Classes/Methods:

Core Utilities

This component provides a collection of general-purpose helper functions and common utilities that support various operations across the project. These utilities include functions for data manipulation, list processing (e.g., ripristinate_lists), ranking algorithms (get_ranks, get_final_ranks), and statistical helpers (mode_with_none).

Related Classes/Methods:

  • ripristinate_lists
  • get_ranks
  • get_final_ranks
  • mode_with_none