Research Software

Software projects developed to support my research activities, often accompanying published papers.

Active Projects

🔬 6DIMCOCO: Multi-dimensional CLIP Training Framework 

Status: Active development | Language: Python | Type: Machine Learning Research

Advanced research framework for training CLIP models with novel n-dimensional loss functions and cutting-edge analysis techniques.

Key Features:

Multi-dimensional CLIP Training: 3D, 4D, 6D, and custom dimensional configurations
Novel Loss Functions: 18+ mathematically rigorous loss function variants with numerical stability
CKA Analysis: Deep model comparison and understanding through Centered Kernel Alignment
Cross-modal Learning: Image-text and multilingual capabilities (Chinese-English translation)
GPU Acceleration: CuPy implementation for large tensor operations
Type-safe Configuration: Reproducible experiment management with dataclass-based configs

Technical Highlights:

Comprehensive test suite with 95%+ coverage ensuring mathematical correctness
Advanced visualization tools for 6D embedding analysis with hypercube representations
PyTorch Lightning integration for distributed training across multiple GPUs
Weights & Biases integration for experiment tracking and hyperparameter optimization

Installation:

git clone https://github.com/st7ma784/6DIMCOCO.git
cd 6DIMCOCO
pip install -r requirements.txt
pip install -e .

Links:

Repository: github.com/st7ma784/6DIMCOCO
Documentation: Complete Sphinx documentation with API reference

🛡️ PGDVisualisation: Adversarial Attack Visualization 

Status: Active development | Language: Python/JavaScript | Type: Security Research

Interactive visualization framework for understanding Projected Gradient Descent (PGD) adversarial attacks on machine learning models.

Key Features:

Advanced Attack Implementation: Multi-modal PGD attacks on CLIP models with text and image perturbations
Real-time Visualization: Interactive web interface for draggable attack visualization
Comprehensive Evaluation: Linear probe analysis of clean vs. adversarial features
Attack Variants: Support for PGD, C&W, AutoAttack, and custom text-based attacks
Performance Analysis: Batch processing with GPU acceleration for large-scale experiments
Docker Integration: Containerized visualization environment

Technical Highlights:

PyTorch Lightning-based training with multi-GPU support and gradient accumulation
Thread-safe result collection using queue-based architecture for concurrent processing
Advanced attack parameterization with configurable alpha, epsilon, and step parameters
Classifier analysis comparing clean, dirty, and general model performance
Web-based interface with Flask backend for real-time attack visualization

Research Applications:

Adversarial robustness evaluation for vision-language models
Attack transferability analysis across different model architectures
Security assessment of multimodal AI systems

Installation:

git clone https://github.com/st7ma784/PGDVisualisation.git
cd PGDVisualisation
pip install -r requirements.txt
# For web interface
docker pull st7ma784/vis

⚡ ML-SLURM-Template: HPC Machine Learning Framework 

Status: Stable release | Language: Python | Type: Research Infrastructure

Comprehensive template for deploying machine learning experiments on SLURM-based high-performance computing clusters with automated hyperparameter optimization.

Key Features:

HPC Integration: Seamless SLURM cluster deployment with automated job submission and resource management
Experiment Tracking: Native integration with Weights & Biases, Neptune, and Azure ML for comprehensive experiment monitoring
PyTorch Lightning Framework: Professional ML training pipeline with multi-GPU and multi-node distributed training support
Hyperparameter Optimization: Automated hyperparameter sweeps with intelligent trial generation and resource allocation
Cloud Deployment: Azure Machine Learning integration with notebook-based experiment management
CKA Analysis: Built-in Centered Kernel Alignment tools for model comparison and representation analysis

Technical Highlights:

Automated data module system with COCO dataset integration and smart downloading capabilities
FSDP (Fully Sharded Data Parallel) strategy for training large models across multiple nodes
Comprehensive SLURM environment detection and configuration for different HPC systems (BEDE, N8, etc.)
Type-safe argument parsing with test-tube integration for systematic hyperparameter exploration
Docker and Conda environment management with reproducible dependency specification

Research Applications:

Large-scale deep learning experiments requiring distributed training
Systematic hyperparameter optimization across HPC resources
Model architecture comparison using CKA analysis
Cloud-to-cluster workflow integration for academic research

Installation:

git clone https://github.com/st7ma784/MLslurmtemplate.git
cd MLslurmtemplate
pip install -r requirements.txt
# Configure SLURM parameters in Launch.py
python Launch.py --dir /path/to/data --num_trials 10

Links:

Repository: github.com/st7ma784/MLslurmtemplate
Documentation: Comprehensive README with step-by-step HPC deployment guide

🌌 JERICHO: Hybrid Plasma Simulation Framework 

Status: Active development | Language: C++/Python | Type: Scientific Computing

Advanced hybrid plasma simulation framework for studying magnetospheric plasma dynamics in planetary systems, particularly Jupiter and Saturn magnetospheres.

Key Features:

Hybrid Plasma Modeling: Ions treated kinetically with full particle dynamics, electrons as massless fluid using MHD equations
Multi-Language Implementation: Production C++ core with comprehensive Python implementation (PyJericho) for accessibility
MPI Parallelization: Distributed computing support with automatic domain decomposition and load balancing
GPU Acceleration: CUDA/CuPy integration with automatic fallback to CPU for large-scale simulations
Advanced Physics: Self-consistent electromagnetic fields with predictor-corrector magnetic field evolution
Web API Interface: Flask-based REST API for remote simulation management and configuration

Technical Highlights:

Multiple particle pushing algorithms (Boris A/B, Wiggs) for optimal numerical stability
Comprehensive boundary condition support (periodic, hard wall, outflow, inflow, MPI)
Real-time visualization and analysis tools with HDF5 output format
Production-quality scientific computing with extensive validation against theoretical models
Sophisticated field interpolation using particle-in-cell methods with ghost point handling

Research Applications:

Saturn magnetosphere plasma escape mechanisms analysis
Jovian system magnetospheric dynamics modeling
Plasma-moon interaction studies (Enceladus, Io)
Cross-planetary magnetosphere comparison research

Installation:

# C++ version
git clone https://github.com/st7ma784/Jericho.git
cd Jericho
python download-dependencies.py  # Auto-install dependencies
make

# Python version (PyJericho)
cd pythonver
./pyjericho.sh setup
# Start web API
./pyjericho.sh api

Links:

Repository: github.com/st7ma784/Jericho
Documentation: Complete Doxygen documentation with API reference
Web Interface: Interactive configuration and job management system

Archived Projects

🗄️ Legacy Research Tools 

Status: Archived | Language: MATLAB | Type: Historical Reference

Early research tools developed during graduate studies for prototype development and proof-of-concept implementations.

Note: These projects are no longer actively maintained but remain available for reference and historical purposes.

Associated Publications:

Historical research papers from graduate work

Your Name. “Early Research Paper.” Conference Proceedings, 2022.