Research Software

Software projects developed to support my research activities, often accompanying published papers.

Active Projects

πŸ”¬ 6DIMCOCO: Multi-dimensional CLIP Training Framework

Status: Active development | Language: Python | Type: Machine Learning Research

Advanced research framework for training CLIP models with novel n-dimensional loss functions and cutting-edge analysis techniques.

Key Features:

  • Multi-dimensional CLIP Training: 3D, 4D, 6D, and custom dimensional configurations

  • Novel Loss Functions: 18+ mathematically rigorous loss function variants with numerical stability

  • CKA Analysis: Deep model comparison and understanding through Centered Kernel Alignment

  • Cross-modal Learning: Image-text and multilingual capabilities (Chinese-English translation)

  • GPU Acceleration: CuPy implementation for large tensor operations

  • Type-safe Configuration: Reproducible experiment management with dataclass-based configs

Technical Highlights:

  • Comprehensive test suite with 95%+ coverage ensuring mathematical correctness

  • Advanced visualization tools for 6D embedding analysis with hypercube representations

  • PyTorch Lightning integration for distributed training across multiple GPUs

  • Weights & Biases integration for experiment tracking and hyperparameter optimization

Installation:

git clone https://github.com/st7ma784/6DIMCOCO.git
cd 6DIMCOCO
pip install -r requirements.txt
pip install -e .

Links:


πŸ›‘οΈ PGDVisualisation: Adversarial Attack Visualization

Status: Active development | Language: Python/JavaScript | Type: Security Research

Interactive visualization framework for understanding Projected Gradient Descent (PGD) adversarial attacks on machine learning models.

Key Features:

  • Advanced Attack Implementation: Multi-modal PGD attacks on CLIP models with text and image perturbations

  • Real-time Visualization: Interactive web interface for draggable attack visualization

  • Comprehensive Evaluation: Linear probe analysis of clean vs. adversarial features

  • Attack Variants: Support for PGD, C&W, AutoAttack, and custom text-based attacks

  • Performance Analysis: Batch processing with GPU acceleration for large-scale experiments

  • Docker Integration: Containerized visualization environment

Technical Highlights:

  • PyTorch Lightning-based training with multi-GPU support and gradient accumulation

  • Thread-safe result collection using queue-based architecture for concurrent processing

  • Advanced attack parameterization with configurable alpha, epsilon, and step parameters

  • Classifier analysis comparing clean, dirty, and general model performance

  • Web-based interface with Flask backend for real-time attack visualization

Research Applications:

  • Adversarial robustness evaluation for vision-language models

  • Attack transferability analysis across different model architectures

  • Security assessment of multimodal AI systems

Installation:

git clone https://github.com/st7ma784/PGDVisualisation.git
cd PGDVisualisation
pip install -r requirements.txt
# For web interface
docker pull st7ma784/vis

⚑ ML-SLURM-Template: HPC Machine Learning Framework

Status: Stable release | Language: Python | Type: Research Infrastructure

Comprehensive template for deploying machine learning experiments on SLURM-based high-performance computing clusters with automated hyperparameter optimization.

Key Features:

  • HPC Integration: Seamless SLURM cluster deployment with automated job submission and resource management

  • Experiment Tracking: Native integration with Weights & Biases, Neptune, and Azure ML for comprehensive experiment monitoring

  • PyTorch Lightning Framework: Professional ML training pipeline with multi-GPU and multi-node distributed training support

  • Hyperparameter Optimization: Automated hyperparameter sweeps with intelligent trial generation and resource allocation

  • Cloud Deployment: Azure Machine Learning integration with notebook-based experiment management

  • CKA Analysis: Built-in Centered Kernel Alignment tools for model comparison and representation analysis

Technical Highlights:

  • Automated data module system with COCO dataset integration and smart downloading capabilities

  • FSDP (Fully Sharded Data Parallel) strategy for training large models across multiple nodes

  • Comprehensive SLURM environment detection and configuration for different HPC systems (BEDE, N8, etc.)

  • Type-safe argument parsing with test-tube integration for systematic hyperparameter exploration

  • Docker and Conda environment management with reproducible dependency specification

Research Applications:

  • Large-scale deep learning experiments requiring distributed training

  • Systematic hyperparameter optimization across HPC resources

  • Model architecture comparison using CKA analysis

  • Cloud-to-cluster workflow integration for academic research

Installation:

git clone https://github.com/st7ma784/MLslurmtemplate.git
cd MLslurmtemplate
pip install -r requirements.txt
# Configure SLURM parameters in Launch.py
python Launch.py --dir /path/to/data --num_trials 10

Links:


🌌 JERICHO: Hybrid Plasma Simulation Framework

Status: Active development | Language: C++/Python | Type: Scientific Computing

Advanced hybrid plasma simulation framework for studying magnetospheric plasma dynamics in planetary systems, particularly Jupiter and Saturn magnetospheres.

Key Features:

  • Hybrid Plasma Modeling: Ions treated kinetically with full particle dynamics, electrons as massless fluid using MHD equations

  • Multi-Language Implementation: Production C++ core with comprehensive Python implementation (PyJericho) for accessibility

  • MPI Parallelization: Distributed computing support with automatic domain decomposition and load balancing

  • GPU Acceleration: CUDA/CuPy integration with automatic fallback to CPU for large-scale simulations

  • Advanced Physics: Self-consistent electromagnetic fields with predictor-corrector magnetic field evolution

  • Web API Interface: Flask-based REST API for remote simulation management and configuration

Technical Highlights:

  • Multiple particle pushing algorithms (Boris A/B, Wiggs) for optimal numerical stability

  • Comprehensive boundary condition support (periodic, hard wall, outflow, inflow, MPI)

  • Real-time visualization and analysis tools with HDF5 output format

  • Production-quality scientific computing with extensive validation against theoretical models

  • Sophisticated field interpolation using particle-in-cell methods with ghost point handling

Research Applications:

  • Saturn magnetosphere plasma escape mechanisms analysis

  • Jovian system magnetospheric dynamics modeling

  • Plasma-moon interaction studies (Enceladus, Io)

  • Cross-planetary magnetosphere comparison research

Installation:

# C++ version
git clone https://github.com/st7ma784/Jericho.git
cd Jericho
python download-dependencies.py  # Auto-install dependencies
make

# Python version (PyJericho)
cd pythonver
./pyjericho.sh setup
# Start web API
./pyjericho.sh api

Links:

  • Repository: github.com/st7ma784/Jericho

  • Documentation: Complete Doxygen documentation with API reference

  • Web Interface: Interactive configuration and job management system


Archived Projects

πŸ—„οΈ Legacy Research Tools

Status: Archived | Language: MATLAB | Type: Historical Reference

Early research tools developed during graduate studies for prototype development and proof-of-concept implementations.

Note: These projects are no longer actively maintained but remain available for reference and historical purposes.

Associated Publications:

  • Historical research papers from graduate work


  • Your Name. β€œEarly Research Paper.” Conference Proceedings, 2022.



Development Practices

Testing

All research software includes:

  • Unit tests with >90% coverage

  • Integration tests for key workflows

  • Continuous integration (GitHub Actions)

  • Automated testing on multiple platforms

Documentation

  • API documentation (Sphinx/pkgdown)

  • User guides and tutorials

  • Installation instructions

  • Usage examples and notebooks

Reproducibility

  • Docker containers for computational environments

  • Environment specification files (requirements.txt, environment.yml)

  • Version pinning for dependencies

  • Reproducible build processes

Performance

  • Profiling and optimization

  • Benchmarking against alternatives

  • Memory efficiency considerations

  • Parallel processing where applicable

Usage Statistics

Note

Download statistics are updated monthly from package repositories.

Project

Downloads/Month

Total Downloads

Active Users

Project Alpha

XXX

X,XXX

~XXX

Data Toolkit

XXX

X,XXX

~XXX

Exp Framework

XX

XXX

~XX

Contributing

How to Contribute

  1. Issues: Report bugs or request features

  2. Pull Requests: Code contributions welcome

  3. Documentation: Help improve docs and examples

  4. Testing: Add tests or test on different platforms

Contributor Guidelines

  • Follow coding standards (PEP 8 for Python, tidyverse for R)

  • Include tests for new features

  • Update documentation

  • Add yourself to contributors list

Acknowledgments

Special thanks to contributors and collaborators:

  • Collaborator A (Institution): Major algorithmic contributions

  • Collaborator B (Company): Performance optimizations

  • Community contributors: Bug fixes and feature requests


For analysis and visualization tools related to these projects, see Data Analysis.