Projects
Here are a few that reflect my work across machine learning systems, generative AI, security, and applied research.
Arcane – Distributed Machine Learning CLI
Currently building Arcane, a command-line interface tool for managing and monitoring distributed machine learning workflows across multiple machines or GPUs. Arcane supports easy configuration, real-time resource tracking, and scalable orchestration of training tasks, designed to simplify large-scale ML experimentation and deployment.
CoDSPy – AI-Powered Code Optimization System
An intelligent code optimization platform built using DSPy and Gradio. The system performs syntax inspection, semantic analysis, performance tuning, and test case generation using multi-agent reasoning with CoT and ReAct strategies. Demonstrates real-world application of local LLMs in developer tooling.
Billion-Scale ANN Benchmarking Dataset – Google Summer of Code 2025 (UC Santa Cruz)
Developed a billion-scale vector embedding dataset sourced from Wikipedia using open-source language models. The dataset spans dimensions of 1024, 4096, and 8192, offering a realistic benchmark for Approximate Nearest Neighbor (ANN) algorithms. This project addresses limitations in existing low-dimensional benchmarks and improves evaluation for high-dimensional retrieval systems.
Multi-Agent Code Evaluation System – PES Innovation Lab
Architected a multi-agent system powered by Mistral 7B for automated code review, optimization, and test case generation. Fine-tuned models for task-specific behavior and orchestrated interactions between optimizer, reviewer, and tester agents. Achieved high performance (CodeBLEU 80+) and developed a live demo showcasing end-to-end automation.
KASPER – Kernel Adaptive Spline-based PDF Attack Recognizer
A deep learning architecture for PDF malware detection, designed to defend against adversarial attacks like FGSM and PGD. Achieved 98.9% accuracy and strong resilience through custom malware injection pipelines and spline-based kernel layers. Submitted to the MLCS Workshop at ECML-PKDD 2025.