About

I’m Prathamesh Devadiga, an undergraduate researcher and AI engineer at PES University, Bangalore. I’m fascinated by the intersection of machine learning security, production AI systems, and the challenges of building trustworthy AI in the real world.

My research focuses on understanding how large language models can be broken—and more importantly, how to defend them. I investigate jailbreak vulnerabilities, data extraction risks, and the surprising ways that safety training can create new security problems. Currently, I’m working with Prof. Shawn Shan at Dartmouth College on privacy-alignment tradeoffs in LLMs, and at Lossfunk I’m investigating methods to induce robust world models in Transformers by modifying training objectives and architectural biases.

I also lead Ādhāra AI Labs, an independent research lab where we explore everything from compiler optimization with small language models to efficient neural architectures for resource-constrained environments. Our work has been presented at NeurIPS, ICIAI, and various workshops.


What I’m Working On Link to heading

Right now, I’m particularly excited about a few areas:

LLM Security & Privacy — I’m investigating how jailbreak attacks can force models to regurgitate training data, revealing critical vulnerabilities that challenge our assumptions about model scale and security. This work is showing that smaller models can actually leak more sensitive information than larger ones—a counterintuitive finding that’s reshaping how we think about model security.

World Models in Transformers — At Lossfunk, I’m investigating how to induce robust world models in Transformers by modifying training objectives and architectural biases. The goal is to mitigate structural decay in learned representations and force models toward stable, causally grounded internal states. This involves designing experiments with long-horizon future prediction and action-conditioned counterfactuals.

Low-Resource Language Modeling — Building AI systems for languages with almost no training data (like Tulu, with ~0.001% of typical training data) is incredibly challenging. I’ve explored how hard constraints and structured learning can help models learn these languages without catastrophic interference from dominant languages.

Production AI Systems — I build end-to-end LLM training and serving systems, focusing on making them fast, efficient, and secure. This includes everything from distributed training frameworks to real-time jailbreak prevention systems that can detect attacks in under 5ms.


Background Link to heading

I’m currently pursuing my B.Tech in Computer Science at PES University (GPA: 8.91/10), where I also teach as a Teaching Assistant for Machine Learning and Deep Learning courses. I’ve had the privilege of working with researchers at Dartmouth College, IIT Indore, and UC Santa Cruz (through Google Summer of Code 2025), and I’ve interned at companies like Nokia and Lossfunk.

I’ve been fortunate to receive recognition like the Amazon AI-ML Scholar award, win the Cisco ThingQbator Hackathon, and attend programs like the Oxford Machine Learning School and Cohere AI Summer School. But what I’m most proud of is the community work—mentoring 50+ teams in hackathons, organizing technical workshops, and helping build the open-source AI community in Bangalore.


Beyond Research Link to heading

When I’m not coding or writing papers, I’m usually teaching, mentoring, or organizing community events. I’ve co-instructed a 30-hour Deep Learning course, served as Head of Technology for the Entrepreneurship Club at PES, and regularly speak at FOSS United and GDSC events about topics like DSPy and LLM fine-tuning.

I believe strongly in open-source research and making AI accessible. That’s why Ādhāra AI Labs focuses on building tools that bridge the gap between cutting-edge research and production-ready applications.


Research Interests Link to heading

  • Machine Learning Security: Adversarial robustness, LLM jailbreaking, data extraction vulnerabilities, constraint-based defenses
  • World Models & Representation Learning: Inducing robust world models in Transformers, causal grounding, structural representation stability
  • Production AI Systems: Low-latency inference, distributed training, efficient serving infrastructure
  • Alignment Safety: Privacy-alignment tradeoffs, safety training vulnerabilities, defense mechanisms
  • Low-Resource NLP: Extremely low-resource languages, structured learning, Indic LLMs

Want to collaborate, discuss research, or just chat about AI? Feel free to reach out or connect on LinkedIn.