Welcome

Hi, I'm Milan Varghese

|

Master's student in AI and Machine Learning at Drexel University (GPA: 4.0). Currently working as a Research Assistant developing novel neural network initialization algorithms and extending BioMedBERT pretraining.

About Me

Current Focus

Activation Steering for LLM Safety

Using mechanistic interpretability to extract harmful direction vectors from Llama 3.1 8B, then training auxiliary MLPs to steer activations away from unsafe directions.

Mech. Interp. LLM Safety Llama

Bayesian Neural Network Initialization

Extended Bayesian Initialization to convolutional architectures using unfold operations and product quantization. 81% on MNIST with 20% data, 65% on CIFAR-10 without gradient descent.

PyTorch Neural Networks Bayesian Methods

Extended Pretraining of BioMedBERT

Extended BioMedBERT pretraining on PubMed abstracts (2020-2024) with robust BLURB evaluation (13 tasks, 10 seeds), exceeding baselines on BioASQ and PubMedQA.

NLP BERT Biomedical

I'm an AI/ML Researcher and Engineer pursuing a Master's in AI and Machine Learning at Drexel University. My research spans Bayesian neural network initialization, mechanistic interpretability for LLM safety, and biomedical NLP. Previous experience includes building enterprise RAG pipelines and deepfake systems at Ernst & Young, and backend infrastructure at Neuflo Solutions.

Expertise

Tech Stack

Machine Learning

Python PyTorch TensorFlow Keras Scikit-learn LangChain

Data & Cloud

PostgreSQL Oracle SQL Azure AWS GCP Postman

Tools & Frameworks

Docker FastAPI Flask Git Power BI Tableau

AI Development Toolkit

Claude Code Antigravity

Certifications

Azure AI Fundamentals (AI-900)
Azure Fundamentals (AZ-900)
Background

Experience & Education

Education

M.S. Artificial Intelligence and Machine Learning

Drexel University, Philadelphia, PA

Expected June 2026 | Drexel CCI Merit Scholarship Recipient

GPA: 4.0

M.Sc. Computer Science - Data Analytics

Mahatma Gandhi University, India

Graduated 2022 | Best Outgoing Student

GPA: 3.77

B.Sc. Physics

University of Kerala, India

Graduated 2020

GPA: 3.39

Work Experience

Researcher - Prof. Rezvaneh Rezapour (COOP)

Drexel University · Social NLP Lab

Current Oct 2025 - Present
  • Contributed to longitudinal study of self-stigma among Reddit users: benchmarked 7 LLMs (F1=0.856), engineered prompts for 10 stigma indicators
  • Extended BioMedBERT pretraining on PubMed abstracts (2020-2024), exceeding baselines on BioASQ & PubMedQA
  • Resolved MongoDB indexing bottleneck, migrated datasets from JSON to Parquet (2x compression)

Research Assistant - Prof. Jake R. Williams

Drexel University · CODED Labs

Current Oct 2024 - Present
  • Extended Bayesian Initialization to convolutional architectures (81% MNIST with 20% data, 65% CIFAR-10 without gradient descent)
  • Conducted multi-month hyperparameter ablation for 1.5B-parameter LLM (SAFFU architecture) with Bayesian Parameter Initialization
  • Managed GPU server operations and built training monitoring visualizations

Backend Developer

Neuflo Solutions

Sep 2023 - Aug 2024
  • Delivered backend infrastructure for NEET coaching MVP using FastAPI, PostgreSQL, Docker, and Azure
  • Curated and built comprehensive NEET question bank: collected questions, answers, and images in LaTeX format, manually verified end-to-end

AI/ML Engineer (Associate)

Ernst & Young

Jul 2022 - Aug 2023
  • Built enterprise document search using Azure Cognitive Services, Blob Storage, Form Recognizer, and GPT-3.5
  • Developed digital twin characters using LLMs and deepfake lip sync for metaverse project
  • Delivered LLM-powered chatbot and prototyped Stable Diffusion text-to-image POC

AI/ML Research Intern

Zoho Technologies

Nov 2021 - May 2022
  • Developed YOLOv3-based form-field detection models for scanned documents
  • Conducted research on LayoutLM for transformer-based document understanding
Portfolio

Featured Projects

Research Projects

Bayesian Neural Network Initialization

Research

Extended Bayesian Initialization to convolutional architectures using unfold operations and product quantization. Achieved 81% accuracy on MNIST with 20% data, 65% on CIFAR-10 without gradient descent.

PyTorch Neural Networks Bayesian Methods

Information Activation Steering for LLM Safety

Ongoing

Used mechanistic interpretability to extract harmful direction vectors from Llama 3.1 8B's residual stream, then trained lightweight auxiliary MLPs to steer activations away from unsafe directions.

Mech. Interp. LLM Safety Llama

Extended Pretraining of BioMedBERT

Ongoing

Extended BioMedBERT pretraining on PubMed abstracts (2020-2024) with robust BLURB evaluation (13 tasks, 10 seeds), exceeding published baselines on BioASQ and PubMedQA.

NLP BERT Biomedical

Academic Projects

Adversarial Attacks on Vision Models

2025

Evaluated 6 attack methods on ResNet-18 and YOLOv5: white-box PGD reduced accuracy from 83% to 0.58%; black-box patch attacks dropped classification by 17.85% and detection by 43%.

Computer Vision Security PyTorch

LLM-based Tool-Calling Agent

2025

Multi-step agentic loop where GPT-4 uses chain-of-thought reasoning to autonomously select and invoke external tools (Wikipedia, geosearch, live Google News RSS) via JSON schema-defined interfaces.

Agents GPT-4 Pydantic

LSTM from Scratch for Song Lyrics

2025

Complete LSTM network from scratch using only NumPy — gated forward/backward passes with BPTT, custom BPE tokenizer, trained on song lyrics with top-k sampling.

Deep Learning NumPy NLP

Classical ML in C++ from Scratch

2024

Multi-class Logistic Regression with softmax and SVMs with one-vs-all classification from scratch in C++ using Eigen, benchmarked on CIFAR-10 (50K train, 10K test).

C++ Eigen ML

Deep Learning Sentiment Analysis

Published

COVID-19 news video sentiment analysis using LSTM, Bi-LSTM, CNN, GRU. Published at ICITA-2021.

Deep Learning NLP
Get in Touch

Let's Work Together

Interested in AI/ML research collaboration, consulting, or just want to chat about machine learning? I'd love to hear from you.

Built with Astro, TailwindCSS, and DaisyUI.
© 2026 Milan Varghese. All rights reserved.