Welcome

Hi, I'm Milan Varghese

|

Master's student in AI and Machine Learning at Drexel University (GPA: 4.0). Currently working as a Research Assistant developing novel neural network initialization algorithms and extending BioMedBERT pretraining.

About Me

Current Focus

Activation Steering for LLM Safety

Using mechanistic interpretability to extract harmful direction vectors from Llama 3.1 8B, then training auxiliary MLPs to steer activations away from unsafe directions.

Mech. Interp. LLM Safety Llama

Bayesian Neural Network Initialization

Extended Bayesian Initialization to convolutional architectures using unfold operations and product quantization. 81% on MNIST with 20% data, 65% on CIFAR-10 without gradient descent.

PyTorch Neural Networks Bayesian Methods

Extended Pretraining of BioMedBERT

Extended BioMedBERT pretraining on PubMed abstracts (2020-2024) with robust BLURB evaluation (13 tasks, 10 seeds), exceeding baselines on BioASQ and PubMedQA.

NLP BERT Biomedical

I'm an AI/ML Researcher and Engineer pursuing a Master's in AI and Machine Learning at Drexel University. My research spans Bayesian neural network initialization, mechanistic interpretability for LLM safety, and biomedical NLP. Previous experience includes building enterprise RAG pipelines and deepfake systems at Ernst & Young, and backend infrastructure at Neuflo Solutions.

Expertise

Tech Stack

Machine Learning

Python PyTorch TensorFlow Keras Scikit-learn LangChain

Data & Cloud

PostgreSQL Oracle SQL Azure AWS GCP Postman

Tools & Frameworks

Docker FastAPI Flask Git Power BI Tableau

Agentic AI Toolkit

Claude Code Antigravity OpenClaw

Certifications

Azure AI Fundamentals (AI-900)
Azure Fundamentals (AZ-900)
Background

Experience & Education

Education

M.S. Artificial Intelligence and Machine Learning

Drexel University, Philadelphia, PA

Expected May 2026 | Drexel CCI Merit Scholarship Recipient

GPA: 4.0

M.Sc. Computer Science - Data Analytics

Mahatma Gandhi University, India

Graduated 2022 | Best Outgoing Student

GPA: 3.77

B.Sc. Physics

University of Kerala, India

Graduated 2020

GPA: 3.39

Work Experience

Researcher - Prof. Rezvaneh Rezapour (COOP)

Drexel University

Current Oct 2025 - Present
  • Contributed to longitudinal study of self-stigma among Reddit users: benchmarked 7 LLMs (F1=0.856), engineered prompts for 10 stigma indicators
  • Extended BioMedBERT pretraining on PubMed abstracts (2020-2024), exceeding baselines on BioASQ & PubMedQA
  • Resolved MongoDB indexing bottleneck, migrated datasets from JSON to Parquet (2x compression)

Research Assistant - Dr. Jake R. Williams

Drexel University

Current Oct 2024 - Present
  • Extended Bayesian Initialization to convolutional architectures (81% MNIST with 20% data, 65% CIFAR-10 without gradient descent)
  • Conducted multi-month hyperparameter ablation for 1.5B-parameter LLM (SAFFU architecture) with Bayesian Parameter Initialization
  • Managed GPU server operations and built training monitoring visualizations

AI/ML Engineer (Associate)

Ernst & Young

Jul 2022 - Aug 2023
  • Built enterprise document search using Azure Cognitive Services, Blob Storage, Form Recognizer, and GPT-3.5
  • Developed digital twin characters using LLMs and deepfake lip sync for metaverse project
  • Delivered LLM-powered chatbot and prototyped Stable Diffusion text-to-image POC

Backend Developer

Neuflo Solutions Pvt Ltd

Sep 2022 - Aug 2023
  • Delivered backend for NEET coaching MVP using FastAPI, PostgreSQL, Docker, and Azure
  • Led technical onboarding for engineering interns, reducing ramp-up time by ~30%

AI/ML Research Intern

Zoho Technologies

Nov 2021 - May 2022
  • Developed YOLOv3-based form-field detection models for scanned documents
  • Conducted research on LayoutLM for transformer-based document understanding
Portfolio

Featured Projects

Research Projects

Bayesian Neural Network Initialization

Research

Extended Bayesian Initialization to convolutional architectures using unfold operations and product quantization. Achieved 81% accuracy on MNIST with 20% data, 65% on CIFAR-10 without gradient descent.

PyTorch Neural Networks Bayesian Methods

Information Activation Steering for LLM Safety

Ongoing

Used mechanistic interpretability to extract harmful direction vectors from Llama 3.1 8B's residual stream, then trained lightweight auxiliary MLPs to steer activations away from unsafe directions.

Mech. Interp. LLM Safety Llama

Extended Pretraining of BioMedBERT

Ongoing

Extended BioMedBERT pretraining on PubMed abstracts (2020-2024) with robust BLURB evaluation (13 tasks, 10 seeds), exceeding published baselines on BioASQ and PubMedQA.

NLP BERT Biomedical

Academic Projects

Adversarial Attacks on Vision Models

2025

Comprehensive white-box (PGD) and black-box (FGSM, patch, noise, frequency-domain) attacks on ResNet-18 and YOLOv5 to quantify vulnerabilities.

Computer Vision Security PyTorch

Classical ML in C++ from Scratch

2024

Implemented Logistic Regression and SVMs from scratch in C++ using the Eigen library, applied to CIFAR-10 image classification.

C++ Eigen ML

Deep Learning Sentiment Analysis

Published

COVID-19 news video sentiment analysis using LSTM, Bi-LSTM, CNN, GRU. Published at ICITA-2021.

Deep Learning NLP
Get in Touch

Let's Work Together

Interested in AI/ML research collaboration, consulting, or just want to chat about machine learning? I'd love to hear from you.

Built with Astro, TailwindCSS, and DaisyUI.
© 2026 Milan Varghese. All rights reserved.