Welcome

Hi, I'm Milan Varghese

|

Master's student in AI and Machine Learning at Drexel University (GPA: 4.0). Currently working as a Research Assistant developing novel neural network initialization algorithms and extending BioMedBERT pretraining.

View Projects Get in Touch

About Me

Current Focus

Activation Steering for LLM Safety

Using mechanistic interpretability to extract harmful direction vectors from Llama 3.1 8B, then training auxiliary MLPs to steer activations away from unsafe directions.

Mech. Interp. LLM Safety Llama

Bayesian Neural Network Initialization

Extended Bayesian Initialization to convolutional architectures using unfold operations and product quantization. 81% on MNIST with 20% data, 65% on CIFAR-10 without gradient descent.

PyTorch Neural Networks Bayesian Methods

Extended Pretraining of BioMedBERT

Extended BioMedBERT pretraining on PubMed abstracts (2020-2024) with robust BLURB evaluation (13 tasks, 10 seeds), exceeding baselines on BioASQ and PubMedQA.

NLP BERT Biomedical

I'm an AI/ML Researcher and Engineer pursuing a Master's in AI and Machine Learning at Drexel University. My research spans Bayesian neural network initialization, mechanistic interpretability for LLM safety, and biomedical NLP. Previous experience includes building enterprise RAG pipelines and deepfake systems at Ernst & Young, and backend infrastructure at Neuflo Solutions.

Download Resume

Expertise

Tech Stack

Machine Learning

Python PyTorch TensorFlow Keras Scikit-learn LangChain

Data & Cloud

PostgreSQL Oracle SQL Azure AWS GCP Postman

Tools & Frameworks

Docker FastAPI Flask Git Power BI Tableau

AI Development Toolkit

Claude Code Antigravity

Certifications

Azure AI Fundamentals (AI-900)

Azure Fundamentals (AZ-900)

Background

Experience & Education

Education

M.S. Artificial Intelligence and Machine Learning

Drexel University, Philadelphia, PA

Expected June 2026 | Drexel CCI Merit Scholarship Recipient

GPA: 4.0

M.Sc. Computer Science - Data Analytics

Mahatma Gandhi University, India

Graduated 2022 | Best Outgoing Student

GPA: 3.77

B.Sc. Physics

University of Kerala, India

Graduated 2020

GPA: 3.39

Work Experience

Researcher - Prof. Rezvaneh Rezapour (COOP)

Drexel University · Social NLP Lab

Current Oct 2025 - Present

Contributed to longitudinal study of self-stigma among Reddit users: benchmarked 7 LLMs (F1=0.856), engineered prompts for 10 stigma indicators
Extended BioMedBERT pretraining on PubMed abstracts (2020-2024), exceeding baselines on BioASQ & PubMedQA
Resolved MongoDB indexing bottleneck, migrated datasets from JSON to Parquet (2x compression)

Research Assistant - Prof. Jake R. Williams

Drexel University · CODED Labs

Current Oct 2024 - Present

Extended Bayesian Initialization to convolutional architectures (81% MNIST with 20% data, 65% CIFAR-10 without gradient descent)
Conducted multi-month hyperparameter ablation for 1.5B-parameter LLM (SAFFU architecture) with Bayesian Parameter Initialization
Managed GPU server operations and built training monitoring visualizations

Backend Developer

Neuflo Solutions

Sep 2023 - Aug 2024

Delivered backend infrastructure for NEET coaching MVP using FastAPI, PostgreSQL, Docker, and Azure
Curated and built comprehensive NEET question bank: collected questions, answers, and images in LaTeX format, manually verified end-to-end

AI/ML Engineer (Associate)

Ernst & Young

Jul 2022 - Aug 2023

Built enterprise document search using Azure Cognitive Services, Blob Storage, Form Recognizer, and GPT-3.5
Developed digital twin characters using LLMs and deepfake lip sync for metaverse project
Delivered LLM-powered chatbot and prototyped Stable Diffusion text-to-image POC

AI/ML Research Intern

Zoho Technologies

Nov 2021 - May 2022

Developed YOLOv3-based form-field detection models for scanned documents
Conducted research on LayoutLM for transformer-based document understanding

View full resume

Portfolio

Featured Projects

Research Projects

Bayesian Neural Network Initialization

Research

Extended Bayesian Initialization to convolutional architectures using unfold operations and product quantization. Achieved 81% accuracy on MNIST with 20% data, 65% on CIFAR-10 without gradient descent.

PyTorch Neural Networks Bayesian Methods

Information Activation Steering for LLM Safety

Ongoing

Used mechanistic interpretability to extract harmful direction vectors from Llama 3.1 8B's residual stream, then trained lightweight auxiliary MLPs to steer activations away from unsafe directions.

Mech. Interp. LLM Safety Llama

Extended Pretraining of BioMedBERT

Ongoing

Extended BioMedBERT pretraining on PubMed abstracts (2020-2024) with robust BLURB evaluation (13 tasks, 10 seeds), exceeding published baselines on BioASQ and PubMedQA.

NLP BERT Biomedical

Academic Projects

Adversarial Attacks on Vision Models

2025

Evaluated 6 attack methods on ResNet-18 and YOLOv5: white-box PGD reduced accuracy from 83% to 0.58%; black-box patch attacks dropped classification by 17.85% and detection by 43%.

Computer Vision Security PyTorch

LLM-based Tool-Calling Agent

2025

Multi-step agentic loop where GPT-4 uses chain-of-thought reasoning to autonomously select and invoke external tools (Wikipedia, geosearch, live Google News RSS) via JSON schema-defined interfaces.

Agents GPT-4 Pydantic

LSTM from Scratch for Song Lyrics

2025

Complete LSTM network from scratch using only NumPy — gated forward/backward passes with BPTT, custom BPE tokenizer, trained on song lyrics with top-k sampling.

Deep Learning NumPy NLP

Classical ML in C++ from Scratch

2024

Multi-class Logistic Regression with softmax and SVMs with one-vs-all classification from scratch in C++ using Eigen, benchmarked on CIFAR-10 (50K train, 10K test).

C++ Eigen ML

Deep Learning Sentiment Analysis

Published

COVID-19 news video sentiment analysis using LSTM, Bi-LSTM, CNN, GRU. Published at ICITA-2021.

Deep Learning NLP

View All Projects

Publications

Research

View All

Ongoing Mar 2026

Longitudinal Analysis of Self-Stigma: A Cognitive-Affective-Behavioral Characterization of Reddit User Timelines

Ongoing research contributing to a longitudinal study of internalized self-stigma among people who use drugs on Reddit, using LLM benchmarking and prompt engineering for stigma indicator classification.

#NLP#Computational Social Science#LLM#Reddit

Aug 2021

Deep learning-based sentiment analysis on COVID-19 News Videos.

Published research on analyzing public sentiment from YouTube comments on COVID-19 news videos using deep learning models.

#Deep Learning#Sentiment Analysis#Computational Social Science#LSTM

View All Research

Thoughts

Latest Posts

View All

Sep 11, 2022

Understanding Transformers and Attention Mechanisms

A deep dive into the transformer architecture and self-attention mechanism that powers modern NLP models.

#Transformers#NLP#Deep Learning

Sep 10, 2022

Getting Started with PyTorch for Deep Learning

A beginner-friendly introduction to PyTorch, covering tensors, autograd, and building your first neural network.

#PyTorch#Deep Learning#Tutorial

View All Posts

Get in Touch

Let's Work Together

Interested in AI/ML research collaboration, consulting, or just want to chat about machine learning? I'd love to hear from you.

Send an Email Connect on LinkedIn

GitHub LinkedIn Email Resume