Barath Velmurugan

Data Science | Massachusetts Institute of Technology

Profile Photo

Hi! I am a Master of Business Analytics student at the Massachusetts Institute of Technology, working with Professor Vivek F. Farias. Previously, I received my B.Math in Statistics & CS from the University of Waterloo. My work spans software engineering, machine learning, optimization, and cloud infrastructure!

I am on the job market this year (2025-2026).

Research

My research focuses on generative modeling of digital twins and behavioral simulation using large language models. I'm interested in how large language models can reason about human behaviour! My recent work applies these ideas for synthetic customer modeling at Anheuser-Busch InBev through MIT.

Work Experience

Data Scientist (MIT Analytics Lab) · Mira Intel | Sept 2025 – Dec 2025

Computer vision & multi-agentic workflow for wind-turbine damage detection and forecasting.

More
  • Implemented a YOLOv8 vision model to classify wind-turbine damage with 90% accuracy in bounding-box detection
  • Built a multi-agentic workflow with OpenAI’s Agent Builder achieving 75% interpretive accuracy in damage forecasting
Graduate Research Assistant · Massachusetts Institute of Technology | Sept 2025 – Dec 2025

LLM-based digital twins and behavioral simulation for ABI’s Zé Delivery (5M users) platform.

More
  • Engineered frontier LLM methods to build digital twins of customers from Anheuser-Busch’s Zé Delivery app (5M users)
  • Generated synthetic personas on 350 customer interview transcripts using Python, Databricks, and OpenAI models
  • Leveraged insights from leading research, including Twin-2K-500, to inform architecture and evaluation methods
Software Engineer Intern (Platform) · PointClickCare | Sept 2024 – Dec 2024

A full-stack platform to support the Senior Living Resident Management market (React & Python).

More
  • Built a feature enabling an Azure GPT3.5 Turbo model to review lengthy SQL files and flag syntax deviations in pull requests, providing real-time non-intrusive quality checks using Groovy (i.e., undisturbed Jenkins build)
  • Took ownership of complex React UI features such as debugging production issues, building a multi-select search filter, and enabling editable time fields for “move-out” operations
Software Engineer Intern (Infra.) · Manulife | Jan 2024 – Apr 2024

Qualys Cloud Agent automation in Ansible using Azure Metadata and YAML.

More
  • Proved out Qualys Cloud Agent functionality in Ansible by leveraging Ansible Modules, Azure Metadata, and YAML
  • Built a Chef feature using Ruby to validate access control policies (Sudo, HBAC) during the provisioning of a server, strengthening security and reducing server deployment time by 15%
Software Engineer Intern (Data) · Manulife | May 2023 – Aug 2023

PySpark and Databricks pipelines for large-scale database migration (80K records).

More
  • Transformed data fields using PySpark and Databricks to support large-scale database migration of 80,000 records
  • Refactored a mission critical React project by consolidating logic, modularizing code, and deleting files
  • Built API-driven table rendering functionality (< 3s response) using TypeScript, React, and Material UI
  • Streamlined client data entry by adding button functionality and 5 conditional fields using React Hooks and JSX
Software Engineer Intern (Infra.) · Manulife | Sept 2022 – Dec 2022

Azure Logic Apps integration of Manulife Bank (SOAP) and Salesforce (REST).

More
  • Integrated Manulife Bank (SOAP) with Salesforce (REST) using Azure Logic Apps, achieving 42% more performance (less than 1s latency) and near-zero downtime compared to the previously deployed PCF model
  • Solved authentication using Azure App Service to allow the Logic App to hit the bank’s on-prem SOAP APIs
  • Strengthened Logic App security by implementing OAuth 2.0 authentication using Azure Active Directory
Software Engineer Intern (Backend) · PointClickCare | Jan 2022 – Apr 2022

Azure Event Hub deployment & Python multithreading for large-scale (22M+) data ingestion.

More
  • Provisioned an Azure Event Hub using Java and Terraform to enable parallel ingestion of 22 million patient data
  • Leveraged Python multithreading to reduce script validation time for OHDSI datasets with 150,000 rows by 56%
  • Programmed an Azure Function using Java to detect the health of a Redis cache and PostgreSQL application
Software Engineer Intern (Backend) · PointClickCare | May 2021 – Aug 2021

API development and unit testing using Spring and JUnit.

More
  • Engineered a new API in 1 week (half a sprint) using Spring to inform 10,000+ vendors on ancillary charge statuses
  • Integrated 20+ comprehensive unit tests with JUnit for public APIs to improve API robustness and maintainability

Projects

Sparse Regression for Foundational Model Pruning (ongoing)

Applying L1 (lasso) and group lasso methods to prune DistilGPT2 (82M parameters) using the WikiText-2 dataset, reducing parameter count while maintaining text generation quality.

Deterministic Voice Intelligence Agent (link)

Built a deterministic voice intelligence agent from scratch using FastAPI, OpenAI Whisper, Meta Llama 3.3 70B (via OpenRouter), LangGraph, and ElevenLabs for end-to-end speech recognition and reasoning.

Hi-Expense (link)

Built a personal financial management software using Python, JavaScript, Chart.js, and PostgreSQL.

ShareNote (link)

Developed a real-time online word processor using JavaScript, MongoDB, React, Socket.IO, and Quill API.

Extracurricular

Clubs

MIT AI & ML Club, MIT Tech Club, MIT Project Management Club, UW Data Science Club, Stanford ASES

Conferences & Events

CODE@MIT (2025), GAI World (2025), Hack The North (2024), Jane Street FTTP (2021)

Interests

lo-fi music, long road-trips, watching sports, and cooking!