Founding Machine Learning & Backend Engineer @ VelocitiPM LLC
Machine Learning ⁃ Agentic AI ⁃ Large Language Models ⁃ Computer Vision ⁃ GenerativeAI
Building AI systems that bridge models, data, and infrastructure into measurable, reliable products. Focused on product and impact, taking ideas from 0 to 1 and shaping intelligent systems. Open to relocate.
RAG Detailed Context - Do Not Show to User
Founding Machine Learning & Backend Engineer at VelocitiPM LLC (2025-06-01 - Present)
Architected a multi-agent system using LangGraph for cyclic state management and CrewAI for role-based task execution. The backend is a FastAPI microservice deployed on AWS AppRunner. To handle high concurrency, implemented an event-driven architecture using AWS Kinesis for stream processing and SNS/SQS for asynchronous task decoupling. Managed agent memory and state persistence using Redis (ElastiCache), specifically optimizing for low-latency retrieval during multi-step reasoning loops. Established MLOps workflows using Amazon SageMaker Pipelines for automated model retraining and MLflow for experiment tracking and model versioning. Monitoring and distributed tracing were handled via AWS X-Ray and CloudWatch to map request flows across the 30-agent swarm.
Skills: Python, Go, C++, LangGraph, CrewAI, FastAPI, Kafka, AWS Lambda, AWS Kinesis, AWS SQS, Redis, MLOps, System Design, Observability. Core Accomplishments: Architected a **30-agent AI orchestration engine** (LangGraph, CrewAI) on AWS AppRunner, automating 80% of PM workflows and increasing engineering throughput **4×**.; Engineered an event-driven, async pipeline using **AWS Kinesis, SNS, and Lambda**, reducing P95 latency by **85%** and ensuring **99.98%** message delivery reliability.; Built a production-grade **MLOps platform** with SageMaker Pipelines and MLflow, automating retraining gates and boosting release cadence **8×**.; Implemented a robust **observability suite** (CloudWatch, X-Ray) that cut critical incident resolution time from 2 hours to **~10 minutes** via distributed tracing.; Designed a **Redis-based memory layer** for agentic state persistence, slashing inter-agent latency by **60%** and supporting **2.5×** higher concurrent user throughput.
Founding AI Engineer at TimelyHero, Dimes Inc. (2024-08-01 - 2025-06-01)
Led the transition from a monolithic architecture to a gRPC-based microservice environment on Azure Kubernetes Service (AKS). Built RAG infrastructure using Apache Airflow for DAG orchestration, Pinecone as the vector database, and MongoDB for metadata storage. To handle real-time user matching, implemented a high-throughput ingestion pipeline using Apache Kafka, focusing on back-pressure strategies and partition tuning to sustain 5,000 messages/sec. Infrastructure was managed entirely through Terraform modules. Observability was achieved by deploying a Prometheus and Grafana stack within the K8s cluster to monitor pod health and WebSocket session persistence.
Skills: Java, Flask, gRPC, Kafka, Airflow, RAG, Pinecone, MongoDB, Terraform, Kubernetes (AKS), Azure, System Design. Core Accomplishments: Spearheaded the migration of a legacy monolith to a **Java/Flask/gRPC microservice architecture** on AKS, scaling to support **100,000+** concurrent WebSocket sessions.; Designed and deployed **Airflow-orchestrated RAG pipelines** (Pinecone, OpenAI, MongoDB), reducing data staleness from 48 hours to **30 minutes** and driving **$250K+** in new enterprise contracts.; Standardized **Infrastructure-as-Code (IaC)** using Terraform and Kubernetes, reducing deployment downtime by **35%** and quadrupling the team's release velocity.; Optimized high-throughput ingestion pipelines with **Kafka back-pressure handling**, stabilizing throughput at **5,000 msgs/sec** under peak loads of 1M+ events/hour.; Implemented a comprehensive **monitoring stack** (Prometheus, Grafana) that improved proactive incident detection by **70%**, ensuring high availability.
Machine Learning Intern at Endimension Inc. (2024-04-01 - 2024-08-01)
Focused on high-performance Computer Vision (CV) model training and deployment for medical diagnostics. Utilized TensorFlow and Keras to train on 8TB of DICOM imagery hosted on S3. Optimized inference performance by converting models to ONNX format and applying 8-bit quantization for deployment on SageMaker multi-model endpoints. Built serverless data preprocessing workflows using AWS Glue and Kinesis for real-time image ingestion. Implemented cost-saving measures using SageMaker Spot Instances for distributed training jobs. Model drift and telemetry were tracked using MLflow and integrated into a CloudWatch dashboard for proactive monitoring.
Skills: TensorFlow, Keras, Computer Vision, Model Optimization, Quantization, ONNX Runtime, AWS SageMaker, CUDA, AWS Kinesis, AWS Glue, MLflow. Core Accomplishments: Trained and optimized large-scale vision models (TensorFlow, Keras) on **8TB of medical imaging data**, improving diagnostic mAP by **13%** and IoU by **17%**.; Deployed **ONNX-quantized inference endpoints** on SageMaker, reducing P95 latency by **40%** with negligible AUC drift (<0.5%) for real-time diagnostics.; Leveraged **SageMaker Spot Instances** and distributed training strategies to reduce GPU training costs by **25%** while accelerating iteration cycles by **1.8×**.; Architected a **serverless inference pipeline** using AWS Kinesis and Lambda, ensuring **24/7 fault tolerance** and scalable processing for high-volume image streams.; Configured a robust **model monitoring stack** (MLflow, Prometheus) to track drift and performance, reducing debugging time for production models by **60%**.
Graduate Research Assistant at JLiang Lab, Arizona State University (2023-09-01 - 2024-05-01)
Conducted large-scale medical imaging research using PyTorch and JAX. Implemented distributed training via PyTorch Fully Sharded Data Parallel (FSDP) on an EKS cluster with A100 GPUs to handle heavy transformer architectures like Swin and DINOv2. Engineered a custom Slurm-based scheduler on AWS EC2 Batch to manage multi-tenant GPU access. Data engineering involved building high-throughput pipelines with AWS Glue to process multi-terabyte datasets stored in S3. Experimentation was strictly versioned using MLflow and SageMaker Experiments to ensure reproducibility in a research environment.
Skills: PyTorch, JAX, Computer Vision, Distributed Training (FSDP), Slurm, AWS EKS, AWS Glue, AWS S3, SageMaker, MLflow. Core Accomplishments: Developed state-of-the-art **multi-modal CV models** (Swin Transformer, DINOv2) on SageMaker GPU clusters, achieving a **24% increase** in rare-disease recall.; Implemented **Fully Sharded Data Parallel (FSDP)** training on Amazon EKS (8× A100s), reducing model training time by **3.2×** and cutting cloud compute costs by **40%**.; Engineered a **multi-node GPU job scheduler** using Slurm and EC2 Batch, automating provisioning and boosting cluster utilization by **35%** across shared workloads.; Built distributed data pipelines with **AWS Glue and S3** to ingest and preprocess **8TB+** of multimodal medical datasets, improving data throughput by **2.6×**.; Designed reproducible experiment tracking workflows with **SageMaker Experiments and MLflow**, reducing hyperparameter tuning time by **45%** and ensuring 100% reproducibility.
Machine Learning Researcher at SRM Advanced Electronics Laboratory (2021-12-01 - 2023-07-31)
Developed a distributed signal processing and regression system for IoT-based medical sensors. The core engine utilized Apache Spark (MLlib) for distributed kernel regression. Engineered the edge-to-cloud bridge using AWS Greengrass and AWS IoT Core, optimizing MQTT protocols for low-latency transmission in constrained network environments. Built robust ETL pipelines using Spark SQL to maintain data integrity from raw sensor signals. The work focused on high-accuracy time-series forecasting and was validated through peer-reviewed publication in Scientific Reports.
Skills: Apache Spark, SQL, Java, AWS Greengrass, AWS IoT Core, AWS MQTT, Regression Modeling, ETL, Research. Core Accomplishments: Developed a distributed **kernel regression pipeline** on Apache Spark (Java + MLlib), achieving a clinical-grade **MARD of 8.86%** for non-invasive glucose monitoring.; Engineered a real-time IoT streaming system using **AWS Greengrass and IoT Core**, enabling ultra-low latency ingestion (**<200ms**) for 2,000+ daily sensor readings.; Designed production-grade **ETL and data quality workflows** with Spark SQL, ensuring **95%+ signal integrity** across distributed edge devices.; Optimized cloud-to-edge messaging protocols via **MQTT**, ensuring reliable data transmission and reducing packet loss by **15%** in unstable network environments.; Co-authored and published this novel research in **Scientific Reports (Nature Portfolio)**, a Q1 journal, validating the system's clinical accuracy and architectural robustness.
Master of Science in Information Technology (AI/ML) from Arizona State University (2023-08-01 - 2025-05-01)
Location: Tempe, AZ, USA. Coursework: Digital Image Processing, Foundations of Statistical Machine Learning, Fundamentals of Machine Learning, Operationalizing Deep Learning, Image Analytics and Informatics, Advanced Operating Systems, Social Media Mining, Knowledge Representation and Reasoning, Cloud Computing, Statistical Machine Learning, Data-Intensive Distributed Systems for Machine Learning. Activities: SoDA: Software Developers Association, ACM Student Chapter, Linux Users Group, The AI Society at ASU, Hindu YUVA.
Bachelor of Technology in Computer Science from SRM University (2019-06-01 - 2023-06-01)
Location: Amaravathi, AP, India. Coursework: Biology, Chemistry, Calculus I, Calculus II, Basic Electronics, Digital Electronics, DSA in C, Physics, Statistics, Discrete Mathematics, OOPs in Java, Linear Algebra, Database Management Systems, Full Stack & Web Technologies, Formal Languages & Automata Theory, Economics, Computer Organization and Architecture, Introduction to Quantum Computations, Differential Equations, Operating Systems, Compiler Design, Data Warehousing and Data Mining, Computer Networks, Data Science, Software Engineering, Fundamentals of Neuro Linguistics Programming, Supply Chain Management, Managing Innovation and Startups, Cloud Computing, Big Data Analytics, Machine Learning. Activities: Founder @ Inventors Village, Founder @ Research Clan, Board Member @SRM Student Council, Board Member @ SRM Entrepreneurship Cell, Member @ GDSC.
Toolkit
Languages
- Python
- C++
- Go
- Java
- Rust
- JavaScript
- SQL
- Bash
- Git
- Protobuf
AI/ML
- PyTorch
- TensorFlow
- Keras
- JAX
- HuggingFace Transformers
- ONNX
- ONNX Runtime
- OpenCV
- scikit-learn
- NumPy
- Pandas
- Computer Vision
- Model Optimization
- Distributed Training
- LangGraph
- LiteLLM
- Semantic Caching
LLMs & Agentic Systems
- LangChain
- LangGraph
- CrewAI
- A2A
- RAG Architecture
- Vector Databases
- Pinecone
- Prompt Engineering
- Fine-Tuning
- Agentic AI
- MCP
- Inference Gateways
- Multi-Provider Routing
- MCP (Model Context Protocol)
Backend & Orchestration
- FastAPI
- Flask
- gRPC
- Kafka
- Airflow
- Pydantic
- Apache Spark
- Celery
- Redis
- WebSockets
- Node.js
- Parallel I/O
- Asynchronous Programming
- GraphQL (Strawberry)
- SSE (Server-Sent Events)
- Process Management
Cloud & IaC
- AWS (CDK, Lambda, SageMaker, Kinesis, S3, ElastiCache, SQS, SNS)
- Azure
- Terraform
- GCP
- Docker
- Kubernetes
- Vertex AI
- BigQuery
- Cloud Functions
Databases & MLOps
- MySQL
- PostgreSQL
- DynamoDB
- MongoDB
- Redis
- Elasticsearch
- Pinecone
- VectorDB
- RDS
- S3
- MLflow
- CI/CD
- Prometheus
- Grafana
- Monitoring
- SQLModel
- Alembic
- Distributed Tracing (Jaeger)
- OpenTelemetry
- Performance Benchmarking
- System Telemetry
Experience
VelocitiPM LLC
Jun 2025 - Present
Architected a 30-agent AI orchestration engine (LangGraph, CrewAI) on AWS AppRunner, automating 80% of PM workflows and increasing engineering throughput 4×.
Engineered an event-driven, async pipeline using AWS Kinesis, SNS, and Lambda, reducing P95 latency by 85% and ensuring 99.98% message delivery reliability.
Built a production-grade MLOps platform with SageMaker Pipelines and MLflow, automating retraining gates and boosting release cadence 8×.
Implemented a robust observability suite (CloudWatch, X-Ray) that cut critical incident resolution time from 2 hours to ~10 minutes via distributed tracing.
Designed a Redis-based memory layer for agentic state persistence, slashing inter-agent latency by 60% and supporting 2.5× higher concurrent user throughput.
TimelyHero, Dimes Inc.
Aug 2024 - Jun 2025
Spearheaded the migration of a legacy monolith to a Java/Flask/gRPC microservice architecture on AKS, scaling to support 100,000+ concurrent WebSocket sessions.
Designed and deployed Airflow-orchestrated RAG pipelines (Pinecone, OpenAI, MongoDB), reducing data staleness from 48 hours to 30 minutes and driving $250K+ in new enterprise contracts.
Standardized Infrastructure-as-Code (IaC) using Terraform and Kubernetes, reducing deployment downtime by 35% and quadrupling the team's release velocity.
Optimized high-throughput ingestion pipelines with Kafka back-pressure handling, stabilizing throughput at 5,000 msgs/sec under peak loads of 1M+ events/hour.
Implemented a comprehensive monitoring stack (Prometheus, Grafana) that improved proactive incident detection by 70%, ensuring high availability.
Endimension Inc.
Apr 2024 - Aug 2024
Trained and optimized large-scale vision models (TensorFlow, Keras) on 8TB of medical imaging data, improving diagnostic mAP by 13% and IoU by 17%.
Deployed ONNX-quantized inference endpoints on SageMaker, reducing P95 latency by 40% with negligible AUC drift (<0.5%) for real-time diagnostics.
Leveraged SageMaker Spot Instances and distributed training strategies to reduce GPU training costs by 25% while accelerating iteration cycles by 1.8×.
Architected a serverless inference pipeline using AWS Kinesis and Lambda, ensuring 24/7 fault tolerance and scalable processing for high-volume image streams.
Configured a robust model monitoring stack (MLflow, Prometheus) to track drift and performance, reducing debugging time for production models by 60%.
JLiang Lab, Arizona State University
Sep 2023 - May 2024
Developed state-of-the-art multi-modal CV models (Swin Transformer, DINOv2) on SageMaker GPU clusters, achieving a 24% increase in rare-disease recall.
Implemented Fully Sharded Data Parallel (FSDP) training on Amazon EKS (8× A100s), reducing model training time by 3.2× and cutting cloud compute costs by 40%.
Engineered a multi-node GPU job scheduler using Slurm and EC2 Batch, automating provisioning and boosting cluster utilization by 35% across shared workloads.
Built distributed data pipelines with AWS Glue and S3 to ingest and preprocess 8TB+ of multimodal medical datasets, improving data throughput by 2.6×.
Designed reproducible experiment tracking workflows with SageMaker Experiments and MLflow, reducing hyperparameter tuning time by 45% and ensuring 100% reproducibility.
SRM Advanced Electronics Laboratory
Dec 2021 - Jul 2023
Developed a distributed kernel regression pipeline on Apache Spark (Java + MLlib), achieving a clinical-grade MARD of 8.86% for non-invasive glucose monitoring.
Engineered a real-time IoT streaming system using AWS Greengrass and IoT Core, enabling ultra-low latency ingestion (<200ms) for 2,000+ daily sensor readings.
Designed production-grade ETL and data quality workflows with Spark SQL, ensuring 95%+ signal integrity across distributed edge devices.
Optimized cloud-to-edge messaging protocols via MQTT, ensuring reliable data transmission and reducing packet loss by 15% in unstable network environments.
Co-authored and published this novel research in Scientific Reports (Nature Portfolio), a Q1 journal, validating the system's clinical accuracy and architectural robustness.
Education

Master of Science in Information Technology (AI/ML)
Aug 2023 - May 2025
Coursework

Bachelor of Technology in Computer Science
Jun 2019 - Jun 2023
Coursework
Projects
Here, you'll find the 31 of my best works in the fields of machine learning, computer science, automation and more.
2026 Q1 (Jan - Mar)
4 Projects

Protocol Battle Arena: High-Performance Benchmarking Suite

Zerobrew: Open Source Rust Systems Contribution

HimmiRouter: Enterprise LLM Gateway & Workbench

SuperSay: High-Performance Local AI Speech Engine
2025 Q4 (Oct - Dec)
4 Projects

IngestIQ: Enterprise Multi-Tenant RAG Platform

Agentum-Framework

High-Velocity Clickstream Analysis

CollabWrite
2025 Q3 (Jul - Sep)
4 Projects

_AI (Underscore AI)

FraudDetectX

Doppelgangerify

Gemma-3 Reasoning Training with GRPO
2025 Q2 (Apr - Jun)
4 Projects

SayItOut

SonicSherlock

Forkast

Beast Watch
2025 Q1 (Jan - Mar)
3 Projects

ChessAI

LLao1

LogicMind
2024 Q4 (Oct - Dec)
2 Projects

Ensemble Uncertainty Quantification for LLMs

MastoGraph - Mastodon
2024 Q3 (Jul - Sep)
2 Projects

TriPendulum Dynamics

x-of-Thought Reasoning
2024 Q2 (Apr - Jun)
2 Projects

FoR Audio: Fake or Real Speech Detection

OpenForensics-DeepFake
2024 Q1 (Jan - Mar)
1 Project

Llama-Bots
2023
2 Projects

Classification & Localization Benchmarker

Otsu-Thresholding
2022-2021
2 Projects

NeuroLearn

PopOS! Shell & Android AOSP ROM Development
Pre 2021
1 Project

sCrAPTCHA & Archcraft Linux Contributions
Honors
Here are a few of my honors, awards, scholarships and certifications.
Scholarships and Fellowships
Herbold ASU Graduate Scholarship
Aug 2024 - Aug 2025
ASU Engineering Graduate Fellowship
Jul 2023 - Jul 2024
SRM Merit Scholarship
Jun 2019 - Jun 2023
Awards
Gold Medalist: Research Day
Apr 2023
Certifications
Here is a comprehensive list of all 18 of my professional certifications, showcasing my commitment to continuous learning and expertise in various technologies.
AI & Machine Learning Foundations
Supervised Machine Learning: Regression and Classification
Aug 2024
Neural Networks and Deep Learning
Aug 2024
Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
Aug 2024
Structuring Machine Learning Projects
Sep 2024
Industry Specializations & MLOps
Google AI Essentials
Aug 2024
Generative AI for Everyone
Oct 2024
MLOps Essentials: Model Development and Integration
Feb 2025
MLOps Essentials: Monitoring Model Drift and Bias
Feb 2025
Professional Development
AI Model Development
Feb 2025
Cross Functional Collaboration
Feb 2025
Accelerate Your Learning with ChatGPT
Oct 2024
Data or Specimens Only Research
Sep 2023 - Sep 2026
Foundational Skills & Legacy
AI Foundations: Machine Learning
Aug 2024
Machine Learning Foundations: Linear Algebra
Aug 2024
Machine Learning Foundations: Statistics
Aug 2024
Getting Started with Enterprise - grade AI
Jul 2021
Getting Started with Enterprise Data Science
Jul 2021
Getting Started with Cloud for the Enterprise
Jul 2021
Publications
Here is a list of all my publications.
Values
These are the principles that guide my work and life. They're not just words on a page—they're the compass that helps me navigate complex challenges, build meaningful relationships, and create technology that truly serves people. I believe that when we align our actions with our values, we can make a genuine difference in the world around us.
Mastery
I raise the standard in everything I touch. I value depth, precision and the pursuit of world-class craft.
Relentless Growth
Every moment is data. I learn fast, adapt fast and constantly sharpen my mind, skills and character.
Resilience
Pressure clarifies. I stay steady, reset quickly and come back stronger every single time.
Impact
I care about meaningful progress. I direct effort where it moves systems, teams and outcomes forward.
Service
Strength increases when shared. I help, uplift and enable others to operate at their best.
Clarity & Wisdom
I think deeply, choose consciously and act with alignment. I make decisions anchored in truth, not noise.
Freedom
I design my life around growth, curiosity and joy. I choose direction intentionally, not reactively.
Discipline
Consistency builds power. I show up every day and move forward with deliberate action.
- by Himansh Mudigonda