Darsh Vora | ML/AI Engineer

The Philosophy

Always in Beta.
Always Compounding.

Two books that changed how I work - not what I do, but why I do it the way I do.

Atomic Habits — James Clear

"You do not rise to the level of your goals. You fall to the level of your systems."

James Clear

Getting 1% better every day compounds to 37x growth in a year. The magic isn't the sprint. It's the system that keeps running.

The Start-up of You — Reid Hoffman

"The best version of yourself is the one that stays permanently in Beta."

Reid Hoffman

Finish lines are illusions. The engineer who deploys, monitors, retrains and iterates is the one who compounds.

🔁

Atomic Habits

Beta Version

Never Finished

Every model is a habit loop: cue, train, deploy, reward, retrain

📈

1% Daily Gain

Marginal Gains

Compound Effect

Small improvements in latency, accuracy and throughput compound into massive value

⚙️

Systems over Goals

Production over Notebooks

Ship the System

A model improving 1% weekly beats a perfect model that never ships

🧪

Identity-Based

Build v26.3

Who I Am

Not an engineer who shipped models. An engineer who ships models — present tense, always

🛡️

Habit Stacking

MLOps Pipelines

Stack the Gains

Train, validate, deploy, monitor, retrain. Each step stacked, each step compounding

🌊

Aggregation of Gains

60x · 3x · $250K+

The Proof

Results are the documented outcome of systems built to compound

My Story

The Engineer Behind the Iterations

I started in Electronics Engineering because I wanted to understand how things work at the level where you can't abstract the problem away. That instinct never left. When I moved into ML, I wasn't chasing the field - I was following a way of thinking. You build something, you watch it fail in a specific way, and that failure tells you exactly what to fix.

Reid Hoffman's idea of staying permanently in Beta resonated because it isn't motivational - it's structural. A system that's never finished is a system that keeps getting better. James Clear's framing of identity over goals landed the same way: you don't aim for a deployment, you become the kind of engineer who ships things that hold up. Not the demo. The thing that's still running six months later.

The credentials are in the tags below. What they don't capture is simpler: I genuinely enjoy the part where the model is live and something unexpected happens. That's where the real engineering starts.

San Francisco, CA MS @ Northeastern · 3.94 AWS ML Certified Electronics to ML/AI v26.3 · Compounding Open to Roles

🔁

Never Just a Model

Pipelines, monitoring, retraining triggers, failure modes. The model is 20% of the work. The system around it is the other 80%.

📈

Constraints Over Comfort

Give me a tight latency budget, limited memory, and no cloud dependency. Pressure like that forces decisions that open-ended projects never do.

📡

Edge to Cloud

A model that only works in a Jupyter notebook isn't finished. I build for wherever it needs to run — on-device, containerized, or serving thousands of requests a second.

💼

Business ROI

$250K+ documented value. Every system I ship is tied to a business outcome, not just a metric.

Career Path

Where I've Made an Impact

AI Software Engineer

● Current

Tatum Robotics

August 2025 to Present

Deployed production ASR service processing 500+ daily utterances at 95%+ accuracy and under 200ms latency by architecting a Whisper ASR pipeline with automated quality validation and CI/CD version control.
Enabled real-time ASL translation across 3,000+ phrases by building a gesture mapping engine on a C# (.NET) backend, supporting 26 hand configurations and diverse signing contexts.
Reduced ASL interpretation latency by 40% by redesigning the gesture-to-phrase mapping pipeline, improving response consistency across varying input conditions.
Accelerated on-device inference 3x with 70% model compression via post-training quantization (FP32 to INT8), benchmarking GPU (CUDA) vs. CPU latency profiles with less than 1% accuracy loss.

ML Engineer Intern

Jan 2025 to Jun 2025

Crewasis AI

January 2025 to June 2025

Powered marketing intelligence across 5K+ daily multimodal social media assets by fine-tuning BLIP-2 with LoRA adapters and deploying a RAG system over audio, video, and text, containerized with Docker.
Scaled batch preprocessing 60x (30min to 30sec), saving $19K+ annually, by deploying Python workers on AWS Lambda with Airflow triggers and automated data quality checks.
Constructed a search system across 1.6M+ records by integrating REST APIs (YouTube, Instagram, TikTok) with FAISS vector retrieval at sub-3s query latency, orchestrated with Kubernetes for reliable scaling.
Validated a 29% cost advantage across 20+ A/B experiments by evaluating multimodal pipeline variants with MLflow tracking and translating results into deployment decisions.

Jr. Data Scientist

Jun 2022 to May 2023

Red Moments Pvt Ltd, Mumbai, India

June 2022 to May 2023

Improved production planning by 23% by developing time-series forecasting models (Prophet, XGBoost) on 75K+ transactions with feature engineering in SQL, deployed as a scoring pipeline for business planning.
Generated $100K annually with 16% inventory reduction by designing A/B testing frameworks translating business questions into structured recommendations for senior stakeholders.
Lifted margins by 9% and produced $80K revenue by constructing ETL pipelines with CI/CD workflows enforcing schema consistency across all reporting layers.
Slashed reporting from 3 days to real-time, saving $30K annually, by building Power BI dashboards surfacing KPI definitions for cross-functional stakeholders.

Portfolio

8 Domains. One Compound System.

Independent projects across every domain, each built to production standards.

🤖

Agentic AI / LLMs

NeuroDigestAI — LLM Content Intelligence Pipeline

End-to-end GenAI pipeline aggregating 10-25+ daily AI sources (YouTube, OpenAI, Anthropic) into structured PostgreSQL digests. LLM-powered ranking cut manual curation by 80-90%, running fully automated at under 20 seconds latency with zero duplicate deliveries.

OpenAI APIPostgreSQLDockerETLRAG

🤖 80-90% curation saved · 20s latency · Zero duplicates

💬

NLP / RAG

FinSight RAG — Financial Document Analysis

Hybrid RAG pipeline over 10 S&P 500 SEC 10-K filings using MiniLM embeddings, dense/sparse retrieval, and semantic reranking. Achieved 94% query success and 4.25/5 relevance across 200 queries, cutting retrieval latency 42% and API costs 40%.

LangChainFAISSChromaDBFinBERTRAG

💬 94% query success · 42% faster retrieval · 40% cost reduction

🎙️

ML / Deep Learning

Speech Emotion Recognition System

CNN-LSTM architecture with multimodal audio feature extraction (MFCC, mel-spectrogram, chroma) across 15K+ audio samples. Achieved 90.5% accuracy and 90.4 F1-score across 8 emotion classes, outperforming InceptionV3 baseline by 3% while training 25% faster.

PyTorchCNN-LSTMLibrosaHuggingFace

🎙️ 90.5% accuracy · 8 emotion classes · 15K+ samples

👁️

Computer Vision

Smart Traffic Management — Vision + Tracking

YOLOv8 + ByteTrack pipeline on 4,500+ augmented traffic images achieving 94.5% mAP and 89.4% MOTA. Integrated Tesseract OCR for license plate recognition at 96% character accuracy, optimized for real-time edge deployment at 30fps.

YOLOv8ByteTrackOpenCVTesseract OCR

👁️ 94.5% mAP · 89.4% MOTA · 96% plate recognition

🏥

Healthcare AI

TreatLive Telemedicine Platform

Consolidated 50+ disparate EHR systems serving 10,000+ users by building scalable FastAPI ingestion backends, reducing manual data reconciliation by 20%. Accelerated query performance 30% via ETL workflows with Redis caching across distributed clinical data sources.

FastAPIMySQLMongoDBRedisETL

🏥 50+ EHR systems · 10K+ users · 30% faster queries

🧬

Healthcare AI

Medical QA System — RAG + Fine-tuned GPT-2

Fine-tuned GPT-2 (124M params) using LoRA and PEFT on 200K+ medical Q&A pairs, reducing validation loss from 1.87 to 1.74 with 30% faster GPU training. Deployed RAG over 500K+ clinical document embeddings via FAISS with safety guardrails.

GPT-2LoRAPEFTFAISSRAG

🧬 200K+ QA pairs · 500K+ embeddings · Safety guardrails

💳

FinTech

Credit Risk Assessment — Bondora

Ensemble credit risk model (XGBoost, LightGBM, Random Forest) on 100K+ P2P loan applications with engineered financial ratios and SMOTE for class imbalance. Achieved 96.5% AUC translating to $250K+ impact through improved loan loss prediction.

XGBoostLightGBMSHAPPostgreSQL

💳 96.5% AUC · 100K+ applications · $250K+ impact

⚙️

MLOps

Customer Churn Prediction + MLOps Pipeline

End-to-end MLOps system on 5,000+ customer records with XGBoost, 50+ MLflow experiments, SHAP explainability, and drift detection via KS and PSI statistical checks. Delivered 12% churn reduction with automated retraining triggers and Tableau stakeholder dashboards.

XGBoostMLflowSHAPdbtTableau

⚙️ 12% churn reduction · 50+ experiments · Drift detection

📊

Analytics / Research

Customer and Sales Analytics — Published Research

Kimball dimensional modeling with dbt-powered data marts and SCD Type 2 tracking across 100K+ retail transactions. Boosted forecast accuracy 76%, designed 20+ KPI metrics with self-serve Tableau dashboards. Published peer-reviewed research.

SnowflakedbtTableauSQLXGBoost

📖 Published · ISBN: 978-93-5777-300-3 · 76% forecast accuracy

Technical Stack

80+ Tools. One System.

Every skill is a habit. Every habit compounds.

🧠 ML / Deep Learning

PyTorchTensorFlow / TFLiteKerasScikit-learnXGBoost / LightGBMCNN / LSTM / RNNCUDASHAPHugging FaceBLIP-2

⚙️ MLOps and Infrastructure

DockerKubernetesMLflowApache AirflowAWS SageMakerAzureGCPTerraformCI/CDFastAPIKafka

💻 Languages and Data

PythonSQLC# (.NET)RGoReact / TypeScriptGitPostgreSQLMongoDBSpark / PySparkBigQuerySnowflakeKafkaETL Pipelinesdbt

🔬 NLP and LLMs

LangChainLangGraphLlamaIndexRAG SystemsBERT / RoBERTaWhisper ASRLoRA / QLoRA / PEFTspaCyNLTKVector DBsAgentic AI

👁️ Computer Vision

YOLO v5/v8OpenCVResNet / EfficientNetMediaPipeBLIP-2PyTorch GeometricTFLite

📊 Analytics and Viz

Pandas / NumPyTableauPower BIPlotlyStreamlitA/B Testingstatsmodels

Social Proof

What My Teams Say

Darsh doesn't just ship models. He ships systems. The Whisper ASR service he built runs at sub-200ms and has held that standard through thousands of daily interactions. That kind of production discipline is genuinely rare at his experience level.

Samantha Johnson

CEO and Founder

Tatum Robotics

What Darsh achieved with our A/V processing pipeline was extraordinary. A 60x speedup isn't incremental improvement. It's a fundamental rethinking of the system. He understands that every architectural decision compounds, and he makes them accordingly.

Mehdi Abtahi

Senior AI/ML Manager

Crewasis AI

Darsh brought a production mindset from day one. Most ML engineers worry about accuracy. He worried about accuracy, latency, monitoring and failure modes simultaneously. Building Crewasis's core pipeline around his architecture was one of the best decisions we made.

Sharon Joseph

CEO and Founder

Crewasis AI

The demand forecasting models Darsh built didn't just improve our inventory metrics. They changed how we made decisions. He has the rare ability to translate ML output into language that operations teams actually act on. The $100K+ impact speaks for itself.

Vishal Rambhia

Ecommerce Head

Red Moments

Recognition

Achievements and Milestones

🎓

MS Data Analytics Engineering

Northeastern University. Graduated December 2025 with academic distinction.

GPA: 3.94 / 4.0

☁️

AWS Certified ML Engineer

Validated expertise in building, training and deploying ML models on AWS at production scale.

Amazon Web Services

📖

Published Research Author

Peer-reviewed research on Customer and Sales Analytics with predictive modeling applications.

ISBN: 978-93-5777-300-3

💰

$250K+ Business Value

Verified impact across production ML systems, data engineering pipelines and AI deployments.

Across roles and projects

🏆

80+ Technical Tools

Full-stack ML expertise spanning Python, SQL, TensorFlow, PyTorch, AWS/Azure/GCP and MLOps.

Train, Deploy, Monitor, Iterate

🤝

Collaborative ML and Robotics Portfolio

Joint research and engineering portfolio spanning robotics, computer vision and production ML.

Production ML · Robotics · CV

Let's Connect

Always Open to a Conversation

Actively seeking ML/AI Engineering roles where production matters and systems compound. Whether it's a role, an interesting problem or just talking shop - I'm in.

✉️

darshvora.mk@gmail.com

💼

linkedin.com/in/darsh-vora29

When I'm not compounding in code

⚽Football

🏎️Formula 1

🥊MMA and Boxing

⌚Watches

🌍Geopolitics

🛍️Shopping

☕Food and Coffee

☕

If you're in San Francisco or passing through, I'm always down for a coffee. Some of the best conversations about AI, F1 or geopolitics happen over a good flat white. Reach out and let's make it happen.

Hi, I'm
Darsh Vora.

Always in Beta.
Always Compounding.

The Engineer Behind the Iterations

Never Just a Model

Constraints Over Comfort

Edge to Cloud

Business ROI

Where I've Made an Impact

8 Domains. One Compound System.

80+ Tools. One System.

What My Teams Say

Achievements and Milestones

MS Data Analytics Engineering

AWS Certified ML Engineer

Published Research Author

$250K+ Business Value

80+ Technical Tools

Collaborative ML and Robotics Portfolio

Always Open to a Conversation

Ask Darsh's AI Live

Hi, I'mDarsh Vora.

Always in Beta.Always Compounding.

The Engineer Behind the Iterations

Never Just a Model

Constraints Over Comfort

Edge to Cloud

Business ROI

Where I've Made an Impact

8 Domains. One Compound System.

80+ Tools. One System.

What My Teams Say

Achievements and Milestones

MS Data Analytics Engineering

AWS Certified ML Engineer

Published Research Author

$250K+ Business Value

80+ Technical Tools

Collaborative ML and Robotics Portfolio

Always Open to a Conversation

Ask Darsh's AI Live

Hi, I'm
Darsh Vora.

Always in Beta.
Always Compounding.