Welcome!
I'm a Data Scientist with nearly 2 years of experience. I work with Python, SQL, Machine Learning, Generative AI, Deep Learning, PySpark, Big Data Engineering, and Cloud Technologies to build systems that process and analyze information at scale.
I'm interested in solving problems where data can make a real difference.
Bachelor of Technology | Aug 2019 – May 2023 | GPA: 8.99 / 10
Roni Analytics, Chennai
Mar 2025 – Mar 2026
LTIMindtree, Remote
May 2024 – Aug 2024
Indian Institute of Technology Madras
Jan 2023 – Aug 2023
Centre for Stem Cell and Cancer Genomics, AMI Bioscience, Coimbatore
Aug 2022 – Sep 2022
Python, SQL (PostgreSQL, MySQL)
LLM Integration, AI Agent Development, RAG (Retrieval Augmented Generation), MCP (Model Context Protocol), AWS Bedrock, Prompt Engineering, Langfuse, OpenTelemetry
Classification, Logistic Regression, Decision Tree, XGBoost, Deep Learning (CNNs), Transfer Learning, Hyperparameter Tuning, Feature Engineering, Imbalanced Data Handling, Model Evaluation & Monitoring
PySpark, Databricks, AWS (S3, Lambda, RDS, EC2, SNS), ETL/ELT Pipelines, Data Warehousing, Medallion Architecture, Delta Lake, Docker, Batch Processing, Partitioning, Caching, Git
Pandas, NumPy, Time Series Analysis, Statistical Analysis, A/B Testing, Hypothesis Testing, Anomaly Detection, API Integration, Data Validation
Matplotlib, Seaborn, Plotly, Power BI, Streamlit
AWS Bedrock · Claude 3.7 Sonnet · MCP · RAG · Langfuse · OpenTelemetry
A production-grade AI agent that lets users query live crypto market data through plain English — backed by 17 live data integrations, retrieval-augmented knowledge, and per-session memory.
Python · XGBoost · Platt Scaling · GPT-4o mini · Polymarket/Kalshi APIs · Streamlit
An end-to-end platform that aggregates 500+ prediction markets, enriches them with live news and sentiment, and applies ML to forecast outcomes with 4 automated trading strategies.
PySpark · Databricks · AWS · Medallion Architecture · Delta Lake
A scalable data platform ingesting raw on-chain events from Ethereum, Base, and Solana, transforming them into analytics-ready tables handling 20M+ records daily.
Python · Multi-threading · Backtesting · Telegram Bot API
A live signal engine scanning 500+ DEX pairs across 4+ blockchains for volatility breakouts, generating buy/sell alerts pushed to subscribers via Telegram.
CNN · TensorFlow · Keras · VGG16 · Transfer Learning
A clinical deep learning model trained to detect early-stage lung cancer in CT scans, fine-tuning VGG16 with strategic layer unfreezing evaluated on a held-out test set.
XGBoost · Logistic Regression · Decision Tree · scikit-learn · Python
A binary classification model predicting customer defaults on a heavily imbalanced dataset (78:22), solved through class weighting and targeted feature engineering.
Logistic Regression · XGBoost · Decision Tree · scikit-learn
An automated loan decisioning system with a three-tier approval framework routing clear approvals and rejections automatically, reserving human review for borderline cases.
Python · Pandas · Asyncio · Statistical Analysis · Telegram Bot
A monitoring system continuously validating DeFi protocol data across multiple blockchains, running anomaly detection concurrently and firing alerts the moment something looks wrong.
Advanced SQL certification demonstrating mastery of complex queries and performance optimization.
Completed LeetCode's Top 50 SQL problems covering joins, subqueries, and window functions.
Coursera certification in Python, Bash scripting, and SQL essentials for data engineering workflows.
Awarded for leadership as Rotaract Club President, driving community and professional initiatives.
Recognized at International Conference on Infectious Diseases for research poster presentation.
Whether you're looking to collaborate on a project, explore a new opportunity, or just talk data science — I'd love to hear from you.