Aravinda Raman Jatavallabha

Machine Learning Engineer & Data Scientist

Currently a Machine Learning Engineer at SmartProtect, building scalable AI solutions for public safety, legal automation, and demand forecasting.
Actively seeking full-time opportunities as an AI/ML Engineer, Data Scientist, Data Analyst, or Data Engineer starting June 2025.

Aravinda Raman Jatavallabha

About Me

Full-Stack AI/ML and Data Science Engineer with a strong track record of building scalable intelligent systems that drive real-world results. I completed my Master's in Computer Science (Data Science Track) from NC State with a 4.0 GPA and am actively seeking full-time roles in Data Science, Machine Learning, AI Engineering, or Analytics. I'm ready to start immediately and aiming to secure a position by June 2025.

My experience spans the full ML pipeline, from developing LSTM-based forecasting systems and optimizing ETL workflows using AWS and Snowflake to integrating LLMs for NLP applications and deploying models with FastAPI and SageMaker. I've led impactful projects across public safety, legal automation, and customer intelligence, combining Python, SQL, and cloud technologies to deliver results at scale.

Alongside applied work, I've contributed to peer-reviewed research in areas like graph neural networks, language modeling, and fairness in AI, with multiple publications across conferences and journals.

I'm looking to bring this blend of hands-on engineering and deep analytical thinking to a high-impact team. Let's connect.

Where I've Delivered Impact

Healthcare

Built CNN models for X-ray and diabetes prediction.

Legal Tech

Built GPT-powered legal assistants with RAG & vector DBs.

Public Safety

Forecasted 911 call volume and optimized staffing using ML.

Retail & Marketing

Boosted campaign ROI via LSTM & ETL optimization.

Autonomous Systems

Built lane detection and GNN-based driver behavior forecasting.

IoT & Networks

Built LSTM models for IoT network health prediction at scale.

Technical Skills

Programming & Scripting

Python SQL C++ React Angular Bash HTML/CSS JavaScript Flask FastAPI REST API Git

Data Handling & Visualization

Pandas NumPy SciPy OpenCV Matplotlib Seaborn Plotly Power BI Tableau Excel

Machine Learning & Deep Learning

Supervised & Unsupervised Learning Scikit-learn PyTorch TensorFlow Keras NLTK SpaCy PyG
Models: CNNs, RNNs, LSTMs, GNNs, Recommender Systems, Anomaly Detection, Ranking Models, Time Series Forecasting

Generative AI & LLMs

LLMs LangChain RAG Prompt Engineering OpenAI API Hugging Face LLaMA Pinecone

Cloud, MLOps & Big Data

AWS (S3, ECR, SageMaker) Azure Apache Spark Hadoop Snowflake Apache Airflow Docker

Statistics & Experimentation

Statistical Modeling Hypothesis Testing A/B Testing Regression Classification Clustering Time Series Analysis

Professional Certifications

NC State University

Data Science Specialization

NC State University

Verified Achievement View Certificate
Stanford University

Machine Learning Specialization

Stanford University

Verified Achievement View Certificate
DeepLearning.AI

Deep Learning Specialization

DeepLearning.AI

Verified Achievement View Certificate
IIIT Hyderabad

AI Summer School

IIIT Hyderabad

Verified Achievement View Certificate

Education

NC State University Logo

Master of Computer Science

Data Science Track

North Carolina State University, Raleigh, NC

View Credential

Aug 2023 - May 2025

4.0/4.0 GPA

Academic Focus Areas

AI & Machine Learning

Deep Learning, Neural Networks, NLP

Data Science

Analytics, Mining, Visualization

Systems & Architecture

Databases, Algorithms, Software Engineering

Relevant Coursework

Neural Networks and Deep Learning
Foundations of Data Science
Automated Learning and Data Analysis
Database Management Systems
Machine Learning with Graphs
Natural Language Processing
Generative AI
Privacy in Artificial Intelligence
Design and Analysis of Algorithms
Manipal Institute of Technology Logo

Bachelor of Technology

Information Technology

Manipal Institute of Technology, Manipal, India

View Credential

Jun 2019 - Jul 2023

8.64/10.0 GPA

Academic Focus Areas

Big Data Analytics

Minor Specialization

Software Engineering

Core Focus

AI & ML

Technical Electives

Relevant Coursework

Data Mining
Machine Learning
Pattern Recognition
Algorithms
Big Data Analytics
Data Structures

Work Experience

SmartProtect Logo

Data Scientist - ML, Analytics & Full-Stack Systems

SmartProtect Public Safety Solutions

May 2024 - Present · Part-time

Wilmington, Delaware, United States · Remote

Performance Metrics

Forecast Accuracy

+20%

Model Performance

Operational Efficiency

18%

Overtime Reduction

Processing Speed

35%

Faster Retraining

ML & Forecasting
  • Improved hourly and daily call volume forecasting accuracy by 20% through A/B/n-tested ARIMA, Prophet, LSTM, and Logistic Regression models deployed on AWS SageMaker.
  • Built scalable ETL pipelines using Apache Airflow, Flask APIs, and Snowflake to automate ingestion and transformation of 1.2M+ dispatch records across regions.
  • Deployed real-time drift detection and monitoring with AWS Lambda and S3 versioning, embedding auto-retraining triggers and reducing ML update latency by 35%.
NC State University Logo

Graduate Teaching Assistant

North Carolina State University

Aug 2024 - May 2025 · Part-time

Raleigh, North Carolina, United States

Teaching Impact

Course Coverage

2

Advanced ML Courses

Teaching Duration

10

Months of Instruction

CSC 522: Automated Learning and Data Analysis

Jan 2025 - May 2025 · Under Prof. Thomas Price

  • Reviewed and evaluated student assignments and projects focused on data mining, supervised and unsupervised learning, predictive modeling, and Python-based machine learning workflows.
DRDO Logo

Machine Learning Engineer

Defence Research and Development Organisation (DRDO)

Jan 2023 - Jun 2023 · Research Internship

Bengaluru, Karnataka, India · On-site

Research Highlights

Model Performance

3.19

Language Model Perplexity (SOTA)

Processing Efficiency

40%

Reduction in retraining latency

Core Research Contributions

  • Designed and implemented a Temporal Graph Neural Network (TGNN) architecture at the Centre for Artificial Intelligence & Robotics (CAIR) Lab, using PyTorch Geometric (PyG) to model dynamic user interactions over time for semantic-aware behavioral forecasting. This improved prediction accuracy by 2% compared to state-of-the-art baselines.
  • Developed a streaming NLP pipeline leveraging BERT for learning contextualized word embeddings in an incremental fashion. This addressed semantic drift by dynamically updating token representations without retraining from scratch.
Merkle Logo

Data Science Intern - ML & Analytics

Merkle

May 2022 - Jul 2022 · Internship

Bengaluru, Karnataka, India · Hybrid

Impact Metrics

Campaign Profitability

+10%

Revenue Optimization

Query Performance

40%

Latency Reduction

Data Scale

16M+

Records Processed

Key Achievements

  • Led a team of 4 to develop and deploy predictive models including XGBoost, LightGBM, and LSTM on 10M+ retail records, resulting in a 10% uplift in campaign profitability for Home Depot through advanced revenue optimization techniques.
  • Engineered scalable ETL workflows using PySpark, SQL, and Snowflake, optimizing 16M+ rows of transactional data with partitioning, caching, and indexing strategies that improved query latency by 40%.
  • Integrated LLM-based vector embeddings to cluster product descriptions, enabling personalized product targeting and enhancing recommendation precision across retail segments.
Manipal Institute of Technology Logo

Machine Learning Engineer

Manipal Institute of Technology

Mar 2021 - Jun 2022 · Part-time

Udupi, Karnataka, India · On-site

Technical Impact

Model Accuracy

99.4%

IoT Network Prediction

Automation Impact

60%

Reduction in Manual Tasks

Key Projects

Medical Imaging Analysis
  • Developed a Bone Age Assessment system using medical imaging data of pediatric patients (ages 1–288 months), customizing and benchmarking VGG-16, MobileNet, InceptionV3, and XceptionNet architectures for bone maturity prediction.
  • Conducted comparative analysis using mean average error (MAE) metrics, improving interpretability and model selection for clinical applications.

Projects

CoveredAI - Health Insurance Document Assistant

A full-stack AI-powered app that simplifies health insurance documents using LLMs and Retrieval-Augmented Generation (RAG). Users can upload plans, ask natural-language questions, view smart summaries, compare multiple plans side-by-side, and export personalized PDF reports. Built with end-to-end semantic search, summarization, and secure document handling.

React TypeScript TailwindCSS Flask LangChain OpenAI GPT FAISS
View on GitHub

Wolf Parking Management System

A campus-wide parking management system that tracks lot availability, zoning rules, permit assignments, and citations. It allows administrators to efficiently manage parking resources, issue fines, and generate reports to support data-driven decisions for better traffic control and user experience.

Java SQL MariaDB Database Design
View on GitHub

Cold Email Generator

An automated cold outreach tool that combines LangChain's ChatGroq + LLaMA3 with ChromaDB to extract job descriptions, match user skills, and generate personalized emails using RAG. Includes an interactive Streamlit UI for seamless job-to-email generation.

LangChain LLaMA3 ChromaDB RAG Streamlit
View on GitHub

Legal Query AI Assistant

A chatbot powered by LLMs (OpenAI GPT / LLaMA) and RAG, designed to retrieve, summarize, and answer complex legal queries from document repositories with high accuracy and fast vector-based search.

GPT LangChain RAG Vector Search
View on GitHub

Customer Churn Prediction

Built a full ML pipeline for predicting customer churn using Apache Airflow, AWS (S3, SageMaker, ECR), and Dockerized Flask APIs, enabling scalable deployment and real-time churn inference.

AWS Airflow Docker Flask
View on GitHub

Lane Detection for Autonomous Vehicles

Implemented a hybrid SegNet + LSTM deep learning model to detect lane lines, compute lane curvature, and measure vehicle offset using OpenCV-based image processing.

SegNet LSTM OpenCV Computer Vision
View on GitHub

COVID-19 Detection using Chest X-rays

Developed a CNN-based classifier to detect pneumonia (COVID-19) from chest X-ray images, achieving 95.28% training accuracy and 89.52% validation accuracy using preprocessed radiology data.

CNN TensorFlow Medical Imaging
View on GitHub

Store Demand Forecasting using Time-Series and Neural Networks

Built a robust hybrid forecasting model combining CNN and BiLSTM architectures to predict daily item-level sales across 10 stores. Leveraged the Kaggle Store-Item Demand Forecasting dataset (2013–2017) and benchmarked against models like XGBoost, ANN, and ARIMA. The hybrid model achieved the lowest MSE, improving forecasting precision and enabling optimized retail inventory decisions.

CNN BiLSTM Time Series XGBoost
View on GitHub

Image-to-Image Translation using CycleGAN

Implemented CycleGAN to translate images between domains without paired datasets—such as Monet paintings to real photographs and human faces to zombies. Trained on publicly available datasets and deployed the model for real-time translation, showcasing the power of unsupervised generative learning in computer vision tasks.

CycleGAN Computer Vision Deep Learning
View on GitHub

Brain Tumor Segmentation using MRI Scans

Used U-Net architecture to perform semantic segmentation on brain MRI images, detecting and outlining tumor regions. The project utilized the LGG MRI Segmentation dataset from Kaggle and focused on pixel-level mask prediction using FLAIR MRI sequences. Achieved high segmentation accuracy and visual interpretability for potential medical diagnostics.

U-Net Medical Imaging Segmentation
View on GitHub

Movie Recommendation System using Collaborative Filtering

Designed a hybrid recommender system that combines cosine similarity with sentiment analysis to suggest movies tailored to user preferences. Scraped metadata from TMDB and IMDB, offering dynamic updates, cast bios, trailers, and review sentiment. Upgraded from a static recommendation system to an interactive, emotionally aware movie discovery experience.

Collaborative Filtering NLP Sentiment Analysis
View on GitHub

Research Papers

Enhancing Privacy in Large Language Models: A Comparative Study on Input Retention and Sanitization Techniques

Novel approach to privacy-preserving language models in healthcare settings using prompt-induced sanitization techniques to reduce PII/PHI leakage while maintaining contextual utility.

LLMs & NLP Privacy & Fairness Healthcare AI
View Paper

Learning Dynamic Representations in Large Language Models for Evolving Data Streams

Investigation of dynamic representation learning in large language models using graph-based attention mechanisms for improved context understanding.

LLMs & NLP Graph Learning
View Paper

Dynamic Graph Representation Learning using Temporal and Topological Information

A Temporal Dynamic Graph Neural Network (TDGNN) framework designed to model real-world dynamic graphs by integrating time-aware message passing, graph topology, and point process theory to enhance prediction in social and interaction networks.

Graph Learning Recommender Systems Time Series & Forecasting
View Paper

SDN-Based Multipath Data Offloading Scheme Using Link Quality Prediction for LTE and WiFi Networks

Time series analysis and prediction framework for optimizing network traffic offloading in 5G networks using software-defined networking.

Time Series & Forecasting 5G Networks
View Paper

Improving Fairness in Visual Recognition through Feature Distillation and Adversarial Debiasing

Novel approach to mitigating bias in computer vision models through adversarial debiasing and balanced representation learning.

Computer Vision Privacy & Fairness
View Paper

Multimodal Conversation Derailment Detection: An Integrated Framework for Early Risk Assessment

Integration of visual and textual cues for early detection of conversation derailment in multimodal AI systems.

Computer Vision LLMs & NLP Multimodal AI
View Paper

Graph Contrastive Learning for Optimizing Sparse Data in Recommender Systems with LightGCL

Development of a lightweight graph contrastive learning framework for efficient recommendation systems.

Recommender Systems Graph Learning
View Paper

Tesla's Autopilot: Ethical Implications and Policy Considerations in Autonomous Vehicle Systems

Analysis of ethical considerations and policy implications in autonomous vehicle systems, focusing on Tesla's Autopilot implementation.

AI Ethics & Policy Autonomous Systems
View Paper

Deciphering Air Travel Disruptions: A Machine Learning Approach to Flight Delay Prediction

Time series forecasting model for predicting flight delays using weather data and historical flight performance.

Time Series & Forecasting Aviation
View Paper

Diabetes Prognosis using Machine Learning: A Comparative Analysis of Classification Algorithms

Machine learning approach to diabetes prognosis using patient data and clinical markers for early detection.

Healthcare AI Classification Models
View Paper

Pediatric Bone Age Assessment using Deep Learning Models: A Comparative Study of CNN Architectures

Deep learning approach to automated bone age assessment using X-ray images for pediatric growth evaluation.

Computer Vision Medical Imaging Deep Learning
View Paper

Let's Connect

Actively seeking full-time opportunities as an AI/ML Engineer, Data Scientist, Data Analyst, or Data Engineer. Whether you're building innovative systems or solving real-world problems with data, I'd love to be a part of it. Let's chat about how I can help your team move faster with intelligent, scalable solutions.