I didn't do a Master's degree to study data. I did it to build things that work in the real world. Two years of graduate-level projects, churn models, automated pipelines, and SQL systems built to the standard of would this hold up in production? Now I'm bringing that into industry. Ready.

I don't just build models. I build business answers.
Off the screen, I'm grounded by faith, family, and movement. As a Christian, integrity isn't a value I list, it's how I make decisions when no one's watching. As a Barcelona fan, I understand that the best teams don't just have talented individuals, they have a system. I bring that same thinking to how I build pipelines and collaborate with teams. When I'm not in the data, I'm on a trail, or surrounded by people I love. That balance is what keeps the work sharp.
A model is only as good as the decision it changes. I just completed my M.S. in Data Analytics at Catholic University, and throughout that journey I built churn models at 80%+ accuracy, automated pipelines that cut reporting errors by 15%, and migrated fragmented records into SQL systems running at 99.9% uptime. None of that matters to me in isolation, what matters is whether someone made a better call because of it.
My core stack: Python · SQL · AWS · GCP · Power BI · Tableau · XGBoost · Scikit-Learn · MLflow · Docker, the full engine from raw data to boardroom decision.
My work focuses on the intersection of Predictive Systems, Revenue Optimization, and Business Intelligence.
Bike-sharing platforms lose revenue daily from a core imbalance: fleets sit idle during off-peak hours while surge periods go underserved. Without forward-looking demand signals, pricing and fleet decisions are reactive, always one step behind the customer.
Built an end-to-end forecasting pipeline to production standards, predicting city-wide demand 12 hours ahead. Engineered a 3-stage modeling approach: baseline Random Forest → lag feature engineering → CatBoost with Bayesian Optimization via Optuna. Containerized with Docker and deployed to cloud infrastructure, giving operations teams real-time inference to inform dynamic pricing and proactive fleet distribution.
51% reduction in MAE (84.78 → 41.17), 44% reduction in RMSE (116.13 → 64.80), and 22% reduction in MAPE (34.63 → 27.10) over a no-feature-engineering baseline, achieved through iterative lag feature engineering and CatBoost hyperparameter tuning with Bayesian Optimization. Delivered as a Dockerized inference pipeline ready for real-world deployment.
Telecom providers lose customers they could have kept, because they only identify at-risk users after a cancellation request. By then, the decision is already made. The real opportunity is predicting churn before it happens, during the window when intervention still works.
Built a full classification pipeline using CatBoost and Optuna on a real-world Vodafone dataset. Addressed a severe 73/27 class imbalance through minority upsampling. Engineered features around customer tenure and usage patterns, then used Optuna for systematic hyperparameter tuning. Validated model behavior through SHAP analysis to identify which signals actually drive churn risk.
Identified the Tenure Effect as the primary churn driver, customers in their first 12 months represent the highest-risk, highest-ROI intervention window. This finding reframes retention strategy from reactive damage control to a structured early-engagement program. Achieved 0.64 precision on churn class with a heavily imbalanced dataset.
Financial institutions lose billions to undetected fraud, but the real challenge isn't catching fraud, it's catching it without drowning analysts in false alarms. With fraud representing only 0.17% of transactions, standard models either miss anomalies entirely or flag so many legitimate transactions they become useless.
Engineered a high-precision scoring engine using CatBoost and Optuna. After an empirical ablation study revealed that extreme outliers in features V1–V28 were critical fraud signals, not noise, revised the preprocessing strategy to retain them rather than clip them. This single insight was the turning point for the model's precision.
0.93 Precision and 0.82 Recall on the fraud class, meaning 93% of flagged transactions are genuine fraud cases. Prevented a total recall collapse by retaining critical outliers, a finding that came directly from systematic ablation testing rather than assumption.
Python, SQL, AWS & Machine Learning
Professional Impact & Results
Experience.experienceDesc1
Experience.experienceDesc2
Education.subtitle
Education.educationDesc1
Education.educationDesc2
Education.educationDesc3
Let's turn your Data into Business Decisions