Hi, I'm John

Bridging the gap between complex data pipelines and executive strategy. I leverage SQL, Python, and advanced BI frameworks to clean large-scale transactional data and build predictive analytics infrastructure.

John Analyst Profile
About
2+
Years On Data

About Me

I am an Advanced Data Analyst and Applied Statistician specializing in transforming large-scale, unstructured data into high-impact business intelligence. My work bridges the gap between complex statistical models and executive-level strategy.

With a focus on the commercial, retail, and financial sectors, my expertise spans the full data analytics lifecycle: engineering public data extraction pipelines, designing robust ETL workflows, modeling customer behavior via unsupervised machine learning, and delivering enterprise-grade interactive dashboards.

Data Quality & Validation

Relational Databases & Data Modeling

Enterprise BI & Dashboards

Automated Data Pipelines (ETL)

What I Do

Data Collection & Survey Architecture

I design intelligent mobile surveys using ODK (Open Data Kit) and KoboToolbox, integrating them directly with automated Google Sheets trackers for real-time field monitoring and research analytics.Extract massive public datasets from scratch—whether mining market intelligence or syncing with live APIs to deliver the raw data needed to drive strategy.

Data Quality & Wrangling

Using Python and SQL, I handle deep data engineering.I clean, transform, reshape, and structure complex transactional records into clean, dependable, and highly optimized database schemas built for rapid query execution.

Exploratory Analytics

I apply rigorous statistical summaries and correlation checks to uncover the underlying patterns within your historical performance.This bridges the gap between raw numbers and clear, foundational business intelligence

Enterprise BI & Visualization

I design high-impact, interactive dashboards using Power BI and Python visualization libraries.I translate complex analytics into dynamic, visual narratives tailored directly for category managers and corporate stakeholders.

Pipeline Automation & ETL

I write production-ready code to fully automate your data workflows,establishing scheduled ETL pipelines that run seamlessly in the cloud. This transforms manual work into a repeatable, hands-off system.

Market Research

I execute large-scale web scraping and sentiment analysis pipelines to track market trends, consumer behavior, and competitive landscapes.

Predictive Analytics & Modeling

I deploy advanced statistical and machine learning models (such as K-Means clustering and predictive algorithms)to forecast demand, evaluate credit risk, and segment consumer behavior with high precision.

Featured Projects

SQL Project

Retail Intelligence Analysis

Processed 260K+ transaction records, using text-mining and regex engines to extract structural features and profile consumer cohorts. Developed an algorithmic similarity matching engine leveraging Pearson Correlation (r) and Min-Max Magnitude Distance to establish pre-trial historical baselines. Applied automated two-sample t-tests to evaluate store layout trials, successfully validating net revenue expansion while executing critical diagnostic audits on localized traffic anomalies to protect corporate profit margins.

Python Statistics A/B Testing Data Engineering Executive Strategy Reporting
View Project
Power BI Dashboard

Power BI Executive Dashboard

Constructed a dynamic, KPI-focused executive reporting suite tracking metrics for an e-commerce operation ($323K+ total sales across 15K+ transactions). Developed advanced, custom DAX measures to calculate rolling averages, year-over-year revenue velocity, and average transaction size bounds. Engineered interactive geospatial heat maps and cohort distribution matrices, giving stakeholders an immediate, scannable overview of regional performance and sales density.

Power BI DAX Data Visualization Business Intelligence Executive Reporting
View Project
Python Analysis

Python Retail Performance Analysis

Exploratory analysis using Seaborn's MPG dataset to uncover trends in retail performance.

Python Pandas Seaborn
View Project
Loan Default Prediction

Loan Default Prediction

Automated credit risk model using machine learning techniques.

Python ML Scikit-learn
View Project
Excel Analysis

Excel Data Cleaning & EDA

Data cleaning, pivoting, and calculations with dashboard visuals.

Excel Pivot Analysis
View Project
Statistical Analysis

Statistical Analysis Report

Python KPI-focused reports with role-based access and automated refresh.

Python ML Statistics
View Project

Technical Proficiency

Technical Skills

Python 80%
SQL 85%
Tableau 65%
Excel 92%
Power BI 87%

Professional Achievements

2+
Years On Data
14+
Data Solutions
10+
Freelance Clients
5+
Working Tools

Get in Touch

Let's Connect

Location
Nairobi, Kenya