Project Summary
Interactive Jupyter Notebook collection showcasing classical machine learning, statistical analysis, and visualization workflows — the foundational layer behind Drake Talley's production AI systems.
Technical deep dive
The Data Science Portfolio repository is an interactive showcase of classical machine learning, statistical modeling, and exploratory data analysis workflows built in Jupyter Notebook. It documents the foundational techniques that underpin my later production AI work: hypothesis-driven analysis, reproducible notebooks, clear visualization, and model evaluation discipline. For recruiters and search engines evaluating a senior data scientist's breadth, this repo demonstrates end-to-end competency across supervised learning, feature engineering, time series, clustering, and business-facing analytics — not just LLM demos.
What the portfolio covers
- Supervised learning workflows: regression, classification, and model comparison with cross-validation
- Exploratory data analysis with pandas, NumPy, and matplotlib/seaborn visualization patterns
- Feature engineering and preprocessing pipelines suitable for tabular business data
- Statistical testing and interpretation — not just accuracy scores, but why a model behaves as it does
- Reproducible Jupyter notebooks with narrative markdown explaining each analytical decision
- Portfolio-ready presentation of results for stakeholders who need context, not just code
Why this repo still matters in the GenAI era
Generative AI gets headlines, but most enterprise value still flows through structured data, tabular models, and rigorous evaluation. This portfolio anchors my profile in that reality. It shows I can build the statistical and ML foundations that production systems depend on — the same discipline I later applied to fraud scoring (SentinelAI), customer segmentation (GameEdge), and MLOps dashboards. Search terms like senior data scientist Jupyter portfolio, machine learning case studies, and reproducible data science workflows map directly to this repository.
Tech stack
| Layer | Tools | Purpose |
|---|---|---|
| Analysis | Jupyter Notebook, IPython | Interactive exploration and narrative documentation |
| Data | pandas, NumPy | Tabular manipulation and numerical computation |
| ML | scikit-learn, statsmodels | Classical ML and statistical modeling |
| Viz | matplotlib, seaborn, plotly | Exploratory and presentation-quality charts |
| Language | Python 3.x | Primary implementation language |
Getting started
Clone the repository, create a virtual environment, install requirements, and launch Jupyter Lab or Notebook. Each project folder contains a self-contained notebook with data (or data download instructions) and a README explaining the business question being answered.
git clone https://github.com/cdtalley/Data-Science-Portfolio
cd Data-Science-Portfolio
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
jupyter labRelated production work
Techniques demonstrated here extend directly into my featured repos: XGBoost fraud scoring in SentinelAI, RFM segmentation in GameEdge Intelligence, and drift monitoring patterns in enterprise MLOps dashboards. Start here for foundational ML; follow the featured project deep dives for production architecture.
Key Features & Capabilities
- Supervised learning with cross-validation and model comparison
- Exploratory data analysis with pandas, NumPy, and seaborn
- Feature engineering and preprocessing for tabular business data
- Statistical testing with narrative markdown documentation
- Reproducible notebooks suitable for portfolio and stakeholder review
Tech Stack & Components
Getting Started
1.Clone and install
Requires Python 3.x and Jupyter Lab or Notebook.
git clone https://github.com/cdtalley/Data-Science-Portfolio
pip install -r requirements.txt
jupyter labFrequently asked questions
- What is the Data Science Portfolio repository?
- An interactive Jupyter Notebook collection showcasing classical machine learning, statistical analysis, and data visualization workflows. It demonstrates foundational data science skills that underpin Drake Talley's production AI systems.
- Does this portfolio include deep learning or LLM projects?
- This repo focuses on classical ML and statistical methods. Deep learning, RAG, and multi-agent systems are covered in separate featured repositories (DocuMind, AutoFlow, SentinelAI) with dedicated architecture articles on draketalley.ai/blog.
- How do I run the notebooks?
- Clone the repo, install requirements.txt in a Python virtual environment, and launch Jupyter Lab. Each project folder includes its own README with data sources and expected outputs.
- Is there a live demo?
- Selected work is published at chandlerdraketalley.com. The GitHub repository contains the full notebook source for local execution and review.
