Browse by use case

Precision@K, NDCG, and MRR are ranking-specific metrics built on the precision/recall intuition covered here. Learn to translate model metrics into product decisions.

Semantic search uses embeddings to match intent, not just keywords. Covers tokenization, embedding similarity, and how RAG retrieval works at a product level.

Ch 5

Metrics Design

Optimizing for clicks can hurt satisfaction. Learn how to design north star, input, and guardrail metrics that don't create perverse incentives in your ranking system.

🎯

Recommendations

Personalizing feeds, surfacing relevant content or products, and balancing exploration vs exploitation.

Ch 8

Unsupervised Learning

Collaborative filtering and user segmentation are the backbone of rec systems. Learn how clustering finds user archetypes and how patterns emerge without labels.

Embedding models turn items and users into vectors — enabling "find similar" recommendations. Covers how embeddings encode meaning and enable similarity search.

Two-tower neural networks train on explicit signals (clicks, purchases) — that's supervised learning applied to recommendations. Learn how models generalize from past behavior to new items.

Ch 4

A/B Testing & Experimentation

Rec systems are hard to A/B test — network effects, novelty bias, and position effects all distort results. Learn how to run experiments that actually tell you if a change is working.

🛡️

Fraud Detection & Trust & Safety

Flagging suspicious transactions, detecting fake accounts, and building classifiers that survive adversarial behavior.

Fraud classifiers are supervised models trained on labeled fraud examples. Covers classification, decision thresholds, and why gradient boosting dominates tabular fraud data.

In fraud, false negatives (missed fraud) and false positives (blocking real users) have very different costs. Precision/recall tradeoffs and how to set thresholds for asymmetric costs.

Ch 8

Unsupervised Learning

Anomaly detection (isolation forests, autoencoders) catches novel fraud patterns before you have labels for them. Learn how unsupervised models flag unusual behavior.

Ch 12

AI Product Decisions

Covers AI security risks including adversarial inputs and prompt injection — relevant if your trust & safety system uses LLMs for moderation or policy enforcement.

📈

Forecasting & Prediction

Predicting demand, churn, revenue, or user behavior — and knowing when your forecast is trustworthy.

Forecasting is regression — predicting a number from features. Covers linear regression, gradient boosting for tabular data, and overfitting traps that make forecasts look better than they are.

Ch 2

Stats Foundations

A forecast is only as useful as your uncertainty estimate. Distributions, variance, and why averages mislead — the foundation for knowing what your prediction interval actually means.

Ch 3

Probability & Uncertainty

Confidence intervals, p-values, and thinking in distributions instead of point estimates. Essential for presenting forecasts honestly to stakeholders.

Ch 6

How ML Actually Works

Overfitting is the silent killer of forecasting models — a model that memorizes past data fails on new data. Learn training vs test splits, overfitting, and what makes a model generalize.

🔒

Content Moderation

Classifying harmful content, managing human review queues, and balancing false positives against platform safety.

Content moderation classifiers are trained on labeled examples of violating vs non-violating content. Learn classification, decision thresholds, and what "accuracy" hides at class imbalance.

The cost of a miss (harmful content stays up) vs a false alarm (real content removed) varies by policy. Covers precision/recall, ROC curves, and how to set thresholds for different risk tolerances.

LLM-based moderation can catch nuanced policy violations that keyword filters miss — but introduces new attack surfaces. Covers prompting, fine-tuning, and evaluation metrics for text classifiers.

Ch 12

AI Product Decisions

Prompt injection and jailbreaking are active threats to LLM-based moderators. Covers AI-specific security risks and the AI vs rules vs human decision framework.

✨

Generative AI Features

Building LLM-powered features like copilots, summarization, Q&A, and AI assistants into your product.

The foundation: tokenization, embeddings, attention, and the real trade-offs between prompting, RAG, and fine-tuning. What each approach costs and when to use which.

Ch 12

AI Product Decisions

Build vs buy, model selection, latency/cost/quality tradeoffs, and prompt injection risks — everything you need to make smart architectural decisions for an AI feature.

Ch 10

Neural Networks & Deep Learning

LLMs are large neural networks. Understanding layers, weights, and why deep learning needs so much data gives you the intuition to have better conversations with your ML team.

Evaluating LLM outputs is hard — there's no single accuracy number. Covers how to think about evaluation when ground truth is subjective, including BLEU, ROUGE, and human eval.

📉

Churn & Retention

Predicting which users are at risk of churning, segmenting by engagement, and measuring the impact of retention interventions.