Courses
Premium, text-based courses for senior engineers who want depth, not fluff.
Bash & Shell Scripting for Engineers
A free course covering Bash for engineers who copy-paste shell from Stack Overflow and want to stop. Parsing order, quoting, arrays, set -euo pipefail, traps, input handling, ShellCheck, and the Makefile patterns you see in every serious repo.
Docker & Container Fundamentals
A free course covering what containers actually are, how Docker images work, and how to run and debug them in production. Builds on the Linux namespaces and cgroups work, no more magic.
etcd Operations Masterclass
The complete production guide to etcd, the storage engine behind every Kubernetes cluster. Internals, sizing, backups, monitoring, disaster recovery, and the troubleshooting playbook for the failures that actually happen in production. If your cluster runs on etcd, this course keeps it running.
Git Internals for Engineers
A free course covering what Git actually is underneath: a content-addressable filesystem: and the workflows senior engineers use every day: rebasing cleanly, recovering lost work, signing commits, and debugging with blame and bisect.
GPU Cost Optimization on Kubernetes
GPUs are the most expensive line item in any cluster running ML workloads. Most teams overspend by 40-70% on the wrong GPU types, idle capacity, and missed autoscaling. This course is the operational playbook for cutting GPU spend in half without breaking production: right-sizing GPU types, MIG partitioning, GPU-aware autoscaling, spot and reserved capacity strategies, and cost attribution that makes engineers actually care.
Identity and Trust for DevOps Engineers
A scenario-driven course on identity and trust for DevOps and platform engineers. Covers cryptographic primitives, TLS and PKI, OAuth 2.0 and OpenID Connect, SAML, LDAP and Active Directory, IdP-as-a-service (Okta), JWTs and key rotation, mTLS and service identity, authorization patterns (RBAC, ABAC, ReBAC), operational SSO, debugging identity and TLS flows, threat modeling, and a full enterprise-identity capstone.
Kubernetes Architecture & Chaos
How Kubernetes actually works under the hood, from API server request lifecycle to etcd Raft to the scheduler framework, paired with chaos engineering reasoning that turns architectural knowledge into operational confidence. Built for the interview question "walk me through what happens when you create a pod" and the production question "how do we test resilience without breaking customers?"
Kubernetes System Design Interview Prep
Master Kubernetes system design interviews with a structured framework, real-world scenarios, and quantitative reasoning. Covers HA, multi-tenancy, networking, security, and cost, everything asked at senior and staff-level interviews.
Kubernetes Cluster Upgrades with kubeadm
The complete guide to upgrading Kubernetes clusters in production using kubeadm. From planning and validation to control plane upgrades, worker node rollouts, and automation.
Kubernetes Debugging for SREs
The systematic debugging playbook for Kubernetes in production. From the layered debugging mental model (App → Pod → Node → Cluster → Cloud) to the 3 AM incident playbook. Pod failures, node issues, networking problems, storage debugging, control plane diagnostics, and incident response, built from real production incident experience.
Kubernetes Performance Optimization
A deep, scenario-driven course on making Kubernetes clusters faster, leaner, and properly tuned. Covers control plane tuning, workload optimization, network and storage performance, autoscaling, and cloud-managed cluster optimization. Every lesson starts with a real performance problem, diagnoses the root cause, and implements the fix with measurable results. Built for DevOps engineers, SREs, and platform engineers who need to squeeze every last bit of performance out of their clusters and ace the interview question 'your cluster is slow, what do you do?'
Actively expanding · existing students get all new lessons free
Kubernetes Security for DevOps Engineers
A scenario-driven course that teaches Kubernetes security through real attack paths and defense architectures. Covers the Kubernetes API, RBAC, threat modeling with STRIDE, network policies, runtime security, and compliance, all through the lens of real breaches and FAANG-level interview questions.
Linux Fundamentals for Engineers
A free course covering the Linux fundamentals that senior engineers are expected to know. Filesystems, processes, systemd, cgroups, and namespaces, the foundation everything else is built on.
Production LLM Inference on Kubernetes
Deep production knowledge for engineers running LLM inference on self-managed Kubernetes. vLLM optimization, gateway architecture, observability, debugging, and cost modeling, all from real H100 production deployments. Lifetime updates included.
LLM Operations for MLOps Engineers
A comprehensive course covering 31 essential LLM concepts through the lens of MLOps engineering. Every lesson teaches the concept, then shows you how to operationalize it at scale, with real interview scenarios and system design questions from FAANG+ companies.
Actively expanding · existing students get all new lessons free
Networking Fundamentals for Engineers
The networking knowledge every DevOps engineer needs: OSI model, DNS, TCP/IP, HTTP, load balancing, and the troubleshooting toolkit that turns 'the app is down' into a root cause in minutes. Completely free.
Observability Fundamentals for Engineers
A free course covering the observability knowledge that separates engineers who debug incidents in minutes from those who stare at Grafana for hours. The three pillars, PromQL, OpenTelemetry, SLOs with error budgets, and the dashboard/alert patterns that actually work in production.
Production GPU Infrastructure on Kubernetes
The complete guide to running GPU workloads on Kubernetes in production. From NVIDIA drivers to vLLM serving at scale.
Production Kubernetes Operations
The Day 2 playbook for production Kubernetes. Identity, storage, networking, scaling, monitoring, upgrades, cost management, and disaster recovery, across self-managed and managed clusters. Not cert prep. Not tutorial happy-path. The knowledge teams learn the hard way, packaged before the outage.
SSL/TLS & Certificate Management for Kubernetes Engineers
From encryption fundamentals to production cert management on Kubernetes. Master TLS handshakes, X.509 certificates, cert-manager, mTLS with service mesh, and the 3AM cert expiry runbook.