GitHub Actions Type Checking for Python

Wiring mypy and pyright into GitHub Actions turns type annotations from a local courtesy into an enforced contract: every pull request runs the same checker, against the same pinned versions, across the same Python matrix, and the build fails when a type error slips through. This guide builds a complete typecheck.yml workflow — matrix over interpreter versions, dependency and .mypy_cache caching, inline PR annotations, and a gate job — then explains the choices that keep it fast and reproducible. It complements the broader Static Analysis Tools & CI Integration standards and the local mirror provided by pre-commit hooks.

Each PR runs checkout → install → cache restore → mypy/pyright → gate; a non-zero checker exit fails the gate.

A complete typecheck.yml

The workflow below is self-contained: it triggers on pushes to main and on every pull request, builds a matrix over three interpreter versions, restores caches, and runs both checkers. Pinning matters — actions/setup-python and actions/cache are pinned to major versions, and the checkers themselves are pinned in the dependency lockfile so a silent upstream release never turns a green build red.

# .github/workflows/typecheck.yml — GitHub Actions, mypy 1.x + pyright 1.1.x
name: typecheck

on:
  push:
    branches: [main]
  pull_request:

permissions:
  contents: read

concurrency:
  group: typecheck-${{ github.ref }}
  cancel-in-progress: true

jobs:
  mypy:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        python-version: ["3.10", "3.11", "3.12"]
    steps:
      - uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
          cache: pip                       # caches the pip download wheelhouse

      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -e ".[dev]"          # mypy + pyright pinned in pyproject

      - name: Restore mypy incremental cache
        uses: actions/cache@v4
        with:
          path: .mypy_cache
          key: mypy-${{ matrix.python-version }}-${{ hashFiles('**/uv.lock', '**/requirements*.txt') }}
          restore-keys: |
            mypy-${{ matrix.python-version }}-

      - name: Run mypy
        run: >
          mypy --strict
          --python-version ${{ matrix.python-version }}
          --show-error-codes
          --output=json
          .

mypy --strict is the policy switch; see mypy configuration & strictness for what it bundles. --python-version pins the target interpreter independently of the runner’s interpreter, so the matrix exercises version-specific narrowing (for example, X | Y unions resolving differently on 3.10 vs 3.9).

Failing the build on type errors

Both checkers communicate through their exit code, which is exactly what GitHub Actions reads to decide pass/fail. mypy exits 1 when it finds errors and 0 when clean; pyright exits 1 on any reported error. Because each runs as the final command in its step, no extra scripting is needed — a non-zero exit marks the step failed and the job red.

Do not mask the exit code. Patterns like mypy . || true or trailing continue-on-error: true silently convert a real [arg-type] regression into a passing build. If you want type errors to be advisory during an early migration, prefer scoping which code is checked (via overrides) over discarding the exit status.

# .github/workflows/typecheck.yml (pyright job) — GitHub Actions, pyright 1.1.x
  pyright:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: pip
      - run: pip install -e ".[dev]"
      - name: Run pyright
        run: pyright --outputjson > pyright-report.json || true
        # capture JSON for annotations; real gate is the next step
      - name: Fail on pyright errors
        run: pyright            # exits 1 on reportGeneralTypeIssues, reportArgumentType, etc.

The two-step split lets you keep a machine-readable report (--outputjson) for annotations while the bare pyright invocation provides the authoritative gate. Pyright’s error categories — reportArgumentType, reportGeneralTypeIssues, reportOptionalMemberAccess, reportMissingImports — are the names readers grep for; surface them verbatim so CI logs stay searchable.

Inline annotations and problem matchers

A red check is useful; an inline comment on the offending line is better. mypy can emit GitHub’s annotation format directly, and pyright ships a community action plus a problem matcher that maps its text output onto the diff.

# .github/workflows/typecheck.yml — GitHub annotation output, mypy 1.x
      - name: Run mypy with annotations
        run: |
          mypy --strict --output=github . 2>&1 | tee mypy.log
        # --output=github prints ::error file=...,line=...::message [code]

--output=github was added in mypy 1.11; it prints ::error workflow commands so each [no-untyped-def] or [return-value] lands as an annotation on the changed line. For older mypy, register a problem matcher JSON that parses file:line: error: message [code]. Pyright users can adopt jakebailey/pyright-action, which renders reportX diagnostics as annotations without custom parsing.

# .github/workflows/typecheck.yml — pyright annotations via action, pyright 1.1.x
      - uses: jakebailey/pyright-action@v2
        with:
          version: 1.1.389          # pin the pyright version explicitly
          python-version: "3.12"

Pinning version: on the action is the same discipline as pinning in the lockfile: pyright’s bundled type stubs change between releases, and an unpinned bump can introduce reportMissingTypeStubs diagnostics that did not exist yesterday.

Caching for speed

mypy’s incremental mode writes a per-module fingerprint cache to .mypy_cache; restoring it across runs skips re-analysis of unchanged modules and is the single largest CI speedup available. The cache key must include the Python version and a hash of the dependency lockfile, because a dependency upgrade can change inferred types. Pyright has no equivalent on-disk type cache — the win there is caching the dependency install instead. Both strategies are covered in depth in caching mypy and pyright in GitHub Actions.

For large repositories, you can additionally scope a PR-only job to changed files — see running mypy only on changed files — while keeping a full mypy . on main.

Version pinning

Three things must be pinned for reproducibility: the action versions (actions/checkout@v4, actions/setup-python@v5, actions/cache@v4), the interpreter versions (the matrix list), and the checker versions. Pin the checkers in pyproject.toml or a lockfile, never on a floating pip install mypy — a minor mypy release routinely tightens inference and surfaces new [unreachable] or [truthy-bool] diagnostics.

# pyproject.toml — pin checkers so CI is reproducible
[project.optional-dependencies]
dev = [
  "mypy==1.13.0",
  "pyright==1.1.389",
  "types-requests==2.32.0.20240914",
]

[tool.mypy]
python_version = "3.10"
strict = true
show_error_codes = true

Common pitfalls

Masking the exit code. mypy . || true and continue-on-error: true turn a hard gate into a no-op; a [arg-type] regression merges green. Scope the checked code instead of discarding the status.
Unpinned checkers. A floating pip install mypy pyright makes the build a moving target. A new release emitting [unreachable] will redden an unrelated PR. Pin in the lockfile.
Caching without the lockfile in the key. A stale .mypy_cache keyed only on Python version can hide errors a dependency bump introduced. Always hash the lockfile into the key.
Forgetting --python-version in a matrix. Without it, every matrix leg checks against the same target and the matrix tests nothing useful — only the runner’s interpreter varies, not the analysis target.

FAQ

Should I run mypy and pyright in the same job or separate jobs? Separate jobs. They have different caching needs and run in parallel, so total wall-clock time is bounded by the slower one rather than their sum. Separate jobs also produce two distinct, individually re-runnable checks.

Do I need both checkers in CI? Many teams run one as the gate and the other advisory. They disagree on edge cases — see pyright vs mypy — so running both catches more, at the cost of reconciling divergent diagnostics.

How do I keep CI consistent with local checks? Pin identical checker versions in pyproject.toml and run them through pre-commit hooks locally. Identical versions and flags eliminate “works on my machine” type drift.

Why does the matrix use fail-fast: false? So one failing interpreter version does not cancel the others. You want to see whether a [no-untyped-def] error is version-specific or universal, which requires all legs to finish.

Back to Static Analysis Tools & CI Integration