Debugging Guide¶
Debug Dagster asset code with full breakpoint support, variable inspection, and Data Wrangler integration.
Quick Start¶
Then in VS Code:
- Select "Dagster: Attach to dg dev" from the debug dropdown
- Press F5 to attach
- Set breakpoints in any asset file
- Materialize from the Dagster UI at
http://localhost:3000
Setup¶
Two pieces make this work: the debugpy hook in definitions.py and the VS Code launch configuration.
1. The debugpy Hook¶
The debug hook lives at the top of definitions.py, before any asset imports. It runs when Dagster loads the code location and again in each step subprocess:
# Enable remote debugging when DAGSTER_DEBUG=1
# Start: DAGSTER_DEBUG=1 uv run dg dev
# Then attach VS Code debugger ("Dagster: Attach to dg dev")
if os.environ.get("DAGSTER_DEBUG"):
os.environ["PYDEVD_DISABLE_FILE_VALIDATION"] = "1"
import debugpy
try:
debugpy.listen(("localhost", 5678))
print("debugpy listening on port 5678 — attach your IDE debugger")
except RuntimeError:
# Subprocess: pause briefly so debugpy can sync breakpoints from VS Code
debugpy.trace_this_thread(True)
import time
time.sleep(0.5)
This handles two scenarios:
| Process | What happens |
|---|---|
| Code server (parent) | debugpy.listen() succeeds, binds to port 5678. VS Code attaches here. |
| Step subprocess | debugpy.listen() raises RuntimeError (port taken). We call trace_this_thread(True) and sleep 0.5s so the debug adapter can propagate breakpoints from VS Code before asset code runs. |
Why the sleep?
Dagster's multiprocess executor spawns a new subprocess for each step. debugpy auto-attaches to these subprocesses, but VS Code needs time to send breakpoint locations. Without the 0.5s pause, the subprocess starts executing asset code before breakpoints are set — a race condition that causes breakpoints to be silently skipped.
2. The Launch Configuration¶
The attach configuration in .vscode/launch.json connects VS Code to the debugpy listener:
"name": "Dagster: Attach to dg dev",
"type": "debugpy",
"request": "attach",
"connect": { "host": "localhost", "port": 5678 },
"justMyCode": true
}
3. The debugpy Dependency¶
debugpy is a dev dependency in pyproject.toml:
Debugging Workflow¶
With the Dagster UI¶
This is the primary workflow — use the UI to select and materialize specific assets while breakpoints are active:
- Wait for
debugpy listening on port 5678in the terminal - F5 with "Dagster: Attach to dg dev"
- Set breakpoints in asset files (e.g.
src/honey_duck/defs/polars/assets.py) - Open
http://localhost:3000, navigate to an asset, click Materialize - VS Code pauses at the breakpoint with full variable inspection
Without the UI¶
For quick iteration without a browser, the launch configurations run dagster job execute directly in a single process:
"name": "Dagster: polars_pipeline",
"type": "debugpy",
"request": "launch",
"module": "dagster",
"args": ["job", "execute", "-m", "honey_duck.defs.definitions", "-j", "polars_pipeline"],
"python": "${workspaceFolder}/.venv/bin/python3",
"justMyCode": true
},
Select the configuration from the debug dropdown and press F5. The job executes with debugpy attached — no subprocess complications.
When to use each approach
Use attach to dg dev when you want to pick specific assets to materialize from the UI. Use the direct launch configs when you want to run a full pipeline and don't need the UI.
Variable Inspection¶
While paused at a breakpoint, the Variables panel shows all local variables including Polars DataFrames and LazyFrames.
Data Wrangler¶
Right-click a DataFrame variable and select Open in Data Wrangler for a tabular view with filtering, sorting, and summary statistics.
Requirements
Data Wrangler requires the ms-toolsai.datawrangler and ms-toolsai.jupyter VS Code extensions, plus ipykernel in your Python environment.
Slow Attribute Warning¶
You may see this warning when inspecting Polars DataFrames:
This is harmless — debugpy is resolving the .plot attribute which triggers Altair lazy loading. Suppress it by setting PYDEVD_WARN_SLOW_RESOLVE_TIMEOUT=2 in your environment or launch config.
Debugging in Containers¶
Limited support
The DAGSTER_DEBUG=1 uv run dg dev + attach workflow does not work reliably in devcontainers. The debugpy subprocess auto-attach mechanism fails with "Client not authenticated" errors when VS Code is running via the Remote Containers extension.
For debugging inside containers, use the direct launch configurations (dagster job execute) which run everything in a single process.
Technical Details¶
Architecture¶
VS Code (debugpy client)
│
port 5678
│
dg dev ──► code server (debugpy.listen)
│
├── run process
│ ├── step subprocess (auto-attach, 0.5s sync)
│ ├── step subprocess (auto-attach, 0.5s sync)
│ └── ...
│
└── webserver (port 3000)
Frozen Modules (Python 3.12+)¶
Python 3.12+ uses frozen modules for stdlib imports. debugpy warns this may cause missed breakpoints:
Debugger warning: It seems that frozen modules are being used,
which may make the debugger miss breakpoints.
We suppress this with PYDEVD_DISABLE_FILE_VALIDATION=1. User code breakpoints are not affected — the warning only applies to stdlib frozen modules.
In-Process Executor Does Not Help¶
Setting executor=dg.in_process_executor on Definitions prevents step-level subprocesses, but Dagster's DefaultRunLauncher still spawns a separate process for each run. The executor controls how steps within a run execute, not how runs are launched. The debugpy listener runs in the code server process, not the run process — so the in-process executor alone doesn't solve the debugging problem.
Port Conflicts¶
The debugpy listener binds to localhost:5678. If another process is using this port, dg dev starts normally but debugging is silently unavailable — the RuntimeError is caught. Check for port conflicts if the "debugpy listening" message doesn't appear on startup.
Programmatic Breakpoints¶
If UI breakpoints aren't hitting (e.g. the 0.5s sleep isn't enough on a slow machine), you can add a programmatic breakpoint directly in asset code:
This always works regardless of propagation timing. Remove it when you're done debugging.