Troubleshooting Guide - Common Issues & Solutions¶
Solutions to common problems when working with Dagster in honey-duck.
Table of Contents¶
- Asset Errors
- Dependency Issues
- Data Validation Errors
- IO Manager Issues
- Performance Issues
- UI/Server Issues
- Environment Issues
Asset Errors¶
Error: "DagsterInvalidDefinitionError: Cannot annotate context parameter"¶
Problem: Some Dagster decorators don't allow typed context parameters.
Solution:
# For regular assets - use type annotation
@dg.asset
def my_asset(context: dg.AssetExecutionContext) -> pl.DataFrame:
pass
# For dynamic assets - remove type annotation
@dg.asset(deps=["other_asset"])
def notification_asset(context) -> dict: # ← No type annotation
pass
Error: "Asset 'sales_transform' has no materializations"¶
Problem: You're trying to use an asset that hasn't been materialized yet.
Solution:
# Materialize the missing asset first
uv run dg launch --assets sales_transform
# Or materialize with all dependencies
uv run dg launch --assets sales_output --select +sales_output
In UI: Click "Materialize" button on the asset page.
Error: "No such asset: 'sales_data'"¶
# WRONG - Parameter name doesn't match asset name
@dg.asset
def my_output(context, sales_data: pl.DataFrame): # Looking for 'sales_data'
pass
Problem: Parameter name must exactly match the asset name.
Solution:
# CORRECT - Parameter matches asset name
@dg.asset
def my_output(context, sales_transform: pl.DataFrame): # Matches 'sales_transform'
pass
Check available assets:
Dependency Issues¶
Error: "Circular dependency detected"¶
Problem: Assets depend on each other in a loop:
Solution:
-
Find the cycle:
-
Break the cycle by refactoring:
Error: "ImportError: cannot import name 'my_asset'"¶
Problem: Missing import or circular import.
Solution:
-
Check the import:
-
Avoid circular imports:
-
Verify file exists:
Data Validation Errors¶
Error: "MissingTableError: Table 'sales_raw' not found in schema 'raw'"¶
Problem: Harvest data hasn't been loaded or table name is wrong.
Solution:
-
Check if harvest data exists:
-
Materialize harvest assets first:
-
Check table names:
Error message shows available tables:
MissingTableError: Table 'sales_raw' not found.
Available tables: ['artworks_raw', 'artists_raw', 'media']
Error: "MissingColumnError: Column 'sale_price' not found in table 'sales_raw'"¶
Problem: Column name is wrong or doesn't exist in data.
Solution:
-
Check actual columns:
-
Fix column name:
-
Use helper validation:
Error: "ColumnNotFoundError" (Polars)¶
Problem: Typo in column name or column was dropped in earlier operation.
Solution:
-
Add debug logging:
-
Check for typos:
IO Manager Issues¶
Error: "Failed to pickle DataFrame"¶
Problem: Trying to use wrong IO manager for data type.
Solution:
Ensure asset returns Polars DataFrame for PolarsParquetIOManager:
@dg.asset
def my_asset(context) -> pl.DataFrame: # ← Polars, not Pandas
# If using DuckDB
result = conn.sql("SELECT * FROM table").pl() # ← Use .pl() not .df()
return result
Error: "FileNotFoundError: No such file or directory: 'data/output/storage/...'"¶
Problem: IO manager trying to load asset that doesn't exist.
Solution:
-
Materialize upstream asset:
-
Or materialize with dependencies:
-
Check file system:
Performance Issues¶
Issue: "Asset taking too long to materialize"¶
Symptoms: Asset runs for minutes/hours.
Solutions:
-
Use lazy evaluation (Polars):
# SLOW - Collects too early df = pl.read_parquet(path).collect() # ← Loads all df = df.filter(pl.col("price") > 1000) # Works on full dataset # FAST - Push down filters df = pl.scan_parquet(path) # ← Lazy df = df.filter(pl.col("price") > 1000) # Filter planned df = df.collect() # ← Only loads filtered data -
Profile your code:
-
Check data size:
Issue: "Out of memory errors"¶
Solutions:
-
Stream data with lazy operations:
-
Process in batches:
-
Use DuckDB for large datasets:
UI/Server Issues¶
Error: "Failed to connect to Dagster"¶
Problem: Dagster server not running or wrong port.
Solution:
-
Start the server:
-
Check the port:
-
Check process:
Error: "Code location failed to load"¶
Problem: Python syntax error or import error in definitions.
Solution:
-
Check error in UI: Look at code location error message.
-
Test import directly:
-
Common causes:
Issue: "UI is slow or unresponsive"¶
Solutions:
-
Clear browser cache: Ctrl+Shift+R (Chrome/Firefox)
-
Restart Dagster:
-
Check for too many runs:
Environment Issues¶
Error: "ModuleNotFoundError: No module named 'polars'"¶
Problem: Dependencies not installed.
Solution:
# Install dependencies
uv sync
# Verify installation
uv run python -c "import polars; print(polars.__version__)"
Error: "Permission denied" when writing files¶
Problem: No write permissions to output directory.
Solution:
-
Check permissions:
-
Fix permissions:
-
Or create directory:
Error: "Database is locked" (DuckDB)¶
Problem: Multiple processes trying to write to DuckDB at once.
Solution:
In honey-duck, output assets have deps=["other_output"] to enforce ordering:
@dg.asset(
deps=["artworks_output"], # ← Wait for artworks_output first
)
def sales_output(context, sales_transform: pl.DataFrame):
# Won't run until artworks_output completes
pass
Why: DuckDB only allows one writer at a time.
Getting More Help¶
Enable Debug Logging¶
# Set log level
export DAGSTER_CLI_LOG_LEVEL=DEBUG
uv run dg dev
# Or in code
@dg.asset
def my_asset(context):
context.log.set_level(logging.DEBUG)
context.log.debug("Detailed debug info")
Check Dagster Logs¶
# View recent run logs
uv run dagster run logs <run_id>
# Server logs when running dg dev
# → Check terminal output
Validate Your Code¶
# Check Python syntax
uv run python -m py_compile src/honey_duck/defs/polars/my_assets.py
# Check imports
uv run python -c "from honey_duck.defs.definitions import defs"
# Run tests
uv run pytest -xvs
Still Stuck?¶
- Check Dagster Docs: https://docs.dagster.io
- Check honey-duck docs: Getting Started
- View example implementations:
src/honey_duck/defs/polars/assets.py - Read error messages carefully: They often include helpful suggestions!
Common Error Message Patterns¶
Pattern 1: "Expected X, got Y"¶
→ Type mismatch: Check return type annotations
Pattern 2: "Asset 'X' not found"¶
→ Typo or missing import: Check asset name spelling
Pattern 3: "Table/Column 'X' not found"¶
→ Data validation error: Use helper functions for auto-validation
Pattern 4: "Failed to load"¶
→ Missing materialization: Materialize upstream assets first
Pattern 5: "Circular dependency"¶
→ Design issue: Refactor to break the cycle
Pro Tip: Most errors have helpful messages. Read them carefully -- they often tell you exactly what's wrong.