CLI Reference - Quick Guide¶
Essential CLI commands for honey-duck development.
CLI Overview¶
honey-duck uses two CLIs:
| CLI | Purpose | Commands |
|---|---|---|
dg |
Modern CLI for project management | dev, launch, list defs, check |
dagster |
Full operational CLI | runs, schedules, sensors, asset management |
Starting Dagster¶
Development Server¶
# Start Dagster UI (recommended for development)
uv run dg dev
# Start on specific port
uv run dg dev -p 3001
# Validate definitions before starting
uv run dg check defs
Opens: http://localhost:3000
What it does: - Starts web UI for visualization - Auto-reloads when code changes - Shows real-time logs - Enables interactive materialization
Production Server¶
# Start Dagster daemon (for schedules/sensors)
uv run dagster-daemon run
# Start webserver separately
uv run dagster-webserver -p 3000
Asset Operations¶
Materialization¶
# Materialize single asset
uv run dg launch --assets sales_transform
# Materialize multiple assets
uv run dg launch --assets sales_transform,artworks_transform
# List all definitions (assets, jobs, etc.)
uv run dg list defs
Asset Wipe (Delete Materializations)¶
Requires dagster CLI:
# Wipe specific asset
uv run dagster asset wipe -a sales_transform
# Wipe multiple assets
uv run dagster asset wipe -a sales_transform -a artworks_transform
Warning: This deletes materialization history and cached data!
Job Execution¶
Running Jobs¶
# Execute complete job
uv run dg launch --job polars_pipeline
# Execute with config
uv run dg launch --job polars_pipeline --config config.yaml
# List all jobs (and other definitions)
uv run dg list defs
Available Jobs in honey-duck¶
# Original implementation (processor classes)
uv run dg launch --job processors_pipeline
# Polars implementation (split assets)
uv run dg launch --job polars_pipeline
# Polars ops implementation (graph-backed)
uv run dg launch --job polars_ops_pipeline
# DuckDB SQL implementation
uv run dg launch --job duckdb_pipeline
# DuckDB + Soda validation implementation
uv run dg launch --job duckdb_soda_pipeline
# Polars FilesystemIOManager implementation
uv run dg launch --job polars_fs_pipeline
# Polars multi-asset implementation
uv run dg launch --job polars_multi_pipeline
Code Locations¶
Reloading Code¶
# Reload code location (when dev server running)
# → Just save your file, auto-reload happens
# Or manually reload
uv run dagster code-location reload
Code Location Info¶
# List code locations
uv run dagster code-location list
# Show code location details
uv run dagster code-location info
Schedules & Sensors¶
# List all schedules
uv run dagster schedule list
# Start a schedule
uv run dagster schedule start daily_pipeline
# Stop a schedule
uv run dagster schedule stop daily_pipeline
# List sensors
uv run dagster sensor list
# Start sensor
uv run dagster sensor start file_sensor
# Stop sensor
uv run dagster sensor stop file_sensor
Run Operations¶
Listing Runs¶
# List recent runs
uv run dagster run list
# List runs for specific job
uv run dagster run list -j polars_pipeline
# Show run details
uv run dagster run show <run_id>
Run Management¶
# Delete specific run
uv run dagster run delete <run_id>
# Delete all runs for a job
uv run dagster run delete --all -j polars_pipeline
# Cancel running job
uv run dagster run cancel <run_id>
Debugging & Inspection¶
Logs¶
# View logs for a run
uv run dagster run logs <run_id>
# Follow logs in real-time
uv run dagster run logs <run_id> --follow
Asset Checks¶
# Execute asset checks
uv run dagster asset check -a expensive_artworks
# Execute all checks for asset
uv run dagster asset check --select expensive_artworks+
Advanced Selection Syntax¶
Selection Patterns¶
# Select by prefix/suffix
uv run dagster asset materialize --select "sales_*" # All starting with sales_
uv run dagster asset materialize --select "*_output" # All ending with _output
# Select by group
uv run dagster asset materialize --select "group:transform_polars"
# Select by tag
uv run dagster asset materialize --select "tag:critical"
# Combine selections
uv run dagster asset materialize --select "+sales_output,+artworks_output"
Graph Traversal¶
# + means "include upstream"
--select +asset_name # Asset + all upstream dependencies
# + means "include downstream"
--select asset_name+ # Asset + all downstream dependencies
# Both
--select +asset_name+ # Asset + full lineage
# Multiple hops
--select asset_name++ # Asset + 2 levels downstream
--select ++asset_name # Asset + 2 levels upstream
# Depth limit
--select asset_name+5 # Asset + 5 levels downstream
Environment Variables¶
# Set Dagster home directory
export DAGSTER_HOME=/path/to/dagster_home
# Set log level
export DAGSTER_CLI_LOG_LEVEL=DEBUG
# Disable colored output
export NO_COLOR=1
# Set database path (honey-duck specific)
export HONEY_DUCK_DB_PATH=/path/to/dagster.duckdb
Configuration Files¶
Project Configuration¶
# Validate dagster.yaml
uv run dagster instance info
# Show instance config
cat $DAGSTER_HOME/dagster.yaml
Run Configuration¶
Create config.yaml:
Use it:
Testing¶
# Run all tests
uv run pytest
# Run specific test file
uv run pytest tests/test_integration.py
# Run with verbose output
uv run pytest -xvs
# Run tests matching pattern
uv run pytest -k "test_sales"
# Run tests with coverage
uv run pytest --cov=honey_duck --cov-report=html
Utility Commands¶
Instance Management¶
# Show instance info
uv run dagster instance info
# Migrate instance schema
uv run dagster instance migrate
# Concurrency limits
uv run dagster instance concurrency info
Health Check¶
# Check if Dagster is running
curl http://localhost:3000/server_info
# Check code location health
uv run dagster code-location list
Common Workflows¶
Development Workflow¶
# 1. Start dev server
uv run dg dev
# 2. Make code changes
# ... edit files ...
# 3. Auto-reload happens
# → Check browser for reload notification
# 4. Test in UI
# → Click asset → Materialize
# 5. Run tests
uv run pytest
CI/CD Workflow¶
# 1. Install dependencies
uv sync
# 2. Run tests
uv run pytest
# 3. Execute pipeline
uv run dg launch --job polars_pipeline
# 4. Check exit code
echo $? # 0 = success, non-zero = failure
Debugging Workflow¶
# 1. Check recent runs (requires dagster CLI)
uv run dagster run list
# 2. View logs for failed run
uv run dagster run logs <run_id>
# 3. Materialize with verbose logging
uv run dg launch --assets my_asset --verbose
# 4. Check asset data
ls -la data/output/storage/my_asset/
Keyboard Shortcuts (UI)¶
When Dagster UI is open:
?- Show helpgthena- Go to Assetsgthenj- Go to Jobsgthenr- Go to Runs/- Search
Tips & Tricks¶
1. Quick Asset Test¶
# Materialize single asset to test changes
uv run dg launch --assets my_new_asset
# Check the output file
cat data/output/json/my_output.json
2. Pipeline Comparison¶
# Run different implementations side-by-side
uv run dg launch --job polars_pipeline &
uv run dg launch --job duckdb_pipeline &
wait
3. Validate Before Launch¶
Troubleshooting Commands¶
# Clear Dagster cache
rm -rf $DAGSTER_HOME/storage/*
# Reset instance
uv run dagster instance info
uv run dagster instance migrate
# Check Python environment
uv run python --version
uv run which python
# Validate code loads
uv run python -c "from honey_duck.defs.definitions import defs; print(defs)"
# Check for import errors
uv run dg check defs
Quick Reference Card¶
# MOST COMMON COMMANDS
# Start UI
uv run dg dev
# Materialize asset
uv run dg launch --assets ASSET_NAME
# Run complete pipeline
uv run dg launch --job JOB_NAME
# List definitions
uv run dg list defs
# Run tests
uv run pytest
# View logs (requires dagster CLI)
uv run dagster run logs RUN_ID
Pro Tip: Bookmark http://localhost:3000 and use the UI for 90% of operations. The CLI is great for automation and CI/CD!