What You'll Build
An AI-powered data pipeline development system that uses Continue’s AI agent with dlt
MCP to inspect pipeline execution, retrieve schemas, analyze datasets, and debug load errors - all through simple natural language prompts
Prerequisites
Before starting, ensure you have:- Continue account with Hub access
- Read: Understanding Agents — How to get started with Hub agents
- Python 3.8+ installed locally
- A dlt pipeline project (or create one during this guide)
- Basic understanding of data pipelines
1
Install Continue CLI
2
Install dlt
To use agents in headless mode, you need a Continue API key.
dlt MCP Workflow Options
🚀 Fastest Path to Success
Skip the manual setup and use our pre-built dlt Assistant agent that includes
the dlt MCP and optimized data pipeline workflows for more consistent results. You can remix this agent to customize it for your specific needs.
- ⚡ Quick Start (Recommended)
- 🛠️ Manual Setup
1
Load the Pre-Built Agent
Navigate to your pipeline project directory and run:This agent includes:
- dlt MCP pre-configured and ready to use
- Pipeline-focused rules for data engineering best practices
2
Run Your First Pipeline Inspection
Start with a comprehensive pipeline check:That’s it! The agent handles everything automatically.
Why Use the Agent? The pre-built dlt Assistant agent provides consistent pipeline development workflows and handles MCP configuration automatically, making it easier to get started with AI-powered data engineering. You can remix and customize this agent later to fit your team’s specific workflow.
Agent Requirements
Agent Requirements
To use the pre-built dlt Assistant agent, you need either:
- Continue CLI Pro Plan with the models add-on, OR
- Your own API keys added to Continue Hub secrets (same as manual setup)
dlt MCP vs dlt+ MCP
Understanding the Difference
dlt MCP is focused on local pipeline development and inspection. It provides tools to:
- Inspect pipeline execution and load information
- Retrieve schema metadata from your local pipelines
- Query dataset records from destination databases
- Analyze load errors, timings, and file sizes
- Connect to dlt+ Projects and manage deployments
- Monitor pipeline runs across multiple environments
- Access centralized logging and observability
- Collaborate with team members on pipeline development
Pipeline Development Recipes
Now you can use natural language prompts to develop and debug your dlt pipelines. The Continue agent automatically calls the appropriate dlt MCP tools.You can add prompts to your agent’s configuration for easy access in future sessions. Go to your agent in the Continue Hub, click Edit, and add prompts under the Prompts section.
Where to run these workflows:
- IDE Extensions: Use Continue in VS Code, JetBrains, or other supported IDEs
- Terminal (TUI mode): Run
cn
to enter interactive mode, then type your prompts - CLI (headless mode): Use
cn -p "your prompt"
for headless commands
About the —auto flag: The
--auto
flag enables tools to run continuously without manual confirmation. This is essential for headless mode where the agent needs to execute multiple tools automatically to complete tasks like pipeline inspection, schema retrieval, and error analysis.Pipeline Inspection
Inspect Pipeline Execution
Review pipeline execution details including load timing and file sizes.TUI Mode Prompt:Headless Mode Prompt:
Schema Management
Retrieve Schema Metadata
Get detailed schema information for your pipeline’s tables.TUI Mode Prompt:Headless Mode Prompt:
Data Exploration
Query Dataset Records
Retrieve and analyze records from your destination database.TUI Mode Prompt:Headless Mode Prompt:
Error Debugging
Analyze Load Errors
Investigate and understand pipeline load errors.TUI Mode Prompt:Headless Mode Prompt:
Pipeline Creation
Build New Pipeline
Create a new dlt pipeline from an API or data source.TUI Mode Prompt:Headless Mode Prompt:
Schema Evolution
Handle Schema Changes
Review and manage schema evolution in your pipelines.TUI Mode Prompt:Headless Mode Prompt:
Continuous Data Pipelines with GitHub Actions
This example demonstrates a Continuous AI workflow where data pipeline validation runs automatically in your CI/CD pipeline in headless mode using the dlt Assistant agent. Consider remixing this agent to add your organization’s specific validation rules.Add GitHub Secrets
Navigate to Repository Settings → Secrets and variables → Actions and add:CONTINUE_API_KEY
: Your Continue API key from hub.continue.dev/settings/api-keys- Any required database credentials for your destination
The workflow uses the pre-built dlt Assistant agent with
--config dlthub/dlt-assistant
. This agent comes pre-configured with the dlt MCP and optimized rules for pipeline operations. You can remix this agent to customize the validation rules and prompts for your specific pipeline requirements.Create Workflow File
This workflow automatically validates your dlt data pipelines on pull requests using the Continue CLI in headless mode. It inspects pipeline schemas, checks for errors, and posts a summary report as a PR comment. The workflow can also be triggered manually viaworkflow_dispatch
.
Create .github/workflows/dlt-pipeline-validation.yml
in your repository:
The dlt MCP works with your local pipeline state. Make sure your CI environment
has access to the necessary pipeline configuration and credentials.
Pipeline Development Best Practices
Implement automated pipeline quality checks using Continue’s rule system. See the Rules deep dive for authoring tips.Schema Validation
Error Handling
Performance Monitoring
Data Quality
Troubleshooting
Pipeline Not Found
Destination Connection Issues
Schema Inference Problems
Verification Steps:
- dlt MCP is installed via Continue Hub
- Pipeline directory is accessible
- Destination database credentials are configured
- Pipeline has been run at least once
What You’ve Built
After completing this guide, you have a complete AI-powered data pipeline development system that: ✅ Uses natural language — Simple prompts instead of complex pipeline commands ✅ Debugs automatically — AI analyzes errors and suggests fixes ✅ Runs continuously — Automated validation in CI/CD pipelines ✅ Ensures quality — Pipeline checks prevent bad data from shippingContinuous AI
Your data pipeline workflow now operates at Level 2 Continuous
AI -
AI handles routine pipeline inspection and debugging with human oversight
through review and approval of changes.
Next Steps
- Inspect your first pipeline - Try the pipeline inspection prompt on your current project
- Debug load errors - Use the error analysis prompt to fix any issues
- Set up CI pipeline - Add the GitHub Actions workflow to your repo
- Create new pipelines - Use AI to scaffold new data sources
- Monitor performance - Track pipeline execution metrics over time