Crawl
Pre-migration intelligence for enterprise data infrastructure.
They catalog your data. Crawl tells you what breaks when you migrate.
Extract business logic from stored procedures, ETL jobs, and warehouse views — the undocumented rules buried in your data stack that block every migration project. Open-source, vendor-neutral, local-first LLM.
What Crawl Does
Input: a 200-line stored procedure that nobody on the team wrote.
CLI Commands
The Problem
Every cloud migration hits the same wall: thousands of stored procedures and ETL jobs encoding business rules in vendor-specific dialects that nobody documented. Migration tools can translate your SQL, but they can't tell you what it means — or whether it's even still relevant.
Crawl is Step 0: the pre-migration intelligence layer that runs before you use Datafold, Lakebridge, dbt, or SnowConvert.
Questions Crawl Answers
What do we have? — Inventory with auto-generated business-rule summaries
What does it do? — Human-readable logic, not just column lineage
Is it still alive? — Dead code detection, contradiction flagging
What should we migrate first? — Triage by criticality, complexity, risk
What breaks if we move? — Vendor-specific logic that won't survive a platform change
How It Works
Design Principles
Step 0, not Step 1. Crawl doesn't migrate your code — it tells you what you have so migration tools can do their job.
Vendor-neutral. Works with any source database, any target platform. No lock-in.
Local-first LLM. Enterprise code never needs to leave your environment. Supports Ollama and vLLM out of the box.
Open-source (Apache 2.0). Your understanding of your data belongs to you, not a vendor.
Enterprise Safety
Crawl is designed to connect to enterprise databases safely.
Read-only, always. No writes, no DDL, no DML. Read-only transaction mode enforced.
Catalog-only access. Reads stored procedure source code from system catalogs. Never queries user table contents.
Non-production recommended. Stored procedure source code is identical in staging — there's no reason to connect to prod.
No hammering. Single connection, rate-limited, batched queries, configurable timeouts.
Query allowlisting. Every SQL query is hardcoded and auditable. No dynamic SQL.
Full audit trail. Every query logged for DBA review.
Supported Sources
| Source | Status |
|---|---|
| PostgreSQL stored procedures | In Development |
| Snowflake (views, UDFs, procs, tasks) | Planned |
| Informatica PowerCenter / IICS | Planned |
| SQL Server stored procedures | Planned |
| Oracle PL/SQL | Planned |
| dbt models | Planned |
Built By
Digital Rain Technologies. Founded by Augustin Chan, former Development Architect at Informatica (12 years, Fortune 500 data integration across APAC/MENA/Europe).
Follow the Build
Crawl is in early development. Get monthly updates on what shipped, what's next, and lessons from building open-source migration intelligence.
No spam. Unsubscribe anytime.