This benchmark is under development. When published, it will include raw data, full methodology, and peer-reviewed reproduction memos — so you can verify every claim yourself.
500+ real-world SQL statements drawn from DataHub GitHub issues, production BigQuery audit logs, and Snowflake/Databricks workloads. Each statement is tagged by dialect and construct, then parsed by sqlglot, sqllineage, openlineage-sql, and GSP.
Does the parser return a valid AST or fall back to an opaque Command?
Table-level and column-level relationships detected vs. ground truth.
Stored procedures, MERGE, recursive CTEs, dynamic SQL, temp tables, window functions.
Issue counts from the DataHub issue tracker, sourced from the gap analysis.
| Dialect / construct | Tier | Total mentions | Open issues |
|---|---|---|---|
| BigQuery | Tier 1 | 90 | 16 |
| Snowflake | Tier 1 | 96 | 9 |
| Databricks | Tier 1 | 69 | 12 |
| MSSQL T-SQL | Tier 2 | 17 | 6 |
| Oracle | Tier 2 | 47 | — |
| MERGE | Tier 2 | 42 | 9 |
Data from materials/oss-gap-analysis/master-sheet.md, collected 2026-04-16.
gudusoft/sql-parser-benchmark (MIT)
While the full benchmark is being assembled, you can test GSP against your own SQL right now.
Try the Quick Start →