22 May, 2024 - About 3 minutes
SQLGlot
Intro
SQLGlot is a no-dependency SQL parser, transpiler, optimizer, and engine. It can be used to format SQL or translate between 21 different dialects like DuckDB, Presto / Trino, Spark / Databricks, Snowflake, and BigQuery. It aims to read a wide variety of SQL inputs and output syntactically and semantically correct SQL in the targeted dialects.
Install
pip3 install "sqlglot[rs]" |
Examples
Create the following python script to check translation of datafunctions from duckdb to hive
import sqlglot |
You should get the following output when running this code
SELECT FROM_UNIXTIME(1618088028295 / POW(10, 3)) |
SQLGlot can even translate custom time formats:
import sqlglot |
You should get
SELECT DATE_FORMAT(x, 'yy-M-ss') |
The official README file from the repo have other interesting examples
Optimize
It also can rewrite queries into an “optimized” form like on the following example:
import sqlglot |
Which will provide the following output
SELECT |
SQL Execution
SQLGlot is able to interpret SQL queries, where the tables are represented as Python dictionaries. The engine is not supposed to be fast, but it can be useful for unit testing and running SQL natively across Python objects.
from sqlglot.executor import execute |
You should get the output
user_id price |
Conclusion
This tool is very interesting if you are considering migrating between technologies in order to validate your SQL syntax or even speed up the process.
The optimization component is also something to take in to consideration.
If you are looking for some way to emulate Bigquery interfaces to test your application the following two projects could also be something to consider bigquery-emulator and tinyquery
Another important tool with similar functionality is Google interactive-sql-translator in case you are specifically searching on ways to speed up the migration to BigQuery could be a better approach.