Writing a Python SQL engine from scratch

Oct 20, 2023

https://github.com/tobymao/sqlglot/blob/main/posts/python_sql_engine.md

When I first started writing SQLGlot in early 2021, my goal was just to translate SQL queries from SparkSQL to Presto and vice versa. However, over the last year and a half, I've ended up with a full-fledged SQL engine. SQLGlot can now parse and transpile between 18 SQL dialects and can execute all 24 TPC-H SQL queries. The parser and engine are all written from scratch using Python.

This post will cover why I went through the effort of creating a Python SQL engine and how a simple query goes from a string to actually transforming data. The following steps are briefly summarized:

compilers

↑ up