TLDR: DuckDB is now able to directly query queries stored in PostgreSQL and increase complex analytical queries without duplicating data.
PostgreSQL may be the worlds innovative open source database (self-proclaimed). From its interesting beginnings being an academic DBMS, it has evolved in the last 30 years right into a fundamental workhorse of our digital environment.
PostgreSQL is made for traditional transactional use cases, OLTP, where rows in tables are manufactured, updated and removed concurrently, also it excels at them. But this design decision makes PostgreSQL much less ideal for analytical use cases, OLAP, where large chunks of tables are read to generate summaries of the stored data. Yet there are various use cases where both transactional and analytical use cases are essential, for instance when attempting to gain the most recent business intelligence insights into transactional data.
There were some attempts to create database management systems that well on both workloads, HTAP, however in general many design decisions between OLTP and OLAP systems are hard trade-offs, causeing this to be endeavour difficult. Accepting that one size will not fit all in the end, systems tend to be separated, with the transactional application data surviving in a purpose-built system like PostgreSQL, and a copy of the info being stored within an entirely different DBMS. Utilizing a purpose-built analytical system boosts analytical queries by several orders of magnitude.
Unfortunately, maintaining a copy of the info for analytical purposes could be problematic: The copy will immediately be outdated as new transactions are processed, requiring a complex and non-trivial synchronization setup. Storing two copies of the database will require twice the space for storage. For instance, OLTP systems like PostgreSQL traditionally work with a row-based data representation, and OLAP systems have a tendency to favor a chunked-columnar data representation. You cant have both without maintaining a copy of the info with the conditions that brings with it. Also, the SQL syntaxes between whatever OLAP system youre using and Postgres varies quite significantly.
However the design space isn’t as monochrome since it seems. For instance, the OLAP performance in systems like DuckDB will not only result from a chunked-columnar on-disk data representation. A lot of DuckDBs performance originates from its vectorized query processing engine that’s custom-tuned for analytical queries. Imagine if DuckDB could somehow read data stored in PostgreSQL? Although it seems daunting, we’ve embarked on a quest to create just this possible.
To permit for fast and consistent analytical reads of Postgres databases, we designed and implemented the Postgres Scanner. This scanner leverages the binary transfer mode of the Postgres client-server protocol (Start to see the Implementation Section for additional information.), allowing us to efficiently transform and utilize the data directly in DuckDB.
Among other activities, DuckDBs design differs from conventional data management systems because DuckDBs query processing engine can operate on nearly arbitrary data sources without requiring to copy the info into its storage format. For instance, DuckDB can currently directly run queries on Parquet files, CSV files, SQLite files, Pandas, R and Julia data frames along with Apache Arrow sources. This new extension adds the ability to directly query PostgreSQL tables from DuckDB.
The Postgres Scanner DuckDB extension source code can be acquired on GitHub, but its directly installable through DuckDBs new binary extension installation mechanism. To set up, just run the next SQL query once:
Then, once you want to utilize the extension, you have to first load it:
To produce a Postgres database accessible to DuckDB, utilize the
postgres_attach requires a single required string parameter, that is the
libpq connection string. For instance it is possible to pass
'dbname=myshinydb' to choose another database name. In the easiest case, the parameter is merely
''. You can find three additional named parameters to the event:
source_schemathe name of a non-standard schema name in Postgres to obtain tables from. Default is
overwritewhether we have to overwrite existing views in the mark schema, default is
filter_pushdownwhether filter predicates that DuckDB derives from the query ought to be forwarded to Postgres, defaults to
false. See below for a discussion of what this parameter controls.
The tables in the database are registered as views in DuckDB, it is possible to list them with
Then you can certainly query those views normally using SQL. Again, no data has been copied, that is only a virtual take on the tables in your Postgres table.
If you like never to attach all tables, but just query an individual table, that’s possible utilizing the
POSTGRES_SCAN_PUSHDOWN table-producing functions directly, e.g.
SELECT FROM postgres_scan('dbname=myshinydb', 'public', 'mytable');SELECT FROM postgres_scan_pushdown('dbname=myshinydb', 'public', 'mytable');
Both functions takes three unnamed string parameters, the
libpq connection string (see above), a Postgres schema name and a table name. The schema name is frequently
public. Because the name suggest, the variant with pushdown in the name will perform selection pushdown as described below.
The Postgres scanner is only going to have the ability to read actual tables, views aren’t supported. However, it is possible to needless to say recreate such views within DuckDB, the syntax ought to be a similar!
From an architectural perspective, the Postgres Scanner is implemented as a plug-in extension for DuckDB that delivers a so-called table scan function (
postgres_scan) in DuckDB. There are plenty of such functions in DuckDB and in extensions, like the Parquet and CSV readers, Arrow readers etc.
The Postgres Scanner uses the typical
libpq library, which it statically links in. Ironically, this makes the Postgres Scanner simpler to install compared to the other Postgres clients. However, Postgres normal client-server protocol is quite slow, so we spent quite a while optimizing this. As an email, DuckDBs SQLite Scanner will not face this problem, as SQLite can be an in-process database.
We actually implemented a prototype direct reader for Postgres database files, but while performance was great, there’s the problem that committed however, not yet checkpointed data wouldn’t normally be stored in the heap files yet. Furthermore, in case a checkpoint was currently running, our reader would frequently overtake the checkpointer, causing additional inconsistencies. We abandoned that approach since you want to have the ability to query an actively used Postgres database and think that consistency is essential. Another architectural option could have gone to implement a DuckDB Foreign Data Wrapper (FDW) for Postgres much like duckdb_fdw but while this may enhance the protocol situation, deployment of a postgres extension is fairly risky on production servers so we expect few people can achieve this.
Instead, we utilize the rarely-used binary transfer mode of the Postgres client-server protocol. This format is fairly like the on-disk representation of Postgres documents and avoids a few of the otherwise expensive to-string and from-string conversions. For instance, to learn a standard
int32 from the protocol message, all we have to do would be to swap byte order (
The Postgres scanner connects to PostgreSQL and issues a query to learn a specific table utilizing the binary protocol. In the easiest case (see optimizations below), to learn a table called
lineitem, we internally run the query:
COPY (SELECT FROM lineitem) TO STDOUT (FORMAT binary);
This query begins reading the contents of
lineitem and write them right to the protocol stream in binary format.
DuckDB supports automatic intra-query parallelization through pipeline parallelism, so we also desire to parallelize scans on Postgres tables: Our scan operator opens multiple connections to Postgres, and reads subsets of the table from each. To efficiently split reading the table, we use Postgres rather obscure TID Scan (Tuple ID) operator, that allows a query to surgically read a specified selection of tuple IDs from the table. The Tuple IDs have the proper execution
(page, tuple). We parallelize our scan of a Postgres table predicated on database page ranges expressed in TIDs. Each scan task reads 1000 pages currently. For instance, to learn a table comprising 2500 pages, we’d start three scan tasks with TID ranges
[(2000,0),(UINT32_MAX,0)]. Having an open bound going back range is essential because the amount of pages (
relpages) in a table in the
pg_class table is only an estimate. For confirmed page range (P_MIN, P_MAX), our query from above is thus extended to check such as this:
COPY ( SELECT FROM lineitem WHERE ctid BETWEEN '(P_MIN,0)'::tid AND '(P_MAX,0)'::tid ) TO STDOUT (FORMAT binary);
In this manner, we are able to efficiently scan the table in parallel without counting on the schema at all. Because page size is fixed in Postgres, this gets the added bonus of equalizing your time and effort to learn a subset of the page in addition to the level of columns in each row.
But wait!, you’ll say, based on the documentation the tuple ID isn’t stable and could be changed by operations such as for example
VACUUM ALL. How will you utilize it for synchronizing parallel scans? That is true, and may be problematic, but we found a remedy:
Needless to say a transactional database such as for example Postgres is likely to run transactions while we run our table scans for analytical purposes. Therefore we have to address concurrent changes to the table we have been scanning in parallel. We solve this by first developing a new read-only transaction in DuckDBs bind phase, where query planning happens. We leave this transaction running until we have been completely done reading the table. We use another little-known Postgres feature,
pg_export_snapshot(), that allows us to have the current transaction context in a single connection, and import it into our parallel read connections using
SET TRANSACTION SNAPSHOT .... In this manner, all connections linked to a unitary table scan will dsicover the table state just as it appeared at the beginning of our scan through the entire potentially lengthy read process.
DuckDBs query optimizer moves selections (filters on rows) and projections (removal of unused columns) only possible in the query plan (push down), and also instructs the lowermost scan operators to already perform those operations should they support them. For the Postgres scanner, we’ve implemented both push down variants. Projections are rather straightforward – we are able to immediately instruct Postgres to only retrieve the columns the query is using. This needless to say also reduces the quantity of bytes that require to be transferred, which boosts queries. For selections, we construct a SQL filter expression from the pushed down filters. For instance, if we run a query like
SELECT l_returnflag, l_linestatus FROM lineitem WHERE l_shipdate < '1998-09-02' through the Postgres scanner, it could run the next queries:
COPY ( SELECT "l_returnflag", "l_linestatus" FROM "public"."lineitem" WHERE ctid BETWEEN '(0,0)'::tid AND '(1000,0)'::tid AND ("l_shipdate" < '1998-09-02' AND "l_shipdate" ISN'T NULL) ) TO STDOUT (FORMAT binary);-- and so forth
As you can plainly see, the projection and selection pushdown has expanded the queries ran against Postgres accordingly. Utilizing the selection push-down is optional. There could be cases where owning a filter in Postgres is in fact slower than transferring the info and running the filter in DuckDB, for instance when filters aren't very selective (many rows match).
To research the performance of the Postgres Scanner, we ran the well-known TPC-H benchmark on DuckDB which consists of internal storage format, on Postgres also which consists of internal format sufficient reason for DuckDB reading from Postgres utilizing the new Postgres Scanner. We used DuckDB 0.5.1 and Postgres 14.5, all experiments were operate on a MacBook Pro having an M1 Max CPU. The experiment script can be acquired. We run scale factor 1 of TPCH, developing a dataset of roughly 1 GB with ca. 6 M rows in the largest table,
lineitem. Each one of the 22 TPC-H benchmark queries was run 5 times, and we report the median run amount of time in seconds. Enough time breakdown is given in the next table.
Stock Postgres struggles to finish queries 17 and 20 inside a one-minute timeout due to correlated subqueries containing a query on the lineitem table. For another queries, we are able to note that DuckDB with the Postgres Scanner not merely finished all queries, in addition, it was faster than stock Postgres on roughly 1 / 2 of them, that is astonishing considering that DuckDB must read its input data from Postgres through the client/server protocol as described above. Needless to say, stock DuckDB continues to be 10x faster using its own storage, but as discussed at the beginning of the post this involves the info to be imported there first.
The Postgres Scanner could also be used to mix live Postgres data with pre-cached data in creative ways. That is especially effective when coping with an append only table, but may be used in case a modified date column exists. Think about the following SQL template:
INSERT INTO my_table_duckdb_cacheSELECT FROM postgres_scan('dbname=myshinydb', 'public', 'my_table') WHERE incrementing_id_column > (SELECT MAX(incrementing_id_column) FROM my_table_duckdb_cache);SELECT FROM my_table_duckdb_cache;
This gives faster query performance with fully up-to-date query results, at the expense of data duplication. In addition, it avoids complex data replication technologies.
DuckDB has built-in support to create query leads to Parquet files. The Postgres scanner offers a rather simple solution to write Postgres tables to Parquet files, it could even directly write to S3 if desired. For instance,
COPY(SELECT FROM postgres_scan('dbname=myshinydb', 'public', 'lineitem')) TO 'lineitem.parquet' (FORMAT PARQUET);
DuckDBs new Postgres Scanner extension can read PostgreSQLs tables while PostgreSQL is running and compute the answers to complex OLAP SQL queries often faster than PostgreSQL itself can with no need to duplicate data. The Postgres Scanner happens to be in preview and we have been curious to listen to everything you think. If you discover any problems with the Postgres Scanner, please report them.