Saturday, April 08, 2017

New Features Coming in PostgreSQL 10

The list of new features coming in PostgreSQL 10 is extremely impressive.  I've been involved in the PostgreSQL project since the 8.4 release cycle (2008-2009), and I've never seen anything like this.  Many people have already blogged about these features elsewhere; my purpose here is just to bring together a list of the features that, in my opinion, are the biggest new things that we can expect to see in PostgreSQL 10.  [Disclaimers: (1) Other people may have different opinions.  (2) It is not impossible that some patches could be reverted prior to release.  (3) The list below represents the work of the entire PostgreSQL community, not specifically me or EnterpriseDB, and I have no intention of taking credit for anyone else's work.]

Headline Features

Declarative Partitioning.  In previous versions of PostgreSQL, PostgreSQL supported only table inheritance, which could be used to simulate table partitioning, but it was complicated to set up and the performance characteristics were not that great.  In PostgreSQL 10, it's possible to do list or range partitioning using dedicated syntax, and INSERT performance has been greatly improved.  There is still a lot more work to do in future releases to improve performance and add missing features, but even what we have in v10 is already a major step forward (IMHO, anyway).

Logical Replication.  PostgreSQL has had physical replication -- often called streaming replication -- since version 9.0, but this requires replicating the entire database, cannot tolerate writes in any form on the standby server, and is useless for replicating across versions or database systems.  PostgreSQL has had logical decoding -- basically change capture -- since version 9.4, which has been embraced with enthusiasm, but it could not be used for replication without an add-on of some sort.  PostgreSQL 10 adds logical replication which is very easy to configure and which works at table granularity, clearly a huge step forward.  It will copy the initial data for you and then keep it up to date after that.

Improved Parallel Query.  While PostgreSQL 9.6 offers parallel query, this feature has been significantly improved in PostgreSQL 10, with new features like Parallel Bitmap Heap Scan, Parallel Index Scan, and others.  Speedups of 2-4x are common with parallel query, and these enhancements should allow those speedups to happen for a wider variety of queries.

SCRAM Authentication.  PostgreSQL offers a remarkable variety of different authentication methods, including methods such as Kerberos, SSPI, and SSL certificate authentication, which are intended to be highly secure.  However, sometimes users just want to use passwords managed by the PostgreSQL server itself.  In existing releases, this can be done either using the password authentication type, which just sends the user-supplied password over the wire, or via the md5 authentication type, which sends a hashed and salted version of the password over the wire.  In the latter approach, stealing the hashed password from the database or sniffing it on the wire is equivalent to stealing the password itself, even if you can't compute a preimage.  PostgreSQL 10 introduces scram authentication, specifically SCRAM-SHA-256, which is much more secure.  Neither the information which the server stores on disk nor the contents of an authentication exchange suffice for the server to impersonate the client.  Of course, the substitution of SHA-256 for MD5 is also a substantial improvement.  See also Michael Paquier's blog on this topic. One point to note is that, unless you are using libpq, you will not be able to use this feature unless your particular client driver has been updated with SCRAM support, so it may be a while before this feature is universally available.

Executor Speedups.  Substantial parts of PostgreSQL's executor have been rewritten to make expression and targetlist projection faster; just-in-time compilation will be added in a future release.  Hash aggregation has been rewritten to use a more efficient hash table and store narrower tuples in it, and work has also been done to speed up queries that compute multiple aggregates and joins where one side can be proven unique.  Grouping sets now support hash aggregation.  While all PostgreSQL releases typically contain at least some performance improvements, the rewrite of expression and targetlist projection is a particularly large and significant improvement which will benefit many users.

Durable Hash Indexes.  Hash indexes in PostgreSQL have suffered from years of long neglect; the situation will be noticeably improved in v10.  The most notable change is that changes to a hash index now write WAL, which means that they are crash-safe and that they are properly replicated to standbys.  However, a good deal of other work has been done, including the necessary prerequisite step of revamping the bucket split algorithm to improve performance and concurrency, caching the metapage for better performance, adding page-at-a-time vacuuming, and expanding them more gradually.  Amit Kapila even writes about a case where they outperformed btree indexes.  While there's certainly more work to be done here, I'm excited about these improvements.

ICU Collation Support.  In current releases, PostgreSQL relies exclusively on the collations supplied by the operating system, but this sometimes causes problems: collation behavior often differs between operating systems, especially between Linux and Windows, and it isn't always easy to find a collation for one operating system whose behavior matches that of some collation available on another system.  Furthermore, at least on Red Hat, glibc regularly whacks around the behavior of OS-native collations in minor releases, which effectively corrupts PostgreSQL's indexes, since the index order might no longer match the (revised) collation order.  To me, changing the behavior of a widely-used system call in a maintenance release seems about as friendly as locking a family of angry racoons in someone's car, but the glibc maintainers evidently don't agree.  (In fact, there's one discussion where it's suggested that you not use some of those interfaces at all.)  libicu, on the other hand, says they care about this.

But Wait, There's More!

In my estimation, the features listed above are the most exciting things that users can expect in PostgreSQL 10, which is expected to be released in September.  However, there are quite a few other significant features as well which could easily have qualified as headline features in a release less jam-packed than this one.  Here are some of them:

Extended Statistics (ndistinct, functional dependencies).  If the query planner makes a bad row count estimate resulting in a terrible plan, how do you fix it?  With extended statistics, you can tell the system to gather additional statistics according to parameters that you specify, which may help it get the plan right.

FDW Aggregate Pushdown.  In previous releases, SELECT COUNT(*) FROM foreign_table operated by fetching every row form the foreign table and counting them locally.  That was terrible, so now it doesn't.

Transition Tables.  It is now possible to write a PL/pgsql AFTER STATEMENT trigger which can access all rows modified by the statement.  This can be both faster and more convenient than writing an AFTER ROW trigger that is called once per row.

Improved Wait Events.  PostgreSQL 9.6 introduced wait event monitoring in pg_stat_activity, but only for a limited range of events.  In PostgreSQL 10, you'll be able to see latch waits and I/O waits, even for auxiliary processes and unconnected background workers.

New Integrity Checking Tools.  You can now validate the integrity of your btree indexes using the new amcheck module.  If you're a developer adding write-ahead logging to a new storage form, or a user who thinks the developers may have introduced a bug, you'll be pleased to be able to test with wal_consistency_checking. pg_dump now has better test coverage.

Smarter Connection Handling.  Connections through libpq can now specify multiple hosts, and you can even tell it to find you the server that is currently accepting write connections.

Quorum-Based Synchronous Replication.  You can now specify that a commit must be acknowledged by any K of N standby synchronous servers, improving flexibility and performance.

Other Cool Stuff

Many other things have also been significantly improved in this release.  XMLTABLE makes querying XML data faster and easier.  You can now interrogate the commit status of a transaction directly, and we've got better tracking of replication lag.  psql now supports \if ... \elseif ... \else ... \endif to make scripting easier, and there are new functions and new roles to allow monitoring tools to run without superuser privileges.  Encoding conversions are now faster, and so is sorting. You can compress the transaction log while streaming it.  And there's more, but this blog post is too long already.  If you're interested in reading even more about new features that will be coming with PostgreSQL 10, depesz blogs frequently on this topic, and so does Michael Paquier.  Both have additional details on some of the features mentioned here, as well as others that may be of interest.

This final note: we have had chronic problems with users erroneously believing that the pg_xlog or pg_clog directory is non-critical data, possibly because the directory names include the word "log".  Those directories have been renamed to pg_wal and pg_xact, which we hope will be clearer.  All SQL functions and utility names that formerly included the string "xlog", meaning the transaction log or write-ahead log, have been renamed to use "wal" instead.  Conversely, the default log directory is now called log rather than pg_log so that it is looks less like an internal name.  These changes will probably cause a bit of upgrade pain for some users, but we hope that they will also help users to avoid catastrophic mistakes.

If you're still reading this, thanks!  I think it's going to be a great release.

21 comments:

  1. There is a lot of great stuff in there! The 10 roadmap on the wiki also mentioned column stores. Is that still in being worked on for a 10.X release? I'm hoping so, because the combination of parallel query and column storage sounds like a giant leap for analytical workloads.

    ReplyDelete
    Replies
    1. Nope, no column stores in v10. Sorry!

      Delete
    2. Columnar storage would be huge, seriously YUGE.

      Is anyone working on this and is there a timeline for it's release?

      Thanks for the work you are doing on postgres.

      Delete
    3. Several people have worked on it, but nobody has yet produced a committable patch (at least, IMHO). I think that to get a big benefit out of column store, we would need some executor changes first - we don't for example have any real support for postponing the fetches of individual columns until the latest possible stage. Without that sort of executor support, a column store isn't likely to be very good. So there is probably some other work to do first.

      Thanks for the complements, and the feedback on what you'd like to see in the future is great, too!

      Delete
    4. Citus publishes an open source extension for columnar storage. Haven't used it so I can't say how good it is, though... https://github.com/citusdata/cstore_fdw

      Delete
    5. Or just use Vertica

      Delete
  2. Amazing and beautiful. Thank you for bringing this list together. Postgresql continues to be the best database in the world.

    ReplyDelete
  3. I used mostly oracle and ms sql but this looks interesting.

    ReplyDelete
  4. Hey Robert, are you sure the release date is September or is it somewhere in August? Thank you.

    ReplyDelete
    Replies
    1. Well, it's hard to be sure at this point. I'm estimating.

      Delete
  5. My wish list for PG11, and development rant in general :)

    I read/see a lot of development gone down the wrong path before one of the committers points out the various errors.. for big ticket items, ideally the senior Committers get together and nut out / agree the HL designs for the prerequisites mentioned above in this blog and others that must exist that are not mentioned above like PGStrom new record size, so they can land early (e/g/ all dependencies in CF1/2) and the dependent items like Columnar Store, PGStrom, JIT and anything else that can speed up performance for nodes with just a few cpu, etc can be worked much earlier and have a chance for same release cycle, so we don't have to wait for another release etc.

    JSON Table and SQL 2016, more jsonb goodies and performance.

    The community rallies and finally moves the remaining postgresql xl stuff into 11 so everyone can focus on improving ONE product. Disappointed to watch one company speaker / +XL benefiting company say it will take many years for that to happen. Assume its just sharding to go?

    A FDW for orientdb or some other apache equiv. db.

    Some compression similar to the PG Pro co. has, with similar performance characteristics etc.

    No bugs get moved to next PG release/CF. extra 1 month window after final CF ending just for bugs only to close them out, where EVERYONE incl committers (e.g. design challenge/issue) help get it through. Could say the same for any big ticket items that are almost there but not quite at the end of the final CF..

    Better OOB settings for small companies/start-ups. parallel thread set to 2, replication/DR etc. Better performance on AWS/Google is the theme here for low number of resources available.

    Worth providing a build for smaller bit size? cant remember if PG was 32/64.. thinking of db size impact as well as replication file size over the wire etc.

    Focus on getting those big gain potential items/refactor/redesign done asap, rather than effort required to get the 1,2,5% gain items through..

    Someone goes through everyone's blogs where someone said they'd go to PG if it had X,Y,Z.. and that list gets split up between companies/developers so come 11, everyone can go back and say hey, its in 11, so go migrate :P

    Learn from others.. maybe create specialised sub-teams.. a Performance group who investigate new performance opportunities, bleeding edge algorithms, and lead the development direction for those etc etc. A dedicated replication team, bug team (M.P. seems like the king of fixes), etc. maybe 1 rep from each of the core PG companies/committers as a minimum.

    Never understood why when one company like Citus release an add-on like their own columnar store, that it isn't embraced by the wider PG dev community/team and worked on and put into PG10/11 core etc.. rather than have nothing at the end of PG10 and also now have dependencies todo as well in 11.. seems like there is a bit of pride to roll your own rather than leverage / or cut your nose off to spite your face.. Even after everyone posted roadmaps people still seemed to be silo'd in there development efforts..I am assuming there is no license restriction on citus columnar store so could be way off here, but if there is a solution out there, then use it, improve it, get it into core, rather than having zilch and waiting 2+ releases by looks of it now based on above blog..

    ReplyDelete
    Replies
    1. I'm the guy that asked the opening question about column stores, and I'm eager to see that happen. But not so eager as to advocate that the PostgreSQL devs abandon their commitment to getting design right the first time. Refraining from deploying whatever code is handy has clearly been a factor in how Postgres became so good.

      There's a discussion of this post at Hacker News, and a comment there addressed this point really well: "They also really care about code quality, getting the design right early, and covering all possible edge cases. As a result, Postgres [is] solid, clean, has unusually few legacy oddities, and almost never any subtle, surprising breaking changes."

      I definitely want columnar storage to happen in Postgres, but not as much as I want Postrges to maintain its amazingly high level of quality. The ratio of moments where the care and craft of the devs shines through versus any unforeseeable gotcha approaches infinity. And a single feature, no matter how shiny, is not worth compromising that.

      As for Citus, I'm glad those folks are doing their work, and I've been keeping an interested eye on them (and MonetDB). But I'm a long way away from urging the Postgres devs to just integrate whatever code comes over the fence.

      Robert (and everyone else), thank you for all the work and care you put into PostgreSQL. It shows every time.

      Delete
  6. Something rather small but that some people may care about is the fact that pg_dump and pg_basebackup flush by default the data generated if possible to make it durable on disk with Postgres 10. There is as well a --no-sync option to get the previous default behavior. Of course nothing can be done with something like pg_dump > file.sql.

    ReplyDelete
  7. There's an incorrect link in your post. In the quote: "Extended Statistics (ndistinct, functional dependencies)" both two links point to the same commit.

    I believe the second link should go to: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=2686ee1b7ccfb9214064d4d2a98ea77382880306

    ReplyDelete
  8. > If you're still reading this, thanks! I think it's going to be a great release.

    thank you!

    Robert, let me do one comment about Logical replication. Do you think "Logical" is true name for the newest replication type? IMHO all our builtin replication is binary-physical and is not logical actualy:

    1) all builtin replications are based on binary WAL stream
    2) moreover there is no separated buffer instead of WAL

    3) all slots hold on one and whole WAL binary stream
    4) all consumers consume one WAL binary stream
    5) if any consumers stop or kiled or paused it affected whole WAL stream and master node has to store all WALs
    6) all WALs include binary stream from vaccuum/index/reindex -- from not logical operations
    7) and if one wants to replicate only some table from only some database then we have to mesh our goals with all cluster (all users and databases) and whole WAL binary (sql/vac/reindex/etc) stream

    8) there is no any form of batching so one has to consume in one thread stuff produced by many threads

    and some other points similar to listed above or conducted with them.

    so does we start game "phisical/logical or semilogical" replications as we have got in mysql already?

    why we do not separate binary stream and logical buffers/queus? this separations (maybe as a option) could solve all that issues.

    thanks in advance for understanding!

    ReplyDelete
    Replies
    1. Logical decoding works by converting the write-ahead log stream (which has to be written anyway for crash recovery) into a stream of logical changes; logical replication arranges to apply those changes on another system. It would be possible to make the logical replication log completely separate from the ordinary write-ahead log, but this would writing every change to three places (the data file, the write-ahead log, and the logical replication log) instead of only two, which would in some scenarios be much less efficient.

      There may also be cases where such a design is very desirable, as when only a single lightly-trafficked table is being replicated; a separate log wouldn't cost much, and would avoid the need to decode the write-ahead log most of which isn't relevant. However, I believe that would be designing for the exceptional case rather than the usual one.

      Delete
  9. Very cool feature list.

    Do you think we'll see DDL replication included in postgres logical replication anytime soon? I'm interested in using postgres LR, but I'm not sure how to manage the DDL synchronization.

    Thanks!

    ReplyDelete
  10. Love the advances in replication. I'm currently migrating a company from SQL Server 2008 to PostgreSQL and the only hiccup is not being able to use multi-master replication right out of the box. We'll be using BDR or xDB once we get to that part, but it would be nice to see that capability in the core someday.

    ReplyDelete
  11. Nice article with all features listed at a place.

    ReplyDelete