Guenther's law models two effects. First, in a concurrent system, some percentage of the work must be done serially rather than in parallel. For example, if we imagine a number of PostgreSQL backends all performing a workload where 5% of the work can't be parallelized, then, no matter how many processes we add, the overall throughput can never be more than 20 times the throughput of a single process. This is the parameter that is called alpha in the above link to Wikipedia, and sigma in the Percona white paper on this topic.
Second, the "law" also models the effect of cache coherency. As I understand it, this means that even in the absence of lock contention, operations that access shared state are going to become slower as the number of processors or threads or whatever increases, because the overhead of maintaining cache coherency is going to go up. This parameter is called beta in the above link to Wikipedia, and kappa in the Percona white paper.
I happen to have a graph that I recently made, showing how well PostgreSQL scales on a read-only workload consisting of lots and lots of little tiny queries. In the graph below, the blue line is PostgreSQL 9.1, the green line is PostgreSQL 9.2 just after I committed my patch to add a fast-path for relation locking, and the red line is a recent snapshot of the development tree.
Now, there are a couple of interesting things about this graph, aside from the obvious fact that the green line looks a lot better than the blue line, and the red line looks better than the green line. First, of course, both the green and red lines flatten off at 32 cores and gradually descend thereafter. Since these results were collected on a 32-core machine, this isn't surprising. The blue line peaks around 20 cores, then drops and levels off. Second, if you look, you can see that the green line is actually descending reasonably quickly after hitting its peak, whereas the red line - and, even more, the blue line - decline more slowly.
Something that's a little harder to see on this graph is that even at 1 client, performance on the latest 9.2devel sources is about 2.9% better than on 9.1, and at 4 clients, the difference grows to 13%. Because of the scale of the graph, these improvements at lower concurrencies are are hard to see, but they're nothing to sneeze at. I'm wondering whether some of the single-client performance improvement may be related to Tom Lane's recent rewrite of the plan cache, but I haven't had a chance to test that theory yet.
Anyway, after listening to Baron's talk, I was curious how well or poorly this day would fit the Universal Scalability Law, I got to wondering what would happen if we fed this data into that model. As it turns out, Baron has written a tool called "usl" which does just that. To avoid confusing the tool, I just fed it the data points for N=32, since what's going on above 32 clients is a completely different phenomenon that the tool, not knowing we're dealing with a 32-core server, won't be able to cope with. For PostgreSQL 9.1, the curve fits pretty well:
What's going on here? Clearly, peak throughput is not at 0 clients, so the tool is confused. But if you look at the graph, you might start to get confused, too: at the higher client counts, performance appears to be increasing more than linearly as we add clients. And surely the cache coherency overhead can't be negative. But in fact, the underlying data shows the same super-linear scaling -- the "% scale" columns in the following table show how the performance compares to a linear multiple of the single-client performance.
|Clients||PG 9.1||PG 9.1 Scale %||PG 9.2 Fast Locks||PG 9.2 Fast Locks Scale %||PG 9.2 Current||PG 9.2 Current % Scale|
The numbers in red are the dramatically odd cases: at 32 clients, the latest code is more than 50% faster than what you'd expect given perfect linear scalability. Your first intuition might be to suspect that these results are a fluke, but I've seen similar numbers in many other test runs, so I think the effect is real, even though I have no idea what causes it.