Tony Bain

Building products & teams that leverage data, analytics, AI & automation to do amazing things


April 27, 2005 | Tony Bain

Ed Sim has blogged on a start-up dipping its toe into the enterprise RDBMS market, Greenplum.  Excluding legacy, this market is currently dominated by Oracle and SQL Server and to a lesser extent DB2 and Teradata.  Greenplum has an impressive array of founders, with backgrounds in Teradata, Oracle, Tandem and others.

Some queries (sorry!) I have for Greenplum in response to some of the statements made on their website:

“Faster reporting and analytics – 10 to 50 times faster than traditional data warehouse technologies”

Are you really claiming that on the same infrastructure, with the same database, queries against your product will be 10 – 50 times faster than Oracle, SQL Server or DB2?  If so this is a claim that you should back up ASAP, the TPC is the closest thing we have to an independent industry benchmark assessment for RDBMS platforms.  Are you working with hardware partners to get you on top of the TPC-H clustered benchmark?  I think this would be the single biggest thing you could do to get attention (regardless of the actual relevance of the TPC scores, many organisations reference this as part of their platform justification).

“DeepGreen MPP’s shared nothing architecture, which leverages a number of self-contained, parallel processing units, is a proven and effective solution to support large-scale data warehousing demands. In shared-nothing architectures like DeepGreen MPP’s, each unit acts as a self-contained database management system that owns and manages a distinct portion of the overall data.”

This sounds like federation, which both Microsoft SQL Server and IBM DB2 have done for a number years.   This architecture introduces its own set of challenges, for example availability.  How do you differ to existing federation technologies, how do you address the availability issues associated with a shared nothing architecture and why is your scale out model preferable to scale up?

“Why Open Source? Because it works.”

How do you differ from other open source RDBMS initiatives, such as MYSQL and Firebird, which while having some success, have made little inroads into the enterprise data centre?