Five reasons why I choose PostgreSQL |

I’ve worked with most of the major SQL databases over the last several years, including Oracle, MS SQLserver, MySQL, and PostgreSQL, both as an admin and (more often) as a developer. They’ve all got their ups and downs (some more of one than the other…), but when it comes to choosing a database for my own projects, my choice these days is nearly always PostgreSQL. This amazing, powerful server is unfortunately one of the hidden gems of the relational database world, so I thought I’d share a few reasons why it’s my personal favorite.

Elegant SQL syntax

There’s just something about SQL itself that seems very throwback; maybe it’s just the convention of typing commands in caps, its odd rules about quotes and character escaping, or just the sometimes-convoluted approaches to problem solving its declarative syntax forces you to take. Every time I start typing “SELECT …” I feel a little like I’m trying to hack the MCP or talk WOPR out of launching nukes. Especially if I’m forced to use some byzantine implementation of SQL like, say, Oracle.

So any little bit of syntactic sugar that streamlines the process and makes creating SQL queries a bit more elegant is greatly appreciated. PostgreSQL’s dialect of SQL is full of little bits of niceness here. PostgreSQL’s typecasting syntax is a great example; instead of having to type some verbose nonsense like:


SELECT CAST (mycolumn AS VARCHAR(30)), 
    CAST (myothercolumn AS DATE) FROM mytable;

…or, worse, some ridiculously specific function like “TO CHAR()”, PostgreSQL allows you to typecast like so:


SELECT mycolumn::VARCHAR(30), myothercolumn::DATE FROM mytable;

Another example of nice PostgreSQL syntax: dollar-sign quoted strings.

In most SQL databases, string literals can only be delimited with single quotes. This is a bit of an annoyance, given that the single quote also happens to be the apostrophe – a rather popular bit of punctuation in the English language. Dollar quoting lets you use double dollar signs ($$) (with an optional tag in between the dollar signs) to delimit a string. This is awesome if you need to quote a long piece of text (e.g. a crosstab query) and don’t want to worry about escaping all the single quotes; to my knowledge, PostgreSQL is one of the only databases to offer an alternate string delimiter.

Or, take the way PostgreSQL does case-insensitive string comparisons. If you want this in SQL Server, the correct way is to specify a case-insensitive collation for the query; if you want it in Oracle or MySQL, you’d do something like:


SELECT * FROM sometable WHERE UPPER(somefield) LIKE UPPER('searchterm');

In PostgreSQL, you get the “ILIKE” operator – exactly like “LIKE”, but case-insensitive, so the query becomes:


SELECT * FROM sometable WHERE somefield ILIKE 'searchterm';

Things like the “::” operator, dollar-sign quoting, or ILIKE don’t seem that big of a deal; but when you get into writing 3-page SQL queries with complex conditions, cross-tabs, and other such things, these little niceties add up to a far cleaner piece of code.

Many useful data types

Most databases give you various types of character, number, date, and binary data types. PostgreSQL gives you this, plus several interesting and cool data types that you may not have considered before:

Enumerated types (for an arbitrary list of fixed values)
Network address types (IP, CIDR, MAC)
Geometric/Spatial types (points, lines, polygons, etc.)
XML and JSON types
BOOLEAN (yes, some very expensive and over-hyped databases still don’t have these…)

Sure, some of these things can be represented with more conventional data types, but having them already built in gives you a standard representation, data type enforcement, and a battery of supporting functions and operators¹.

And unlike some other databases, these data types are well supported and sanely implemented. Take the ENUM type for example; while MySQL also has the ENUM type, it’s implementation is awkward: simply getting a list of possible ENUM values from the database (which you’d want for, say, populating a drop-down list in the application) requires querying the information_schema for a column description and using a regex on the results². PostgreSQL, on the other hand, has a built-in function to do this.

Arrays

One of the best data types that PostgreSQL adds to the mix is the array type, which lets you have arrays of any other type of data in a single field. Now, you might think having an array inside a database column is redundant, if not flat out dangerous; but I can assure you that, used sparingly, arrays in a SQL database are pretty amazingly useful.

Consider a situation where you want to flatten a couple of tables with a many-to-one relationship; for instance, you have an HR application where you want to flatten the “Departments” and “Employees” tables to a table containing the department name in one column and a list of employees in the other. In most databases, about all you can do is either (a) create a table with the department in one column and one employee per row in the next, then aggregate the employees in your application code, or (b) aggregate the employees to a string delimited by commas or somesuch. In PostgreSQL, you could aggregate the employees to a VARCHAR array; if you’re using a decent driver (like Python’s PsycoPG driver), the array will come into your application as a native array/list/vector/etc. Very convenient.

Arrays are also useful when you may need multiple values in a field, but it doesn’t quite justify yet another table join.

PostgreSQL features a pretty robust set of operators and functions for testing, comparing, manipulating, and converting arrays. It does take a bit of mental gymnastics now and then to figure out how to get an array into the format you need to accomplish various things, but used correctly and sparingly they’re an awesome tool.

Stored procedures: pick your language

All the database engines I’ve been talking about have some form of “stored procedure”, or the ability to create custom functions stored in the database itself. This is quite useful if, for example, you’ve got a complex database schema and want to abstract some common read or write operations that are otherwise a real pain to do in raw SQL, or to create custom data-manipulation functions that are specific to your data.

In MySQL, your only option for writing these is plain old SQL, which means a stored procedure is more of a “named query”. It’s useful, but not all that powerful.

Oracle gives you its own PL/SQL, an extension of SQL that adds procedural logic constructs like flow control and variables. If you can imagine the unholy union of SQL and COBOL, you’ve got a good picture of PL/SQL. Ugly and archaic though it may appear, PL/SQL does make a lot of procedures possible that mere SQL cannot accomplish. SQL Server’s TSQL dialect contains similar procedural constructs that are available when creating user functions or stored procedures³.

What PostgreSQL does here is a little unique. Instead of restricting you to a single procedural language when defining stored procedures, it offers support for a variety of popular languages; out-of-the-box it supports Python, Perl, Tcl, and PL/pgSQL⁴, but there are optional modules for Java, R, PHP, Ruby, Scheme, and Unix shell. This means that you can construct your procedural logic in a syntax that you’re comfortable with, or that best lends itself to your task.

Now, I’m not a huge fan of putting tons of application logic into the database; but there are times when it just fits, and being able to do it in the same language as the actual application is takes a bit of mental overhead out of the equation.

Excellent clients

If I had to pick a database GUI toolset that was the most powerful yet easy to use, I’d have to concede that to SQL Server Management Studio. After that, PostgreSQL’s PGAdmin tool is pretty awesome and fairly complete when it comes to configuring and interacting with your server(s). To be fair, PostgreSQL is just a simpler database system⁵, and doesn’t need a complex configuration tool a la SSMS; I can’t think of too many things PostgreSQL is capable of that you can’t do in PGAdmin⁶. As a bonus for developers like me who run a non-Windows OS on the desktop, PGAdmin is available for most major platforms.

If you prefer a web-based client, there’s also phppgadmin, which is not quite as powerful but lets you get the most common jobs done in a browser.

Of course, if you’re old-school like me and prefer to dig into the command-line now and then, PostgreSQL is a winner here. The psql command-line client has nice features like readline support, piping, and tab completion that make it very comfortable to anyone who uses modern unixy operating systems. Contrast this to Oracle’s sqlplus client, which likes to barf out weird code sequences whenever you hit the arrow or tab keys and has no facility to edit the previously typed line; or with sqlcmd, which only runs in the Windows cmd shell and thus inherits its many limitations.

The Perfect database?

Is PostgreSQL the perfect database? Probably not, and I’m not holding my breath for the .NET ISVs of the world to suddenly embrace it. But if you’re a developer whose always eyed it with uncertainty since it lacked the hype of MySQL, the marketing dollars of SQL Server, or the sheer veneration of Oracle, it’s time to get past that and check it out.

Footnotes:

¹ I just gotta say, it’s nice to be able to sort IP addresses properly, so that for example 192.168.2.0 comes before 192.168.100.0.

² http://stackoverflow.com/questions/4644220/mysql-select-enum-values

³ SQL Server distinguishes between “user functions” and “stored procedures”; the difference isn’t important here, they’re just different methods of storing custom code in the database.

⁴ this is PostgreSQL’s take on PL/SQL, which is pretty similar to Oracle’s and TSQL’s

⁵ For example, it doesn’t have its own job scheduling or email systems, probably because it’s typically run on unix-like systems that normally have a perfectly good job scheduler and MTA built into the OS.

⁶ most of what I can think of is of interest to DBAs more than devepers, and this post is speaking from a developer perspective.

4 Thoughts on “Five reasons why I choose PostgreSQL”

Craig Skinner says:

November 6, 2013 at 9:58 am

After many years working as an SQL developer on Oracle, Sybase, DB2, MySQL & PostgreSQL, I am in confident agreement.

Personally, I hesitate to call MySQL a database at all. MS Access is more robust.

1. Alan says:
  
  November 6, 2013 at 11:29 pm
  
  Ouch! I wouldn’t go that far, but I do think it’s a shame that MySQL gets more hype and attention. PostgreSQL seems to be a hidden gem among databases, unfortunately.
  
trisky says:

January 23, 2014 at 7:18 am

Heh, Postgresql is the poor man ¿R?DBMS choice :). I doubt any serious client or project would choose it instead Oracle.

Anyways, Oracle’s playing in a whole another league.

1. Alan says:
  
  January 23, 2014 at 10:24 am
  
  Their loss, I guess.

Five reasons why I choose PostgreSQL