Why I think MySQL is a toy.

A commentor on my previous post asked why I think MySQL is a toy.

I've actually blogged about that a number of times, but when wanting to point that out, I found that most of those posts point out just one thing, rather than having one post that enumerates them all. So let's remedy that, shall we?

There are many things wrong with MySQL, including, but not limited to:

  • In my personal opinion, the whole storage engine thing is a bad idea. It leads to dilution of effort (the MySQL developers need to implement stuff more than once), and makes users choose between disjoint feature sets, which will almost always result in suboptimal results. It also neatly allows PR people to claim that MySQL is 'fast' and 'ACID compliant', without mentioning that you can't combine both.
  • One of MySQL's default settings is to truncate string values if they are longer than the field in which they're supposed to fit. This eats data. Hint: it's a database. While truncating strings might be appropriate in extreme corner cases, making it the default is so wrong it's not even funny.
  • One of MySQL's default settings is to use MyISAM as a storage engine, which doesn't do transactions. This means you don't have atomicity, which eats data. Hint: it's a database. While telling a user "if the system crashes, I might write your entire data set to disk, or I might write just half of it, or I might corrupt the data set and I'm not going to tell you until you try to read again" might be appropriate in extreme corner cases, making it the default is so wrong it's not even funny. Update: as many people pointed out to me in the comments, apparently one of the first things Oracle did when they took over MySQL is to change the default from MyISAM to InnoDB, making this point no longer valid. I still think the replaceable storage engine thing is a bad idea, but with InnoDB as default, it's not as much of a pain anymore.
  • If those two weren't enough 'mysql eats your data' arguments, note that most distributions routinely run a data recovery tool at MySQL startup, because not doing so caused problems in the past.
  • I've seen MySQL crash and burn and segfault reproducibly when it encountered a corrupt table. Now I'm not saying that the database should be able to read data as if nothing happened from a corrupt file, but it should not crash and burn and segfault; instead, it should produce an error message.
  • Fetching a result set can be done in two ways: either you call mysql_use_result() before calling mysql_fetch_row() which tells MySQL you'll be fetching the result one row at a time, or you call mysql_store_result(), which will read the entire result set into memory. So far so good. The problem, however, is that if you use mysql_use_result(), you're locking the tables from which you're fetching data, and no other client will be able to update any data in those tables until you're finished. On the other hand, if you need to process a large amount of data that can't be processed on the server for some reason, you may need to run a query that returns more data than you have memory. In that case, running mysql_store_result() is plain impossible. This isn't just a theoretical thing; I've seen cases in data warehouse situations where a database client needed to process multi-gigabyte query results. Trying this on MySQL is a pain.
  • When compared to PostgreSQL, the MySQL feature set is immature. For instance, here's a number of useful[1] features which PostgreSQL has but MySQL, to the best of my knowledge, does not (corrections are welcome):
    • table inheritance
    • asynchronous notification
    • full ACID compliance, in all cases. (MySQL only offers full ACID compliance if you pick a particular storage engine)
    • asynchronous command processing.
    • Very flexible authentication system, and real actual users. For instance, PostgreSQL supports Kerberos authentication, and understands that users may actually log in from different hosts (gasp!)
    • SELinux extensions, called SE-PostgreSQL.
    • Server-side languages are implemented using a plugin scheme, allowing stored procedures to be written bascially in any possible language. Someone wrote a PL/LOLCODE which, while not very useful, shows the flexibility; MySQL only supports one language for stored procedures and triggers—if the storage engine supports triggers, which not all do.
    • Sequences (an AUTO_INCREMENT column is a reasonable workaround, but still not a sequence)
    • window functions
    • extensible data type system; of those, MySQL only supports enumerated types.
  • Against that list, MySQL can only pit "multi-master clustering". While I'm sure that's useful for some use cases, I remain unconvinced that it's a useful enough feature to have to deal with the administrative overhead that MySQL's multi-master clustering imposes upon you, or that it is worth losing all the above over.

So it's my opinion that any database which fails to store data correctly in its default settings can't be anything but a toy; or that a database which has a comparatively small feature set can't be anything but a toy. But maybe that's just me.

[1] No, I haven't used all those features; but I have used asynchronous notification, sequences (other than for primary keys), kerberos auth, custom data types, and (obviously) I have enjoyed the extra peace of mind of knowing that my database is ACID compliant, meaning that it will either accept my transaction as a whole, or reject it as a whole (but usually the former). In addition, I've seen customers use the table inheritance feature.