Thu, 08 Sep 2011
A commentor on my
previous post asked why I think MySQL is a toy.
I've actually blogged about that a number of times, but when
wanting to point that out, I found that most of those posts point out
just one thing, rather than having one post that enumerates them all. So
let's remedy that, shall we?
There are many things wrong with MySQL, including, but not limited
to:
- In my personal opinion, the whole storage engine thing is a bad
idea. It leads to dilution of effort (the MySQL developers need to
implement stuff more than once), and makes users choose between disjoint
feature sets, which will almost always result in suboptimal results. It
also neatly allows PR people to claim that MySQL is 'fast' and 'ACID
compliant', without mentioning that you can't combine both.
- One of MySQL's default settings is to truncate string values if they
are longer than the field in which they're supposed to fit. This eats
data. Hint: it's a database. While truncating strings might be
appropriate in extreme corner cases, making it the default is so wrong
it's not even funny.
- One of MySQL's default settings is to use MyISAM as a storage
engine, which doesn't do transactions. This means you don't have
atomicity, which eats data. Hint: it's a database. While telling a user
"if the system crashes, I might write your entire data set to disk, or I
might write just half of it, or I might corrupt the data set and I'm not
going to tell you until you try to read again" might be
appropriate in extreme corner cases, making it the default is so wrong
it's not even funny. Update: as many people pointed
out to me in the comments, apparently one of the first things Oracle did
when they took over MySQL is to change the default from MyISAM to
InnoDB, making this point no longer valid. I still think the replaceable
storage engine thing is a bad idea, but with InnoDB as default, it's not
as much of a pain anymore.
- If those two weren't enough 'mysql eats your data' arguments, note
that most distributions routinely run a data recovery tool at MySQL
startup, because not doing so caused problems in the past.
- I've seen MySQL crash and burn and segfault
reproducibly when it encountered a corrupt table. Now I'm not saying
that the database should be able to read data as if nothing happened
from a corrupt file, but it should not crash and burn and
segfault; instead, it should produce an error message.
- Fetching a result set can be done in two ways: either you call
mysql_use_result() before calling mysql_fetch_row()
which tells MySQL you'll be fetching the result one row at a time, or
you call mysql_store_result(), which will read the entire
result set into memory. So far so good. The problem, however, is that if
you use mysql_use_result(), you're locking the tables from
which you're fetching data, and no other client will be able to update
any data in those tables until you're finished. On the other hand, if
you need to process a large amount of data that can't be processed on
the server for some reason, you may need to run a query that returns
more data than you have memory. In that case, running
mysql_store_result() is plain impossible. This isn't just a
theoretical thing; I've seen cases in data warehouse situations where a
database client needed to process multi-gigabyte query results. Trying
this on MySQL is a pain.
- When compared to PostgreSQL, the MySQL feature set is immature. For
instance, here's a number of useful[1] features which PostgreSQL has but
MySQL, to the best of my knowledge, does not (corrections are welcome):
- table
inheritance
- asynchronous
notification
- full ACID
compliance, in all cases. (MySQL only offers full ACID
compliance if you pick a particular storage engine)
- asynchronous
command processing.
- Very flexible authentication system, and real actual users. For
instance, PostgreSQL supports Kerberos authentication, and understands
that users may actually log in from different hosts (gasp!)
- SELinux extensions, called SE-PostgreSQL.
- Server-side languages are implemented using a plugin scheme,
allowing stored procedures to be written bascially in any possible
language. Someone wrote a PL/LOLCODE which,
while not very useful, shows the flexibility; MySQL only supports one
language for stored procedures and triggers—if the storage engine
supports triggers, which not all do.
- Sequences
(an AUTO_INCREMENT column is a reasonable workaround, but still not a
sequence)
- window
functions
- extensible
data type system; of those, MySQL only supports enumerated
types.
Against that list, MySQL can only pit "multi-master clustering".
While I'm sure that's useful for some use cases, I remain unconvinced
that it's a useful enough feature to have to deal with the
administrative overhead that MySQL's multi-master clustering imposes
upon you, or that it is worth losing all the above over.
So it's my opinion that any database which fails to store data
correctly in its default settings can't be anything but a toy; or that a
database which has a comparatively small feature set can't be anything
but a toy. But maybe that's just me.
[1] No, I haven't used all those features; but I have used
asynchronous notification, sequences (other than for primary keys),
kerberos auth, custom data types, and (obviously) I have enjoyed the
extra peace of mind of knowing that my database is ACID compliant,
meaning that it will either accept my transaction as a whole, or reject
it as a whole (but usually the former). In addition, I've seen customers
use the table inheritance feature.
/en/computer/cluebat
PermaLink