Why I think MySQL is a toy.
A commentor on my previous post asked why I think MySQL is a toy.
I've actually blogged about that a number of times, but when wanting to point that out, I found that most of those posts point out just one thing, rather than having one post that enumerates them all. So let's remedy that, shall we?
There are many things wrong with MySQL, including, but not limited to:
- In my personal opinion, the whole storage engine thing is a bad idea. It leads to dilution of effort (the MySQL developers need to implement stuff more than once), and makes users choose between disjoint feature sets, which will almost always result in suboptimal results. It also neatly allows PR people to claim that MySQL is 'fast' and 'ACID compliant', without mentioning that you can't combine both.
- One of MySQL's default settings is to truncate string values if they are longer than the field in which they're supposed to fit. This eats data. Hint: it's a database. While truncating strings might be appropriate in extreme corner cases, making it the default is so wrong it's not even funny.
- One of MySQL's default settings is to use MyISAM as a storage engine, which doesn't do transactions. This means you don't have atomicity, which eats data. Hint: it's a database. While telling a user "if the system crashes, I might write your entire data set to disk, or I might write just half of it, or I might corrupt the data set and I'm not going to tell you until you try to read again" might be appropriate in extreme corner cases, making it the default is so wrong it's not even funny. Update: as many people pointed out to me in the comments, apparently one of the first things Oracle did when they took over MySQL is to change the default from MyISAM to InnoDB, making this point no longer valid. I still think the replaceable storage engine thing is a bad idea, but with InnoDB as default, it's not as much of a pain anymore.
- If those two weren't enough 'mysql eats your data' arguments, note that most distributions routinely run a data recovery tool at MySQL startup, because not doing so caused problems in the past.
- I've seen MySQL crash and burn and segfault reproducibly when it encountered a corrupt table. Now I'm not saying that the database should be able to read data as if nothing happened from a corrupt file, but it should not crash and burn and segfault; instead, it should produce an error message.
- Fetching a result set can be done in two ways: either you call mysql_use_result() before calling mysql_fetch_row() which tells MySQL you'll be fetching the result one row at a time, or you call mysql_store_result(), which will read the entire result set into memory. So far so good. The problem, however, is that if you use mysql_use_result(), you're locking the tables from which you're fetching data, and no other client will be able to update any data in those tables until you're finished. On the other hand, if you need to process a large amount of data that can't be processed on the server for some reason, you may need to run a query that returns more data than you have memory. In that case, running mysql_store_result() is plain impossible. This isn't just a theoretical thing; I've seen cases in data warehouse situations where a database client needed to process multi-gigabyte query results. Trying this on MySQL is a pain.
- When compared to PostgreSQL, the MySQL feature set is immature. For
instance, here's a number of useful[1] features which PostgreSQL has but
MySQL, to the best of my knowledge, does not (corrections are welcome):
- table inheritance
- asynchronous notification
- full ACID compliance, in all cases. (MySQL only offers full ACID compliance if you pick a particular storage engine)
- asynchronous command processing.
- Very flexible authentication system, and real actual users. For instance, PostgreSQL supports Kerberos authentication, and understands that users may actually log in from different hosts (gasp!)
- SELinux extensions, called SE-PostgreSQL.
- Server-side languages are implemented using a plugin scheme, allowing stored procedures to be written bascially in any possible language. Someone wrote a PL/LOLCODE which, while not very useful, shows the flexibility; MySQL only supports one language for stored procedures and triggers—if the storage engine supports triggers, which not all do.
- Sequences (an AUTO_INCREMENT column is a reasonable workaround, but still not a sequence)
- window functions
- extensible data type system; of those, MySQL only supports enumerated types.
Against that list, MySQL can only pit "multi-master clustering". While I'm sure that's useful for some use cases, I remain unconvinced that it's a useful enough feature to have to deal with the administrative overhead that MySQL's multi-master clustering imposes upon you, or that it is worth losing all the above over.
So it's my opinion that any database which fails to store data correctly in its default settings can't be anything but a toy; or that a database which has a comparatively small feature set can't be anything but a toy. But maybe that's just me.
[1] No, I haven't used all those features; but I have used asynchronous notification, sequences (other than for primary keys), kerberos auth, custom data types, and (obviously) I have enjoyed the extra peace of mind of knowing that my database is ACID compliant, meaning that it will either accept my transaction as a whole, or reject it as a whole (but usually the former). In addition, I've seen customers use the table inheritance feature.
And then there's the MySQL geospatial fun, which is quite literally broken beyond repair. It's really quite lovely, you can do all sorts of magic, such as "do these two sets of coordinates overlap?". Only it doesn't work, as MySQL uses "minimal bounding rectangles" instead of the actual shape of your coordinate set. It's plain wrong and broken. There are many outstanding bugs against the geospatial features, and they're all being ignored because it's "too much work" to fix them. "Maybe in MySQL 6 or 7 or after hell freezes over.".
The geospatial extensions in postgres (PostGIS) work like a charm. No cutting corners, no incorrect results. It Just Works(tm).
InnoDB is the default storage in MySQL since 5.5.
http://dev.mysql.com/doc/refman/5.5/en/storage-engine-setting.html
From MySQL 5.5 on InnoDB ist the default engine.
http://dev.mysql.com/doc/refman/5.5/en/innodb-default-se.html
All good points, I have some additions though.
This isn't true anymore since 5.5 (or 5.1, don't remember exact version anymore), InnoDB is now the default.
I would have agreed about the multi-master administration woes a couple months ago, but Codership has nice product called Galera that make it's a lot easier. No having to deal with the increment offsets per server, it does it for you. Only down side I see at the moment is that it forces people to use InnoDB if you want the replication.
http://codership.com/products/mysql_galera
BTW: Can you do something about the background of your site, It's really hard to read stuff on it.
MySQL permissions scheme is indeed a toy.
There isn't notion of ownership. Writing an application that manages database in a MySQL server without giving complete control of your server to the user running the application is impossible. To allow an unprivileged user to reset users passwords (a reasonably common task) the documentation says you need to give update privileges on the mysql administrative database to the user.
Well, MySQL is not the best choice for every occasion. That said, I use it a lot and it does the job well. I guess I use MariaDB these days but it is the same for the purposes of this kind of debate.
Opinions vary but I like the storage engine concept. For one thing, it allows me to drop something like this in if I need to:
http://www.tokutek.com/products/tokudb-for-mysql/
MySQL is both fast and ACID compliant in it's default configuration as others have pointed out (with InnoDB as the storage engine). I really, really wish I would stop reading new posts from people that have not figured this out. Lack of ACID compliance was a serious and valid complaint for a long time but not for a while now.
MySQL has many problems and for some workloads they might even matter in the real world. I suspect that most of the MySQL critics are not running into them though.
So, while I might be running SAP on PostgreSQL, I am not about to stop using MySQL for my web apps (even the big ones). Then again, I am low-brow enough even to run SQLite sometimes (like on phones) and thumbdrive based apps.
A while ago a different frustrated author listed his gripes with MySQL: http://angryhacker.com/blog/archive/0001/01/01/mysql-eye-for-an-microsoft-sql-server-guy.aspx
I'm a long time postgres user and I love every aspect of it. Coming from Microsoft SQL server, the only thing that Postgres lacks is 'merge replication' that MSSQL has native, which is excellent. But I do use Mysql where I need high availability setup. Setting HA with postgres is PITA - you are either stuck with postgres streaming replication, and then invent-the-wheel for failover (the best thing today is to use Pacemaker, but that is also PITA for it self), or use something like PgPool which is also unreliable and complicated to setup. With MySQL you can set up master-master replication with ease, you can target either server and make sure your data will be replicated (there are drawbacks, but if you plan it well, you won't have any issues). InnoDB is a must, also 'strict mode' is a must, and you, well, don't write stored procedures, because those suck in mysql. You don't use triggers anyway, those are just sign of a poor database design. Until PostgreSQL team offers easy-to-set-up HA solution, MySQL (as toyish as it is) will be around, because, there really aren't any options.
IMO table inheritance brings in more complexity to the developers as once has to remember about all the children when querying the parent. Views in MySQL can basically do what inheritance does in your example.
To your list, I would add: - columns can't be defaulted to a builtin function (e.g. in Oracle you declare a column "create_datetime date default sysdate not null", in MySQL you... can't) - BEFORE triggers do not process before not-null constraints are applied, so you can't, for example, have a modify_datetime column be both trigger-populated and not-null - there is an arbitrarily small size limit on unique indexes, leaving you unable to properly enforce uniqueness on long varchar fields
IIRC, all of these have outstanding bugs logged, and all of those bugs have been open for years. Some of them have amusing missing-the-point comments from MySQL apologists.