russel coker raid

Software RAID

Russel blogs about write intent bitmaps, which are an option in the Linux Software RAID subsystem, which works somewhat like a journal on the RAID level: every time you write to the array, you first mark the bits you're going to write to as dirty, then write them, and then mark them as clean again. This allows the RAID subsystem to have to check much less in case of a system crash, where normally the system would have to run a full array rebuild.

He'd suggested this before on the debian-boot mailinglist, and when I read that post, it seemed to make sense. However, after reading his blog post, I'm not so sure anymore. In his words:

The down-side to this feature is that it will slightly reduce performance. But when comparing the possibility of a few percent performance loss all the time and the possibility of a massive performance loss for an hour or two after a crash it seems that losing a few percent all the time is almost always the desired option.

I vehemently disagree there. Performance is irrelevant in case you have a large server park; in that case, adding another server to the park is relatively cheap—you don't run hundreds of servers on a small budget, and besides in these days of virtualization, often migrating a service from one physical server to another isn't very hard.

However, this isn't true when you're talking about small businesses, or (especially) home servers. When I have a choice between high loss of performance in case of something which happens only rarely (in my experience, the Software RAID subsystem is pretty good in recovering from a power loss without having to go through the RAID rebuild, leaving only kernel crashes and similar) or a small but continuous performance loss on my home server, there is no doubt in my mind that I will choose the former. First, my home server is a Thecus N2100, which, while powerful enough to run a number of services for my home network, is not a very fast system with somewhat low resources in comparison to some other systems; and even a small loss of performance is probably noticeable. Second, the speed of recovery which the RAID subsystem uses (and, hence, its performance impact) is manageable through the files /proc/sys/dev/raid/speed_limit_max and /proc/sys/dev/raid/speed_limit_min. Obviously lowering the speed of the RAID rebuild will make the process take longer; but if performance matters that much to you, then lowering the rebuild speed can be an option. Finally, sometimes the RAID subsystem chooses to go through a lengthy check of the entire array; it would be interesting to know whether using the write intent bitmap feature disables this too. I suspect this is not the case, and if so it would seem as if enabling this feature would cost some performance for little benefit: in normal situations these checks happen far more often than actual RAID rebuilds; so the most important source of performance loss would not be handled at the cost of extra performance loss.

In closing, I guess the right answer to this question is that it's a trade-off; that choosing the right defaults should be done by upstream (to avoid confusion), and that the user should be given the possibility (somehow) of enabling or disabling this option in defiance of the defaults at install time (perhaps only in d-i's expert mode)