Using NBD to get rid of an LVM installation.
LVM can be a virtue. However, it can also be a pain at times; for instance, there's loss of diskspace due to LVM's metadata that needs to be stored as well. This amounts to approximately 5% of the diskspace (if I'm not mistaken), which can be quite a bit. Also, this is an extra abstraction layer, which requires memory and CPU time to be processed. This will be negligable on most modern computers, but sometimes it isn't.
I had a system that had quite a few things on LVM, and wanted to get rid of them. Unfortunately, to remove the LVM physical volumes would mean that I'd have to get rid of the data, which would mean storing it somewhere else, which in turn would probably mean that I wouldn't be able to use the system for quite a long time. That wasn't acceptable.
So, instead, what I did was to export a whole hard disk (one with enough space to hold the logical volumes on that machine) using NBD, and migrate stuff that way:
- First, set up the NBD export. The config file should look something like this:
[generic] user = nbd group = disk [export] exportname = /dev/hde port = 1234
This will export all of /dev/hde on port 1234. Obviously you wouldn't want to do this on an untrusted network, but that's always true with NBD (in my case, the link was a 1M crossed cable -- quite safe).
nbd-client <server> 1234 /dev/nbd0
... assuming /dev/nbd0 is not in use yet. Provided the client machine is running kernel 2.6.26 or newer and nbd-client 2.9.10 or later (e.g., Debian Lenny or later), and you're running udev, this will create a number of /dev/nbd0pX device nodes, with X starting at 1, one for each partition on the disk. Of course, one can also export a specific partition rather than the whole disk, or just one file on a filesystem, or create a physical volume of the entire disk rather than just a partition -- your choice.
pvcreate /dev/nbd0p2 vgextend /dev/<vg> /dev/nbd0p2
If you now run 'vgdisplay', you'll see that the size of the LVM volume has increased considerably.
pvmove -v /dev/sda7
This will show progress information every 15 seconds while the data is being moved from the local hard disk to the NBD device. You'll be able to continue using the system as usual--although of course the system will gradually get slower and slower as more and more data is only available through a comparatively slow network connection.
vgreduce -a /dev/<vg> cfdisk /dev/sda ...
This final step might not be as simple as it sounds; e.g., if your /home filesystem is on LVM currently and you want to move that out of LVM, you'll have to make sure only root is logged on; If /var is on LVM, too, you may have to go to single-user; etc. Figuring out the details for this one is left as an exercise to the reader.
However, I will note that since somewhere between etch and lenny, doing root-on-NBD is possible; and if you're running unstable, then doing root-on-LVM-on-NBD should be possible, too, although I would suggest a little caution before trying that.
In any case, using NBD in this way just saved me a whole lot of no-computer-time here. Which is great.
I think you're wrong about LVM metadata size. AFAIK it's even not relative, it something like a couple of extents that are reserved.
Are you not confusing about root reserved space (tune2fs -m) on ext fs?
I might be wrong. However, in any case, LVM needs to store something which is relative to the number of extents; at the very least, it will need to store information such as 'this extent is assigned to this LV' somewhere. That does take up quite some diskspace, and I've seen it somewhere between 5 and 10% of the total diskspace.
But, hey, I didn't look at the LVM source, and it's not the main point of my post, so...
All the on-disk metadata[0][1] is stored in the metadata header of each PV. This header is fixed in size when the PV is created and is usually much smaller than a single extent[2][3]. Each PV in a VG has a copy of the metadata for the entire VG.
[0] Excluding the backups made by userspace in /etc/lvm. [1] The in-memory metadata may be considerable, especially for snapshots. [2] Typical defaults are 192512 bytes of metadata (following a 4096 byte fixed-size header) with the rest of the volume available for data. Typical default for extent size is 4M but this is set by the LV not PV. [3] The metadata size can be set with the --metadatasize option to pvcreate and doing so is advisable if on a striped RAID volume or if expecting to create many LVs or snapshots.
On my laptop, the partition containing my only PV has 303,998,123,008 bytes (a little over 283GB). The LVM VG contains 72478 extents of 4MB each for a total of 303,994,765,312 usable bytes; I make that just over 3MB of overhead, or around 0.1% (so I suspect it's O(1) rather than O(size)).
Inside that VG, lvdisplay reports that my main filesystem occupies exactly 200GB (51200 extents), and blockdev(8) reports that the block device exposed for the filesystem is indeed exactly that size.
So I think you're wrong about the space cost of LVM...
-- smcv.pseudorandom.co.uk
I doubt it.
First, I presume you meant to say RAID1 rather than RAID0. The latter does just striping, no mirroring, so you probably won't be able to use that.
Second, RAID1 also needs some metadata, so you need to 'reformat' a partition as an MD device, on which you can then store your actual data. It's not possible to take a 'regular' partition, and turn that into a RAID1 member.
At least not without tricks like using LVM; but since this was about getting rid of LVM, that's hardly helpful