diff6

There are six servers. They're meant to be used for the same purpose, and are mostly the same too, but not quite entirely. This being the case, mostly because of how they were maintained in the past: with cssh mostly, but it's inevitable that eventually there will be differences.

So we're in the process of migrating them to puppet now. One of the things I need to do while doing that, however, is figure out what the differences are, and get rid of them where applicable. Differences such as, say, in the installed packages. Some of these servers have extra things installed, for a reason that isn't clear anymore, that others don't.

So how does one figure out what the difference is? If this were two servers, I'd create a list of installed packages and use 'diff' on them. If there were three, I'd use 'diff3' instead. But if there are 6?

Turns out that isn't too hard. First, of course, you need a list of packages installed on each individual host. Both rpm and dpkg can do this with a simple command line invocation. Assuming that's done, and the output is stored in files called installed-host1 through installed-host6, the procedure is:

$ comm -12 installed-host1 installed-host2 > installed-common
$ for i in $(seq 3 6)
> do
>   comm -12 installed-common installed-host$i > installed-tmp
>   mv installed-tmp installed-common
> done

And there, you now have a file "installed-common" containing everything which is installed everywhere. If you use "comm" or "diff" to compare that file against the installed-hostX files, you can easily see what the difference is.

Granted, that doesn't give you the overview in just one file, but usually that's not really necessary. The 'installed-common' file contains the largest common denominator, which really is everything you need.