Munin is a great tool. If you can script it, you can monitor it with munin. Unfortunately, however, munin is slow; that is, it will take snapshots once every five minutes, and not look at systems in between. If you have a short load spike that takes just a few seconds, chances are pretty high munin missed it. It also comes with a great webinterfacefrontendthing that allows you to dig deep in the history of what you've been monitoring.
By the time munin tells you that your Kerberos KDCs are all down, you've probably had each of your users call you several times to tell you that they can't log in. You could use nagios or one of its brethren, but it takes about a minute before such tools will notice these things, too.
Maybe use CollectD then? Rather than check once every several minutes, CollectD will collect information every few seconds. Unfortunately, however, due to the performance requirements to accomplish that (without causing undue server load), writing scripts for CollectD is not as easy as it is for Munin. In addition, webinterfacefrontendthings aren't really part of the CollectD code (there are several, but most that I've looked at are lacking in some respect), so usually if you're using CollectD, you're missing out some.
And collectd doesn't do the nagios thing of actually telling you when things go down.
So what if you could see it when things go bad?
At one customer, I came in contact with Frank, who wrote ExtreMon, an amazing tool that allows you to visualize the CollectD output as things are happening, in a full-screen fully customizable visualization of the data. The problem is that ExtreMon is rather... complex to set up. When I tried to talk Frank into helping me getting things set up for myself so I could play with it, I got a reply along the lines of...
well, extremon requires a lot of work right now... I really want to fix foo and bar and quux before I start documenting things. Oh, and there's also that part which is a dead end, really. Ask me in a few months?
which is fair enough (I can't argue with some things being suboptimal), but the code exists, and (as I can see every day at $CUSTOMER) actually works. So I decided to just figure it out by myself. After all, it's free software, so if it doesn't work I can just read the censored code.
As the
manual
explains, ExtreMon is a plugin-based system; plugins can add information
to the "coven", read information from it, or both. A typical setup will
run several of them; e.g., you'd have the from_collectd
plugin (which
parses the binary network protocol used by collectd) to get raw data
into the coven; you'd run several aggregator plugins (which take that
raw data and interpret it, allowing you do express things along the
lines of "if the system's load gets above X, set load.status
to
warning
"; and you'd run at least one output plugin so that you can
actually see the damn data somewhere.
While setting up ExtreMon as is isn't as easy as one would like, I did manage to get it to work. Here's what I had to do.
You will need:
- A monitor with a FullHD (or better) resolution. Currently, the display frontend of ExtreMon assumes it has a FullHD display at all time. Even if you have a lower resolution. Or a higher one.
- Python3
- OpenJDK 6 (or better)
First, we clone the ExtreMon git repository:
git clone https://github.com/m4rienf/ExtreMon.git extremon
cd extremon
There's a README there which explains the bare necessities on getting the coven to work. Read it. Do what it says. It's not wrong. It's not entirely complete, though; it fails to mention that you need to
- install CollectD (which is required for its types.db)
- Configure CollectD to have a line like
Hostname "com.example.myhost"
rather than the (usual)FQDNLookup true
. This is because extremon uses the java-style reverse hostname, rather than the internet-style FQDN.
Make sure the dump.py
script outputs something from collectd. You'll
know when it shows something not containing "plugin" or "plugins" in the
name. If it doesn't, fiddle with the #x3.
lines at the top of the
from_collectd
file until it does. Note that ExtreMon uses inotify to
detect whether a plugin has been added to or modified in its plugins
directory; so you don't need to do anything special when updating
things.
Next, we build the java libraries (which we'll need for the display thing later on):
cd java/extremon
mvn install
cd ../client/
mvn install
This will download half the Internet, build some java sources, and drop
the precompiled .jar
files in your $HOME/.m2/repository
.
We'll now build the display frontend. This is maintained in a separate repository:
cd ../..
git clone https://github.com/m4rienf/ExtreMon-Display.git display
cd display
mvn install
This will download the other half of the Internet, and then fail, because Frank forgot to add a few repositories. Patch (and push request) on github
With that patch, it will build, but things will still fail when trying to sign a .jar file. I know of four ways on how to fix that particular problem:
- Add your passphrase for your java keystore, in cleartext, to the pom.xml file. This is a terrible idea.
- Pass your passphrase to maven, in cleartext, by using some command line flags. This is not much better.
- Ensure you use the maven-jarsigner-plugin 1.3.something or above, and figure out how the maven encrypted passphrase store thing works. I failed at that.
- Give up on trying to have maven sign your jar file, and do it manually. It's not that hard, after all.
If you're going with 1 through 3, you're on your own. For the last option, however, here's what you do. First, you need a key:
keytool -genkeypair -alias extremontest
after you enter all the information that keytool
will ask for, it will
generate a self-signed code signing certificate, valid for six months,
called extremontest
. Producing a code signing certificate with longer
validity and/or one which is signed by an actual CA is left as an
exercise to the reader.
Now, we will sign the .jar
file:
jarsigner target/extremon-console-1.0-SNAPSHOT.jar extremontest
There. Who needs help from the internet to sign a .jar file? Well, apart from this blog post, of course.
You will now want to copy your freshly-signed .jar
file to a location
served by HTTPS. Yes, HTTPS, not HTTP; ExtreMon-Display will fail on
plain HTTP sites.
Download this SVG file,
and open it in an editor. Find all references to be.grep
as well as
those to barbershop
and replace them with your own prefix and
hostname. Store it along with the .jar
file in a useful directory.
Download this JNLP file,
and store it on the same location (or you might want to actually open it
with "javaws" to see the very basic animated idleness of my system).
Open it in an editor, and replace any references to barbershop.grep.be
by the location where you've stored your signed .jar
file.
Add the chalice_in_http
plugin from the plugins
directory. Make sure
to configure it correctly (by way of its first few comment lines) so
that its input and output filters are set up right.
Add the configuration snippet in section 2.1.3 of the manual (or
something functionally equivalent) to your webserver's configuration.
Make sure to have authentication—chalice_in_http
is an input
mechanism.
Add the chalice_out_http
plugin from the plugins
directory. Make
sure to configure it correctly (by way of its first few comment lines)
so that its input and output filters are set up right.
Add the configuration snippet in section 2.2.1 of the manual (or something functionally equivalent) to your webserver's configuration. Authentication isn't strictly required for the output plugin, but you might wish for it anyway if you care whether the whole internet can see your monitoring.
Now run javaws https://url/x3console.jnlp
to start Extremon-Display.
At this point, I got stuck for several hours. Whenever I tried to run x3mon, this java webstart thing would tell me simply that things failed. When clicking on the "Details" button, I would find an error message along the lines of "Could not connect (name must not be null)". It would appear that the Java people believe this to be a proper error message for a fairly large number of constraints, all of which are slightly related to TLS connectivity. No, it's not the keystore. No, it's not an API issue, either. Or any of the loads of other rabbit holes that I dug myself in.
Instead, you should simply make sure you have Server Name Indication enabled. If you don't, the defaults in Java will cause it to refuse to even try to talk to your webserver.
The ExtreMon github repository comes with a bunch of extra plugins; some are special-case for the place where I first learned about it (and should therefore probably be considered "examples"), others are general-purpose plugins which implement things like "is the system load within reasonable limits". Be sure to check them out.
Note also that while you'll probably be getting most of your data from
CollectD, you don't actually need to do that; you can write your own
plugins, completely bypassing collectd. Indeed, the from_collectd
thing we talked about earlier is, simply, also a plugin. At $CUSTOMER,
for instance, we have one plugin which simply downloads a file every so
often and checks it against a checksum, to verify that a particular
piece of nonlinear
software hasn't gone astray yet again. That doesn't need collectd.
The example above will get you a small white bar, the width of which is defined by the cpu "idle" statistic, as reported by CollectD. You probably want more. The manual (chapter 4, specifically) explains how to do that.
Unfortunately, in order for things to work right, you need to pretty much manually create an SVG file with a fairly strict structure. This is the one thing which Frank tells me is a dead end and needs to be pretty much rewritten. If you don't feel like spending several days manually drawing a schematic representation of your network, you probably want to wait until Frank's finished. If you don't mind, or if you're like me and you're impatient, you'll be happy to know that you can use inkscape to make the SVG file. You'll just have to use dialog behind ctrl+shift+X. A lot.
Once you've done that though, you can see when your server is down. Like, now. Before your customers call you.
I've owned a Logitech Wingman Gamepad Extreme since pretty much forever, and although it's been battered over the years, it's still mostly functional. As a gamepad, it has 10 buttons. What's special about it, though, is that the device also has a mode in which a gravity sensor kicks in and produces two extra axes, allowing me to pretend I'm really talking to a joystick. It looks a bit weird though, since you end up playing your games by wobbling the gamepad around a bit.
About 10 years ago, I first learned how to write GObjects by writing a
GObject-based joystick API. Unfortunately, I lost the code at some point
due to an overzealous rm -rf
call. I had planned to rewrite it, but
that never really happened.
About a year back, I needed to write a user interface for a customer
where a joystick would be a major part of the interaction. The code
there was written in Qt, so I write an event-based joystick API in Qt.
As it happened, I also noticed that jstest
would output names for the
actual buttons and axes; I had never noticed this, because due to my 10
buttons and 4 axes, which by default produce a lot of output, the
jstest
program would just scroll the names off my screen whenever I
plugged it in. But the names are there, and it's not too difficult.
Refreshing my memory on the joystick API made me remember how much fun
it is, and I wrote the beginnings of what I (at the time) called
"libgjs", for "Gobject JoyStick". I didn't really finish it though,
until today. I did notice in the mean time that someone else released
GObject bindings for javascript and also called that gjs, so in the
interest of avoiding confusion I decided to rename my library to
libjoy
. Not only will this allow me all kinds of interesting puns like
"today I am releasing more joy", it also makes for a more compact API
(compare joy_stick_open()
against gjs_joystick_open()
).
The library also comes with a libjoy-gtk
that creates a
GtkListStore*
which is automatically updated as joysticks are added
and removed to the system; and a joytest
program, a graphical joystick
test program which also serves as an example of how to use the API.
still TODO:
- Clean up the API a bit. There's a bit too much use of
GError
in there. - Improve the UI. I suck at interface design. Patches are welcome.
- Differentiate between
JS_EVENT_INIT
kernel-level events, and normal events. - Improve the documentation to the extent that gtk-doc (and, thus, GObject-Introspection) will work.
What's there is functional, though.
Update: if you're going to talk about code, it's usually a good idea to link to said code. Thanks, Emanuele, for pointing that out
After yesterday's late night accomplishments, today I fixed up the UI of joytest a bit. It's still not quite what I think it should look like, but at least it's actually usable with a 27-axis, 19-button "joystick" (read: a PS3 controller). Things may disappear off the edge of the window, but you can scroll towards it. Also, I removed the names of the buttons and axes from the window, and installed them as tooltips instead. Few people will be interested in the factoid that "button 1" is a "BaseBtn4", anyway.
The result now looks like this:
If you plug in a new joystick, or remove one from the system, then as soon as udev finishes up creating the necessary device node, joytest will show the joystick (by name) in the treeview to the left. Clicking on a joystick will show that joystick's data to the right. When one pushes a button, the relevant checkbox will be selected; and when one moves an axis, the numbers will start changing.
I really should have some widget to actually show the axis position, rather than some boring numbers. Not sure how to do that.
I just watched a CCC talk in which the speaker claims Perl is horribly broken. Watching it was fairly annoying however, since I had to restrain myself from throwing things at the screen.
If you're going to complain about the language, better make sure you actually understand the language first. I won't deny that there are a few weird constructions in there, but hey. The talk boils down to a claim that perl is horrible, because the list "data type" is "broken".
First of all, Netanel, in Perl, lists are not arrays. Yes, that's confusing if you haven't done more than a few hours of Perl, but hear me out. In Perl, a list is an enumeration of values. A variable with an '@' sigil is an array; a construct consisting of an opening bracket ('(') followed by a number of comma- or arrow-separated values (',' or '=>'), followed by a closing bracket, is a list. Whenever you assign more than one value to an array or a hash, you need to use a list to enumerate the values. Subroutines in perl also use lists as arguments or return values. Yes, that last bit may have been a mistake.
Perl has a concept of "scalar context" and "list context". A scalar context is what a sub is in when you assign the return value of your sub to a scalar; a list context is when you assign the return value of your sub to an array or a hash, or when you use the list construct (the thing with brackets and commas) with sub calls (instead of hardcoded values or variables) as the individual values. This works as follows:
sub magic {
if (wantarray()) {
print "You're asking for a list!";
return ('a', 'b', 'c');
} else {
print "You're asking for a scalar!";
return 'a';
}
}
print ("list: ", magic(), "\n");
print "scalar: " . magic() . "\n";
The above example will produce the following output:
You're asking for a list!
list: abc
You're asking for a scalar!
scalar: a
What happens here? The first print
line creates a list (because things
are separated by commas); the second one does not (the '.' is perl's
string concatenation operator; as you can only concatenate scalars, the
result is that you call the magic() sub in scalar context).
Yes, seen as how arrays are not lists, the name of the wantarray() sub is horribly chosen. Anyway.
It is documented that lists cannot be nested. Lists can only be one-dimensional constructs. If you create a list, and add another list as an element (or something that can be converted to a list, like an array or a hash), then the result is that you get a flattened list. If you don't want a flattened list, you need to use a reference instead. A reference is a scalar value that, very much like a pointer in C, contains a reference to another variable. This other variable can be an array, a hash, or a scalar. But it cannot be a list, because it must be a variable -- and lists cannot be variables.
If you need to create multi-dimensional constructs, you need to use references. Taking a reference is done by prepending a backslash to whatever it is you're trying to take a reference of; or, in the case of arrays of hashes, one can create an anonymous array or hash with [] resp {}. E.g., if you want to add a non-flattened array to a list, you instead create a reference to an array, like so:
$arrayref = [ 'this', 'is', 'an', 'anonymous', 'array'];
you can now create a multi-dimensional construct:
@multiarray = ('elem1', $arrayref);
Or you can do that in one go:
@multiarray = ('elem1', [ 'this', 'is', 'an', 'anonymous', 'array']);
Alternatively, you can create a non-anonymous array first:
@onedimarray = ('this', 'is', 'not', 'an', 'anonymous', 'array');
@multiarray = ('elem1', \@onedimarray);
In perl, curly brackets can be used to create a reference to anonymous hashes, whereas square brackets can be used to create a reference to anonymous arrays. This is all a basic part of the language; if you don't understand that, you simply don't understand Perl. In other words, whenever you see someone doing this:
%hash = {'a' => 'b'};
or
@array = [ '1', '2' ];
you can say that they don't understand the language. For reference, the
assignment to %hash
will result in an (unusable) hash with a single
key that is a reference to an anonymous hash (which cannot be accessed
anymore) and a value of undef
; the assignment to @array
will result
in a two-dimensional array with one element in the first dimension, and
two elements in the second.
The CGI.pm fix which Natanel dismisses in the Q&A part of the talk as a "warning" which won't help (because it would be too late) is actually a proper fix, which should warn people in all cases. That is, if you do this:
%hash = { 'name' => $name, 'password' => $cgi->param('password') };
then CGI.pm's param()
sub will notice that it's being called in list
context, and issue a warning -- regardless of whether the user is
passing one or two password
query-parameters. It uses the wantarray()
sub, and produces a warning if that returns true.
In short, Perl is not the horribly broken construct that Natanel claims it to be. Yes, there are a few surprises (most of which exist for historical reasons), and yes, those should be fixed. This is why the Perl community has redone much of perl for Perl 6. But the fact that there are a few surprises doesn't mean the whole language is broken. There are surprises in most languages; that is a fact of life.
Yes, the difference between arrays and hashes on the one hand, and lists on the other hand, is fairly confusing; it took me a while to understand this. But once you get the hang of it, it's not all that difficult. And then these two issues that Natanel found (which I suppose could be described as bugs in the core modules) aren't all that surprising anymore.
So, in short:
- Don't stop using Perl. However, do make sure that whenever you use a language, you understand the language, first, so you don't get bitten by its historical baggage. This is true for any language, not just Perl.
- Don't assume that just because you found issues with core modules, the whole language is suddenly broken.
What I do agree with is that if you want to use a language, you should understand its features. Unfortunately, this single line in the final slide of Natanel's talk is just about the only thing in the whole talk that sortof made sense to me.
Ah well.