Nagios: Watching Clustered environments (the other way)

Well, recently I stepped up to watch our cluster environments … Michael has a good howto on how to watch Windows Cluster environments in the NSclient++ wiki.

Now, this has it’s own perks … Which I stumbled upon when trying to write a Linux-HA OCF resource agent for the Nagios NRPE server. Combining that Linux-HA with SLES10 is a good thing generally, but using startproc in that resource agent is not such a good idea.

Apparently Novell (or SuSE GmbH) thought it might be wise to include some additional logic into the wrapper. startproc, checkproc and killproc do check for the name of the executable. So if you try to start an additional process with the same name, you need to dig a bit deeper.

For this to work, you need two additional things (quotations directly from man 8 startproc):

-p pid_file
(Former option -f changed due to the LSB specification.) Use an alternate pid file instead of the default (/var/run/<basename>.pid). The pid read from this file is being matched against the pid of running processes that have an executable with specified path of the program. In order to avoid confusion with stale pid files, a not up-to-date pid will be ignored.

Now, then apparently this isn’t enough. startproc is still refusing to start a second process.

-i ignore_file
The pid found in this file is used as session id of the same binary program which should be ignored by startproc.

Read More

Linux-HA: Creating a random authkey

I just looked over the slides of a presentation one of my trainees bought back from Chemnitz, and there was this nifty one-line command, with which you can generate a random sha1sum for your authkeys file.

Now, since I’m a bit lazy here’s the full command line to fill /etc/ha.d/authkeys for you:

TSM client: Backing up files with umlauts on SLES

In the past, I always had problems with SLES and our Tivoli Storage Manager client’s when backing up files with german umlauts. Well, today I looked a bit harder, and quite quickly found a solution.

As you can see from the above, SLES9/10 ain’t setting LANG or LC_ALL (which I searched for first), but is setting LC_CTYPE.

So, simply changing the LC_CTYPE in the init-script and/or prepending the dsmc command line with a new LC_CTYPE fixes my umlauts problems!

Well, I had a long’ish talk with one of my trustworthy IBM senior consultants the day after writing this …

He told me something along the lines of this:

If you would like to back up files with names containing characters with a code > 127 please ensure that you have chosen a SBCS character set for your locale. The default code page C or the code page POSIX supports characters up to 127 only. Files whose names contain special characters will be skipped if C or POSIX is used. It is strongly recommended to perform a system backup by using a SBCS character set to prevent any file or directory from being skipped. This behavior for different locales is intended.

And this:

The UTF-8 locale is default on some Linux platforms. However, TSM Client currently does not support running under UTF-8 locales (such as en_US.UTF-8 and ja_JP.UTF-8). Export your LANG and LC_ALL environment variables to the iso8859-1 or EUC versions of your locale and then start a new xterm (or mlterm) session prior to running TSM Client.

That basically means, at least for using the TSM Client Java Interface (dsmj) and the scheduler/client acceptor daemon you have to switch your locales to something _not_ UTF-8 capable.

He also mentioned, that IBM doesn’t have a real solution for this problem, as well that there is no real workaround. You need to invest some time into figuring out the “right” locale setting for your system(s), since after writing the above I came to the result that it ain’t enough ..

You need to do the following:

After doing so, the scheduler and the command-line client works …