Linux-HA and Tivoli Storage Manager (Finito!)

As I previously said, I was writing my own OCF resource agent for IBM’s Tivoli Storage Manager Server. And I just finished it yesterday evening (it took me about two hours to write this post).

Trac revision log (shortened)
Trac revision log (shortened)

Only took me about four work days (that is roughly four hours each, which weren’t recorded in that subversion repository) plus most of this week at home (which is 10 hours a day) and about one hundred subversion revisions. The good part about it is, that it actually just works πŸ˜€ (I was amazed on how good actually). Now you’re gonna say, “but Christian, why didn’t you use the included Init-Script and just fix it up, so it is actually compilant to the LSB Standard ?”

The answer is rather simple: Yeah I could have done that, but you also know that wouldn’t have been fun. Life is all about learning, and learn something I did (even if I hit the head against the wall from time to time πŸ˜‰ during those few days) … There’s still one or two things I might want to add/change in the future (that is maybe next week), like

  • adding support for monitor depth by querying the dsmserv instance via dsmadmc (if you read through the resource agent, I already use it for the shutdown/pre-shutdown stuff)
  • I still have to properly test it (like Alan Robertson mentioned in his one hour thirty talk on Linux-HA 2.0 and on his slides, Page 100-102) in a pre-production environment
  • I’m probably configure the IBM RSA to act as a stonith device (shoot the other node in the head) – just for the case one of them ever gets stuck in a case, where the box is still up, but doesn’t react to any requests anymore

Read More

Bloody cluster solutions (continued)

So, as the previous try on getting the teamix people to fix the bloody LoadBalancer (as in sending at least an identification string for the SSH check) didn’t work so well (they told me, I should configure MASQuerading/ROUTEing on the PacketPro (which is kinda icky), I went on today and looked at what SLES10 installs as default logger.

Surprisingly they install a rather new syslog-ng (well, syslog-ng-1.6.8 is what they ship) so it was rather easy to workaround the situation.

Here’s what already was in the (more on that later):

which I just extended with the following:

Afterwards just a quick SuSEconfig -module syslog-ng, restart the syslog daemon and the messages were gonse. Sure I know it’s a rather ugly hack πŸ˜† , but since they refused to provide a “true” fix and it seemed like that question has been asked more than once it works for me, so *shrug* πŸ˜›

But now you’d ask why ? Simply because Novell figured it would be too easy to just invent things like CONFIG_PROTECT for RPM/YaST, so they placed yet another file in there; from which the syslog-ng.conf files is generated every time SuSEconfig is being executed (that’s like every time you install a package using YaST).

Bloody cluster solutions

In preparation to get our website (and all those other websites – like or clustered, someone bought the cluster version of the PacketPro 450. These things are nice, especially considering you don’t need to fiddle around with LVS yourself (which is a *real* pain in the ass).

The only problem I have currently with them is that they scan the database and web nodes every 30 seconds, and since we have an active node and a hot-standby both do this and producing this:

That’s only the logs from three minutes … now figure you have it running for like four days and figure what the average log size due to such crap is … But at least it looks solvable, though I gonna have to call them tomorrow and ask for a patch/update to get their ssh-scan to send some banner when performing the service check.