syslog-ng – BAFM

Linux-HA and Tivoli Storage Manager (Finito!)

October 5, 2008August 8, 2014 Christian 1 Comment

As I previously said, I was writing my own OCF resource agent for IBM’s Tivoli Storage Manager Server. And I just finished it yesterday evening (it took me about two hours to write this post).

Only took me about four work days (that is roughly four hours each, which weren’t recorded in that subversion repository) plus most of this week at home (which is 10 hours a day) and about one hundred subversion revisions. The good part about it is, that it actually just works 😀 (I was amazed on how good actually). Now you’re gonna say, “but Christian, why didn’t you use the included Init-Script and just fix it up, so it is actually compilant to the LSB Standard ?”

The answer is rather simple: Yeah I could have done that, but you also know that wouldn’t have been fun. Life is all about learning, and learn something I did (even if I hit the head against the wall from time to time 😉 during those few days) … There’s still one or two things I might want to add/change in the future (that is maybe next week), like

adding support for monitor depth by querying the dsmserv instance via dsmadmc (if you read through the resource agent, I already use it for the shutdown/pre-shutdown stuff)
I still have to properly test it (like Alan Robertson mentioned in his one hour thirty talk on Linux-HA 2.0 and on his slides, Page 100-102) in a pre-production environment
I’m probably configure the IBM RSA to act as a stonith device (shoot the other node in the head) – just for the case one of them ever gets stuck in a case, where the box is still up, but doesn’t react to any requests anymore

Bloody cluster solutions (continued)

July 12, 2007August 16, 2014 Christian 1 Comment

So, as the previous try on getting the teamix people to fix the bloody LoadBalancer (as in sending at least an identification string for the SSH check) didn’t work so well (they told me, I should configure MASQuerading/ROUTEing on the PacketPro (which is kinda icky), I went on today and looked at what SLES10 installs as default logger.

Surprisingly they install a rather new syslog-ng (well, syslog-ng-1.6.8 is what they ship) so it was rather easy to workaround the situation.

Here’s what already was in the syslog-ng.conf.in (more on that later):

  filter f_iptables { facility(kern) and match("IN=") and match("OUT="); };
  filter f_messages { not facility(news, mail) and not filter(f_iptables) };

1 2	filter f_iptables { facility(kern) and match("IN=") and match("OUT="); }; filter f_messages { not facility(news, mail) and not filter(f_iptables) };

which I just extended with the following:

  filter f_iptables { facility(kern) and match("IN=") and match("OUT="); };
  filter f_messages { not facility(news, mail) and not filter(f_iptables)
                      and not match ("Did not receive identification string from 172.16.(123|234)");
  };

filter f_iptables { facility(kern) and match("IN=") and match("OUT="); };

filter f_messages { not facility(news, mail) and not filter(f_iptables)

and not match ("Did not receive identification string from 172.16.(123|234)");

};

Afterwards just a quick SuSEconfig -module syslog-ng, restart the syslog daemon and the messages were gonse. Sure I know it’s a rather ugly hack 😆 , but since they refused to provide a “true” fix and it seemed like that question has been asked more than once it works for me, so *shrug* 😛

But now you’d ask why syslog-ng.conf.in ? Simply because Novell figured it would be too easy to just invent things like CONFIG_PROTECT for RPM/YaST, so they placed yet another file in there; from which the syslog-ng.conf files is generated every time SuSEconfig is being executed (that’s like every time you install a package using YaST).

Bloody cluster solutions

July 4, 2007June 21, 2013 Christian 1 Comment

In preparation to get our website (and all those other websites – like www.fh-neubrandenburg.de or www.hmt-rostock.de) clustered, someone bought the cluster version of the PacketPro 450. These things are nice, especially considering you don’t need to fiddle around with LVS yourself (which is a *real* pain in the ass).

The only problem I have currently with them is that they scan the database and web nodes every 30 seconds, and since we have an active node and a hot-standby both do this and producing this:

Jul  4 18:27:29 dbc-mysql1 sshd[7313]: Did not receive identification string from 172.16.234.11
Jul  4 18:27:30 dbc-mysql1 sshd[7350]: Did not receive identification string from 172.16.234.12
Jul  4 18:27:59 dbc-mysql1 sshd[7363]: Did not receive identification string from 172.16.234.11
Jul  4 18:28:01 dbc-mysql1 sshd[7364]: Did not receive identification string from 172.16.234.12
Jul  4 18:28:31 dbc-mysql1 sshd[7393]: Did not receive identification string from 172.16.234.11
Jul  4 18:28:33 dbc-mysql1 sshd[7394]: Did not receive identification string from 172.16.234.12
Jul  4 18:29:04 dbc-mysql1 sshd[7417]: Did not receive identification string from 172.16.234.11
Jul  4 18:29:05 dbc-mysql1 sshd[7418]: Did not receive identification string from 172.16.234.12
Jul  4 18:29:36 dbc-mysql1 sshd[7419]: Did not receive identification string from 172.16.234.11
Jul  4 18:29:37 dbc-mysql1 sshd[7420]: Did not receive identification string from 172.16.234.12
Jul  4 18:30:06 dbc-mysql1 sshd[7419]: Did not receive identification string from 172.16.234.11
Jul  4 18:30:07 dbc-mysql1 sshd[7420]: Did not receive identification string from 172.16.234.12

Jul 4 18:27:29 dbc-mysql1 sshd[7313]: Did not receive identification string from 172.16.234.11

Jul 4 18:27:30 dbc-mysql1 sshd[7350]: Did not receive identification string from 172.16.234.12

Jul 4 18:27:59 dbc-mysql1 sshd[7363]: Did not receive identification string from 172.16.234.11

Jul 4 18:28:01 dbc-mysql1 sshd[7364]: Did not receive identification string from 172.16.234.12

Jul 4 18:28:31 dbc-mysql1 sshd[7393]: Did not receive identification string from 172.16.234.11

Jul 4 18:28:33 dbc-mysql1 sshd[7394]: Did not receive identification string from 172.16.234.12

Jul 4 18:29:04 dbc-mysql1 sshd[7417]: Did not receive identification string from 172.16.234.11

Jul 4 18:29:05 dbc-mysql1 sshd[7418]: Did not receive identification string from 172.16.234.12

Jul 4 18:29:36 dbc-mysql1 sshd[7419]: Did not receive identification string from 172.16.234.11

Jul 4 18:29:37 dbc-mysql1 sshd[7420]: Did not receive identification string from 172.16.234.12

Jul 4 18:30:06 dbc-mysql1 sshd[7419]: Did not receive identification string from 172.16.234.11

Jul 4 18:30:07 dbc-mysql1 sshd[7420]: Did not receive identification string from 172.16.234.12

That’s only the logs from three minutes … now figure you have it running for like four days and figure what the average log size due to such crap is … But at least it looks solvable, though I gonna have to call them tomorrow and ask for a patch/update to get their ssh-scan to send some banner when performing the service check.