Adapter bonding on Linux

Well, today I had a rather weird error. I was testing the adapter bonding on one of the boxen designated as Tivoli Storage Manager Server, when I noticed that the bonding wasn’t working as expected when simulating an error (that is unplugging one of the TP cables for the bond).

Now, the bond had “mode=6 miimon=100” as options. After running “linux bond debug” through Google (which turned up nothing useful, besides one document on the Oracle Wiki about IOS/Linux adapter teaming), I figured “Hey, just lets test switching the arguments.” And guess what ?

Afterwards, it just works when you unplug one of the cables of the bond, while it didn’t work before … *shrug*

Windows Server 2003 Terminal services

Well, once you thought you don’t have any more problems, another one just pops up. I’m currently bashing my head against the wall, why the hell the forwarded (or is it redirected ?) drives are not shown in the in the “My Computer” explorer view. I pretty sure have an idea why (basically, HKEY_CURRENT_USERSSoftwareClasses isn’t writeable, but that’s where Windows, or rather the Terminal Services — or whatever is creating the associations), just don’t know a clever way around/by it.

It’s basically a dead end. The user has no access to that particular subkey, and I can’t change the permissions by changing it in ntuser.dat apparently. Neither do the inherited permissions apply, so I’m basically stuck. 🙁

Linux-HA and Tivoli Storage Manager (Finito!)

As I previously said, I was writing my own OCF resource agent for IBM’s Tivoli Storage Manager Server. And I just finished it yesterday evening (it took me about two hours to write this post).

Trac revision log (shortened)
Trac revision log (shortened)

Only took me about four work days (that is roughly four hours each, which weren’t recorded in that subversion repository) plus most of this week at home (which is 10 hours a day) and about one hundred subversion revisions. The good part about it is, that it actually just works 😀 (I was amazed on how good actually). Now you’re gonna say, “but Christian, why didn’t you use the included Init-Script and just fix it up, so it is actually compilant to the LSB Standard ?”

The answer is rather simple: Yeah I could have done that, but you also know that wouldn’t have been fun. Life is all about learning, and learn something I did (even if I hit the head against the wall from time to time 😉 during those few days) … There’s still one or two things I might want to add/change in the future (that is maybe next week), like

  • adding support for monitor depth by querying the dsmserv instance via dsmadmc (if you read through the resource agent, I already use it for the shutdown/pre-shutdown stuff)
  • I still have to properly test it (like Alan Robertson mentioned in his one hour thirty talk on Linux-HA 2.0 and on his slides, Page 100-102) in a pre-production environment
  • I’m probably configure the IBM RSA to act as a stonith device (shoot the other node in the head) – just for the case one of them ever gets stuck in a case, where the box is still up, but doesn’t react to any requests anymore

Read More

Setting up Linux-HA

Well, initially I thought writing the OCF resource agent for Tivoli Storage Manager was the hard part. But as it turns out, it really ain’t. The hard part, is getting the resources into the heartbeat agent (or whatever you wanna call it). The worst part about it, is that the hb_gui is completely worthless if you want to do a configuration without quorum.

First of all, we need to setup the main Linux-HA configuration file (/etc/ha.d/ha.cf). Configuring that, is rather simple. For me, as I do have two network devices, over which both nodes see each other (one is an adapter bond of comprising of two simple, plain, old 1G copper ports; the other is the 1G fibre cluster port), the configuration looks like this:

After configuring the service itself is done, one just needs to start the heartbeat daemon on both nodes. Afterwards, we should be able to configure the cluster resources.

I find it particularly easier to just update the corresponding sections with cibadmin (the man-page really has some good examples). So here are my configuration files for two resource groups (crm_mon doesn’t difference between resources and grouped resources, it’ll just show you that you configured two resources).
Read More

Subversion via HTTP(s) and mod_rewrite

Well, I just finished my wild-goose chase with Apache and subversion regarding a rather weird error. I recently reinstalled our subversion box, and ever since then I was unable to commit anything new to any of the repositories.
Subversion told me this:

Apache didn’t say much about it either, besides this particular line:

Today I sat down and thought really hard, what exactly was different from before.

  1. Installed Trac instead of Redmine, but that can’t have anything to do with the error
  2. Configured URL rewriting …

As it turns out, the following RewriteRule was the cause:

After changing the Rewrite Rule (as showed below, compare the difference yourself 😛 ), it works just like a charm.

Hint to self: whenever encountering HTTP 302 in conjunction with Subversion, check the RewriteRule’s ❗

Linux-HA and Tivoli Storage Manager

Well, since we received part of our shipment on Wednesday, I finally looked at how we’re gonna deploy our active/active Tivoli Storage Manager configuration. Right now, we do have a single pSeries box hosting ~100 client nodes which we’re looking to split by two (since we do have two x366 for that purpose now).
Now, as there ain’t no solution for this scenario yet (neither from International Business Machines nor someone out of the open source community), I sat down and started writing an OCF Resource agent for dsmserv (that is the Tivoli Storage Manager server).
At first I had a bit trouble adjusting myself on how stupid/non-standard dsmserv is, but after reading through the Storage Manager Installation handbook (on multiple installations on a single server) and through some peoples notes on multiple deployments of Tivoli Storage Manager on the same server, I think I managed to get my head around it.
I still think the resource agent lacks some real testing (I put a two node cluster online on Tuesday, but that is non-productive), but that’ll happen soon.

As you can see, I reworked the “stop” phase, to first terminate all running processes and then dismount all tapes in order to avoid data corruption (that was an advice from our friendly IBM systems engineer); if that fails, try terminating it by a “friendly” kill (SIGTERM); and if that ain’t helping, kill it the “Die Hard Way”™ (SIGKILL).

Defragmenting all fragmented MyISAM tables

I just had another look at what I wrote the week before last (you know, being home-sick/on vacation has it’s advantages) and additionally read up on “OPTIMIZE TABLE” again. The comments in the manual mention “SHOW TABLE STATUS“, which gives you a complete list, but it doesn’t allow you to filter certain kinds of things out (like I only wanted to see MyISAM tables in the list, I only wanted database and table).

So I went ahead and looked around in MySQL’s own databases and if you look closely at information_schema, it’s got a list of all databases/tables with an additional pointer whether or not databases are fragmented, the row Data_free. I only found this, because I looked at how the mysqltuner figured whether or not you have fragmented tables.

So, without further ado, here’s the final script I’m gonna torture for the next week:

I know it ain’t completely bullet proof and it sure as hell isn’t neat, but I think it does the job. Also, if you don’t want to paste it, here’s the file download.

High Definition Multimedia Interface

Well, I bought myself a new TV last week (and it was finally delivered on Monday), which came with a HDM Interface. I also bought me a DVD player some time between New years eve and now (I don’t really remember when, I guess I could look at the invoice, but I’m lazy) with an HDM Interface.

Up till now, I had a SCART cable to interface the DVD player with my TV (the old one was really old, didn’t have no other interface). So when my brother said he was gonna go shopping, I told him he should bring me a HDMI cable. And he did that without any whining or complaining. He only said, I owe him 40 EUR …

At that point, I basically went WTF … Forty euros for a FUCKING 2 meter cable ?! Even if it’s completely molded out of gold, it’s still rip-off.

What’s up, dude ?

Some people might ask, “Gosh, what’s up with you ?” .. Well I’ve been incredibly busy with work the last month (sheeesh, another month already passed by). We finally finished the public tender; gonna get some of the components next week (that is the library extension, the LTO4 drives and cartridges, two brand new Cisco MDS9134 and some Fibre channel hard disks for the existing DS4700). After work, I made my usual visits in the gym and well, after I was home some relaxing, kicking up my feet and doing nothing (as in zip).

I’ve visited Munich in late August for a job interview (though they handed me the “Dude, you’re good. But you only finished second” answer by now), had a long weekend for a change (as in five days), which I used to visit my aunt and my cousin in Stuttgart.

Spent some more time figuring out TYPO3 stuff for work, made a terrible mistake (copied a vHost configuration and changed only the ServerName, left the ServerAlias‘es in place). I’m currently writing a review (as well as a Nagios impacted howto) for the MessPC Ethernetbox, hopefully once I’m back at work I’ll be able to finish it.

Right now, I’m home sick (yup, again) with a cold (or something similar, still having para nasal occlusion). Well, I guess I’m gonna be fine next week (at least I hope so …). Cherio.

Custom macros in host definitions

Well, I was playing with the hostgroup inheritance earlier. One problem with that is, if you define a duplicate service Nagios is really unpredictable or rather inconsistent. Now, as Thomas Guyot-Sionnest told me, I should try custom macros for the check definition. So what I did was the following:

templates/host-windows.cfg

hostgroups/windows.cfg

hosts/terminal1.cfg

As you can see, the default RDP port is 3389 (as defined in the host template), but for some systems you might want to “change” the port (for example, if you’re having a Citrix farm and you changed the RDP port to something else and still want to be able to check whether or not the RDP service is active), thus the check using the macro, and a single host redefining the macro, thus having a bit more flexibility.