Monitoring the IBM BladeCenter chassis with Nagios

Today I ended up working out the details on what we want to monitor regarding our BladeCenter. The most interesting details (for us that is) are these:

  • Fan speeds for Chassis Cooling/Power Module Cooling Bay(s)
  • Temperature
  • Power Domain utilization

It wasn’t *that* hard to implement. Only trouble(s) I ran into, were (1) IBM did a real shitty job with the MIB’s. If you look closely into the mmblade.mib, you’re gonna notice, that not a single OID is specified for the events. (2) As the MIB’s weren’t documented anywhere, I had to look them up via snmpwalk (which I had never used before). So as a reminder (to myself), here’s how it is done:

This will get you a list, with a lot of output (5154 lines to be exact). Lucky me, the web interface of the management module/ssh interface is rather verbose, so all you need to do is compare those values with what you are looking for.

So for myself (and anyone interested) read ahead for the list of checks we are currently running on the management module. Read More

VMware vCenter: is not connected

Well, today I once again had the case where a virtual machine (in my case a Virtual Machine Template) was kinda stuck. You couldn’t remove the template (as in the entries for “Remove from inventory” was grayed out) and you couldn’t re-add the Virtual Machine’s VMX from the datastore browser either.

VI Client - Disconnected templates
VI Client – Disconnected templates

Though, a simple putting the host into maintenance mode and rebooting helped that problem. Maybe there is a simpler solution for this, I just don’t know about it.

Thanks to Sven in #1, I now know that simple solution for my problem!

Half a minute, and a heart-stopping moment later (all VM’s on that host turn grey after the first update) the VM’s are accessible again. Thanks again to Sven!

Installing SLES10 via network with no DHCP available

In our current fight against the BladeCenter switches, we’re currently facing the problem that the blades ain’t able to send/receive DHCP-traffic.

So in order to move forward, we had to use static IP addresses. And since SLES10 ain’t straight forward on that, I had to look it up. Now, here’s for me (and everyone else tired of searching) how to do it:

Setting up the BladeCenter H

Well, we finally had our maintenance window today, in which we planned the hardware exchange for our current Dell Blade Chassis (don’t ask!). The exchange went fine, but as we started exploring the components (like the IBM BladeCenter SAN switches — which are in fact Cisco MDS 9100) we hit a few road blocks.

First, the default user name/password combo for the Cisco MDS 9100 for the BladeCenter is USERID/PASSW0RD (just as the rest of the password combinations).

Next, we started tinkering around with the Catalyst Switch modules. A hint to myself:

Whenever setting up the switch via the WebGUI, make sure you setup both passwords. The password for the switch itself (when prompted by the WebGUI, enter “admin” as well as the password you just entered.

Now, you should be able to connect to the switch with telnet and be able to access the EXEC mode (and unlike me who struggled ~30 minutes till one of my trainees told me to enter a switch password — out of curiosity).

Now, here the list of commands I needed to setup the switch’s “basics”:

Updating path information for TSM

As I did some switching today (between the new lin_tape version by IBM and our own lin_tape version), I ended up writing those lines a dozen times. Here is (just for me, if you don’t care .. skip ahead) on how to generate a list of commands:

which should get you a list like this:

Sidenote: Amount of Slots per Virtual Tape Library

Well, I just stumbled about this again (and I don’t know right now whether or not this is documented inside a RedBook or not) today, so I thought maybe I’m gonna write it down.

Slot-Amount Property of a Virtual Tape Library
Slot-Amount Property of a Virtual Tape Library

Please keep in mind, when creating the virtual library to think hard about the amount of slots you might need. It ain’t that bad, you just can’t decrease the amount anymore.  So if you think about creating 50 different virtual tape libraries with 500 slots each on your TS7530, think again. The current software level only supports 25.000 slots on a global level.

Working with IBM’s Virtualization Engine Console

Recently, we got the recommendation from our system partner to use static allocated tape cartridges instead of dynamic allocated ones. Apparently using dynamic allocating cartridges comes with a performance penalty if more than a few nodes are backing up a large amount of data at once.

And yet again, I noticed that the IBM Virtualization Engine Console (aka Falconstor Software) is really error prone.

In order to change the allocation type, we had to shred the old cartridges first (500 x ~100M up till now), chance the allocation type at the virtual tape library level, and then recreate the 500 cartridges with a fixed size (500x 102400MB). Now, as I was kinda optimistic, I decided to create all 500 cartridges at once.

Failure during the initiation of 500 virtual catridges

So I ended up creating the 500 cartridges in steps of 45. Which isn’t that big of a deal. But, as we do have two separate logical virtual tape libraries (basically the whole TS7530 is partitioned into two tape libraries), we had to do it for the second one too. But I told myself “Come on, try the maximum amount again!” … And guess what:

Successful initiation of 510 virtual catridges

That worked ❓ Please, don’t ask me why the hell it’s working for one virtual tape library on the same system (well, different virtualization engine), but ain’t for the other one … 😯

Sunday afternoon playtime

Well, it’s yet again Sunday afternoon. And I had (again I might add) the urge to play around with all the stuff I have at home.

So at first, I “fixed” the ground wire off my NAS box.

Ground wire for he CPU fan

Afterwards, I went back upstairs. Hooked my Philips up to my notebook and figured I *really* need a wireless keyboard. Because typing with the Windows on-screen keyboard is a huge pain in the ass (as well as a pain for the mouse-hand)!

Samsung R70 hooked up to my Philips TFT

Automatic updates on SUSE Linux Enterprise 10

I had the problem, that the automatic update function of YaST doesn’t work like I want it to. I just wanted it to install only those updates, that ain’t interactive, don’t need a service restart and don’t need a reboot.

YaST does only feature an online update that skips “interactive” updates (I’ve never even encountered an interactive update up till now). So I went ahead and wrote a (hackish) script, that achieves what I need.

And just for me, the crontag entry: