Nagios: Integrating Cisco switches

Well, as I wrote recently, we received a new BladeCenter a few weeks back. Now, as we slowly take it into service I was interested in watching the utilization of the back planes as well as the CPU utilization of the Cisco Catalyst 3012 network switches.

The first mistake I made, was to trust Cisco with their guide about how to get the utilization from the device using SNMP. They stated some OID’s, which I tried with snmpwalk and got a result from.

Now, as I tried retrieving the SNMP data by means of the check_snmp plugin, I got some flaky results:

Those of you, who read the excerpts carefully will notice the difference between snmpwalk and the OID I passed on to check_snmp.

The point being, the OID’s Cisco gave in their Design tech notes are either old, or just not accurate at all. After passing on the .0 to each value given by Cisco, the check_snmp is all honky dory and integrated into Nagios.

As usual, the Nagios definitions are further down, for those interested. Read More

Linux: Getting information about an EXT3 filesystem

You know, I’m not getting any younger. It’s getting harder remembering every damn command … so here is how you get information out of your EXT3 filesystem:

Restarting the NSclient++ service without the management applet

For people, who are as click and point-lazy as me, here is how you restart the service without using the service management applet.

MySQL: Setting up an InnoDB raw device

Well, since I had to brood about this (again I might add), I’m gonna write it down this time …

Setting up the InnoDB raw device isn’t that hard, just make sure the device has proper permissions (either add mysql to the disk group or create a udev rule).

Now after that (and a reboot/udevcontrol reload_rules later), you should be able to initialize the InnoDB device. Yes, the InnoDB device needs initializing.

When you create a new data file, you must put the keyword newraw immediately after the data file size in innodb_data_file_path.

The next time you start the server, InnoDB notices the newraw keyword and initializes the new partition.

After that is done, you should be able to start the MySQL service for the first time. It is gonna fail (at least according to the init-script), but ultimatly if you take a closer look at /var/log/mysqld.log it’s gonna be successful.

After that, remove the “newraw” from your /etc/my.cnf. Otherwise, MySQL is gonna reinitialize the volume all over again, as the handbook states.

However, do not create or change any InnoDB tables yet. Otherwise, when you next restart the server, InnoDB reinitializes the partition and your changes are lost.

After InnoDB has initialized the new partition, stop the server, change newraw in the data file specification to raw.

Monitoring the IBM BladeCenter chassis with Nagios

Today I ended up working out the details on what we want to monitor regarding our BladeCenter. The most interesting details (for us that is) are these:

  • Fan speeds for Chassis Cooling/Power Module Cooling Bay(s)
  • Temperature
  • Power Domain utilization

It wasn’t *that* hard to implement. Only trouble(s) I ran into, were (1) IBM did a real shitty job with the MIB’s. If you look closely into the mmblade.mib, you’re gonna notice, that not a single OID is specified for the events. (2) As the MIB’s weren’t documented anywhere, I had to look them up via snmpwalk (which I had never used before). So as a reminder (to myself), here’s how it is done:

This will get you a list, with a lot of output (5154 lines to be exact). Lucky me, the web interface of the management module/ssh interface is rather verbose, so all you need to do is compare those values with what you are looking for.

So for myself (and anyone interested) read ahead for the list of checks we are currently running on the management module. Read More

VMware vCenter: is not connected

Well, today I once again had the case where a virtual machine (in my case a Virtual Machine Template) was kinda stuck. You couldn’t remove the template (as in the entries for “Remove from inventory” was grayed out) and you couldn’t re-add the Virtual Machine’s VMX from the datastore browser either.

VI Client - Disconnected templates
VI Client – Disconnected templates

Though, a simple putting the host into maintenance mode and rebooting helped that problem. Maybe there is a simpler solution for this, I just don’t know about it.

Thanks to Sven in #1, I now know that simple solution for my problem!

Half a minute, and a heart-stopping moment later (all VM’s on that host turn grey after the first update) the VM’s are accessible again. Thanks again to Sven!

Installing SLES10 via network with no DHCP available

In our current fight against the BladeCenter switches, we’re currently facing the problem that the blades ain’t able to send/receive DHCP-traffic.

So in order to move forward, we had to use static IP addresses. And since SLES10 ain’t straight forward on that, I had to look it up. Now, here’s for me (and everyone else tired of searching) how to do it:

Setting up the BladeCenter H

Well, we finally had our maintenance window today, in which we planned the hardware exchange for our current Dell Blade Chassis (don’t ask!). The exchange went fine, but as we started exploring the components (like the IBM BladeCenter SAN switches — which are in fact Cisco MDS 9100) we hit a few road blocks.

First, the default user name/password combo for the Cisco MDS 9100 for the BladeCenter is USERID/PASSW0RD (just as the rest of the password combinations).

Next, we started tinkering around with the Catalyst Switch modules. A hint to myself:

Whenever setting up the switch via the WebGUI, make sure you setup both passwords. The password for the switch itself (when prompted by the WebGUI, enter “admin” as well as the password you just entered.

Now, you should be able to connect to the switch with telnet and be able to access the EXEC mode (and unlike me who struggled ~30 minutes till one of my trainees told me to enter a switch password — out of curiosity).

Now, here the list of commands I needed to setup the switch’s “basics”:

Updating path information for TSM

As I did some switching today (between the new lin_tape version by IBM and our own lin_tape version), I ended up writing those lines a dozen times. Here is (just for me, if you don’t care .. skip ahead) on how to generate a list of commands:

which should get you a list like this:

Sidenote: Amount of Slots per Virtual Tape Library

Well, I just stumbled about this again (and I don’t know right now whether or not this is documented inside a RedBook or not) today, so I thought maybe I’m gonna write it down.

Slot-Amount Property of a Virtual Tape Library
Slot-Amount Property of a Virtual Tape Library

Please keep in mind, when creating the virtual library to think hard about the amount of slots you might need. It ain’t that bad, you just can’t decrease the amount anymore.  So if you think about creating 50 different virtual tape libraries with 500 slots each on your TS7530, think again. The current software level only supports 25.000 slots on a global level.