SUSE Linux Enterprise Server 10 on VMware ESX (finished)

As I pointed out in my past posts, I was having some weird errors with SLES10 regarding mounting CD images inside the guest (as well, as it turned out, on the physical hardware).

Now, finally, after about a week or so Novell finally released an updated kernel package today. So the error per se is fixed, I can use my CD drives again, as well as update my virtual machines by means of virtual center, without the trickery of copying the linux.iso from /vmimages/tools to the guest and mount it by means of mount -o loop.

Yet another VMware error

Today I was moving a pretty standard SLES10 virtual machine to another host, when the migration dialog showed me this:


And if you now think, the virtual machine is something special take a look at those settings:

Virtual machine configuration
Virtual machine configuration

I don’t know what to think about that error message. Googling for it doesn’t reveal that much about it. If anyone out there got an idea, I’m open for suggestions.

Fixing vmkernel symlinks

Since I do happen to be in the situation pretty often where the kernel inside a VM is newer than what VMware currently has in their tools (as in the SUSE kernel is newer than the binary modules built by VMware), here’s a quick reminder for myself on how to to fix the .ko symlinks.

Relaxing weekend

There I was, enjoying our new patio and trying to get a tan outside (we had plenty great weather during the weekend).

The new patio
The new patio

Now, all of the sudden there was this *weird* noise I couldn’t classify (which I heard while listening to my iPod completely cranked up …), so I stood up.

Combine harvester mowing down rape
Combine harvester mowing down rape

They were finally mowing the rape field behind our house … hah!

SUSE Linux Enterprise Server 10 on VMware ESX (continued)

Well, after some searching today (we applied the VMware Update 2 today, thus the VMware Tools update too), I finally found out what is causing that problem.

Though the problem seems to be not limited to virtual systems alone, I just browsed through this Novell Forum thread which pretty much describes my problem. I found the same error in the VM’s I tried to mount a CD image.

Only difference between my behaviour and the one described, is that the virtual maschine is switched off immediately after you try to mount a CD image.

Now, this guy is saying Novell is working on it … But you’re gonna have to ask the question, why in gods name did such an update get through QA ? Or ain’t there no QA for updates ? *shrug*

Microsoft Cluster Services powered by IBM

If you think back, I talked about my problems with MSCS while utilizing the IBM RDAC Multipath driver for Windows.

Everyone I talked to about this, including our IBM business partner and it’s systems engineers; as well as some IBM systems engineer (who in fact was an freelance guy hired by IBM), told me it had to do with how we did the zoning (stuffing every controller into a single zone), and that would be the reason why the x3650 was seeing that many drives.

When the freelance SE came to visit us, we redid the zoning, separating each endpoint connection (each HBA port to each controller port) into a different zone.

Additionally he told me, that was the only IBM™ supported configuration.

SAN Zoning (Overview)
SAN Zoning (Overview)

As you can see, I had to create ten different zones for each single port of the dual port fibre channel HBA and it’s corresponding endpoint (I guess, I still have to create more, since the DS4700 is having *two* ports per controller).

SAN Zoning (Detailed)
SAN Zoning (Detailed)

After we finished that, we rebooted the x3650 and hoped that would have fixed. Afterwards the IBM SE was baffled. Still seeing ~112 devices. What the heck ? He ranted about how awful this was and did some mumbo jumbo with his notebook, uploaded the ds4?00 configuration files to some web interface, but shortly afterwards said the storage configuration seemed to be fine on the first glance.

So we had another look at the storage configuration and he quickly found, that the other cluster ports were set to “Windows Cluster 2003 (Supporting DMP)” in the port configuration and said that’d be the cause why stuff still ain’t working (I think he guessed wildly, since he had no clue either). After I told him, I just can’t change those ports right now (since the remaining part of the cluster is in full production), we agreed that I’d do it some other time and tell him about my results.

Anyways, the next day my co-workers suggested, trying a newer Storage Manager version on the x3650, at the same level with the highest firmware version on the storages (thus being the DS4700 and v09.23). Now guess what ?

That fucking works. The cluster is still behaving weird sometimes (now the other boxen seem to have trouble bringing resources online, but only sometimes).

So here my hint: Always keep an old version of the Storage Manager around, you can’t get them from IBM anymore *shrug*

Cascading Style Sheets are really weird

So here I was, sitting around and thinking about formatted classes for my paragraphs. Now the result is quite pleasing, but has some side effects. But see for yourself …

Messed up CSS
Messed up CSS

As you can see, the browser is reusing the background-image URL from the <p> element within the <a> element, even though the <a> element initially had none. Even defining putting an background-image: none; into the <a> class doesn’t get me anywhere.

The weird thing is only Firefox is displaying it this messed up (not so weird when you think about how IE 6 treats standards). So if any of you CSS wiz’ got a suggestion, I’m listening 🙂

Thanks to the tip of Dave(?), the issue is fixed now!

Connecting to a remote console with MSTSC 6.0.6001

Well, as one can read in about every damn post you can find for that topic, the /console switch is now silently ignored, as well as the rdp file option “connect to console:i:1“.

Now, what you don’t find anywhere (only in some scenario explanation), that it is allowed to specify the mode (ie /console previously and now /admin) within the full address parameter.

Scenario: In the RDC client UI, you specify Computer_name /console in the Computer box (where Computer_name represents the name of the remote computer to which you want to connect), and then click Connect.

Behavior: The /console switch is silently ignored. You will be connected to a session to remotely administer the server. (For more information about the Windows Server 2008 behavior, see the “Behavior when you connect to a server that does not have Terminal Server installed” section of this article.)

So my rdp connection file basically looks like this:

SUSE Linux Enterprise Server 10 on VMware ESX

We’re currently having a *really* weird problem with our VM’s. Sometime last week, SUSE released a kernel update. Now, once you install it and you reboot the selected VM with a DVD/CD image present, you’re gonna see this:


The only workaround so far has been to unmount *any* cleanse any CD-Drives attached to the VM. And yes, this is reproduceable, even reinstalling from scratch doesn’t change the fact, that after installing the patch the VM quits working.

I also know, SLES10 SP2 ain’t officially supported yet by VMware, but I’d still suspect it to just work and not produce such weird errors. The only thing I found so far is this VMTN thread ..

Lucky us, VMware just today released Update 2 for VirtualCenter and ESX, wherein SLES10SP2 should be officially supported!

Nagios Hostgroup Inheritance (continued)

Well, it turns out that my thought was ultimativly flawed. When defining the hostgroup_members in the lower tiers, nagios is association the checks from the lower tier with the upper tiers. Thus propagandating all checks upwards, and me ending up with ~250 checks instead of ~150.

Gonna have to try to define the dependency backwards, maybe that’ll help. But that’s a topic for Monday. Guess I’ll finish viewing Ghost in the Shell – Stand Alone Complex first.