Retiring people

I’m not sure whether or not I blogged about this before, but here it is just for me to actually remember what, in which order I need to do. If you got the list in form of a csv file, simply do the following:

That’ll give you a detailed list of which metadata.xml need to be changed.

IBM RDAC and Windows Cluster Service

Okay, so we received a brand new x3650 the other day entitled to replace one (or better two) of our NAS frontend servers. We installed Windows on it the other day (had to create a custom Windows Server 2003 CD first, since the default one doesn’t recognize the integrated ServeRAID), and we prepped the box during the week with the usual things.

On Monday I started installing the “IBM StorageManager RDAC” MultiPath driver (since the box got two single port PCIe FC-HBA’s) and figured I’d be nice if we had this. I asked a IBM Systems Engineer of one of our partners, which told me generally there wouldn’t be a problem with Microsoft Cluster Services (MSCS) and the IBM MPIO driver. Only requirement would be that I’d install the new storport.sys driver (version 5.2.3790.4021) first (as in Microsoft KB932755).

Now, yesterday I finished the zoning, did the mappings on the storage arrays and then figured the box should see the hard disks. So I started adding another node to our existing Microsoft Cluster.

Result: Zip (as in MSCS telling me not all nodes could see the quorum disk)

Reason: a combination of two things. First, said IBM Storage Manager RDAC. The first time I installed it, I forgot about the storage mappings, thus the box seeing zero disks. After uninstalling it, I was seeing 121 (that’s right, one hundred and twenty one) new devices.

Visible volumes previous to installing the RDAC driver
Visible volumes previous to installing the RDAC driver

That is basically a result of the zoning I did for this particular device, which has *all* controllers present in a single SAN zone, thus the HBA’s seeing devices eight (or nine) times .. Update: yes, I’m missing one controller … 😀

SAN zoning for the box
SAN zoning for the box

Now, as I reinstalled the RDAC *after* the host discovered the volumes, it’s showing only a dozen drives.

Visible volumes after installing the RDAC driver
Visible volumes after installing the RDAC driver

Now, as I figured this out, I told myself “Hey, adding the third node to the Windows Cluster should now work without a clue …” … guess what ?

It’s Microsoft and it doesn’t. Now why doesn’t it work ? ‘Cause the Cluster Setup Wizard is getting confused in Typical mode, as it’s creating a “local quorum disk” which naturally isn’t present in the cluster it’s joining. Now, switching the wizard to “Advanced (minimum) configuration” as suggested in Q331801, just works … *shrug*

Windows XP Embedded and GPO settings (continued)

Well, as I said in my previous post, I do have some weird things happening. Apparently adding the domain user to the local group “Administrators” makes everything just works fine, yet he can’t do administrator like stuff (like turning off the write protection, changing local user accounts, …).

Also, if you’re looking for a smart way of how to add a certain global group (as in Active Directory group) to a local group, try this:

That simple, doesn’t even need the usual credentials to lookup the object, it apparently bypassed that step *shrug*.

And yet another weird thing is: if I run a certain command from a deployment script, it gives me different result as a manual execution of said script would give me .. *shrug*

If I put that into a rsp (that is Wyse Device Manager script), it ain’t working. Would I be executing it myself without the WDM, everything works like a charm … *yuck*

Rescuing a rebooting machine that’s hanging

One of my co-worker approached me today with a weird problem. Yesterday he had a disk in a 900GiB array failing which he replaced. After that, he run a rebuild/verification, fsck’ed the file system and tried to mount the volume again.

Apparently the mount produced a kernel oops (guess what, the 900GiB is running reiserfs), thus leaving the kernel tainted (or what do they call it ?). So he tried to reboot the box but it didn’t reboot. It started rebooting but then hung (as in not continuing the reboot). He tried to ssh back to the box, and it worked just fine.

This is where sysrq comes in handy.

That’ll restart the box, and cha-ching .. 😀

IBM (Tivoli) Integrated Solutions Console

Here I am, preparing our environment for the first (of hopefully many) tester for our upcoming VTL project. So I ended up installing the ISC and Administration Center for Tivoli Storage Manager on a 64bit guest (that is SLES10 for AMD64), just because I forgot to include support for later versions with our current running one. Guess what, nana na na na. Exactly, didn’t work, the same errors I got while trying it before in a virtual environment. “Portlet is not available.”

So I ended up redoing the whole thing on a 32bit guest and guess what … bada-bing. works … *shrug* I don’t know whether or not that’s a surprising thing .. but what surprises me, is that I do have a working 64bit Integrated Solutions Console and Administration Center running, only difference is that one is running on real hardware.

Anyway, after looking on how the Integrated Solutions Console (that is the Websphere environment – yes *yuck*) did it’s own start up after boot (you know, I’d like to restart an application if it’s hanging without the need to reboot the whole box), I found this particular line of code:

And since I was lazy (and it was already Friday afternoon), I ended up writing a small init script which rids you of the need of such a ugly way to start a service.

And the corresponding sysconfig file:

Et voilá, it’s done. Now just a `chkconfig -a isc‘ and it’s gonna startup nice and easy (when it really should) via the normal service startup and not get spawned from the inittab.

Windows XP Embedded and GPO settings

We’re currently having a weird issue (which we had before); the Windows XP Embedded powering our Wyse V90’s isn’t applying any GPO settings if you log on with a user that has a configured profile.

Googling (is that a valid word yet ?!) for it, only resulted in one useful link, which is apparently a guy with the exact same problem … *shrug* I’m completely out of ideas by now, as I don’t even have a place to start (as in where the reason might be located).

Well, I do have a place to start with (that’s the local Events Viewer), which indeed lists some errors, but only such errors which ain’t making any sense. For example I see this:

  • Userenv:1086 – “Windows cannot do loopback processing for downlevel or local users. Loopback processing will be disabled.
  • SceCli:1704 – “Security policy in the Group policy objects has been applied successfully.
  • Userenv:1085 – “The Group Policy client-side extension Folder Redirection failed to execute. Please look for any errors reported earlier by that extension.

Getting a FC HBA to rescan it’s attached devices

If you’re using a 2.6 based distribution, the FC HBA (or more correctly the corresponding driver) should create entries in /sys/class/scsi_host. Now you only need to get the host-number (basically the # of the host bus adapter) and you can get started ..

Simply doing this, is going to tell the FC HBA “rescan” and discover new devices ..

That should do the trick, and you should be able to get udev to recognize the new devices attached via FibreChannel without the need to reboot the whole box (which might be a bit tricky).