August 2009

Tivoli Storage Manager Server 5.5.3

August 28, 2009June 21, 2013 Christian Leave a comment

I spent yesterday afternoon upgrading our TS7530, and in my fad I also upgraded TSM to 5.5.3. Now, once I started TSM it quickly started complaining about the paths to the drives.

ANR8873E The path from source TSM1 to destination VTL1_DR03
(/dev/lin_tape/IBMtape03) is taken offline.
ANR8873E The path from source TSM1 to destination VTL1_DR03
(/dev/lin_tape/IBMtape03) is taken offline.
HBA_LoadLibrary: previously unfreed libraries exist, call HBA_FreeLibrary().
ANR8873E The path from source TSM1 to destination VTL1_DR07
(/dev/lin_tape/IBMtape07) is taken offline.
ANR8873E The path from source TSM1 to destination VTL1_DR07
(/dev/lin_tape/IBMtape-07) is taken offline.

ANR8873E The path from source TSM1 to destination VTL1_DR03

(/dev/lin_tape/IBMtape03) is taken offline.

ANR8873E The path from source TSM1 to destination VTL1_DR03

(/dev/lin_tape/IBMtape03) is taken offline.

HBA_LoadLibrary: previously unfreed libraries exist, call HBA_FreeLibrary().

ANR8873E The path from source TSM1 to destination VTL1_DR07

(/dev/lin_tape/IBMtape07) is taken offline.

ANR8873E The path from source TSM1 to destination VTL1_DR07

(/dev/lin_tape/IBMtape-07) is taken offline.

I thought maybe this is a mere device problem (we have had them before), so I rebooted the boxes. But still no luck and I went home after about an hour of trying without any luck. In the morning, my co-worker called our trustworthy IBM service partner, and the TSM consultant said he had the exact, same problem yesterday. We would have two options:

Enable the option SANDISCOVERY, with the (completely undocumented) Passive setting (setopt SANDISCOVERY PASSIVE)
Downgrade back to 5.5.2

For now, we implemented the first option, in the hope that’ll solve our troubles. And it actually does.

Mass-updating Tivoli Storage Manager drive status

August 28, 2009June 21, 2013 Christian 2 Comments

I was fighting with our VTL again, and TSM was thinking all the drives were offline. In order to update the drive status, you’d need to go into the ISC and select each drive and set them to ONLINE. Since I’m a bit click-lazy, I wrote a simple nested for-loop, which gives me the output to update all the drives at once:

for i in 1 2; do
  for k in $( seq -w 1 32 ); do
    echo &quot;UPDATE DRIVE VTL$i VTL${i}_DR${k} ONLINE=YES&quot;
  done
done

for i in 1 2; do

for k in $( seq -w 1 32 ); do

echo "UPDATE DRIVE VTL$i VTL${i}_DR${k} ONLINE=YES"

done

Result is a list like this:

...
UPDATE DRIVE VTL1 VTL1_DR31 ONLINE=YES
UPDATE DRIVE VTL1 VTL1_DR32 ONLINE=YES
UPDATE DRIVE VTL2 VTL2_DR01 ONLINE=YES
UPDATE DRIVE VTL2 VTL2_DR02 ONLINE=YES
UPDATE DRIVE VTL2 VTL2_DR03 ONLINE=YES
...

...

UPDATE DRIVE VTL1 VTL1_DR31 ONLINE=YES

UPDATE DRIVE VTL1 VTL1_DR32 ONLINE=YES

UPDATE DRIVE VTL2 VTL2_DR01 ONLINE=YES

UPDATE DRIVE VTL2 VTL2_DR02 ONLINE=YES

UPDATE DRIVE VTL2 VTL2_DR03 ONLINE=YES

...

The same goes for mass-updating the path status:

for i in 1 2; do
  for k in $( seq -w 1 32 ); do
    echo &quot;UPDATE PATH TSM$i VTL${i}_DR${k} SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL$i ONLINE=YES&quot;
  done
done

for i in 1 2; do

for k in $( seq -w 1 32 ); do

echo "UPDATE PATH TSM$i VTL${i}_DR${k} SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL$i ONLINE=YES"

done

Result is a list like this:

...
UPDATE PATH TSM1 VTL1_DR31 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL1 ONLINE=YES
UPDATE PATH TSM1 VTL1_DR32 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL1 ONLINE=YES
UPDATE PATH TSM2 VTL2_DR01 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL2 ONLINE=YES
UPDATE PATH TSM2 VTL2_DR02 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL2 ONLINE=YES

...

UPDATE PATH TSM1 VTL1_DR31 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL1 ONLINE=YES

UPDATE PATH TSM1 VTL1_DR32 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL1 ONLINE=YES

UPDATE PATH TSM2 VTL2_DR01 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL2 ONLINE=YES

UPDATE PATH TSM2 VTL2_DR02 SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=VTL2 ONLINE=YES

IBM RSA II adapter and Java RE (fini)

August 19, 2009August 8, 2014 Christian Leave a comment

If you remember back to July, I looked into some troubles I had with the IBM RSA II adapter’s Java interface and the latest JRE updates. I just noticed, that IBM released a new firmware yesterday for the RSA. The ChangeLog states this:

Version 1.13, GFEP35A
Problem(s) Fixed:

* Suggested
o Fix for Remote Control General Exception in JRE 1.6 update 12 and above.
o Corrected a problem that DHCP renew/release may fail after a long time.
o Corrected a problem that remote control preference link disapears after creating new key buttons.
o Corrected a problem that cause event number shows only from 0 to 255 when views RSA log via telnet session.

As you can see, IBM finally decided that it isn’t a Sun problem but rather their own! Finally, after about 4 months a fix, yay!

Even if the fix is just for the x3550 for now, but that puts a light to the end of the tunnel and puts up hope, that they are gonna fix it for the other RSA adapters too!

rpc.statd starting before portmap

August 4, 2009August 8, 2014 Christian Leave a comment

One problem gone, another one turns up. When rpc.statd (nfs-common) tries to start before portmap, it’s gonna result in failure. Now, the logfile (/var/log/daemon.log) is gonna print a rather cryptic error message:

Aug  4 15:54:25 xen2 rpc.statd[3419]: Version 1.1.2 Starting
Aug  4 15:54:25 xen2 rpc.statd[3419]: unable to register (statd, 1, udp).

1 2	Aug 4 15:54:25 xen2 rpc.statd[3419]: Version 1.1.2 Starting Aug 4 15:54:25 xen2 rpc.statd[3419]: unable to register (statd, 1, udp).

After fixing the start order (I really hate *SUSE*/Debian* for not having init-script dependencies — like Gentoo’s baselayout/Roy’s openrc does have), everything is like it should be and I’m able to put the /srv/xen mount into the fstab …

for i in /etc/rc2.d/ /etc/rc3.d/ /etc/rc4.d/ /etc/rc5.d/; do
 cd $i; rm S20nfs-common; ln -s ../init.d/nfs-common S21nfs-common;
done

for i in /etc/rc2.d/ /etc/rc3.d/ /etc/rc4.d/ /etc/rc5.d/; do

cd $i; rm S20nfs-common; ln -s ../init.d/nfs-common S21nfs-common;

done

portmap hanging on shutdown

August 4, 2009August 4, 2009 Christian 1 Comment

Here’s yet another post about my compute cluster. It’s (obviously) running NFS and that works quite well. Up till now, I would always have trouble with portmap hanging on shutdown/reboot. After spending some time thinking about the problem, looking at the init script and googling, I stumbled upon this Ubuntu bug on portmap.

As noted in the bug, a pmap_dump would hang indefinitely. After taking another look at our nfs-root configuration (in regard to the first comment on the bug), it turns out it’s exactly that. We didn’t setup lo which seems vital for some things.

After adding the lines

auto lo
iface lo inet loopback

1 2	auto lo iface lo inet loopback

to /etc/network/interfaces, portmap stops just fine …

OFED packages for Debian

August 4, 2009August 8, 2014 Christian Leave a comment

As I mentioned yesterday, I’m currently doing some project work. Said project includes InfiniBand technology.

Apparently we bought a “cheap” InfiniBand switch, which comes without a subnet manager. So, in order to communicate between the nodes, you need to install the subnet manager (opensm in my case) on each node.

echo 'deb http://pkg-ofed.alioth.debian.org/apt/ofed ./' &gt; 
  /etc/apt/sources.list.d/openfabrics.list; 
  aptitude update; 
  aptitude install opensm

echo 'deb http://pkg-ofed.alioth.debian.org/apt/ofed ./' >

/etc/apt/sources.list.d/openfabrics.list;

aptitude update;

aptitude install opensm

In order to utilize the InfiniBand interface you need to do a few things first though:

Obviously install the opensm package
Add ib_umad and ib_ipoib to /etc/modules

After installing opensm on the host as well as the NFS root, opensm comes up just fine and the network starts automatically. Only trouble right now is, that ISC’s DHCP doesn’t support InfiniBand, otherwise I could even utilize DHCP to distribute the IP addresses.

Xen dom0 failing with kernel panic

August 3, 2009June 21, 2013 Christian 1 Comment

I’m building a 6-node cluster, using Xen at the moment. For the last few days, I tried my setup in a virtual machine, simply because VM’s boot much faster than the real hardware. However, certain things you can only replicate on the real hardware (for example, the InfiniBand interfaces, as well as certain nfs-stuff).

So I spent most of the day to replicate my configurations onto the hardware. After getting all done, the moment of the first boot … kaput! Doesn’t boot, just keeps hanging before booting the real kernel. Now what ? I removed the Xen vga parameters and rebooted (waited ~2 minutes in the process) until I finally saw the root cause for my trouble:

Low bootmem alloc of 67108864 bytes failed!
Kernel panic - not syncing: Out of low memory

1 2	Low bootmem alloc of 67108864 bytes failed! Kernel panic - not syncing: Out of low memory

I was like *wtf* … My tftp setup _worked_ inside the VM’s, why ain’t it working here ? Quick look at the pxelinux.cfg for the mac address revealed this:

label   Xen AMD64
  MENU LABEL      ^xen-3.2-1-amd64
  KERNEL mboot.c32
  APPEND xen/xen-3.2-1-amd64.gz console=vga vga=gfx-1024x768x16 dom0_mem=64M noreboot ---
          xen/vmlinuz-2.6.26-2-xen-amd64 rw console=tty0 rootdelay=5 root=/dev/nfs nfsroot=172.30.10.1:/srv/nfs/xen2 ip=dhcp ---
          xen/initrd.img-2.6.26-2-xen-amd64

label Xen AMD64

MENU LABEL ^xen-3.2-1-amd64

KERNEL mboot.c32

APPEND xen/xen-3.2-1-amd64.gz console=vga vga=gfx-1024x768x16 dom0_mem=64M noreboot ---

xen/vmlinuz-2.6.26-2-xen-amd64 rw console=tty0 rootdelay=5 root=/dev/nfs nfsroot=172.30.10.1:/srv/nfs/xen2 ip=dhcp ---

xen/initrd.img-2.6.26-2-xen-amd64

As you can see, I had devised 64M for the dom0, which apparently wasn’t enough. After tuning the memory limit to 256M, everything is honky-dory!

TS7530 authentification failure

August 1, 2009August 8, 2014 Christian Leave a comment

Today, I had a rather troublesome morning. Once I got to work, Nagios was already complaining about the lin_taped on one of our TSM servers, which apparently failed due to too many SCSI resets. Additionally, I can’t login using the VE console (I can login however using SSH) so I ended up opening up a IBM Electronic Service Call (ESC+).

Using SSH, I can get some information on the VE’s status:

vetapeservice@VTL-A:~&gt; sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)
Copyright (c) 2001-2008 FalconStor Software. All Rights Reserved.

Status of VE for Tape SNMPD Module                         [RUNNING]
Status of VE for Tape Configuration Module                 [RUNNING]
Status of VE for Tape Base Module                          [RUNNING]
Status of VE for Tape HBA Module                           [RUNNING]
Status of VE for Tape Authentication Module                [RUNNING]
Status of VE for Tape Server (Compression) Module          [RUNNING]
Status of VE for Tape Server (Hifn HW Compression) Module  [RUNNING]
Status of VE for Tape Server (Application Upcall) Module   [RUNNING]
Status of VE for Tape Server (FSNBase) Module              [RUNNING]
Status of VE for Tape Server (Upcall) Module               [RUNNING]
Status of VE for Tape Server (Application) Module          [RUNNING]
Status of VE for Tape Server (Application IOCTL) Module    [RUNNING]
Status of VE for Tape Server (User)                        [STOPPED]
Status of VE for Tape Target Module                        [RUNNING]
Status of VE for Tape Server IMA Daemon                    [RUNNING]
Status of VE for Tape Server RDE Daemon                    [RUNNING]
Status of VE for Tape Communication Module                 [RUNNING]
Status of VE for Tape Logger Module                        [RUNNING]
Status of VE for Tape Call Home Module                     [RUNNING]
Status of VE for Tape Local Client (VBDI)                  [RUNNING]
Status of VE for Tape Self Monitor Module                  [RUNNING]
Status of VE for Tape Failover Module                      [RUNNING]

vetapeservice@VTL-B:~&gt; sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)
Copyright (c) 2001-2008 FalconStor Software. All Rights Reserved.

Status of VE for Tape SNMPD Module                         [RUNNING]
Status of VE for Tape Configuration Module                 [RUNNING]
Status of VE for Tape Base Module                          [RUNNING]
Status of VE for Tape HBA Module                           [RUNNING]
Status of VE for Tape Authentication Module                [RUNNING]
Status of VE for Tape Server (Compression) Module          [RUNNING]
Status of VE for Tape Server (Hifn HW Compression) Module  [RUNNING]
Status of VE for Tape Server (Application Upcall) Module   [RUNNING]
Status of VE for Tape Server (FSNBase) Module              [RUNNING]
Status of VE for Tape Server (Upcall) Module               [RUNNING]
Status of VE for Tape Server (Application) Module          [RUNNING]
Status of VE for Tape Server (Application IOCTL) Module    [RUNNING]
Status of VE for Tape Server (User)                        [RUNNING]
Status of VE for Tape Target Module                        [RUNNING]
Status of VE for Tape Server IMA Daemon                    [RUNNING]
Status of VE for Tape Server RDE Daemon                    [RUNNING]
Status of VE for Tape Communication Module                 [RUNNING]
Status of VE for Tape Logger Module                        [RUNNING]
Status of VE for Tape Call Home Module                     [RUNNING]
Status of VE for Tape Local Client (VBDI)                  [RUNNING]
Status of VE for Tape Self Monitor Module                  [RUNNING]
Status of VE for Tape Failover Module                      [RUNNING]

vetapeservice@VTL-A:~> sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)

Status of VE for Tape SNMPD Module [RUNNING]

Status of VE for Tape Configuration Module [RUNNING]

Status of VE for Tape Base Module [RUNNING]

Status of VE for Tape HBA Module [RUNNING]

Status of VE for Tape Authentication Module [RUNNING]

Status of VE for Tape Server (Compression) Module [RUNNING]

Status of VE for Tape Server (Hifn HW Compression) Module [RUNNING]

Status of VE for Tape Server (Application Upcall) Module [RUNNING]

Status of VE for Tape Server (FSNBase) Module [RUNNING]

Status of VE for Tape Server (Upcall) Module [RUNNING]

Status of VE for Tape Server (Application) Module [RUNNING]

Status of VE for Tape Server (Application IOCTL) Module [RUNNING]

Status of VE for Tape Server (User) [STOPPED]

Status of VE for Tape Target Module [RUNNING]

Status of VE for Tape Server IMA Daemon [RUNNING]

Status of VE for Tape Server RDE Daemon [RUNNING]

Status of VE for Tape Communication Module [RUNNING]

Status of VE for Tape Logger Module [RUNNING]

Status of VE for Tape Call Home Module [RUNNING]

Status of VE for Tape Local Client (VBDI) [RUNNING]

Status of VE for Tape Self Monitor Module [RUNNING]

Status of VE for Tape Failover Module [RUNNING]

vetapeservice@VTL-B:~> sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)

Status of VE for Tape SNMPD Module [RUNNING]

Status of VE for Tape Configuration Module [RUNNING]

Status of VE for Tape Base Module [RUNNING]

Status of VE for Tape HBA Module [RUNNING]

Status of VE for Tape Authentication Module [RUNNING]

Status of VE for Tape Server (Compression) Module [RUNNING]

Status of VE for Tape Server (Hifn HW Compression) Module [RUNNING]

Status of VE for Tape Server (Application Upcall) Module [RUNNING]

Status of VE for Tape Server (FSNBase) Module [RUNNING]

Status of VE for Tape Server (Upcall) Module [RUNNING]

Status of VE for Tape Server (Application) Module [RUNNING]

Status of VE for Tape Server (Application IOCTL) Module [RUNNING]

Status of VE for Tape Server (User) [RUNNING]

Status of VE for Tape Target Module [RUNNING]

Status of VE for Tape Server IMA Daemon [RUNNING]

Status of VE for Tape Server RDE Daemon [RUNNING]

Status of VE for Tape Communication Module [RUNNING]

Status of VE for Tape Logger Module [RUNNING]

Status of VE for Tape Call Home Module [RUNNING]

Status of VE for Tape Local Client (VBDI) [RUNNING]

Status of VE for Tape Self Monitor Module [RUNNING]

Status of VE for Tape Failover Module [RUNNING]

After looking a bit deeper, it seems that none of the two TSM server is able to see the IBMchanger devices for the first VTL. The second is perfectly visible, just not the first. After putting both VE nodes into suspended failover, gathering support data for the IBM support from both VE’s and the Brocade SAN switches, apparently everything works again. I guess the library does have “self healing” properties.