TSM – BAFM

Dealing with SnapVault replication issues

April 9, 2013April 9, 2013 Christian Leave a comment

Well, for the past two months I had a case open with NetApp to figure out this SnapVault replication issue we were seeing. The initial transfer of the SnapVault relation would complete with a hick up, manual snapshot transfers also work – just the scheduled, auto-created Snapshots won’t replicate.

At first I (and the NetApp support) thought this was an issue with SnapVault itself, however after being away for the last four weeks I looked at the issue with fresh eyes. After a short peek into the logs, I found what I had found back when I first looked into this.

Mon Apr  1 21:16:46 CEST [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.aux has been taken offline
Mon Apr  1 21:16:46 CEST [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws has been taken offline
Mon Apr  1 23:06:35 CEST [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.
Mon Apr  1 23:06:35 CEST [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid lun clone.
Mon Apr  1 23:08:35 CEST [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.
Mon Apr  1 23:08:36 CEST [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid lun clone.

Mon Apr 1 21:16:46 CEST [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.aux has been taken offline

Mon Apr 1 21:16:46 CEST [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws has been taken offline

Mon Apr 1 23:06:35 CEST [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.

Mon Apr 1 23:06:35 CEST [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid lun clone.

Mon Apr 1 23:08:35 CEST [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.

Mon Apr 1 23:08:36 CEST [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid lun clone.

SnapVault would create the daily snapshot on the SnapVault Primary and start the replication. However something (or someone, wasn’t clear at this point) then created a FlexClone of a volume … And as, back when we first encountered this, I was kinda puzzled.

But then I decided (please don’t ask me what made me look there) to look at the logs of the NetApp Filer on our logserver. As it turns out, back when I enabled syslogging to an external logserver I seem to have enabled debug logging … and it was great to have that! Below you’ll find the log I found – and as you can see there’s at least a clue as to from where that ghost snapshot is coming from.

Apr  1 21:16:46 fas01 [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.aux has been taken offline
Apr  1 21:16:46 fas01 [fas01: lun.destroy:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.aux destroyed
Apr  1 21:16:46 fas01 [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws has been taken offline
Apr  1 21:16:48 fas01 [fas01: lun.map:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws was mapped to initiator group viaRPC.20:00:00:25:b5:02:0a:4c.b230-5=3
Apr  1 21:16:48 fas01 [fas01: lun.map:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws was mapped to initiator group viaRPC.20:00:00:25:b5:02:0b:4c.b230-5=3
Apr  1 22:05:31 fas01 [fas01: lun.map.unmap:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws unmapped from initiator group viaRPC.20:00:00:25:b5:02:0a:4c.b230-5
Apr  1 22:05:31 fas01 [fas01: lun.map.unmap:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws unmapped from initiator group viaRPC.20:00:00:25:b5:02:0b:4c.b230-5
Apr  1 22:05:35 fas01 [fas01: lun.destroy:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws destroyed
Apr  1 22:05:37 fas01 [fas01: wafl.snap.delete:info]: Snapshot copy {25b8da84-9351-4f20-987c-e7b02d76f15e} on volume flex_windows_boot NetApp was deleted by the Data ONTAP function zapi_snapshot_
delete. The unique ID for this Snapshot copy is (64, 3471969).
Apr  1 23:06:35 fas01 [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.
Apr  1 23:06:35 fas01 [fas01: snapdiff.abnormal.abort:debug]: Encountered unexpected error while computing differences between Snapshot copies.
Apr  1 23:06:35 fas01 [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid
lun clone.
Apr  1 23:08:35 fas01 [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.
Apr  1 23:08:35 fas01 [fas01: snapdiff.abnormal.abort:debug]: Encountered unexpected error while computing differences between Snapshot copies.
Apr  1 23:08:36 fas01 [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid
lun clone.

Apr 1 21:16:46 fas01 [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.aux has been taken offline

Apr 1 21:16:46 fas01 [fas01: lun.destroy:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.aux destroyed

Apr 1 21:16:46 fas01 [fas01: lun.offline:warning]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws has been taken offline

Apr 1 21:16:48 fas01 [fas01: lun.map:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws was mapped to initiator group viaRPC.20:00:00:25:b5:02:0a:4c.b230-5=3

Apr 1 21:16:48 fas01 [fas01: lun.map:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws was mapped to initiator group viaRPC.20:00:00:25:b5:02:0b:4c.b230-5=3

Apr 1 22:05:31 fas01 [fas01: lun.map.unmap:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws unmapped from initiator group viaRPC.20:00:00:25:b5:02:0a:4c.b230-5

Apr 1 22:05:31 fas01 [fas01: lun.map.unmap:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws unmapped from initiator group viaRPC.20:00:00:25:b5:02:0b:4c.b230-5

Apr 1 22:05:35 fas01 [fas01: lun.destroy:info]: LUN /vol/flex_windows_boot/sv/{6f3899ab-6c8a-402e-bbb7-d7b7298d254f}.rws destroyed

Apr 1 22:05:37 fas01 [fas01: wafl.snap.delete:info]: Snapshot copy {25b8da84-9351-4f20-987c-e7b02d76f15e} on volume flex_windows_boot NetApp was deleted by the Data ONTAP function zapi_snapshot_

delete. The unique ID for this Snapshot copy is (64, 3471969).

Apr 1 23:06:35 fas01 [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.

Apr 1 23:06:35 fas01 [fas01: snapdiff.abnormal.abort:debug]: Encountered unexpected error while computing differences between Snapshot copies.

Apr 1 23:06:35 fas01 [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid

lun clone.

Apr 1 23:08:35 fas01 [fas01: rpl.src.lun.invalid_clone:error]: Replication source transfer failed due to an invalid LUN clone with fileid 32466 in snapid 67 in volume flex_windows_boot.

Apr 1 23:08:35 fas01 [fas01: snapdiff.abnormal.abort:debug]: Encountered unexpected error while computing differences between Snapshot copies.

Apr 1 23:08:36 fas01 [fas01: replication.src.err:error]: SnapVault: source transfer from /vol/flex_windows_boot/sv/ to fas03:/vol/flex_windows_boot/sv : replication source found an invalid

lun clone.

Now, with knowing from which corner this issue originated it dawned on me, we have had a similar issue before. A quick peek into TSM Manager and I knew I was on the right track. The daily system backup starts around 21:15. Now our TSM backup includes the System State backup (which in turn utilizes VSS – which triggers the NetApp Snapshot!).

After excluding the System State from the Daily Backup the SnapVault stuff worked without a hickup. I ended up removing SnapDrive from the Server in question, since we don’t really need it there. Snapshots created from SnapDrive of the boot lun are gonna be inconsistent anyhow (doesn’t matter if I do ’em from SnapDrive or the NetApp CLI).

That restored the default VSS handler, which enables TSM to backup the System State again.

TSM and NetApp – Another Quick Hint

May 27, 2012June 21, 2013 Christian Leave a comment

Well, we’ve been trying to come up with a decent way to backup NetApp snapshots to tape (SnapMirror To Tape), so we evaluated all the available methods of using NDMP backups.

There’s Image Backup in two different variants – FULL and DIFFerntial
There’s SnapMirror To Tape

So the Image Backup is one of the ways. However the DIFFerntial backup only works for CIFS and NFS shares (which we don’t use). We only have FC luns (or rather FCoE luns), so there’s only a single (or in case of the boot luns more than one) file in each volume. With that however, each run of the Image Backup with the DIFFerential option, it’s gonna backup the full size of the volume (plus the deduplicated amount).

The SnapMirror To Tape option presents another problem: We intend to use SnapManager for SQL/Oracle, which creates “consistent” snapshots of the database luns. However the SnapMirror To Tape backup doesn’t have an option to use an already existing snapshot, but creates another one. Which puts the whole SnapManager business down the curb. So we either do use a SnapMirror To Disk from one database lun to another controller and then run the SnapMirror To Tape from the second controller or come up with another way to back them up to TSM.

TSM and NetApp – Quick Hint

April 25, 2012June 21, 2013 Christian Leave a comment

Well, to save everyone else the trouble (since it isn’t documented anywhere – and I just spent about an hour finding the cause for this), if you need to configure NDMP on your NetApp Filer, make sure you also configure an interface other than e0M.

Apparently the necessary controlport for NDMP (10000) is being blocked on e0M, thus ndmp may be configured and running, however TSM is gonna complain that it is unable to connect to the specified data mover.

ANR4728E Server connection to file server 172.31.76.100 failed. Please check
the attributes of the file server specified during definition of the datamover.
ANR2146E DEFINE DATAMOVER: Node NAS_DM is not registered.
ANS8001I Return code 11.

ANR4728E Server connection to file server 172.31.76.100 failed. Please check

the attributes of the file server specified during definition of the datamover.

ANR2146E DEFINE DATAMOVER: Node NAS_DM is not registered.

ANS8001I Return code 11.

Doing TSM’s job on Windows Server 2008

January 26, 2012June 21, 2013 Christian Leave a comment

Ran into another weird problem the other day … Had a few Windows boxens running out of space. Why ? Well, because TSM includes a System-State backup when creating the daily incremental. Now, apparently (as stated by the IBM support) it isn’t TSM’s job to keep track of the VSS snapshots but rather Windows’. Now by default, if you don’t click on the VSS properties of a Windows drive, there is no limit on the volume. Thus, VSS is slowly eating up all your space.

That isn’t the worst of it, but when you want to delete it all … With Windows 2003 you would just this:

run vssadmin delete shadows /for=C:

1	run vssadmin delete shadows /for=C:

However, as with everything Microsoft, Windows 2008 R2 does it a little bit different. As a matter of fact, it won’t allow you to delete application triggered snapshots (as you can see in the example below), so you’re basically shit-out-of-luck.

Error: Snapshots were found, but they were outside of your allowed context. Try removing them with the backup
application which created them.

1 2	Error: Snapshots were found, but they were outside of your allowed context. Try removing them with the backup application which created them.

Well, not really … diskshadow to the rescue. Simply running diskshadow with a simple script like this:

diskshadow> delete shadows C:
......

1 2	diskshadow> delete shadows C: ......

Just for clarification this isn’t my own work, it was someone elses.

TSM Client: Service Script for Solaris 10

October 27, 2010June 21, 2013 Christian Leave a comment

Today I’ve been fighting with Solaris 10 and the SMF Manifest (others would call it init-script …). Since I wanted to do it the proper way (I could have used a “old-style” init-script, but I didn’t wanna ..), I ended up combing the interweb for examples .. As it turns out, not even IBM has documented a way, on how to do this.

In the end this is what I’ve come up with:

&lt;?xml version=&#039;1.0&#039;?&gt;
&lt;!DOCTYPE service_bundle SYSTEM &#039;/usr/share/lib/xml/dtd/service_bundle.dtd.1&#039;&gt;

&lt;!--
  mkdir /var/svc/manifest/application/dsmc
  chmod 755 /var/svc/manifest/application/dsmc
  chown root:bin /var/svc/manifest/application/dsmc
  &lt;den inhalt dieser datei per vi nach /var/svc/manifest/application/dsmc/dsmc.xml importieren
  svcadm -v import /var/svc/manifest/application/dsmc/dsmc.xml
  svcadm enable application/dsmc
  
--&gt;
&lt;service_bundle type=&#039;manifest&#039; name=&#039;TIVsmCba:dsmc&#039;&gt;
  &lt;service name=&#039;application/dsmc&#039; type=&#039;service&#039; version=&#039;0&#039;&gt;
    &lt;create_default_instance enabled=&#039;true&#039;/&gt;
    &lt;single_instance/&gt;
    
    &lt;dependency name=&#039;fs-local&#039; grouping=&#039;require_all&#039; restart_on=&#039;none&#039; type=&#039;service&#039;&gt;
      &lt;service_fmri value=&#039;svc:/system/filesystem/local&#039;/&gt;
    &lt;/dependency&gt;
    &lt;dependency name=&#039;net-physical&#039; grouping=&#039;require_all&#039; restart_on=&#039;none&#039; type=&#039;service&#039;&gt;
      &lt;service_fmri value=&#039;svc:/network/physical&#039;/&gt;
    &lt;/dependency&gt;

    &lt;exec_method name=&#039;start&#039; type=&#039;method&#039; exec=&#039;/opt/tivoli/tsm/client/ba/bin/dsmc.helper&#039; timeout_seconds=&#039;60&#039;&gt;
      &lt;method_context working_directory=&#039;/opt/tivoli/tsm/client/ba/bin&#039;/&gt;
    &lt;/exec_method&gt;
    &lt;exec_method name=&#039;stop&#039; type=&#039;method&#039; exec=&#039;/usr/bin/pkill dsmc&#039; timeout_seconds=&#039;60&#039;&gt;
      &lt;method_context/&gt;
    &lt;/exec_method&gt;
  
    &lt;stability value=&#039;Unstable&#039;/&gt;
    &lt;template&gt;
      &lt;common_name&gt;
        &lt;loctext xml:lang=&#039;C&#039;&gt;Tivoli Storage Manager Client Scheduler&lt;/loctext&gt;
      &lt;/common_name&gt;
    &lt;/template&gt;
  &lt;/service&gt;
&lt;/service_bundle&gt;

<?xml version='1.0'?>

<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>

<!--

mkdir /var/svc/manifest/application/dsmc

chmod 755 /var/svc/manifest/application/dsmc

chown root:bin /var/svc/manifest/application/dsmc

<den inhalt dieser datei per vi nach /var/svc/manifest/application/dsmc/dsmc.xml importieren

svcadm -v import /var/svc/manifest/application/dsmc/dsmc.xml

svcadm enable application/dsmc

-->

<service_bundle type='manifest' name='TIVsmCba:dsmc'>

<create_default_instance enabled='true'/>

<single_instance/>

<service_fmri value='svc:/system/filesystem/local'/>

</dependency>

<service_fmri value='svc:/network/physical'/>

</dependency>

<exec_method name='start' type='method' exec='/opt/tivoli/tsm/client/ba/bin/dsmc.helper' timeout_seconds='60'>

<method_context working_directory='/opt/tivoli/tsm/client/ba/bin'/>

</exec_method>

<exec_method name='stop' type='method' exec='/usr/bin/pkill dsmc' timeout_seconds='60'>

<method_context/>

</exec_method>

<common_name>

<loctext xml:lang='C'>Tivoli Storage Manager Client Scheduler</loctext>

</common_name>

</template>

</service>

</service_bundle>

However, in order to get the scheduler client working on Solaris, I had to create a little helper script in /opt/tivoli/tsm/client/ba/bin named dsmc.helper:

#!/bin/bash

export DSM_DIR=&quot;/opt/tivoli/tsm/client/ba/bin&quot;
export DSM_LOG=&quot;/opt/tivoli/tsm/client/ba/bin&quot;
export DSM_CONFIG=&quot;/usr/bin/dsm.opt&quot;

$DSM_DIR/dsmc sched &amp;&gt;/dev/null &amp;

#!/bin/bash

export DSM_DIR="/opt/tivoli/tsm/client/ba/bin"

export DSM_LOG="/opt/tivoli/tsm/client/ba/bin"

export DSM_CONFIG="/usr/bin/dsm.opt"

$DSM_DIR/dsmc sched &>/dev/null &

With that, I was able to automate the TSM Scheduler Client startup on Solaris.

VMware Consolidated Backup and TRANSPORT_MODE=”hotadd”

March 18, 2010June 21, 2013 Christian Leave a comment

As the title says, I’ve been playing with vCB (inside a VM) and the TSM integration with newer (>6.0) clients for work. Result of all this work should be a feasibility study. We’re currently thinking about replacing our VMware server(s) with ESXi. But as most of you know, if you install ESXi, you simply can’t install anything (well, you can .. on ~100KB of disk space, which is compared to a TSM client weighing roughly 120MB nothing!). As we would like the possibility to backup VMs on image-level, I went looking at solutions.

VMware Data Recovery
VMware Consolidated Backup
vRanger, ……

As I was looking for something that wouldn’t cost us any money (thus excluding the third), I took a look at vDR and vCB. One point I do have to give to vDR is, that it’s damn fast. Only bad thing about vDR is that it doesn’t integrate at all with TSM, and it ain’t supported to install a TSM client inside the vDR VM. So vDR was also done for.

Only remaining thing was vCB. I remember way back when TSM didn’t support vCB directly, at which time it was *quite* the hassle to configure. But with newer TSM clients (as in the newer 6.x ones), IBM decided to integrate support for it. Which makes setting things up quite easy. You may think at least.

Since I wanted to use “hotadd” as transport mode for the vmdk’s (which is basically creating a snapshot of the vmdk and assigning that snapshot to the vCB VM), I did have to tinker around with some JavaScript files in %ProgramFiles%VMwareVMware Consolidated Backup. Sure, it isn’t supported by VMware (which is a bit lame since they announced the EOL for vCB with the upcoming vSphere version), but I didn’t want to open a support request. I’m lazy, yep:

Change DEFAULT_TRANSPORT_MODE in utils.js from “san” to “hotadd“. But apparently this only solved the backup method for vmdk-level, but not for file-level backups. The file-level is still gonna use nbd (network block device), which kinda sucks since the backup is going out via network.

After doing that, the hotadd mode is still gonna fail, since apparently the denoted “VMware Consolidated Backup User” (vcb-user in my case) also needs permissions onto the datastore. The permissions the handbook sets for the user are okay, you just need to apply that role to your datastore(s) containing the VMs you want to backup too! Otherwise vcbMounter is gonna fail with a rather cryptic error telling you that it doesn’t have sufficient rights to create a linked clone.

Converting TIVSM RPMs to deb

February 15, 2010June 21, 2013 Christian 1 Comment

We received a preinstalled customer server the other day, for which we had declared “as-is” support only, since it is running Lucid Lynx. Now today, I started getting the TSM client to work. Was kinda weird, since at first dsmc was reporting something like this:

# ./dsmc: no such file or directory

After fiddling with it a bit more, here are the control files, as well as the prerm and postinst-scripts for TIVSM-API, TIVSM-API64 and TIVSM-BA:

tivsm-api/debian/control:

Source: tivsm-api
Section: non-free
Priority: extra
Maintainer: root &lt;root@localhost&gt;

Package: tivsm-api
Architecture: all
Depends: ${shlibs:Depends}
Description: IBM Tivoli Storage Manager API

Source: tivsm-api

Section: non-free

Priority: extra

Maintainer: root <root@localhost>

Package: tivsm-api

Architecture: all

Depends: ${shlibs:Depends}

Description: IBM Tivoli Storage Manager API

tivsm-api/debian/tivsm-api.postinst:

for library in /opt/tivoli/tsm/client/api/bin/*.so; do
        ln -s $library /usr/lib/${library##*/}
done

# Automatically added by dh_makeshlibs
if [ &quot;$1&quot; = &quot;configure&quot; ]; then
        ldconfig
fi
# End automatically added section

for library in /opt/tivoli/tsm/client/api/bin/*.so; do

ln -s $library /usr/lib/${library##*/}

done

# Automatically added by dh_makeshlibs

if [ "$1" = "configure" ]; then

ldconfig

# End automatically added section

tivsm-api/debian/tivsm-api.prerm:

for library in /opt/tivoli/tsm/client/api/bin/*.so; do
        rm -f /usr/lib/${library##*/}
done

for library in /opt/tivoli/tsm/client/api/bin/*.so; do

rm -f /usr/lib/${library##*/}

done

tivsm-api64/debian/control:

Source: tivsm-api64
Section: non-free
Priority: extra
Maintainer: root &lt;root@localhost&gt;

Package: tivsm-api64
Architecture: amd64
Depends: ${shlibs:Depends}
Description: IBM Tivoli Storage Manager API

Source: tivsm-api64

Section: non-free

Priority: extra

Maintainer: root <root@localhost>

Package: tivsm-api64

Architecture: amd64

Depends: ${shlibs:Depends}

Description: IBM Tivoli Storage Manager API

tivsm-api64/debian/postinst:

for library in /opt/tivoli/tsm/client/api/bin64/*.so; do
        ln -s $library /usr/lib64/${library##*/}
done

# Automatically added by dh_makeshlibs
if [ &quot;$1&quot; = &quot;configure&quot; ]; then
        ldconfig
fi
# End automatically added section

for library in /opt/tivoli/tsm/client/api/bin64/*.so; do

ln -s $library /usr/lib64/${library##*/}

done

# Automatically added by dh_makeshlibs

if [ "$1" = "configure" ]; then

ldconfig

# End automatically added section

tivsm-api64/debian/prerm:

for library in /opt/tivoli/tsm/client/api/bin64/*.so; do
        rm -f /usr/lib64/${library##*/}
done

# Automatically added by dh_makeshlibs
if [ &quot;$1&quot; = &quot;configure&quot; ]; then
        ldconfig
fi
# End automatically added section

for library in /opt/tivoli/tsm/client/api/bin64/*.so; do

rm -f /usr/lib64/${library##*/}

done

# Automatically added by dh_makeshlibs

if [ "$1" = "configure" ]; then

ldconfig

# End automatically added section

tivsm-ba/debian/control:

Source: tivsm-ba
Section: non-free
Priority: extra
Maintainer: root &lt;root@localhost&gt;

Package: tivsm-ba
Architecture: any
Depends: ${shlibs:Depends}, lib32stdc++6 [amd64], libc6-i386 [amd64], lib32gcc1 [amd64]
Description: IBM Tivoli Storage Manager Client

Source: tivsm-ba

Section: non-free

Priority: extra

Maintainer: root <root@localhost>

Package: tivsm-ba

Architecture: any

Depends: ${shlibs:Depends}, lib32stdc++6 [amd64], libc6-i386 [amd64], lib32gcc1 [amd64]

Description: IBM Tivoli Storage Manager Client

tivsm-ba/debian/tivsm-ba.postinst:

ln -s /opt/tivoli/tsm/client/lang/EN_US /opt/tivoli/tsm/client/ba/bin/EN_US

for binary in dsmadmc dsmagent dsmc dsmcad dsmj dsmswitch dsmtca dsmtrace; do
        ln -s /opt/tivoli/tsm/client/ba/bin/$binary /usr/bin/$binary
done

ln -s /opt/tivoli/tsm/client/lang/EN_US /opt/tivoli/tsm/client/ba/bin/EN_US

for binary in dsmadmc dsmagent dsmc dsmcad dsmj dsmswitch dsmtca dsmtrace; do

ln -s /opt/tivoli/tsm/client/ba/bin/$binary /usr/bin/$binary

done

tivsm-ba/debian/tivsm-ba.prerm:

rm -f /opt/tivoli/tsm/client/ba/bin/EN_US

for binary in dsmadmc dsmagent dsmc dsmcad dsmj dsmswitch dsmtca dsmtrace; do
        rm -f /usr/bin/$binary
done

rm -f /opt/tivoli/tsm/client/ba/bin/EN_US

for binary in dsmadmc dsmagent dsmc dsmcad dsmj dsmswitch dsmtca dsmtrace; do

rm -f /usr/bin/$binary

done

All that was left to do, was simply adding a -n to the dh_makeshlibs call in each packages debian/rules file, otherwise dh_makeshlibs would overwrite my shiny postinst/prerm actions!

TS7530 authentification failure

August 1, 2009August 8, 2014 Christian Leave a comment

Today, I had a rather troublesome morning. Once I got to work, Nagios was already complaining about the lin_taped on one of our TSM servers, which apparently failed due to too many SCSI resets. Additionally, I can’t login using the VE console (I can login however using SSH) so I ended up opening up a IBM Electronic Service Call (ESC+).

Using SSH, I can get some information on the VE’s status:

vetapeservice@VTL-A:~&gt; sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)
Copyright (c) 2001-2008 FalconStor Software. All Rights Reserved.

Status of VE for Tape SNMPD Module                         [RUNNING]
Status of VE for Tape Configuration Module                 [RUNNING]
Status of VE for Tape Base Module                          [RUNNING]
Status of VE for Tape HBA Module                           [RUNNING]
Status of VE for Tape Authentication Module                [RUNNING]
Status of VE for Tape Server (Compression) Module          [RUNNING]
Status of VE for Tape Server (Hifn HW Compression) Module  [RUNNING]
Status of VE for Tape Server (Application Upcall) Module   [RUNNING]
Status of VE for Tape Server (FSNBase) Module              [RUNNING]
Status of VE for Tape Server (Upcall) Module               [RUNNING]
Status of VE for Tape Server (Application) Module          [RUNNING]
Status of VE for Tape Server (Application IOCTL) Module    [RUNNING]
Status of VE for Tape Server (User)                        [STOPPED]
Status of VE for Tape Target Module                        [RUNNING]
Status of VE for Tape Server IMA Daemon                    [RUNNING]
Status of VE for Tape Server RDE Daemon                    [RUNNING]
Status of VE for Tape Communication Module                 [RUNNING]
Status of VE for Tape Logger Module                        [RUNNING]
Status of VE for Tape Call Home Module                     [RUNNING]
Status of VE for Tape Local Client (VBDI)                  [RUNNING]
Status of VE for Tape Self Monitor Module                  [RUNNING]
Status of VE for Tape Failover Module                      [RUNNING]

vetapeservice@VTL-B:~&gt; sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)
Copyright (c) 2001-2008 FalconStor Software. All Rights Reserved.

Status of VE for Tape SNMPD Module                         [RUNNING]
Status of VE for Tape Configuration Module                 [RUNNING]
Status of VE for Tape Base Module                          [RUNNING]
Status of VE for Tape HBA Module                           [RUNNING]
Status of VE for Tape Authentication Module                [RUNNING]
Status of VE for Tape Server (Compression) Module          [RUNNING]
Status of VE for Tape Server (Hifn HW Compression) Module  [RUNNING]
Status of VE for Tape Server (Application Upcall) Module   [RUNNING]
Status of VE for Tape Server (FSNBase) Module              [RUNNING]
Status of VE for Tape Server (Upcall) Module               [RUNNING]
Status of VE for Tape Server (Application) Module          [RUNNING]
Status of VE for Tape Server (Application IOCTL) Module    [RUNNING]
Status of VE for Tape Server (User)                        [RUNNING]
Status of VE for Tape Target Module                        [RUNNING]
Status of VE for Tape Server IMA Daemon                    [RUNNING]
Status of VE for Tape Server RDE Daemon                    [RUNNING]
Status of VE for Tape Communication Module                 [RUNNING]
Status of VE for Tape Logger Module                        [RUNNING]
Status of VE for Tape Call Home Module                     [RUNNING]
Status of VE for Tape Local Client (VBDI)                  [RUNNING]
Status of VE for Tape Self Monitor Module                  [RUNNING]
Status of VE for Tape Failover Module                      [RUNNING]

vetapeservice@VTL-A:~> sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)

Status of VE for Tape SNMPD Module [RUNNING]

Status of VE for Tape Configuration Module [RUNNING]

Status of VE for Tape Base Module [RUNNING]

Status of VE for Tape HBA Module [RUNNING]

Status of VE for Tape Authentication Module [RUNNING]

Status of VE for Tape Server (Compression) Module [RUNNING]

Status of VE for Tape Server (Hifn HW Compression) Module [RUNNING]

Status of VE for Tape Server (Application Upcall) Module [RUNNING]

Status of VE for Tape Server (FSNBase) Module [RUNNING]

Status of VE for Tape Server (Upcall) Module [RUNNING]

Status of VE for Tape Server (Application) Module [RUNNING]

Status of VE for Tape Server (Application IOCTL) Module [RUNNING]

Status of VE for Tape Server (User) [STOPPED]

Status of VE for Tape Target Module [RUNNING]

Status of VE for Tape Server IMA Daemon [RUNNING]

Status of VE for Tape Server RDE Daemon [RUNNING]

Status of VE for Tape Communication Module [RUNNING]

Status of VE for Tape Logger Module [RUNNING]

Status of VE for Tape Call Home Module [RUNNING]

Status of VE for Tape Local Client (VBDI) [RUNNING]

Status of VE for Tape Self Monitor Module [RUNNING]

Status of VE for Tape Failover Module [RUNNING]

vetapeservice@VTL-B:~> sudo /ve/bin/ve status

IBM VE for Tape Server v3.00 (Build 1465)

Status of VE for Tape SNMPD Module [RUNNING]

Status of VE for Tape Configuration Module [RUNNING]

Status of VE for Tape Base Module [RUNNING]

Status of VE for Tape HBA Module [RUNNING]

Status of VE for Tape Authentication Module [RUNNING]

Status of VE for Tape Server (Compression) Module [RUNNING]

Status of VE for Tape Server (Hifn HW Compression) Module [RUNNING]

Status of VE for Tape Server (Application Upcall) Module [RUNNING]

Status of VE for Tape Server (FSNBase) Module [RUNNING]

Status of VE for Tape Server (Upcall) Module [RUNNING]

Status of VE for Tape Server (Application) Module [RUNNING]

Status of VE for Tape Server (Application IOCTL) Module [RUNNING]

Status of VE for Tape Server (User) [RUNNING]

Status of VE for Tape Target Module [RUNNING]

Status of VE for Tape Server IMA Daemon [RUNNING]

Status of VE for Tape Server RDE Daemon [RUNNING]

Status of VE for Tape Communication Module [RUNNING]

Status of VE for Tape Logger Module [RUNNING]

Status of VE for Tape Call Home Module [RUNNING]

Status of VE for Tape Local Client (VBDI) [RUNNING]

Status of VE for Tape Self Monitor Module [RUNNING]

Status of VE for Tape Failover Module [RUNNING]

After looking a bit deeper, it seems that none of the two TSM server is able to see the IBMchanger devices for the first VTL. The second is perfectly visible, just not the first. After putting both VE nodes into suspended failover, gathering support data for the IBM support from both VE’s and the Brocade SAN switches, apparently everything works again. I guess the library does have “self healing” properties.

OCF agent for Tivoli Storage Manager: redux

June 5, 2009August 8, 2014 Christian Leave a comment

Well, after I finished my first OCF agent back in October 2008, we have it running in production now for about ten months. During that time, we found quite a few points in which we’d like to improve the behaviour with that Linux-HA should handle TSM.

Shutdown TSM nicely if possible (Cancel client sessions, cancel running processes and dismount mounted volumes)
Better error handling

So, after another week of writing and testing with a small instance, I present the new OCF agent for Tivoli Storage Manager. It still has one or two weak points, but they are negligible. I still need to write the documentation for it, but the script should just work …

Weird TS3500 problem: redux

June 3, 2009August 8, 2014 Christian Leave a comment

Well, after yesterday’s episode with our tape library today continued to be a taxing day. After restarting a few exports that were hanging yesterday due to our library problems, something similar returned. TSM was unable to locate a few (two to be exact) tapes in the library.

ANR8300E I/O error on library LIB3584 (OP=00006C03, CC=314, KEY=05, ASC=3B,
ASCQ=0E, SENSE=70.00.05.00.00.00.00.0A.00.00.00.00.3B.0E.00.C0.00-.04.,
Description=The source slot or drive was empty in an attempt to move a
volume). Refer to Appendix C in the 'Messages' manual for recommended action.
ANR8312E Volume 000400 could not be located in library LIB3584.
ANR8358E Audit operation is required for library LIB3584.
ANR8381E LTO volume 000400 could not be mounted in drive DR6 (/dev/rmt0).
ANR1402W Mount request denied for volume 000400 - volume unavailable.
ANR1410W Access mode for volume 000400 now set to "unavailable".

ANR8300E I/O error on library LIB3584 (OP=00006C03, CC=314, KEY=05, ASC=3B,

ASCQ=0E, SENSE=70.00.05.00.00.00.00.0A.00.00.00.00.3B.0E.00.C0.00-.04.,

Description=The source slot or drive was empty in an attempt to move a

volume). Refer to Appendix C in the 'Messages' manual for recommended action.

ANR8312E Volume 000400 could not be located in library LIB3584.

ANR8358E Audit operation is required for library LIB3584.

ANR8381E LTO volume 000400 could not be mounted in drive DR6 (/dev/rmt0).

ANR1402W Mount request denied for volume 000400 - volume unavailable.

ANR1410W Access mode for volume 000400 now set to "unavailable".

Yet the library reported the tapes were still inventoried. *shrug* Here we are again, looking completely baffled. After a short while trying to figure out what to do, we went through the Data Cartridge inventory again. As it turns out, through putting the library in “Pause”-Mode and restarting TSM multiple times, TSM apparently completely forgot that it had these tapes put into drives.

After manually moving the tapes back to their home slot via the management interface of the TS3500 and setting the volume access mode back to read-write, everything is fine now I could finish my pending exports!