Christian – Page 20

Firefox: Hosting Xmarks (formerly Foxmarks) on lighttpd

May 3, 2009May 3, 2009 Christian Leave a comment

Well, I am an enthusiastic user of Xmarks (or Foxmarks) and played with this again and again. So this weekend, I finally decided to do it properly. I sat down, recreated the whole WebDAV stuff (even if I cheated of this HowtoForge article).

Always redirect traffic to HTTPS, since transmitting username and passwords via HTTP ain’t that secure (MITM)

Okay, so here are the shortended setup instructions:

Enable mod_access, mod_auth, mod_redirect and mod_webdav in /etc/lighttpd/lighttpd.conf
Create the necessary directories
Create the htpasswd-file
Configure the redirections

mkdir -p /var/www/dav/{web,auth,sql}
chown -R lighttpd:lighttpd/var/www/dav/{web,sql}
htpasswd -c /var/www/dav/auth/htpasswd chrischie

mkdir -p /var/www/dav/{web,auth,sql}

chown -R lighttpd:lighttpd/var/www/dav/{web,sql}

htpasswd -c /var/www/dav/auth/htpasswd chrischie

Since we just created the necessary directories, as well as a htpasswd-file containing a user we should be able to change the configuration now:

$SERVER["socket"] == ":80" {
    $HTTP["host"] == "dav" {
        url.redirect = ( "^/(.*)" =&gt; "https://%1/$1" )
    }
}

$SERVER["socket"] == ":443" {
    $HTTP["host"] == "dav" {
        webdav.activate = "enable"
        webdav.is-readonly = "disable"
        webdav.sqlite-db-name = "/var/www/dav/sql/sqlite.db"
        auth.backend = "htpasswd"
        auth.backend.htpasswd.userfile = "/var/www/dav/auth/htaccess"
        auth.require = ( "" =&gt; ( "method" =&gt; "basic",
                                 "realm" =&gt; "webdav",
                                 "require" =&gt; "valid-user" ) )
    }
}

$SERVER["socket"] == ":80" {

$HTTP["host"] == "dav" {

url.redirect = ( "^/(.*)" => "https://%1/$1" )

}

$SERVER["socket"] == ":443" {

$HTTP["host"] == "dav" {

webdav.activate = "enable"

webdav.is-readonly = "disable"

webdav.sqlite-db-name = "/var/www/dav/sql/sqlite.db"

auth.backend = "htpasswd"

auth.backend.htpasswd.userfile = "/var/www/dav/auth/htaccess"

auth.require = ( "" => ( "method" => "basic",

"realm" => "webdav",

"require" => "valid-user" ) )

}

Now, just restart the lighttpd service and watch your WebDAV shine. Seriously, there are a couple of things you should be aware of:

When using a home-grown WebDAV server with HTTPS (meaning, custom certificate), Firefox is gonna be blocking the site at first (and Xmarks is gonna fail with a rather cryptic “Error 8172“). Navigate to the URL manually and add an Exception for the certificate.
Before changing the URL’s in Xmarks, I made the error and manually created directories named “bookmarks” and “passwords”, which I then entered in the respective dialogboxes in the settings window. That however made Xmarks cry horribly when running the synchronization.

------ Xmarks/3.1.0 (/Places) starting upload with https://dav ------
&gt;&gt;&gt; PUT https://chrischie@dav/xmarks/bookmarks
&gt;&gt;&gt; Body is: {"commands":[{"action":"insert","nid":"ROOT","args":...
&gt;&gt;&gt; Callback ({status:403, errormsg:""})
Got a 403
False alarm? ({status:403, errormsg:"", auth:(void 0)})
Returned error: Forbidden(403)
Will retry at Sun May 03 2009 16:25:41 GMT+0200

------ Xmarks/3.1.0 (/Places) starting upload with https://dav ------

>>> PUT https://chrischie@dav/xmarks/bookmarks

>>> Body is: {"commands":[{"action":"insert","nid":"ROOT","args":...

>>> Callback ({status:403, errormsg:""})

Got a 403

False alarm? ({status:403, errormsg:"", auth:(void 0)})

Returned error: Forbidden(403)

Will retry at Sun May 03 2009 16:25:41 GMT+0200

After deleting the folders, it works just fine.

TSM: Restoring the database/recovery log to a point-in-time

April 24, 2009June 21, 2013 Christian Leave a comment

Well, my co-worker just called on my cell (it’s Friday, 16:00), and asked me which start-up script he needed to change in order to restore the database. My first response was, “ummm, that’s gonna be hard, we’re using heartbeat”.

Okay, so after a bit of asking I got out of him what he wanted to achieve by changing the start-up script. Apparently he did something to crash Tivoli Storage Manager (or rather repeatedly crash it) and wanted to restore the database. He talked to one of the systems partner we do have (and I’m happy we have them most of the time), who in return told him how to do it, but forgot a minute after he hung up the phone.

So, I went digging while he still was telling me how he got Tivoli to kick his own ass … After a bit, I thought “hrrrrrm, shouldn’t this be covered in the Tivoli documentation ?”, and surprisingly it’s actually covered in the documentation.

It’s actually rather simple.

Stop the dsmserv Linux-HA cluster service (tsm-control ha stop tsm1)
Setup the environment (since we’re running multiple instances of Tivoli Storage Manager – export DSMSERV_DIR, export DSMSERV_CONFIG)
Enter the path of the server
Run dsmserv restore db
Wait some time (took about half an hour to restore the 95G database and the 10G recovery log)
Start the dsmserv Linux-HA cluster service (tsm-control ha start tsm1)
Update the server-to-server communication, since the restore db changes the communication verification token

&gt; tsm-control ha stop tsm1
  - tsm1 (dsmserv) -&gt; ha: [ OK ]
&gt; export DSMSERV_DIR=/opt/tivoli/tsm/server/bin
&gt; export DSMSERV_CONFIG=/opt/tivoli/tsm/server/tsm1/dsmserv.opt
&gt; cd /opt/tivoli/tsm/server/tsm1
&gt; /opt/tivoli/tsm/server/bin/dsmserv restore db todate=TODAY totime=08:00:00 source=dbbackup preview=no
.... wait some time ....
&gt; tsm-control ha start tsm1
  - tsm1 (dsmserv) -&gt; ha: [ OK ]

> tsm-control ha stop tsm1

- tsm1 (dsmserv) -> ha: [ OK ]

> export DSMSERV_DIR=/opt/tivoli/tsm/server/bin

> export DSMSERV_CONFIG=/opt/tivoli/tsm/server/tsm1/dsmserv.opt

> cd /opt/tivoli/tsm/server/tsm1

> /opt/tivoli/tsm/server/bin/dsmserv restore db todate=TODAY totime=08:00:00 source=dbbackup preview=no

.... wait some time ....

> tsm-control ha start tsm1

- tsm1 (dsmserv) -> ha: [ OK ]

Nagios: Service Check Timed Out

April 3, 2009June 21, 2013 Christian 6 Comments

Since I got the pleasure of watching some Windows boxen with Nagios, I took the Windows Update plugin from Michal Jankowski and implemented it. It took me some time, to initially set up the nsclient++ correctly so it just works, but up till now the check plugin sometimes reported the usual “Service Check Timed Out”.

Usually I ended up increasing the cscript timeout, or the nsclient++ socket timeout, but it still kept showing up. Since I rely heavily on my surveillance tools, I have the demand, that as few as possible false positives show up. So I ended up chasing down this error today, and after that I have to say it was quite simple.

In my case, it wasn’t cscript (that timeout is set to 300 seconds), neither nsclient++ (socket timeout is set to 300 seconds too), nor the nrpe plugin itself (that has 300 seconds as well).

As it turns out, Nagios got an additional setting controlling these things, called service_check_timeout which defaults to 60 seconds. Sadly the plugin, or rather Windows needs longer than those 60 seconds to figure out whether or not it needs updating, thus Nagios is killing the plugin and returning a CRITICAL message.

After increasing the value of service_check_timeout that’ll be fixed hopefully.

SLES10: zypper.log

April 3, 2009June 21, 2013 Christian Leave a comment

Well, I just stumbled upon something .. My Nagios at work wasn’t working anymore, and I went looking.

nagios3 ~ [0] &gt; tail -f /var/log/nagios/nagios.log
[1238658394] Error: Unable to save status file: No space left on device
[1238658403] Error: Unable to save status file: No space left on device
[1238658413] Error: Unable to save status file: No space left on device
[1238658423] SERVICE ALERT: tsm1;POWER WARN;OK;SOFT;4;-u OK - 0
[1238658423] Error: Unable to save status file: No space left on device
[1238658433] SERVICE ALERT: tsm2;LOAD;WARNING;SOFT;1;WARNING - load average: 6.25, 5.72, 5.36
[1238658433] Error: Unable to save status file: No space left on device
[1238658443] Error: Unable to save status file: No space left on device
[1238658453] Error: Unable to save status file: No space left on device
[1238658463] Error: Unable

nagios3 ~ [0] > tail -f /var/log/nagios/nagios.log

[1238658394] Error: Unable to save status file: No space left on device

[1238658403] Error: Unable to save status file: No space left on device

[1238658413] Error: Unable to save status file: No space left on device

[1238658423] SERVICE ALERT: tsm1;POWER WARN;OK;SOFT;4;-u OK - 0

[1238658423] Error: Unable to save status file: No space left on device

[1238658433] SERVICE ALERT: tsm2;LOAD;WARNING;SOFT;1;WARNING - load average: 6.25, 5.72, 5.36

[1238658433] Error: Unable to save status file: No space left on device

[1238658443] Error: Unable to save status file: No space left on device

[1238658453] Error: Unable to save status file: No space left on device

[1238658463] Error: Unable

After that, zip – nada. Next thing, check whether or not the device is really full … Okay, df ..

nagios3 ~ [130] &gt; df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             3.5G  1.2G  2.1G  37% /
udev                  506M   88K  506M   1% /dev
/dev/sdb1             7.9G  7.7G     0 100% /var

nagios3 ~ [130] > df -h

Filesystem Size Used Avail Use% Mounted on

/dev/sda2 3.5G 1.2G 2.1G 37% /

udev 506M 88K 506M 1% /dev

/dev/sdb1 7.9G 7.7G 0 100% /var

So, it is actually completely filled up. So, now we need to find who’s hogging the space. Since I had a assumption (pnp4nagios), I went straight for /var/lib …

nagios3 lib [0] &gt; du -sh *
16K     CAM
1.1M    YaST2
8.0K    acpi
4.0K    apache2
28K     autoinstall
16K     dhcpcd
4.0K    empty
96K     hardware
4.0K    logrotate.status
8.0K    misc
78M     mysql
2.1M    nagios
4.0K    net-snmp
4.0K    news
24K     nfs
8.0K    nobody
36K     ntp
4.0K    pam_devperm
824K    php5
359M    pnp4nagios
22M     rpm
28K     scpm
4.0K    smpppd
4.0K    sshd
4.0K    support
8.0K    suseRegister
4.0K    uniconf
4.0K    update-messages
4.0K    wwwrun
33M     zmd
14M     zypp

nagios3 lib [0] > du -sh *

16K CAM

1.1M YaST2

8.0K acpi

4.0K apache2

28K autoinstall

16K dhcpcd

4.0K empty

96K hardware

4.0K logrotate.status

8.0K misc

78M mysql

2.1M nagios

4.0K net-snmp

4.0K news

24K nfs

8.0K nobody

36K ntp

4.0K pam_devperm

824K php5

359M pnp4nagios

22M rpm

28K scpm

4.0K smpppd

4.0K sshd

4.0K support

8.0K suseRegister

4.0K uniconf

4.0K update-messages

4.0K wwwrun

33M zmd

14M zypp

That wasn’t it .. so heading to the next place, that’s suspicious most of the time, /var/log.

nagios3 log [0] &gt; du -sh *
5.2G    YaST2
4.0K    acpid
1.4G    apache2
28K     boot.msg
28K     boot.omsg
4.0K    cups
4.0K    dsmerror.log
148K    dsmsched.log
4.0K    faillog
4.0K    krb5
12K     lastlog
4.0K    localmessages
16K     mail
16K     mail.info
198M    messages
0       mysqld.log
14M     nagios
0       ntp
4.0K    pnp4nagios
4.0K    sa
8.0K    scpm
4.0K    vmdesched.log
16K     vmware-imc
4.0K    vmware-tools-guestd
82M     warn
348K    wtmp
115M    zmd-backend.log
24M     zmd-messages.log

nagios3 log [0] > du -sh *

5.2G YaST2

4.0K acpid

1.4G apache2

28K boot.msg

28K boot.omsg

4.0K cups

4.0K dsmerror.log

148K dsmsched.log

4.0K faillog

4.0K krb5

12K lastlog

4.0K localmessages

16K mail

16K mail.info

198M messages

0 mysqld.log

14M nagios

0 ntp

4.0K pnp4nagios

4.0K sa

8.0K scpm

4.0K vmdesched.log

16K vmware-imc

4.0K vmware-tools-guestd

82M warn

348K wtmp

115M zmd-backend.log

24M zmd-messages.log

I was like “WTF ? 5.2G for YaST2 logs ?” when I initially saw that output … As of now, I got a crontab emptying /var/log/YaST2 every 24 hours …

Nagios: SNMP OID’s for IBM’s RSA II adapter

April 1, 2009June 21, 2013 Christian Leave a comment

Well, after some poking around I finally found some OID’s for the RSA’s (only through these two links: check_rsa_fan and check_rsa_temp).

For Nagios, I dismissed the fans, since the fan speed is only passed on in percent values. So I only added this:

define hostgroup{
  hostgroup_name                  rsa-snmp
  alias                           Remote Supervisor Adapter (allowing SNMP connections)
}

define service{
  use                             generic-perfdata

  check_command                   check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.2.1.1!45!60!°C!Temperature CPU0!
  hostgroup_name                  rsa-snmp
  service_description             TEMP CPU0
}

define service{
  use                             generic-perfdata

  check_command                   check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.2.2.1!45!60!°C!Temperature CPU1!
  hostgroup_name                  rsa-snmp
  service_description             TEMP CPU1
}

define service{
  use                             generic-perfdata

  check_command                   check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.5.1.0!29!35!°C!Temperature Ambient!
  hostgroup_name                  rsa-snmp
  service_description             TEMP AMBIENT
}

define hostgroup{

hostgroup_name rsa-snmp

alias Remote Supervisor Adapter (allowing SNMP connections)

}

define service{

use generic-perfdata

check_command check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.2.1.1!45!60!°C!Temperature CPU0!

hostgroup_name rsa-snmp

service_description TEMP CPU0

}

define service{

use generic-perfdata

check_command check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.2.2.1!45!60!°C!Temperature CPU1!

hostgroup_name rsa-snmp

service_description TEMP CPU1

}

define service{

use generic-perfdata

check_command check_rsa_snmpv1_public!.1.3.6.1.4.1.2.3.51.1.2.1.5.1.0!29!35!°C!Temperature Ambient!

hostgroup_name rsa-snmp

service_description TEMP AMBIENT

}

Oh, and if anyone else is curious like me, here’s the list with the OID’s, courtesy of Gerhard Gschlad and Leonardo Calamai.

For the fans:

Fan1: .1.3.6.1.4.1.2.3.51.1.2.3.1.0
Fan2: .1.3.6.1.4.1.2.3.51.1.2.3.2.0
Fan3: .1.3.6.1.4.1.2.3.51.1.2.3.3.0
Fan4: .1.3.6.1.4.1.2.3.51.1.2.3.4.0
Fan5: .1.3.6.1.4.1.2.3.51.1.2.3.5.0
Fan6: .1.3.6.1.4.1.2.3.51.1.2.3.6.0
Fan7: .1.3.6.1.4.1.2.3.51.1.2.3.7.0
Fan8: .1.3.6.1.4.1.2.3.51.1.2.3.8.0
Fan9: .1.3.6.1.4.1.2.3.51.1.2.3.9.0
Fan10: .1.3.6.1.4.1.2.3.51.1.2.3.10.0
Fan11: .1.3.6.1.4.1.2.3.51.1.2.3.11.0
Fan12: .1.3.6.1.4.1.2.3.51.1.2.3.12.0

Fan1: .1.3.6.1.4.1.2.3.51.1.2.3.1.0

Fan2: .1.3.6.1.4.1.2.3.51.1.2.3.2.0

Fan3: .1.3.6.1.4.1.2.3.51.1.2.3.3.0

Fan4: .1.3.6.1.4.1.2.3.51.1.2.3.4.0

Fan5: .1.3.6.1.4.1.2.3.51.1.2.3.5.0

Fan6: .1.3.6.1.4.1.2.3.51.1.2.3.6.0

Fan7: .1.3.6.1.4.1.2.3.51.1.2.3.7.0

Fan8: .1.3.6.1.4.1.2.3.51.1.2.3.8.0

Fan9: .1.3.6.1.4.1.2.3.51.1.2.3.9.0

Fan10: .1.3.6.1.4.1.2.3.51.1.2.3.10.0

Fan11: .1.3.6.1.4.1.2.3.51.1.2.3.11.0

Fan12: .1.3.6.1.4.1.2.3.51.1.2.3.12.0

And for the temperatures:

CPU1: .1.3.6.1.4.1.2.3.51.1.2.1.2.1.1
CPU2: .1.3.6.1.4.1.2.3.51.1.2.1.2.2.1
CPU3: .1.3.6.1.4.1.2.3.51.1.2.1.2.3.1
CPU4: .1.3.6.1.4.1.2.3.51.1.2.1.2.4.1
Ambient: .1.3.6.1.4.1.2.3.51.1.2.1.5.1.0

CPU1: .1.3.6.1.4.1.2.3.51.1.2.1.2.1.1

CPU2: .1.3.6.1.4.1.2.3.51.1.2.1.2.2.1

CPU3: .1.3.6.1.4.1.2.3.51.1.2.1.2.3.1

CPU4: .1.3.6.1.4.1.2.3.51.1.2.1.2.4.1

Ambient: .1.3.6.1.4.1.2.3.51.1.2.1.5.1.0

I just found a proper list of OID’s for the IBM RSA adapter. That’s rather nice, since I really was looking for the OID’s for the VRM failure OID and other warning/critical events.

RPM spec: Installing a custom init-script

March 26, 2009June 21, 2013 Christian Leave a comment

Well, I’m sitting again here grinding my head on how to fix up a certain package. Now, I had to look it up again, so this time I’m writing it down!

Source1: ${name}.initd
...
install -o root -g root -m 755 %{S:1} $RPM_BUILD_ROOT/etc/init.d/ndo2db

Source1: ${name}.initd

...

install -o root -g root -m 755 %{S:1} $RPM_BUILD_ROOT/etc/init.d/ndo2db

Windows: Running msconfig as non privileged user

March 25, 2009June 21, 2013 Christian Leave a comment

Well, the title is kinda misleading since you need administrator privileges to run msconfig in it’s full scope. But this is just a hint to myself on how to execute msconfig without logging out and then logging in as administrator.

runas /user:Administrator C:WINDOWSpchealthhelpctrbinariesmsconfig.exe

1	runas /user:Administrator C:WINDOWSpchealthhelpctrbinariesmsconfig.exe

Nagios: Watching Clustered environments (the other way)

March 19, 2009June 21, 2013 Christian Leave a comment

Well, recently I stepped up to watch our cluster environments … Michael has a good howto on how to watch Windows Cluster environments in the NSclient++ wiki.

Now, this has it’s own perks … Which I stumbled upon when trying to write a Linux-HA OCF resource agent for the Nagios NRPE server. Combining that Linux-HA with SLES10 is a good thing generally, but using startproc in that resource agent is not such a good idea.

Apparently Novell (or SuSE GmbH) thought it might be wise to include some additional logic into the wrapper. startproc, checkproc and killproc do check for the name of the executable. So if you try to start an additional process with the same name, you need to dig a bit deeper.

For this to work, you need two additional things (quotations directly from man 8 startproc):

-p pid_file
(Former option -f changed due to the LSB specification.) Use an alternate pid file instead of the default (/var/run/<basename>.pid). The pid read from this file is being matched against the pid of running processes that have an executable with specified path of the program. In order to avoid confusion with stale pid files, a not up-to-date pid will be ignored.

Now, then apparently this isn’t enough. startproc is still refusing to start a second process.

-i ignore_file
The pid found in this file is used as session id of the same binary program which should be ignored by startproc.

Linux-HA: Creating a random authkey

March 18, 2009June 21, 2013 Christian Leave a comment

I just looked over the slides of a presentation one of my trainees bought back from Chemnitz, and there was this nifty one-line command, with which you can generate a random sha1sum for your authkeys file.

Now, since I’m a bit lazy here’s the full command line to fill /etc/ha.d/authkeys for you:

node2 ~ [0] &gt; echo &quot;auth 1
1 sha1 $( dd if=/dev/urandom count=4  2&gt; /dev/null | openssl dgst -sha1 )&quot;

1 2	node2 ~ [0] > echo "auth 1 1 sha1 $( dd if=/dev/urandom count=4 2> /dev/null \| openssl dgst -sha1 )"

TSM client: Backing up files with umlauts on SLES

March 2, 2009June 21, 2013 Christian Leave a comment

In the past, I always had problems with SLES and our Tivoli Storage Manager client’s when backing up files with german umlauts. Well, today I looked a bit harder, and quite quickly found a solution.

sles9 root [0] &gt; env | grep ^LC
LC_CTYPE=de_DE.UTF-8

1 2	sles9 root [0] > env \| grep ^LC LC_CTYPE=de_DE.UTF-8

As you can see from the above, SLES9/10 ain’t setting LANG or LC_ALL (which I searched for first), but is setting LC_CTYPE.

So, simply changing the LC_CTYPE in the init-script and/or prepending the dsmc command line with a new LC_CTYPE fixes my umlauts problems!

sles9 root [0] &gt; LC_CTYPE=&quot;en_US&quot; dsmc incr

1	sles9 root [0] > LC_CTYPE="en_US" dsmc incr

Well, I had a long’ish talk with one of my trustworthy IBM senior consultants the day after writing this …

He told me something along the lines of this:

If you would like to back up files with names containing characters with a code > 127 please ensure that you have chosen a SBCS character set for your locale. The default code page C or the code page POSIX supports characters up to 127 only. Files whose names contain special characters will be skipped if C or POSIX is used. It is strongly recommended to perform a system backup by using a SBCS character set to prevent any file or directory from being skipped. This behavior for different locales is intended.

And this:

The UTF-8 locale is default on some Linux platforms. However, TSM Client currently does not support running under UTF-8 locales (such as en_US.UTF-8 and ja_JP.UTF-8). Export your LANG and LC_ALL environment variables to the iso8859-1 or EUC versions of your locale and then start a new xterm (or mlterm) session prior to running TSM Client.

That basically means, at least for using the TSM Client Java Interface (dsmj) and the scheduler/client acceptor daemon you have to switch your locales to something _not_ UTF-8 capable.

He also mentioned, that IBM doesn’t have a real solution for this problem, as well that there is no real workaround. You need to invest some time into figuring out the “right” locale setting for your system(s), since after writing the above I came to the result that it ain’t enough ..

You need to do the following:

lang=&quot;de_DE@euro&quot;
export LC_CTYPE=&quot;$lang&quot;
export LANG=&quot;$lang&quot;
export LC_ALL=&quot;$lang&quot;

lang="de_DE@euro"

export LC_CTYPE="$lang"

export LANG="$lang"

export LC_ALL="$lang"

After doing so, the scheduler and the command-line client works …