My neverending lighttpd troubles

Well, after a day or so my lighttpd troubles reappeared. But this time, the lighttpd process would simply put out this:

And as the message says, PHP (or rather mod_fastcgi?) would simply stop to process requests. In the end, I tuned some of the lighttpd/mod_fastcgi parameters.

Up till now (I made the change on July 14th), these changes seem to have fixed the issue, guess I’m still hoping (with the saying “Hope dies last” in mind) it’s gonna fix my problems once and for all.

Lighttpd issues

At first, it seemed that my lighttpd issues were resolved by updating PHP/remerging lighttpd. But apparently not. After putting in a crontab entry, that restarts lighttpd every 15 minutes (which completely sucks), the issue was minimized in it’s impact but not really solved.

Thanks to Michél (I guess, again) — who helped me looking at the strace logs, and of course Christian (aka hoffie — one of my old Gentoo buddies), the issue seems finally resolved. It turns out it was neither a PHP nor lighttpd issue. It was a simple matter of (stale) symlinks in /etc/ssl/certs if you can imagine that. Apparently a stale symlink forced PHP into a loop or something, from which it couldn’t recover on it’s own.

So the thank you is probably to the one, who introduced those lines to the ca-certificates ebuild (guess, that would be vapier, the old code monkey):

After letting the find run through /etc/ssl/certs and restarting lighttpd in the process, everything is back to working order! Finally!

Lighttpd troubles resolved

Well, after last weeks lighttpd troubles with PHP (or was it without ?), they finally seem resolved. First thing I did, was upgrade to the new php-version (5.2.10). After that, I ran revdep-rebuild, which apparently found issues with lighttpd being linked to a wrong pcre-version. After remerging lighttpd the issues seem to be gone!

Well, guess I was to quick in saying the problem was resolved .. it’s still there, just not happening as fast as it would in the past ….

Weird lighttpd troubles

Well, since about a week or so I keep having troubles with my vHost and lighttpd. The point being, after some time (up till now it’s been something between days and minutes) lighttpd completely freezes and doesn’t serve no content anymore. I don’t know if this is related to PHP (might be, I did perform an update to dev-lang/php-5.2.9-r2 on Thu May 28 12:18:57 2009), but I have to figure this out since the restart cron-job is getting annoying.

Well, it seems like lighttpd is getting stuck in mod_fastcgi …

Usually the last line is followed by a line telling that it released the proc, but not always.

Zend Optimizer again

Well, I happen to be back at my favorite application. Today I stumbled upon a “nice” thing. If you turn on the Zend Optimizer (doesn’t matter whether it is 2.6.2 or 3.3.0), one of the TYPO3 back ends ain’t showing *any* content in the preview pane. Once you turn the Zend Optimizer stuff off, it works without a problem.

O RLY ?
O RLY ?

And as Zend stated on their “Support Forum“, they don’t really support the Zend Optimizer stuff in the first place. Which is nice, what for do you need the Zend Guard shit in the first place ??

Well, so I do have two options now:

  1. Disable the one plug-in, which really needs the Zend Optimizer (as it also features the Zend De Guard engine – or whatever you want to call it)
  2. or risk some other things breaking due to the Zend Optimizer engine not working (correctly) with php-5.1.2 (which is rather old considering 5.3.0 is in development right now)

But I will see about that tomorrow …

YA RLY!
YA RLY!

TYPO3 and MySQL replication

Apparently the TYPO3 version we are using, doesn’t play too nice with the MySQL MasterMaster replication.

Sometimes, something like this is going to happen:

Well, as you can see from the last line in the log, the Slave-SQL thread found a duplicate entry and thought it is smart to just turn off the thread instead of disregarding the just made entry. So from now on, both databases drift since there ain’t no replication anymore until someone kick starts the replication again (someone being me).

Anyway, I think I finally traced the fucker down, supposedly one of the problematic cases is located in t3lib/class.t3lib_tstemplate.php on line 362.

Basically what TYPO3 is doing is a DELETE and an INSERT right afterwards. But apparently, it doesn’t check whether the DELETE even succeeded. I hacked it for now, simply adding this:

Sadly, this looks more and more like a race-condition between the two boxes (as in the replication / UPDATE being too slow), when users visit a edited site, that hasn’t had it’s cache regenerated yet. Problem is, it ain’t just this single spot, but also the search indexing, image cache and the whole page cache. For now we switched the cluster to active/passive load balancing, till we have a chance to see if a newer TYPO3 fixes those issues.

SLES, ZendOptimizer and IBM PowerPC(4)+

What would you figure from the above ? Hopefully the rather obvious, that it’s a *really* shitty combination.

So we figured it would be a nice thing to test our new setup before going into pre-production testing or production, but we don’t have an extra spare box. So we took one of the power4 boxes we have mounted in the rack basically consuming energy all day (that’s about 38kWh a day) and installed SLES10 onto it. Which wasn’t all that bad (at first the box repeatedly started back to AIX, from CD and after convincing the SMS – that’s basically the bios on the power*-boxes also known as System Management Services with a hammer to boot from the first hard disk).

The real bad part started later. First the box committed suicide sometime on the weekend (the last one that is), which is rather not so good.

So we installed the ocfs2-tools (which is obviously needed if you want do writes on a SAN volume mounted on two separate boxes), configured the o2cb thing to start automatically on boot and added the entry to /etc/fstab.

So far so good, but as we slowly activated the apache-vhosts, we finally came to what cost me about three damned hours of my life:

Now guess what … ZendOptimizer just went bye-bye … Damn and what now ? So I looked at the Knowledgebase on zend.com, even found an Article stating it’d do that from time to time

And attached also the usual crap .. “Please update to the latest version”. Only problem with that is that the latest version is indeed available for x86_64 (meaning amd64 in Gentoo terms), but ain’t for ppc (even if the product page states it should be).

So I went home, knowing what the problem is – since it was already past 4pm – swearing a short “frack that“.

Now that I’m home, ate something (a rather good salad), listening to some Korn/Kid Rock/Offspring and after doing some undertakers work, I asked myself “Why exactly do we need that crappy application anyway ?” (beyond the obvious point, that the ZendOptimizer is like/ is a php-compiler cache).

It turns out, one of my co-workers wrote a TYPO3-plugin interfacing our local research database .. and the catchy thing is, guess what …

He “guarded” it with ZendGuard, thus we need to use the ZendOptimizer thingy; otherwise we couldn’t use it either … 😯

O RLY ?
O RLY ?