VMware design rules

I’m just got back from four days in Rostock over at S&N, where I was attending a VMware design course and here’s a list of questions I did ask the trainer:

  1. What’s the disadvantage of having a 1016 ported vSwitch ?
  2. Any clues on how to exchange the default certificate of the Virtual Center ?
  3. Are there any tools to stress test the virtual system ?
  4. Are there any performance impacts of having more than 10 users in Virtual Center ?
  5. Any clues and/or guides on how to do time synchronization in VMware guests, especially Linux guests ?
  6. What’s the preferred NIC type for Linux guests ?
  7. Any clues to using Raw Device Mappings with VMotion ?
  8. Is there a way of defining CPU masks on a global level ?

Answers:

  1. There might be a small overhead, though that’s limited to a really, non-measureable amount
  2. Hasn’t done it yet.
  3. Yes, there are free stress test tools like cpubusy.vbs, cpubusy.pl, iometer.exe, ..
  4. Nope, you should only experienece load problems starting at 25 or so users
  5. Select *one* variant, either time synchronization by use of the VMware tools or ntpupdate; if ntpupdate, select a single time source for your whole environment
  6. For ESX 3.5.0 that would be “Flexible” (as per VMware Knowledgebase), as the vmxnet type is a leftover from ESX 3.0
  7. Raw device mappings are *absolutely* supported by VMware, and also work without any troubles (when mapping/zonig is correctly configured)
  8. Currently there’s no known way of doing this
  • When adjusting the CPU afinity of a VM, *always* completely stop the virtual machine afterwards
  • When trying to figure out CPU bottlenecks, check whether or not hyperthreading is enabled. The hyperthreaded (second) core is only giving you a CPU with 15% of the first.

Also, here are some guidelines on how the trainer extended the defaults:

ESX Server:

  • Extend the “/” size to 10GiB
  • Extend the “swap” partition to about 1GiB
  • Extend the “/var/log” partition to about 4 GiB
  • don’t mess around with creating too many vSwitches; just keep it simple
  • set the duplex mode manually if the ESX is giving you any trouble
  • disable the Traffic Shaping, unless you *really* need it

VirtualCenter:

  • There’s two options when installing VirtualCenter: either install it on a physical box or simply put it into a virtual machine itself
  • A problem with putting it into a virtual machine is, when the VM is shutting down or powered off due to isolation of the ESX running it, any ESX Server powering up isn’t going to start any virtual machines as that in return requires the License Server (as Michael pointed out in #c1, the VM is still gonna start as the HA agent is able to start virtual machines on the basis of the 14-day grace period)
  • Only use the SQL Server Express variant if you really have to. It’s limited to 4GB database size, so if your installation grows above say 50 hosts and 2000 VM’s, this is gonna break the limits of SQL Server Express

open-vm-tools for Debian Etch

Well, after a loooong time of trying to get the modules and all the other stuff (read: init-script for the guest daemon and modules) working, I think I’m about there.

I finally fixed a long-standing issue, with the postinst/prerm scripts, and the tools should be about ready. Gonna try and send it Daniel Baumann’s way (that is the Debian Maintainer), for proper inclusion into Lenny.

I (successfully) tried splitting the Xorg parts from the “normal” open-vm-tools, as I usually don’t want Xorg installed on *any* of my virtual machines. Thus leaving me with open-vm-tools, open-vm-modules and open-vm-toolbox (and open-vm-source) as a list of packages one could install.

Windows Server 2008

Well, as it is Saturday and I’m having lots of time (whereas I’d usually spend it working), I thought I’d give Windows Server 2008 a try. What interested me most, is the Windows Server 2008 Server Core Installations, as it’s supposed to lower the security risk (as there is *no* Internet Explorer, no Explorer nothing running by default, only a simply cmd.exe).

As one of my co-workers requested me to upload the Standard/Enterprise/Datacenter DVD (which he got through our Microsoft Select 6.0(?) agreement) to our ISO’ VMFS, I had the DVD already at hand. As for that, I *really* love the feature set of VMware.

Deploying a new VM (even if you have to reinstall it) is quite fast (took me about 20 minutes, which I used to get some breakfast – it was only 6:30am). That’s about when I figured, how damn greedy Windows Server 2008 is. 16GiB hard disk as default installation and 2GiB RAM for a simple server ? Damn.

Been a while

Well, it’s been quite a while since most of the people last heard a word from me. The last few months I’ve been extremely busy with work-related tasks (and as a side-effect of that, didn’t want to spend much time in front of the computer after 9 hours of work). I also started spending more and more time in the gym, like nearly two hours every Tuesday and Thursday.

  • I finally fixed our replication issues, we do now have a working! MySQL Multi-Master (1. Node, 2. Node — bear in mind, this boxes are *only* serving MySQL and nothing else, so don’t use these configurations on mixed setups) Replication Setup as database back end for our TYPO3-vHosts.
  • all the web nodes are now serving the content from a clustered, shared SAN volume (is that a good thing ? šŸ˜› – don’t know yet …)
  • our VI environment is getting more and more acceptance (even if you hear some complaints now and then, like “awww, damn that crap my 4GiB RAM, 2×3.0GHz Windows 2008 is running soooo choppy” – simple answer, don’t use Windows Server 2008 and/or Windows Vista!)
  • I finished prepping our VM templates (at least the Windows ones)
  • we’re still putting together the plans on whether or not invest into a VDI solution.

The next few weeks are gonna be as frantic as the weeks before, I still have to migrate a lot of TYPO3 installations to our new cluster (which sadly needs time, as we need to wait for DNS changes to propagate). Honestly, I might be ending up extending the SAN volume for the MySQL data storage, as even with only three somewhat busy sites, the binary log of the last 5 days is about 2GiB in size. And we still have ~20 other busy sites on a separate box.

Lucky me, I created the MySQL data storage on a logical volume, so I can easily extend the volume in the san-manager semi-online (the fs needs to be unmounted and thus the MySQL process), then extend the physical volume (LVM2 PV) and the logical volume (LV) afterwards, and at last the underlying EXT3 file system.

As some of you know by now, I am on extended leave for now. I don’t have tree access (at my own request), though I’m gonna try to keep up with Chris and 2008.0 … So long!

Deploying VM templates

Ok, so after my first day yesterday after a rather long vacation I today wanted to look at the problem that the Administrator password isn’t changed when using VirtulCenter’s clone customization functionality (which relies at least for Windows on sysprep).

After a short googling, I stumbled upon this.

Simple problem short … Don’t specify an Administrator password for the template. Then you should be able to change the Administrator password when cloning the template. It’s “should“, as the VM’s are still updating.

And it really works. After emptying the Administator password, the cloning works just fine. Damn sysprep bug …

Waiting

We are still waiting for the money promised by the state and the country for our HBFG (again, it’s “Hochschulbaufƶrderungsgesetz”), that hopefully is reducing or eliminating our storage/SAN problem we have currently. Right now we have to Cisco MDS9216 (that’s a 16-port 2GBps SAN-switch, two for redundancy), which means we only have 16 SAN-ports. That isn’t much, but still is to less, as we have like 30 machines or so, that *really* need access to the SAN, so we either end up unplugging some of them from the SAN or merge them onto some big machines (like our x366).

The other side of the problem is the storage .. Currently that isn’t redundant, which means we’re fucked if the storage decides to not come up, or one of the controller smokes .. So were looking at two DS4700 with 2 enclosures each filled with 300GB 2GBps FC disks. That will hopefully also solve our constant lack of rackspace.

Apart from that, we took a look at the terminal server market, heard someone from Citrix, looked ourselves at 2X (and I think we are going with the 2X solution – even if they don’t support the authentication passthrough – yet). We might want to consider buying dedicated hardware for the terminal servers, as I implemented them running on the ESX which isn’t a permanent solution, as at least the students will work on those terminal servers 0700-2200, that means continuous load in that time, which isn’t good for the ESX Cluster, as they are pretty loaded already.

We’re also looking in buying a third box for the ESX Cluster, probably one of the same as we have currently (that is x366 – with 2 DC Xeon’s, 16GB RAM, 2×73 GB SAS, 2x dual-port Intel NIC, 2x dual-port FC HBA) to get some extra capacity.

Recently I did some experiments with Gentoo as MySQL cluster (master< ->master replication for our upcoming database servers – that’s what the blade chassis and the two blades are for) and noticed that the Gentoo VM’s were sucking up RAM and didn’t release it, so I had to reset them every morning, in order to free some RAM. I guess I should poke Chris a bit about that, as he told me back at FOSDEM that he was doing some load testing with a similar setup not so far ago.

Mood sucks

christel, you remember the mood-swings we were talking about ?

I think I’m undergoing just another šŸ™ I’m currently pretty much pissed. Basically everything is pestering me currently (except #gentoo-dev and Gentoo work).

Work just ripped another piece of me (hah, thanks VMware & BigBlue). I started the day with a ice cold shower (if I’m talking about ice cold it was ice cold), they’re currently replacing our old gas heating and unfortunately that means no warm water at all! *arg*

Live sucks (again)

Now is again such a time in live, where you have the motivation or wish to just fade away.

I’m just listening to Fort Minor – Where’d you go and thinking about the stuff Mike Shinoda is singing …

I want you to know it’s a little fucked up,
That I’m stuck here waitin’, no longer debatin’,
Tired of sittin’ and hatin’ and makin’ these excuses,
For while you’re not around, and feeling so useless,
It seems one thing has been true all along,
You don’t really know what you got ’til it’s gone.

The track is really great, I really enjoy that. Some people I know would call it dark or even depressive but it ain’t even like that. It’s just a bit blue, like carving for someone is. šŸ™‚

I didn’t thought that telling someone, that you sense a bit more than friendship would change so much. So much, that even simple talking to each other isn’t possible any more. Oy, that sucks.

And than you get to hear, that’s only for you, so it won’t hurt so much. What a lame excuse.

On the other hand I really start to enjoy my life. Looking the third seaoson of Stargate which DHL just delivered yesterday. Work is getting more exciting.

Hopefully the new IBM System X (well we ordered xSeries šŸ˜‰ ) will be delivered next week, I’m already studying the ESX documentation.

LinuxTag – part 2

Iā€™m sitting in the S8 to Frankfurt Airport where Iā€™ll switch to the ICE to Stuttgart to visit my cousins and my aunt. Linux Tag was quite amazing, I finally met some of the people behind OpenVZ (Kir and Kirill), saw a bit of Andrew Mortonā€™s Kernel FAQ (Kir told us that) and met some people including Bertl, doener, derjohn, zeng, foo, … of the linux-vserver community. Both workshops were quite interesting and I learned a lot of things about openvz and itā€™s userland tools and linux-vserver (finally I understood the CPU Tokenbucket system).

Even if I didnā€™t arrive in time to watch Kir and Kirill’s presentation of openvz and its features completely, I managed to watch Kir demonstrating the live migration between two different nodes. Even if Kirill needed to reboot his system due to a readonly filesystem (it was / that was the whole bugger) I have to admit it really impressed me (since thatā€™s a feature we had to pay 3000ā‚¬ for VMware ESX and no I donā€™t want do hear a single word about it). Sadly the OpenVZ stuff isnā€™t ported yet to SPARC so Iā€™ll keep vServer running on the U1 (Ultra1). I also met Hollow in person, which really was the highlight of all days. He was my mentor when I joined Gentoo and is the person that Iā€™m doing most of my work on Gentoo / Linux vServer / OpenVZ related things. Bertlā€™s talk nearly took four hours but those four hours were quite informative and interesting. He held a general introduction into virtualization theory (which took him two hours). After a small fifth teen minute break he demonstrated most of the things possible with linux-vserver (including resource limits to kill kill certain memory/cpu hogs).

Demonstration ended at 18:10 and we got back up to the Linux vServer booth were I finally managed to ask Bertl about his patch name versioning scheme. And I finally understood it!

We also stopped by at the SWsoft booth to say goodbye to Kir and Kirill and to talk about the SRPMS but they already had left. We did some group photos of all present at the Linux vServer booth. Afterwards Hollow and I grabbed our backpacks and took of to the station. On the way we had a little discussion about problems and stuff that we recently noticed. First was the /dev/console virtualization effort, since we switched from init-style Gentoo (which we removed from the utils) to plain. The virtualization would show some effect if youā€™re wanna be able to see whatā€™s happening on the startup phase of a vServer. Second thing was the reintroduction of the fastboot bug (thatā€™s what I call it). The util-vserver package leaves a plain and empty file in the guests root filesystem, which really annoys me. The third thing is the vserver-init.$( mktemp )Ā“ file that is placed in /tmp but isnā€™t deleted after startup is complete. Another thing we talked about was the vserver stopĀ“ which only waits for the vkill timeout to kick in but isnā€™t going to stop the vServer by itself.