As you might recall from my first article about this topic, I had some troubles with the Microsoft Cluster Services and the registration replication. Now, today as we tried switching the TSM-Server for some resources, we ran into this again.
We were using the service install tool (dsmcutil install scheduler) to set the new password as well as the GUI. Now, as we brought the resource online with the local service manager, everything was honky dory. But as soon as we brought it online using the Cluster Manager, it failed horribly. Why ?
Well, as I read the Microsoft KB the last time, I started remembering something about the replication.
- When the resource goes online, the registry keys are updated with the previously checkpointed information.
- When the resource is brought offline, all the checkpoints associated with this resource are saved.
If you manually update these registry keys while the application or service is offline, the changes may not be replicated or may be lost. To prevent this from happening, make any manual changes while the service or application resource is online.
Simply put, when you toggle the resource offline, the cluster saves the registry from the currently running node onto the quorum (checkpoints). As we changed those settings while the resource was offline, it discarded them, as we toggled it back online with the Cluster Manager.
Simple solution: just remove the registry replication parameter when the resource is offline (and click “Apply” and “OK” afterwards). After that update the registry on the cluster node currently owning the physical disk drive (either using the GUI or dsmcutil). Afterwards, re-add the registration key and you should be able to “force” the Microsoft Cluster into thinking that the registry you have on this cluster node is the valid one.