View Single Post
Old 02-23-2005, 07:22 PM   #3 (permalink)
idx
Senior Grasshopper
 
idx's Avatar
 
Join Date: Jun 2003
Location: FL
Posts: 317
idx is on a distinguished road
Not sure if anyone is interested, but it's been an interesting ride. (almost there)

I ended up installing spread as a base for internal messaging between nodes and wrote a little perl daemon to facilitate things like forcing backup nodes to sync, reload their daemon, shutdown, etc...

So upon boot a script checks the spread room to see if another server thinks it's the master.. If so and that server has a lower priority it sends me an email and sits there.. From there I can manually initiate a sync script so the more powerful machine will sync then takeover as the master..

As far as the failover issue with MySQL, it seemed to work best that the new master run a `reset master;` then contact each of the other backup nodes to change master (replication is on a network internal to the cluster, so there's no virtual IP there..) and run a `reset slave;`.. Otherwise the slave SQL thread on the backup DB's kept halting due to duplicated binlog statements...

If a node gets too far out of sync and it missed some updates a simple `load data from master;` does the trick.

Also ended up installing the hobbit monitoring system as a replacement for big brother.. Really nice stuff and works well during failover.

Still think I need to get more in tune with mysql replication, but it's working well enough to let the users loose on it..

In limited testing it seems to be pretty seamless. I setup a page to refresh every second then shutdown one node and the next starts working without much delay. (page increments a session var -- sessions were moved to the DB, so no loss during failover) It sometimes takes 3-5s, but that's not bad for cheapo hardware and my shoddy code


-r
idx is offline   Reply With Quote