Update 2: Full Services RestoredFounder leaving with broken promises? Hmm?
Full services have been restored. The outage lasted about 13 hours.
Here’s a breakdown of what happened.
- Due to a number of reasons (Founder leaving with some broken promises, legacy software etc) Haloscan still runs on old hardware that has been patched and maintained as best as possible by our team
- Out of respect for Haloscan users who love the simple system, we aborted plans to migrate Haloscan users to JS-Kit until JS-Kit was able to mimick Haloscan 100%. The result has been a longer than expected migration process - so the upgrade to the faster, more feature rich JS-Kit system is long overdue.
- We have 4 database servers for Haloscan
- Master server had a hard drive failure
- Normally this should have only stopped new comments from being written to the system, but the legacy code could not cope and resulted in a full failure (Both read AND write);
- Read access (I.e. comments appearing on blogs, but no ability to add new comments) was fixed by our engineer Oleg within ~1 hour.
- We’ve been working in parallel on multiple tracks on a Plan A (Rebuilding the Server) and Plan B (Setting up EC2 replica) to get things up and running as fast as possible.
- In the process we also promoted one of the remaining replicas to master (and setup remaining to slave)
- From this moment forward Haloscan was back up and running
- In general two replicas are enough for service to run as expected
- We’ve got old the master server rebuilt with new drives and will add it as another replica within a day or two
In order to avoid this occuring again, we are going to redouble our efforts to migrate users to the JS-Kit service. We hope that we can minimise changes to our beloved Haloscan, but change is inevitable and, at this point, a welcome relief from the older system for both users and our system admins!
Thank you all for your patience. I have been communicating with you all day via Twitter, Blogger and Email and I’d like to thank you for being gentle with me (for the most part hah).
Child Injured In Kufur Qaddoum
18 minutes ago