For the past few months, all our efforts have been concentrated towards boosting uptime.This would be done, by building redundancy around every hardware and software component that can fail. This, manifestly, holds true even for the mail storage boxes. We decided to invest in building what we internally call “rabbit pairs“, which are these powerful boxes with a huge storage capacity (upwards of 18 TB raw storage) and lightning fast IO responses by using tiered storage ( a mix of SSDs and slow SATA discs).
Where does the redundancy come from?
Here’s how.The rabbits are currently deployed in an Active-Passive setup where mail data is synced (resynced to be precise) to the slave regularly. So if we want to take a box down, we switch over to the slave, with near zero service down times. That’s not it !! We are also almost production ready with our clustered filesystem implementation which could theoretically make this setup Active-Active. But our first deployment would be to deploy the clustered file system architecture in Active-Passive so that we have a failover box which is hot and ready to switch whenever the master fails. This would get our storage to the atleast desirable level of fail safety. The net result, your mail should never go down !!
Once this setup is successfully deployed in production (hopefully in 3-4 weeks), we’ll migrate accounts from the older infrastructure to the rabbits. Get ready for at least a 10 fold performance improvement once you move to the new set up. We are all excited to go live on this. Just waiting for our Operations to weave their magic.
A bunch of us have also been busy automating the migration process. The cool part is that we’ve minimized the downtime required for the migrations. So you won’t even notice that your emails moved to this new infrastructure.
The next few weeks are going to be exciting for us and hopefully, this culminates into rock solid email services for your organization.