We programmers some times add too much defensive code in order to protect ourselves from the caller not asserting preconditions before making the call. So for e.g. if we have to save a file in some directory, we would first go and check if the directory exists and if it exists then create the file. Now NFS is not designed to work at cloud scale and we saw lots of calls just stuck in file.exists call in threaddumps. The solution was simple, some of these directories could be created at tomcat startup or app node installer can create them. Also some code can assume that directory exists and if if gets a FileNotFoundExcpetion then create it and retry the operation. Removing these defensive coding practices reduced a lot of unnecessary stat calls on filers and improved performance. This is just an example but similar pattern can be observed in other areas of the code and fixed. Defensive programming is good but too much of it is bad and can be improved by making some assumptions or providing better documentation of the api.
One of the biggest problems I have been trying to solve at our startup is to put our tomcat nodes in HA mode. Right now if a customer comes, he lands on to a node and remains there forever. This has two major issues: 1) We have to overprovision each node with ability to handle worse case capacity. 2) If two or three high profile customers lands on to same node then we need to move them manually. 3) We need to cut over new nodes and we already have over 100+ nodes. Its a pain managing these nodes and I waste lot of my time in chasing node specific issues. I loath when I know I have to chase this env issue. I really hate human intervention as if it were up to me I would just automate thing and just enjoy the fruits of automation and spend quality time on major issues rather than mundane task,call me lazy but thats a good quality. So Finally now I am at a stage where I can put nodes behing HAProxy in QA env. today we were testing the HA config and first problem I immediately
Comments
Post a Comment