One of the biggest problems I have been trying to solve at our startup is to put our tomcat nodes in HA mode. Right now if a customer comes, he lands on to a node and remains there forever. This has two major issues:
1) We have to overprovision each node with ability to handle worse case capacity.
2) If two or three high profile customers lands on to same node then we need to move them manually.
3) We need to cut over new nodes and we already have over 100+ nodes.
Its a pain managing these nodes and I waste lot of my time in chasing node specific issues. I loath when I know I have to chase this env issue.
I really hate human intervention as if it were up to me I would just automate thing and just enjoy the fruits of automation and spend quality time on major issues rather than mundane task,call me lazy but thats a good quality.
So Finally now I am at a stage where I can put nodes behing HAProxy in QA env. today we were testing the HA config and first problem I immediately saw is that we have to use sticky sessions due to some design issue that will take long time to solve. Now we were doing sticky session by JSESSIONID like
appsession JSESSIONID len 32 timeout 12h request-learn
Immediately I realized that Two tomcats can generate same JSESSIONID so there is a potential for security breach. Thank god tomcat has a way to add a node identifier to JSESSIONID to solve this issue :). You can go to conf/server.xml and add jvmRoute to Engine like
This way your JSESSIONID would be generated like BBF8B5EF74EAAECE0278DC92A9F1353D.192.155.4.5_8080
As a side effect now we would know by looking at cookie as to which node is serving the request and this will help in trouble shooting node specific issues.
I hope to cut down the 100+nodes to 40 nodes after this HA. I will keep 40 because we have pod/farms in each DC so we need to overprovision each pod else this could have been reduced to 10 or 15 nodes. The pods are there to avoid DC meltdown.
1) We have to overprovision each node with ability to handle worse case capacity.
2) If two or three high profile customers lands on to same node then we need to move them manually.
3) We need to cut over new nodes and we already have over 100+ nodes.
Its a pain managing these nodes and I waste lot of my time in chasing node specific issues. I loath when I know I have to chase this env issue.
I really hate human intervention as if it were up to me I would just automate thing and just enjoy the fruits of automation and spend quality time on major issues rather than mundane task,call me lazy but thats a good quality.
So Finally now I am at a stage where I can put nodes behing HAProxy in QA env. today we were testing the HA config and first problem I immediately saw is that we have to use sticky sessions due to some design issue that will take long time to solve. Now we were doing sticky session by JSESSIONID like
appsession JSESSIONID len 32 timeout 12h request-learn
Immediately I realized that Two tomcats can generate same JSESSIONID so there is a potential for security breach. Thank god tomcat has a way to add a node identifier to JSESSIONID to solve this issue :). You can go to conf/server.xml and add jvmRoute to Engine like
This way your JSESSIONID would be generated like BBF8B5EF74EAAECE0278DC92A9F1353D.192.155.4.5_8080
As a side effect now we would know by looking at cookie as to which node is serving the request and this will help in trouble shooting node specific issues.
I hope to cut down the 100+nodes to 40 nodes after this HA. I will keep 40 because we have pod/farms in each DC so we need to overprovision each pod else this could have been reduced to 10 or 15 nodes. The pods are there to avoid DC meltdown.
Comments
Post a Comment