To avoid thundering herd problem we only allow X no of write and Y no of reads to Msyql database from a node. Recently I introduced HA into our tomcat stack that reduced no of nodes by 60% and the HA is helpful but it can happen that one customer can hog all the threads in the cluster. Before HA this would cause a downtime of only one node but now it has a potential to bring down 1/4 th of data centre.
To avoid this issue I was looking for various alternatives and finally the idea was to use a fair share thread pool that would pin an upper bound on no of threads per customer but it was becoming too complex and I was not going anywhere. I kept it as a background thread and then the worse happened and yesterday we had a downtime as one bad customer gobbled up all reader threads.
So in crunch I came up with a java fair share threadpool approach by implementing a pool of thread pool. Each customer in our site has a random UUID called as customerId all read/write methods have it in the argument. So I used AOP to intercept all methods at a layer and then created 4 pools, when a method is called I hash the customerId and use any of the 4 pools based on modulus of hash. I know its not a foolproof solution but it would buy me enough time to come up with right solution and if it works then I dont even need to think of a one ;).
The code already went live yesterday night and so far I see all 4 pools being randomly used.
To avoid this issue I was looking for various alternatives and finally the idea was to use a fair share thread pool that would pin an upper bound on no of threads per customer but it was becoming too complex and I was not going anywhere. I kept it as a background thread and then the worse happened and yesterday we had a downtime as one bad customer gobbled up all reader threads.
So in crunch I came up with a java fair share threadpool approach by implementing a pool of thread pool. Each customer in our site has a random UUID called as customerId all read/write methods have it in the argument. So I used AOP to intercept all methods at a layer and then created 4 pools, when a method is called I hash the customerId and use any of the 4 pools based on modulus of hash. I know its not a foolproof solution but it would buy me enough time to come up with right solution and if it works then I dont even need to think of a one ;).
The code already went live yesterday night and so far I see all 4 pools being randomly used.
Comments
Post a Comment