Ran into an issue where customers would complain that random logins are slow and then subsequent request are fast. Took a long time to debug because it was totally random.
Finally found that because we give each customer a unique subdomain like XXX.yyy.com page speed was caching the aggregated bundles per subdomain. When we configure pagespeed we never configured the cache size so by default it was taking 100M. Before we did Tomcat HA domains were pinned to a node so we never ran into the issue but after HA any customer can be served from any node so we were running into an issue where every hour the cache was flushed and domains would see this random login.
Took almost 2-3 hours to debug the issue and the only reason I was able to figure out the issue because I was thinking like if I had to write pagespeed how would I write it. Also I ran a du on the cache and it was 267M and luckily I saw the apache error logs that the cache clean had ran and ran du again and it was 97M. I feel like accomplishing something today :).
Finally found that because we give each customer a unique subdomain like XXX.yyy.com page speed was caching the aggregated bundles per subdomain. When we configure pagespeed we never configured the cache size so by default it was taking 100M. Before we did Tomcat HA domains were pinned to a node so we never ran into the issue but after HA any customer can be served from any node so we were running into an issue where every hour the cache was flushed and domains would see this random login.
Took almost 2-3 hours to debug the issue and the only reason I was able to figure out the issue because I was thinking like if I had to write pagespeed how would I write it. Also I ran a du on the cache and it was 267M and luckily I saw the apache error logs that the cache clean had ran and ran du again and it was 97M. I feel like accomplishing something today :).
Comments
Post a Comment