Skip to main content

Posts

Showing posts from August, 2013

Jenkins archiving artifacts outside your workspace

it seems in jenkins you cant really archive artifacts outside your workspace, I had a requirement to start tomcat(job1) and then run webdriver tests(job2 which runs on slave) and now archive the logs of tomcat(job3).  But the  tomcat lives outside of job1 or job3 workspace. Well it seems the solution is simple, add a shell step in your job that will create a soft link from the outside folder to your workspace and then you can use that softlink to archive the artifacts.

Debugging random webdriver issues on jenkins

So one of my friend was facing one issue where he wrote a bunch of webdriver tests and they all work fine but when he runs on jenkins slave it randomly fails.  He runs his jenkins job hourly and problem is that it fails may be 4 times in 24 hours.  So how do you debug the issue, well he was adding loggers in the test to figure this out and then plodding over logs to figure out what went wrong. This is quite a bit of guess work and I thought sometimes a picture is better than 1000 words. So it would be nice if I could take a screenshot when the test errors out and then save it as an artifact. Guess what the webdriver already has an api for that. So all i needed to do was to add a TestRule like this and add the screenshots directory in the publish artifacts in jenkins.  I will know it in a week or so if this would save him a lot of time or not.     @Rule     public TestRule testWatcher = new TestWatcher() {       ...

Pagespeed and cache flush

Ran into an issue where customers would complain that random logins are slow and then subsequent request are fast. Took a long time to debug because it was totally random. Finally found that because we give each customer a unique subdomain like XXX.yyy.com page speed was caching the aggregated bundles per subdomain.  When we configure pagespeed we never configured the cache size so by default it was taking 100M.  Before we did Tomcat HA domains were pinned to a node so we never ran into the issue but after HA any customer can be served from any node so we were running into an issue where every hour the cache was flushed and domains would see this random login.  Took almost 2-3 hours to debug the issue and the only reason I was able to figure out the issue because I was thinking like if I had to write pagespeed how would I write it. Also I  ran a du on the cache and it was 267M and luckily I saw the apache error logs that the cache clean had ran and ran du again a...

Customers are smarter than you

We just finished a call with a big customer who found 4 security issues in  the product. While fixing them took only 1 hour, they were real issues. We have hired a third party consultants to monitor security issues in our product after every release but it seems they gave the green signal and still the customer found issues. So in short Customers are always smarter than you and a lot to learn from them if you keep eyes and ear open.

Spring manual applying interceptor to a bean

We had a weird requirement where the same spring bean needs to be deployed on two different node types (storage and metadata). When its deployed on metadata node it needs to talk to a specific shard database and when its deployed on storage node it can talk to multiple shard databases.  To achieve this in metadata node we wrote our own transaction annotation and applied a  TransactionInterceptor to the bean in spring that would  start transaction before every method. Now in order for same bean in storage node talk to different databases over different transaction managers we created a pool of beans at startup each with its own interceptor and there we had to hand create the beans but now problem was how to manually apply the same AOP inteceptor. We tried using ProxyFactoryBean but it was not easy and then my colleague landed on to this which saved the day.                 ProxyFactory proxyFactory = new ProxyFactory(sqlDirectory...

abnormal data migration

Cloud storage is a funny and interesting field.  Just analyzed one data pattern where one customer sent a 4 TB hard drive and just migrating 1.2TB of it into cloud created 5M files.  Doing some rough calculation the avg size came out to be 300KB,  which is weird.  Digging deep into the system revealed that the customer scanned all his documents into TIF and the avg size ranged from 10KB to 100KB to 300KB.  wth. Also they had a special LFT or loft file that was 4112 bytes and 2M of them were there. As of right now the sharding approach I had implemented, pins a customer to a shard and that means if we migrate the entire 4TB we would end up with 30M+ files. Life is going to be interesting in next few months. It seems the solution I did an year back is already reaching limits and I need some other solution to federate a customer data across multiple shards and machines but still able to do a consistent mysql backup and replication and also do a 2 phase commit...