Monday, November 10, 2014

A well intentioned public api can bring down a server

Apis are powerful creatures and people can use them to do tons of weird things. We had exposed a public api to create a link but our UI had a capability to select multiple files and generate links for them in bulk in call, so our public api mimicked this behavior.

Today a server was running hot with full GC, I took a heap dump and restarted it. Upon analysing the heap dump in Eclipse Memory Analyzer I found that Sys log appender was choked and it had a queue of 10K messages with each being 2MB. I copied the value and found the class name in log message.

Aparently whenever a link was created a log message was written that would iterate over each file and log a line for each file. There was a bug in the log message that  it would log entire message instead of that file.

for (target in linkRequest.getTargets()) {
 logger.info("queuing preview generation for {}", event) ;
}

QA/developers cant detect this and most people in code review focuses less on logger messages.

But there was a customer who today did some automation and created a link for each file in the folder in one call,  he had 10K files in that folder. What this means is that the 2MB payload for link generation request was logged in for loop 10K times causing full GC.

The fix was simple to just change logger message in for loop but I will talk to API designer to put some upper limit on no of links created in 1 call. 

for (target in linkRequest.getTargets()) {
 logger.info("queuing preview generation for {}", target) ;
}

No comments:

Post a Comment