Programming fun at startup

Posts

Showing posts from 2010

slf4j over sysout not thread safe

I started using it to redirect all sysout and syserr to log files instead of catalina.out but its not thread safe as I started getting below exceptions. Solution was to patch the library and make LoggerAppenderImpl.appendAndLog and LoggerAppenderImpl.append methods synchronized for now as I am relying on assumption that we wont have too many system.out in third party libraries. java.lang.StringIndexOutOfBoundsException: String index out of range: 117 at java.lang.String. (String.java:212) at java.lang.StringBuilder.toString(StringBuilder.java:430) at uk.org.lidalia.sysoutslf4j.context.LoggerAppenderImpl.flushBuffer(LoggerAppenderImpl.java:62) at uk.org.lidalia.sysoutslf4j.context.LoggerAppenderImpl.appendAndLog(LoggerAppenderImpl.java:57) at uk.org.lidalia.sysoutslf4j.system.SLF4JPrintStreamDelegate.appendAndLog(SLF4JPrintStreamDelegate.java:76) at uk.org.lidalia.sysoutslf4j.system.SLF4JPrintStreamDelegate.delegatePrintln(SLF4JPrintStreamDelegate.java:

Jersey mapped to /* but tomcat to serve other static content

Ran into interesting issue where Jersey was to be mapped to /* to make the REST urls easy. I mean instead of http://foo.bar.com/rest/HelloWorld the rest urls had to be http://foo.bar.com/HelloWorld. But also the JSP and in local dev env all static content needed to be served by tomcat. Was searching for a solution and ran into this Jersey mapped to all url and tomcat to serve JSP . This thread on nabble made my day as I was looking for this solution for 2-3 days but was trying all wrong options.

Securing JSESSIONID cookie if tomcat is fronted by apache

In my previous post Creating a custom Valve in tomcat I descibed the valve solution I tried to secure the JSESSIONID cookie but it didnt worked so finally I had to patch the tomcat class to get it working. We are using tomcat 5.5.28 so I downloaded the source and modify the class apache-tomcat-5.5.24-src/container/catalina/src/share/org/apache/catalina/connector/Response.java. In addCookie method I had to add this code String cookieName = cookie.getName(); if (request != null && "JSESSIONID".equals(cookieName)) { String clientId = request.getHeader("X-Forwarded-For"); if (clientId != null) { cookie.setSecure(true); } } Then compile the code and replace the catalina.jar in tomcathome/server/lib

Tomcat creating a custom Valve

I recently tried registering an app with Salesforce and they reported a security vulnerability of JSESSIONID cookie not being secure in it. The app uses https but this JSESSIONID cookie is created by tomcat. The app is fronted by tomcat so the Apache-tomcat connector is not secure. There were various solution like: 1) Adding secure="true" on http connector, but it didnt worked, somehow it used to work in older tomcat but not in the version of tomcat we use. 2)Other solution is to write an apache module to rewrite the Set-Cookie header but that is too complex. 3)I tried implementing a filter and wrapping the HttpServletResponse and overriding setHeader method but unfortunately by the time the call reaches the filter tomcat has already added the cookie in response and if I add another one there were two cookies sent one with secure and other with no secure attribute so that defeats the purpose. Here I thought Valves comes to rescue so I implemented a tomcat Valve (unfortu

Can't believe you can run windows program on ubuntu

I wanted to use sqlyog on ubuntu and googling it I stumbled upon "wine". Couldnt beleive that it would work but it works like a charm :). Here is a link on how to run sqlyog on ubuntu. http://andrewault.blogspot.com/2008/12/sqlyog-on-ubuntu.html

Python adding pid file

I have a thumbnail generator that launches multiple processes and the correct way to shut it down is to send kill -HUP to the parent process. To automate I had to write a pid file from python, it was a piece of cake def writePidFile(): pid = str(os.getpid()) f = open('thumbnail_rabbit_consumer.pid', 'w') f.write(pid) f.close()

Use Log5j to eliminate isDebugEnabled checks

we use log4j and the code is proliferated with if(logger.isDebugEnabled()) check to avoid string concatenation if the log level is INFO, but many developers forget to add this check and most of the time this problem is not noticed. Every penny counts when the system is under heavy pounding. I ran across Log5j and this solves this problem by delegating string interpolation to the log5j api which can discard the interpolation if log level is not met. From Log5j website in log4j: log . debug ( "This thing broke: " + foo + " due to bar" ); in log5j: log . debug ( "This thing broke: %s due to bar" , foo );

Disabling Tomcat session persistence across restart

Tomcat default standard manager preserves sessions across tomcat restart, we had a special requirement to disable this feature. The way to do this is to introduce a Manager tag with pathname="" as shown below in server.xml under context element <Context path="" docBase="ROOT" debug="0" privileged="true"> <Manager pathname="" /> ......... </Context>

Missing exception traces

This was an interesting issue. We were seeing exceptions in log with empty traces, this was happening even though we had done logger.error(e); or e.printStackTrace. Our CTO found out that aparently its some JVM optimization by sun (now oracle lol). From http://java.sun.com/j2se/1.5.0/relnotes.html he found <<< The compiler in the server VM now provides correct stack backtraces for all "cold" built-in exceptions. For performance purposes, when such an exception is thrown a few times, the method may be recompiled. After recompilation, the compiler may choose a faster tactic using preallocated exceptions that do not provide a stack trace. To disable completely the use of preallocated exceptions, use this new flag: -XX:-OmitStackTraceInFastThrow. >>> Kudos to him as we were thinking its some DWR library issue eating the trace.

Little annoyances in middle of important work

I get annoyed when I am in middle of some coding session and people ask same question that they had asked few days back. Seems like if you answer it for them then they wont put an effort to look for answer. Some of the questions people ask me are: How do I signup a new account on this branch How do i revert a changelist How do I restart tomcat on QA env I have noticed it that instead of asking the question on IM immediately if you wait 2 mins then they would probably for 90% of the times figured out the solution themselves. Another thing that has worked for me is to create a wiki page for common questions and solutions and refer them to the page.

Looks like Google voice is going to compete big time with Skype

When skype call was launched it was free for the remaining part of the year and now google voice is introducing free calls for the year for US and canada. I just tried the call quality from my ubuntu and it rocks. Here is how it looks like in my gmail

RabbitMQ synchronously consume messages

In my previous post I had described a scenario to synchronously consuming message from RabbitMQ. Here is a way to do it in python.To make things simple to understand I just wrote a dummy program that dumps the content of a RabbitMQ queue and then you can use the same program to remove a message also from queue by iterating all messages and acknowledging it. (its a dumb implementation so dont judge the coding, the intent is to demonstrate synchronous consumption of queue contents). import sys from amqplib import client_0_8 as amqp messageIdToRemove = None chan = None def process_message(msg): print "=================================================" print "Properties =" print msg.properties print "Body=" print msg.body if op == "remove_message": if messageIdToRemove == msg.properties['message_id']: print "@@@@@@removing message@@@@@@@@@@@@@@@@@@" chan.basic_ack(msg.del

RabbitMQ retrying failed messages

We are a Hybrid cloud file server company and recently we had a requirement where we had to allow users to View a file with Google docs and upon saving the file in Google docs we need to download the file back and create a version in Cloud file server. Upon saving the file in google docs we insert a message in rabbitMQ from app nodes and then a background GoogleDocs consumer process pulls the file and create a version using REST api of the cloud file server. As there are many components involved here there can be multiple failure scenarios from Google throttling us, to appservers going down for maintenance, to app servers throttling the background jobs if they are under heavy load. The problem with rabbitMQ is that once a message is delivered to the consumer even if the consumer doesn't acknowledges it, RabbitMQ won't redeliver the unacked message to consumers until the channel is properly closed and reopened. I tried checking rabbit transactions api to rollback the transac

IE CSS size limit

We ran into an interesting issue with IE, we aggregate CSS at build time per JSP file and recently we introduced Jquery and colorbox into the current release that caused the css size to grow beyond 300K. Everything would work find locally as we dont do aggregation in development mode but things wont work in QA environment on IE, Firefox would work fine. Some of the images wont show up in IE no matter what you do and one of the developer found that IE will ignore any css beyond 288 KB and even gzipping content doesn't matter because its the final size the IE will have to parse that matters. Fortunately the solution was to just create two aggregate css instead of one and that solved the issue. More details on this are at http://joshua.perina.com/africa/gambia/fajara/post/internet-explorer-css-file-size-limit

SVN revert a changeslist

This summary is not available. Please click here to view the post.

Dumping RabbitMQ queue contents

Apparently there is no easy way to look at the content of messages in RabbitMQ. the list_queues only will give you a count but if you want to look at all the messages content then there is no easy way. I had few thumbnail generation messages in the queue that the code failed to ACK because of exceptions in code. Now I want to see what files are stuck for thumbnail generation. The best way is to restart the program and Rabbit would redeliver the messages but we cant restart a live system. So the trick is to just write a python program that would consume the message but wont acknowledge it. import sys from amqplib import client_0_8 as amqp def process_message(msg): print "=================================================" print "Properties =" print msg.properties print "Body=" print msg.body if __name__ == '__main__': if len(sys.argv) < 5: print "Usage python list_queue_messages.py mq_url mq_user mq_

Parsing xml file using SAX file in python

Again I love simplicity of python. I had a xml file that sometimes can range from 1KB to 100MB so I can't use ElementTree. Python SAX seems similar to java minus the verbose nature of java. My xml file looks something like <restorezipmeta messageid="UUID" outputzippath="test.zip" resultendpoint="http://XXX.7080/rest/private/RestoreRestService/1.0" userid="1"> <restoreentry logicalpath="/Shared/kpatel/CustomRules.rtf" physicalpath="/home/kpatel/try/About_Ubuntu_[Russian].rtf"></restoreentry> <restoreentry logicalpath="/Shared/kpatel/BrowserPAC.doc" physicalpath="/home/kpatel/try/Derivatives_of_Ubuntu.doc"></restoreentry> </restorezipmeta> and here is a small sample to parse the xml using SAX api and create the zip. All you need to do is to create the ContentHandler like you do in Java minus the verbose nature import zipfile import os, sys import log

creating zip file in python and compression

I am a java programmerr but sometimes program in python.I must say python is very easy to do stuff that requires very verbose coding in java. To create a zip file in python all you need to do is : import zipfile restoreZip = zipfile.ZipFile("test.zip", "w") restoreZip.write(physicalPathOfFile, logicalPathOfFileInZip) restoreZip.close() Now python doesnt compress by default so you need to add compression mode while opening zip file restoreZip = zipfile.ZipFile("test.zip", "w", zipfile.ZIP_DEFLATED ) solved the issue

Java redis client for begineers

I had to use redis in our project in both python and Java. In this post I will cover a basic example of generating counters using Redis. I had to use cassandra db to push data but cassandra doesnt have autoincrement counters at this moment so we will use redis till the new version of cassandra supports it. Redis has memcache like API but the advantage is that its has atomic operations and the data is saved so server restarts will survive the data. The only disadvantage I see is that the java client yet doesnt support consistent hashing but eventually it will. Install the redis server by following http://code.google.com/p/redis/wiki/QuickStart run it using ./redis-server Download JRedis client from http://github.com/alphazero/jredis/downloads Run the below program. import org.jredis.JRedis; import org.jredis.RedisException; import org.jredis.connector.ConnectionSpec; import org.jredis.ri.alphazero.JRedisService; import org.jredis.ri.alphazero.connection.DefaultConnectionSpe

Eclipse disable pydev CTRL+SHIFT+T and enable Java open type binding

I am a java programmer but I can code in python too so for some work I had to code in python and I installed pydev in eclipse but It messed up with my Open Type (CTRL +SHIFT+ T) binding and when I now hit that it asks me a popup to tell whether to open python class browser or java open type, off course I hate it because 99% of the time I want Java open Type. The way to get it working back was to go to Windows->preferences->General->Keys and go to the python "Show Class Browser" binding and instead of choosing "In Windows" choose "Pydev editor scope" and you are all set.

Ubuntu VirtualBox and no sound in Windows7 issue

My Windows7 crashed and as we do our most of the development in ubuntu it makes sense to install ubuntu as host. But when I installed Windows7 there was no sound coming in it. Thanks to ubuntu community the solution was tedious due to my mistake but finally I did it. The way to do is : 1) Make sure you have proper virtualbox version. I had the OSE version but my ubuntu is lucid version so I needed that virtual box. If you get "Error: Conflicts with the installed package 'virtualbox-ose'" that means you have to use Applications->Ubuntu Software Centre to uninstall all virtualbox OSE package (in my case there were 3 packages). 2) Download and install the virtualbox lucid version. It will automatically detect old hardisks and upgrade it. 3) You need to VM settings in virtualbox and go to audio and select Alsa Ac97 driver. 4) start your VM and now go to realtek AC97 driver http://www.download3k.com/System-Utilities/System-Maintenance/Download-Realtek-AC-97-Drive

Svn switch saves sync time after a branch is cut

we have an external branch containing third party code and jars. Its near 2G in size and everytime someone cuts a branch its a pain for me to sync even if it has only few changes as I work remote. svn switch saves life here cd $OLDBRANCH svn up cd .. rsync -a $OLDBRANCH / $NEWBRANCH cd $NEWBRANCH svn switch http://abc.xyz.com/repos/branches/$NEWBRANCH

Ubuntu adding Firefox hot key

I am a keyboard guy when it comes to launch applications. On windows I was using hotkeyplus for it but my VM crashed and as we use Ubuntu for development I installed it as a host but I missed my hotkeys so the way to add hotkey for firefox is 1) Go to System->Preferences->KeyBoard Shortcuts and click Add 2) fill details as shown below 3) Add shortcut key as shown below

Generate Tiff Thumbnails using PIL

This was tricky because not all browsers show Tiff images so the trick is to generate the thumbnails as jpg. thumb_image_format = None if mimeType == "image/tiff" : thumb_image_format = "JPEG" ret_value = utils.create_thumbnail_pil(inputPath, outputPath, thumb_image_format) def create_thumbnail_pil(self, infile, outfile, thumb_image_format=None): import Image size = 100, 100 try: im = Image.open(infile) new_image = im.copy() if new_image.mode == "CMYK": self.logger.info('converting CYMK to RGB for %s' , outfile) new_image = new_image.convert("RGB") new_image.thumbnail(size, Image.ANTIALIAS) if thumb_image_format == None: thumb_image_format = im.format new_image.save(outfile, t

Tika supported document types

Tika is a library to extract text out of documents. We wrote a remote document processor service that given a streamed document can extract the text out of it and return it back in response. The reason for streaming documents is that we didnt wanted to mount all filers on that box, as filers keeps on changes so we dont want ops people to forget adding the new filers to the box and leading to any issues. I needed a way to figure out if tika can extract the text out of a document or not before sending request to the document processor. Had to look into the code but if you are using the default AutoDetecting parser here is a way to find public static boolean canExtractText(String extension) { String mimeType = tika.detect("a." + extension); return parser.getParsers().containsKey(mimeType); } private static AutoDetectParser parser = new AutoDetectParser(); private static Tika tika = new Tika();

Java com.sun.net.httpserver. HttpServer OutOfMemory Issue

We had to extract text out of documents that people were uploading to the site so we can index them. As this was going to be just a text extraction services I didnt wanted to go for some full blown tomcat server so wrote a quick http server using java this.httpServer = HttpServer.create(addr, 0); HttpContext context = this.httpServer.createContext("/", new DocumentProcessHandler()); this.httpThreadPool = Executors.newFixedThreadPool(this.noOfThreads); this.httpServer.setExecutor(this.httpThreadPool); context.getFilters().add(new HttpParameterFilter()); this.httpServer.start(); Not adding all code for brevity. After being live for 2-3 days the process started crashing and it was a regular pattern. I added -verbose:gc -XX:+HeapDumpOnOutOfMemoryError to the startup script and had a crash dump after 3 days. Using eclipse Memory Analyzer and using Histogram to look at the heap dump revealed that we had 90K HttpConnection ob

RabbitMQ for Thumbnail generation in cloud

We are an cloud based fileshare and backup company. A frequent requirement for customers is to share their data with other people, for e.g. some customers will have people who will upload stock photos and the customer want to first look at thumbnail before really downloading the large picture. We were earlier using disk based queuing and there was a process that was running at night that used to come around and generate thumbnails. This was working earlier as most of the customers didn't had time sensitive thumbnail generation requirements. Now as we had grown the customer base has grown and people wanted their thumbnails to be generated instantaneously. So I came up with the architecture as shown in the image. We use the new RabbitMQ with persister so that if it goes down the message are preserved. App nodes are tomcat servers that uses Java api to push messages to Rabbit as soon as a file is uploaded by the user and we also queue a message to scribe in case pushing a message

BlackBerry curve wired handsfree with normal 3.5 mm jack

Blackberry is just great, my wired handsfree with blackberry broke and was trying to pair my motorola wireless handsfree with blackberry, but couldn't get it working (may be because its 3 years old tech). But Blackberry is just great you can plugin any normal 3.5 mm jack headphones and it will work without a charm as it would use the phone microphone.

Office 2007 and Office 2010 documents Text extraction using Tika

We were earlier using various different libraries to extract text out of word, pdf, ppt, excel and it was tricky to maintain it. Our CTO found this cool apache Tika project that made our life easy. Now extracting text out of various documents is a piece of cake. Beauty of tika library is that it can detect mimetype and other metadata automatically. Here is a sample code to extract text using Tika @Override public String getText(InputStream stream, int maxSize) { Tika tika = new Tika(); tika.setMaxStringLength(maxSize); try { return tika.parseToString(stream); } catch (Throwable t) { logger.error("Error extracting text from document of type" + logIdentifier, t); return " "; } }

IE browsing slowness and mod_ssl

We were observing that our pages were loading very slow in IE compared to FF and we thought that IE is inherently slow parsing the page, that was a wrong assumption. Using fiddler with IE shows that the server somehow was making too many SSL handshakes with server compared to when we plugged in Fiddler with FF or Safari. This was traced down to an Apache configuration. Apache comes with a default mod_ssl conf SetEnvIf User-Agent ".*MSIE.*" \ nokeepalive ssl-unclean-shutdown \ downgrade-1.0 force-response-1.0 What this tells is that for all IE browsers dont use Keep-alive and downgrade HTTP response to 1.0. Apache was consuming significant time doing this when compared to same request in FF. Well all this was required for really old browsers. Changing the conf as shown below solved the issue and now we get good initial load performance in IE browsers SetEnvIf User-Agent ".*MSIE [1-4].*" \ nokeepalive ssl-unclean-shutdown \ downgrade-1.0 force-respo

Applet JRE 1.6.0_19 security popup issue

We recently ran into an issue where suddenly customers using Java applet for multi file upload started seeing security warning and the worrying thing about this dialog was the "Block" was the first choice so customers keep on clicking Block. The reason this dialog was coming is that our applet was making a http call to download some properties files and Java applet was treating it as a security warning because Applet jars were signed but the properties file were not and they can't be signed. The fix for this issue was to bundle the properties file in the jar file. Temporarily you can also ask users to enable this setting

Java Applet CACHE_VERSION Mac v/s Windows

Java Applets uses a property called as CACHE_VERSION which is of format 4.0.5.452d that is comprised of 4 hexadecimal values separated by ".". The Applet plug-in uses this to determine whether to download new Jars or not. The sun documentation says that applet plugin would download the new jars if the CACHE_VERSION is higher than the previous one. My findings on this: Windows plugin in IE/FF/Chrome/"Safari on windows" all will download new jars regardless of whether the jar cache version is greater or not. Earlier our version was 4.0.5.XXX, I tried updating version to 4.0.4.XXX or even 1.2.3.XXX and windows would happily download it. We recently ran into an issue where applet would not work fro some mac users and it was random, the culprit was that during one deployment our operations team had updated jar as 4.0.6.XXX and we use XXX as svn changelist number of the jar file so we never changed the "4.0.5" portion. When the new build was deployed the jar

Unix timing a command

To time any command in unix just prefix the command with "time ". for e.g. "time ps" "time ls" "time python scripts/dev_scripts/test_backup_upload_locally.py ../vmshare" The output would be something like PID TTY TIME CMD 14648 pts/0 00:00:00 bash 14778 pts/0 00:00:00 ps real 0m0.032s user 0m0.004s sys 0m0.024s

Tiff image thumbnails that are visible in all browsers

It seems that not all browsers show tiff image properly. using showed up properly only in Safari (way to go apple) and it didnt showed properly in Firefox, IE and Chrome. The reason I wanted to do this was to generate thumbnail images for files added to our Cloud server, even though PIL was able to generate the tiff thumbnail it was only visible in Safari. The solution was simple and I used ImageMagic to generate thumbnails for Tiff images in Jpg format ;). nice -n 10 convert Sample.tiff -thumbnail 100x100 -bordercolor white -border 50 -background white -gravity center -crop 100x100+0+0 +repage -limit memory 32 -limit map 32 -limit disk 500 Sample.jpg

First page thumbnail for a multi page tiff

Discovered a new thing that you can have multipage images in tiff format. The way I discovered that was when I used image magic to generate the file it generated 7 images for a file and that broke the code. the way to fix was to generate first page image was to use [0] after input file in image magic. The beauty of the solution is that it works fine even if the image has only 1 page. nice -n 10 convert kp.tiff [0] -thumbnail 100x100 -bordercolor white -border 50 -background white -gravity center -crop 100x100+0+0 +repage -limit memory 32 -limit map 32 -limit disk 500 kp.jpg

Tika0.7 OutOfMemory compile issue

Not sure why don't they generate and put binaries on the site. Was trying to compile Tika0.7 and faced compile issues as tests were failing due to OutOfMemory issue. setting the below env variables before doing mvn install solved the issue export MAVEN_OPTS="-Xmx1024m"

Python CMYK images

Recently ran into an issue when implementing thumbnail generation for our website. Some of the Jpeg images were getting blue color thumbnail background causing customer complaints. Using image magic solved the issue but Image magic was out of process and very slow.We use PIL for image generation and we were using PIL 1.1.6. You can check your PIL version by doing import Image Image.VERSION For those of you facing similar issue, upgrading to PIL 1.1.7 fixed the issue. At first installing PIL 1.1.7 was not working as the python was somehow still picking up 1.1.6, I had to remove all old references by doing "apt-get remove python-imaging" and that solved the issue.

Applet Jar download without browser restart

We use an applet in our website for uploading multiple files/folder tree to the server. Recently we ran into an issue where our code signing certificate was expired and we had to sign the jars again and publish new jars. We use CACHE_VERSION to give each jar a version that way on each browser restart the applet doesn't go to server for checking if a new version is available on the server or not. Refer http://java.sun.com/products/plugin/1.3/docs/appletcaching.html for more details on CACHE_VERSION. We ran into an issue where even after uploading the new jars to the server and giving them each a different CACHE_VERSION customers were still complaining about the expired certificate dialog. Doing some googling found that its a common problem in Java plugins in most browsers and a restart of browser would fix it. The browsers will check the cache version in an open browser only once and then even if you render the applet tag again it wont check the cache version. Wow so many people

Update expired certificate in a signed jar

Recently our website code signing certificate expired and we had to update a jar that we long time back got from a third party. There was no way to get the unsigned jar back so I had to find a trick to update the expired certificate. The solution was elegant and simple : Unzip the jar remove the META-INF folder Use jar command to create the jar again Use jarsigner to sign the jar with the updated keystore

Sending CTRL + BREAK to a java linux process

use "kill -QUIT pid" to send CTRL + BREAK to a running java linux process. This would print the threaddump

Ant append to a file using echo task

Learnt a new thing that you can use echo task and redirect its output to a file. <propertyfile file="${deploy.path}/svninfo.txt" comment="File containing build version,Build Date and svn info"> <entry key="Version" value="${revisionProperty}"/> <entry key="Build Date" type="date" value="now"/> </propertyfile> <echo file="${deploy.path}/svninfo.txt" append="true"> ==============Svn Info============== ${svnInfoOut} ==================================== </echo>

RabbitMQ purge a queue

Such a simple operation is not available in rabbitmqctl. You can list the queues but not clear it so wrote a python client for it. Better solution is to install BQL plugin but for now this would suffice import sys from amqplib import client_0_8 as amqp if __name__ == '__main__': if len(sys.argv) < 6: print "Usage python purge_queue.py mq_url mq_user mq_pass mq_vhost queue_name" exit() mq_url=sys.argv[1] mq_user=sys.argv[2] mq_pass=sys.argv[3] mq_vhost=sys.argv[4] mq_queue_name=sys.argv[5] conn = amqp.Connection(host=mq_url, userid=mq_user, password=mq_pass, virtual_host=mq_vhost, insist=False); chan = conn.channel(); n=chan.queue_purge(mq_queue_name); if n==0: print "purged %s sucessfully" % mq_queue_name else: print "unable to purge %s. Ther

Python for else block

Found interesting thing in python a "for else" block wow The else block will get executed if the for is not terminated by a break pobj = subprocess.Popen(cmd, bufsize=1024) for i in range(4): retcode = pobj.poll() if retcode is not None: break time.sleep(30) else: try: logger.warn('killing process %d' % pobj.pid) os.kill(pobj.pid, signal.SIGKILL) except: logger.exception('Error killing process')

Quartz Admin JSP

We use quartz for scheduling jobs on tomcat nodes, the job runs on a frequency rather then a set time so its helpful to know when is the next time the job will be fired and for testing purposes its good if we can fire the job manually instead of waiting for the trigger to happen as some jobs runs only once a day. I wrote this small JSP that exactly allows to do the same. Here is the sample code for jsp for googlers like me <%@page import="org.quartz.ee.servlet.QuartzInitializerServlet" %> <%@page import="org.quartz.impl.StdSchedulerFactory" %> <%@page import="org.quartz.*" %> <%@page import="java.util.*" %> <% String jobNameToRun=request.getParameter("jobNameToRun"); String groupNameToRun=request.getParameter("groupNameToRun"); String btnTrigger=request.getParameter("btnTrigger"); StdSchedulerFactory factory = (StdSchedulerFactory) pageContext.getServletContext()

Logging DWR method parameters

just set org.directwebremoting=debug in log4j.xml and you should see all parameters like shown below. DEBUG 2010-02-19 19:41:45,981 http-4280-Processor69 org.directwebremoting.dwrp.BaseCallMarshaller - Environment: c0-e10=boolean:false, c0-e11=string:true, c0-e12=boolean:true, c0-e13=string:8, c0-e14=boolean:false, c0-e1=boolean:true, c0-e15=boolean:false, c0-e2=boolean:true, c0-e16=string:5, c0-e3=string:https%3A%2F%2Fkp.foo.com, c0-e4=string:5, c0-e6=boolean:true, c0-e5=boolean:false, c0-e8=boolean:true, c0-e7=string:true, c0-e9=string:8,

Firefox3 and Caching HTTPS content

We use HTTPS for our website because it caters to data for SMBs, I primarily use Firefox browser and it was not caching static content. I was ignorant because when I googled for caching HTTPS content I found various posts that told that content over HTTPS cant be cached. Surprising thing was that IE8 was caching the content and as I dont use IE much I didn't noticed it. Luckily I started used JAWR for DWR JS aggregation and recently I added ExtJS and prototype to it. Surprisingly after doing it the initial load became fast by 1-2 sec and I was surprised. Upon firing up Fiddler I was surprised to see that Firefox is not even making call for the aggregated JS. Ultimately I figured out that it was Cache-Control:Public header that was making all the difference. JAWR was setting Cache-Control: public, max-age=315360000, post-check=315360000, pre-check=315360000 and we were setting Cache-Control:Private, max-age=315360000 . That's it done!! we changed the headers and now Initial l

AtomicInteger

Found this interesting class AtomicInteger that allows you to access primitives in a highly concurrent environment. I ran into an issue when I had to implement a throttling filter to allow only certain no of request in the app for a particular functionality to guarantee a certain level of QOS, my earlier code was public class BackupThrottlingFilter implements Filter { @Override public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { try { activeBackupRequests ++; if(activeBackupRequests > 50) { BackupFileUploadServlet.sendServiceUnavailableResponse((HttpServletResponse) response); return; } chain.doFilter(request, response); } finally { activeBackupRequests--; } } private static int activeBackupRequests; } Ran into an issue with this as its not synchroni

Should I use CDN or not?

I was doing some comparison of home page loading of my company's website and one competitor's homepage loading. The competitor website loads in a snap and ours take some secs before it shows up. Here are some of the stats from YSlow for mine and competitor's site. Now as I can see that competitor is making more images/JS then my company's site but it loads in a snap. I can notice three differences They are using a CDN so images are being downloaded from a local server near Dallas Their total download size for Java scripts and image are 1/3 then ours. ours is HTTPS and their site is HTTP. We can't change to HTTP as we cater to SMB and security transmission is a key for us. I am working with the team to reduce the size of Javascript download and cut it by half. Earlier we were using YUI for tree and ExtJS for table. We recently moved to ExtJS tree as it performs much better then YUI in IE and when you have large no of children nodes. There are few rema

Killing a particular Tomcat thread

Update: This JSP does not work on a thread that is inside some native code. On many occasions I had a thread stuck in JNI code and it wont work. Also in some cases thread.stop can cause jvm to hang. According to javadocs " This method is inherently unsafe. Stopping a thread with Thread.stop causes it to unlock all of the monitors that it has locked". I have used it only in some rare occasions where I wanted to avoid a system shutdown and in some cases we ended up doing system shutdown as jvm was hung so I had a 70-80% success with it. -------------------------------------------------------------------------------------------------------------------------- We had an interesting requirement. A tomcat thread that was spawned from an ExecutorService ThreadPool had gone Rogue and was causing lots of disk churning issues. We cant bring down the production server as that would involve downtime. Killing this thread was harmless but how to kill it, t

Partition your Rolling Fact by time or not

We are creating a data warehouse to store event logging for actions done by user. The requirements are to keep 6 months of historical data and allow user to run audit reports. The challenge here is to partition the facts by creationtime of the record or not. The advantage of partitioning by creationtime is that we can chop a partition when we want to purge the data within seconds and all new data would be added to current month partition so ETL data loads would become fast. The disadvantage is that you will have to include time horizon in your every query that gets fired on the data warehouse otherwise it will do FULL SCAN. The alternative is to not partition by time or you can create global indexes on time-partitioned tables but when you drop data these indexes/tables becomes fragmented. This is a very important decision here and if you can get the User requirements and all the queries would contain time then go ahead and partition your fact by time else its better to pay the performa

Managing User Perception in long running operations

Its important to manage User Perception properly and give him feedback if you are doing a long running synchronous operation. In the web world a user can get frustrated/impatient and try to do the same operation again leading to sending even more load on your server. We recently ran into one issue like this where registration was taking a long time due to some server issue and looking into apache logs it was taking 10 sec but users were reporting it was taking 30-40 sec. Doing registration from a browser with empty cache confirmed that it was not registration but it was the confirmation page that was taking a long time and its a plain html page. It was all because the confirmation page was making 50 requests to the server to download images/js/css. The solution that clicked to me was simple, reduce the no of images, css, JS requests. The confirmation page was downloading all these images to 25 images/css/JS for rendering Header/Footer 5 images for rendering rounded corners 5-6 JS/