Skip to main content


Showing posts from June, 2010

Svn switch saves sync time after a branch is cut

we have an external branch containing third party code and jars. Its near 2G in size and everytime someone cuts a branch its a pain for me to sync even if it has only few changes as I work remote. svn switch saves life here

svn up
cd ..
svn switch$NEWBRANCH

Ubuntu adding Firefox hot key

I am a keyboard guy when it comes to launch applications. On windows I was using hotkeyplus for it but my VM crashed and as we use Ubuntu for development I installed it as a host but I missed my hotkeys so the way to add hotkey for firefox is

1) Go to System->Preferences->KeyBoard  Shortcuts and click Add
2) fill details as shown below

3) Add shortcut key as shown below

Generate Tiff Thumbnails using PIL

This was tricky because not all browsers show Tiff images so the trick is to generate the thumbnails as jpg.

                    thumb_image_format = None
                    if mimeType == "image/tiff" :
                        thumb_image_format = "JPEG"
                    ret_value = utils.create_thumbnail_pil(inputPath, outputPath, thumb_image_format)

    def create_thumbnail_pil(self, infile, outfile, thumb_image_format=None):
        import Image
        size = 100, 100
            im =
            new_image = im.copy()
            if new_image.mode == "CMYK":
      'converting CYMK to RGB for %s' , outfile)               
                new_image = new_image.convert("RGB")
            new_image.thumbnail(size, Image.ANTIALIAS)
            if thumb_image_format == None:
                thumb_image_format = im.format
  , thumb_image_format)

Tika supported document types

Tika is a library to extract text out of documents. We wrote a remote document processor service that given a streamed document can extract the text out of it and return it back in response. The reason for streaming documents is that we didnt wanted to mount all filers on that box, as filers keeps on changes so we dont want ops people to forget adding the new filers to the box and leading to any issues.

I needed a way to figure out if tika can extract the text out of a document or not before sending request to the document processor. Had to look into the code but if you are using the default AutoDetecting parser here is a way to find

    public static boolean canExtractText(String extension) {
        String mimeType = tika.detect("a." + extension);
        return parser.getParsers().containsKey(mimeType);       

    private static AutoDetectParser parser = new AutoDetectParser();
    private static Tika tika  = new Tika();