Skip to main content

Mysql JDBC driver and Streaming large resultset

We are moving one legacy component from Berkeley db to mysql so that we can replicate it to distribute read requests and shard it to scale it. I had to dump the entire db contents and stream it over http to some other component. the db content for a single customer can range from 100K to 3-4 Million records. We are using spring JDBC to make the task of dealing with jdbc api simple. Now I was using a ResultSetExtractor to stream the resultset like this.

            ResultSetExtractor resultSet = new ResultSetExtractor() {
                public Object extractData(ResultSet result) throws SQLException, DataAccessException {
                    return "";

getJdbcTemplate(context).getJdbcOperations().query( sql, resultSetExtractor);

But it appears that Mysql native JDBC driver loads entire resultset into memory before passing the control onto ResultSetExtractor and that was causing OOM.

By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and cannot allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.
To enable this functionality, create a Statement instance in the following manner:
stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this, any result sets created with the statement will be retrieved row-by-row.

This is explained here 

It seems SimpleJdbcTemplate doesnt have a setFetchSize method and JdbcTemplate has a setFetchSize but that doesn not work when you call the query method and if you use statement then the applySettings method applies fetchSize only if its >0.  So the solution is to use PreparedStatement directly

callback = new PreparedStatementCallback() {
                public Void doInPreparedStatement(PreparedStatement pstmt) throws SQLException, DataAccessException {
                    ResultSet rs = pstmt.executeQuery();
                    return null;

        executeStreamed(jdbcTemplate, callback, sql);

     * Unless you specify the statement settings as below the mysql driver is going to load all results in memory.
     * @param jdbcTemplate
     * @param callback
     * @param sql
    protected void executeStreamed(SimpleJdbcTemplate jdbcTemplate, PreparedStatementCallback callback, final String sql) {
        PreparedStatementCreator creator = new PreparedStatementCreator() {
            public PreparedStatement createPreparedStatement(Connection conn) throws SQLException {
                PreparedStatement pstmt = conn.prepareStatement(sql, java.sql.ResultSet.TYPE_FORWARD_ONLY,
                return pstmt;
        jdbcTemplate.getJdbcOperations().execute(creator, callback);


  1. Alternatively, we can subclass JdbcTemplate and override applySettings method and use this subclass.


Post a Comment

Popular posts from this blog

RabbitMQ java clients for beginners

Here is a sample of a consumer and producer example for RabbitMQ. The steps are
Download ErlangDownload Rabbit MQ ServerDownload Rabbit MQ Java client jarsCompile and run the below two class and you are done.
This sample create a Durable Exchange, Queue and a Message. You will have to start the consumer first before you start the for the first time.

For more information on AMQP, Exchanges, Queues, read this excellent tutorial
import com.rabbitmq.client.Connection; import com.rabbitmq.client.Channel; import com.rabbitmq.client.*; public class RabbitMQProducer { public static void main(String []args) throws Exception { ConnectionFactory factory = new ConnectionFactory(); factory.setUsername("guest"); factory.setPassword("guest"); factory.setVirtualHost("/"); factory.setHost(""); factory.setPort(5672); Conne…

Spring query timeout or transaction timeout

If you are using spring to manage transactions then you can specify default transaction timeout using

    <bean id="transactionManager"
        <property name="dataSource" ref="dataSource" />
        <property name="defaultTimeout" value="30" /> <!--30 sec--->             

or you can override the timeout in the annotation

    @Transactional(readOnly = false, timeout=30)

or if you are doing it programatic transactions then you can do

DataSourceTransactionManager transactionManager = new DataSourceTransactionManager(dataSource);

 or you can override the timeout for one particular transaction

TransactionTemplate transactionTemplate = new TransactionTemplate();

Python adding pid file

I have a thumbnail generator that launches multiple processes and the correct way to shut it down is to send kill -HUP to the parent process. To automate I had to write a pid file from python, it was a piece of cake
def writePidFile(): pid = str(os.getpid()) f = open('', 'w') f.write(pid) f.close()