Skip to main content

java Spring Shard datasource with Mysql/Oracle

If you are implementing database sharding and using Spring JDBC then you are out of luck to using declarative transactions and find a Datasource with Spring that would handle sharding. I had to implement my own Datasource manager and own annotations to use declarative kind of transactions to hide complexities from average developers.  Its very important to abstract out cross cutting concerns as sharding and transactions so that any junior developers wont be confused and start copying code left and right without understanding the impact of their changes globally. 

So the idea is that
1) You would implement a ShardDataSourceManager that would be basically pool of connection pools and you would lookup a datasource by shard id.
2)You would define your own Transactional annotations and annotate methods with it
3) You need to write an interceptor at dao layer that would read annotations on method and some context info. From the context info you would lookup shard id and lookup datasource and inject into a thread local.
4)The dao layer when it looks up datasource would look into thread local to construct a jdbc template and execute queries on it.

Here is a sample ShardDataSourceManager, ShardTransactional Annotation

public @interface ShardTransactional {
      public abstract boolean readOnly() default false;


public class ShardTransactionInterceptor implements MethodInterceptor {
    private static final AppLogger logger = AppLogger.getLogger(ShardTransactionInterceptor.class);
    private static ThreadLocal dataSourceThreadLocal = new ThreadLocal();
    private ShardDataSourceManager shardDataSourceManager;
    public ShardDataSourceManager getShardDataSourceManager() {
        return shardDataSourceManager;

    public void setShardDataSourceManager(ShardDataSourceManager shardDataSourceManager) {
        this.shardDataSourceManager = shardDataSourceManager;

    public Object invoke(final MethodInvocation method) throws Throwable {
        if (method.getMethod().isAnnotationPresent(ShardTransactional.class)) {
            try {
                ShardTransactional annotation = method.getMethod().getAnnotation(ShardTransactional.class);
                User user = getParam(method, User.class);
                if (user == null) {
                    throw new IllegalStateException("All transactional methods must have user argument");
                TransactionTemplate transactionTemplate = new TransactionTemplate();
                boolean readOnly = annotation.readOnly();
                ShardInfo shardInfo =  getShardInfo(user);
                transactionTemplate.setTransactionManager(shardDataSourceManager.getTransactionManagerByHostId(shardInfo.getHostId(), readOnly));
                return transactionTemplate.execute(new TransactionCallback() {
                    public Object doInTransaction(TransactionStatus transactionStatus) {
                        try {
                            return method.proceed();
                        }catch (Throwable t) {
                            logger.error("Rolling back transaction due to" ,t);
                            throw new RuntimeException(t);                       
            } finally {
        } else {
            return method.proceed();

    private ShardInfo getShardInfo(User user) {
        ...code to lookup shard by user   
        return shardInfo;

    public static DataSource getDataSource() {
        return dataSourceThreadLocal.get();
    private DataSource cacheDataSourceInThreadLocal(int hostId, boolean readOnly) {
        DataSource datasource = shardDataSourceManager.getDataSourceByHostId(hostId, readOnly);
        return datasource;

    private T getParam(MethodInvocation method, Class clazz) {
        Method reflectMethod = method.getMethod();
        Class[] parameterTypes = reflectMethod.getParameterTypes();
        if (parameterTypes != null) {
            int i=0;
            boolean found = false;
            for (Class parameterType : parameterTypes) {
                if(clazz.isAssignableFrom(parameterType)) {
                    found = true;
            if (found) {
                T param = (T) method.getArguments()[i];
                return param;
        return null;

public class ShardDataSourceManager {
    private static final AppLogger logger = AppLogger.getLogger(ShardDataSourceManager.class);
    private static boolean autoCommit = false;
    private Map dataSourceMap = new HashMap();

    private Map transactionManagerMap = new HashMap();

    private ShardManager shardManager;

    private String driverClassName = "";

    private int maxActive = 20;

    private int maxIdle = 5;

    private int maxWait = 180000;
    private int minEvictableIdleTimeMillis = 300000;
    private boolean testWhileIdle = true;

    private String validationQuery = "select 1 from dual";
    private String userName;

    private String userPassword;

    public String getDriverClassName() {
        return driverClassName;

    public void setDriverClassName(String driverClassName) {
        this.driverClassName = driverClassName;

    public int getMaxActive() {
        return maxActive;

    public void setMaxActive(int maxActive) {
        this.maxActive = maxActive;

    public int getMaxIdle() {
        return maxIdle;

    public void setMaxIdle(int maxIdle) {
        this.maxIdle = maxIdle;

    public int getMaxWait() {
        return maxWait;

    public void setMaxWait(int maxWait) {
        this.maxWait = maxWait;

    public int getMinEvictableIdleTimeMillis() {
        return minEvictableIdleTimeMillis;

    public void setMinEvictableIdleTimeMillis(int minEvictableIdleTimeMillis) {
        this.minEvictableIdleTimeMillis = minEvictableIdleTimeMillis;

    public boolean isTestWhileIdle() {
        return testWhileIdle;

    public void setTestWhileIdle(boolean testWhileIdle) {
        this.testWhileIdle = testWhileIdle;

    public String getValidationQuery() {
        return validationQuery;

    public void setValidationQuery(String validationQuery) {
        this.validationQuery = validationQuery;

    public String getUserPassword() {
        return userPassword;

    public void setUserPassword(String userPassword) {
        this.userPassword = userPassword;

    public String getUserName() {
        return userName;

    public void setUserName(String userName) {
        this.userName = userName;

    public void init() throws Exception {
        for (DbHost shardInfo : shardManager.getDbHosts()) {
            String url = "jdbc:mysql://" + shardInfo.getMasterHost();
            BasicDataSource dataSource = createDataSource(url, username);
            dataSourceMap.put(shardInfo.getHostId(), dataSource);
            DataSourceTransactionManager masterTransactionManager = new DataSourceTransactionManager(dataSource);
            transactionManagerMap.put(shardInfo.getHostId(), masterTransactionManager);
  "DataSource Created for hostid= {}, url= {}", shardInfo.getHostId(), dataSource.getUrl());

    private BasicDataSource createDataSource(String url, String username) {"Initing {} ", url);"Creating Datasource {}", url);
        BasicDataSource dataSource = new BasicDataSource();
        return dataSource;

    private DataSource getDataSourceByHostId(int hostId) {
        DataSource dataSource = dataSourceMap.get(hostId);
        if (dataSource == null) {
            logger.warn("Could not find a data source for: {}", hostId);
            throw new IllegalArgumentException("Invalid dbname, no such pool configured: " + hostId);
        return dataSource;

    public DataSource getDataSourceByHostId(int hostId, boolean readOnly) {
        DataSource dataSource = null;
        if (dataSource == null) {
            logger.debug("Using Master datasource for hostid={}", hostId);
            dataSource = dataSourceMap.get(hostId);
        if (dataSource == null) {
            String msg = "Could not find a data source for hostId=" + hostId;
            throw new IllegalArgumentException(msg);
        return dataSource;

    public DataSourceTransactionManager getTransactionManagerByHostId(int hostId, boolean readOnly) {
        DataSourceTransactionManager transactionManager = null;
        if (transactionManager == null) {
            logger.debug("Using Master transactionmanager for hostid={}", hostId);
            transactionManager = transactionManagerMap.get(hostId);
        if (transactionManager == null) {
            String msg = "Could not find a data source for hostId=" + hostId;
            throw new IllegalArgumentException(msg);
        return transactionManager;

    public void destroy() throws Exception {"destroying pools");

    private void destroyPool(Map dsMap) throws SQLException {
        if (dsMap != null) {
            for (BasicDataSource dataSource : dsMap.values()) {
      "Discarding pools: {}", dataSource);


  1. Seems like a neat solution. However as I observed sharding eventaully becomes much more than just inserts in a "shard-aware" connection pool. Cross-shard queries, transaction consistency and administration of the entire array - are crucial to have a a good sharding solution. You can have a look at ScaleBase (disclaimer: I work there),, to see how a this can be your 1-stop-shop for all of your sharding needs, totally transparent (standard conn pool... :) ).

  2. Can I get the source code for this to play with?

  3. except the imports the code pasted above is the real source code we have live in production serving 1B+ rows from 20 mysql servers. I havent got a chance to put it on github yet.

  4. Any github project ? looks nice, i'm doing similar stuff and i'd like to fork and contribute if possible

    1. No github project right now :( as I got busy.


Post a Comment

Popular posts from this blog

RabbitMQ java clients for beginners

Here is a sample of a consumer and producer example for RabbitMQ. The steps are
Download ErlangDownload Rabbit MQ ServerDownload Rabbit MQ Java client jarsCompile and run the below two class and you are done.
This sample create a Durable Exchange, Queue and a Message. You will have to start the consumer first before you start the for the first time.

For more information on AMQP, Exchanges, Queues, read this excellent tutorial
import com.rabbitmq.client.Connection; import com.rabbitmq.client.Channel; import com.rabbitmq.client.*; public class RabbitMQProducer { public static void main(String []args) throws Exception { ConnectionFactory factory = new ConnectionFactory(); factory.setUsername("guest"); factory.setPassword("guest"); factory.setVirtualHost("/"); factory.setHost(""); factory.setPort(5672); Conne…

What a rocky start to labor day weekend

Woke up by earthquake at 7:00 AM in morning and then couldn't get to sleep. I took a bath, made my tea and started checking emails and saw that after last night deployment three storage node out of 100s of nodes were running into Full GC. What was special about the 3 nodes was that each one was in a different Data centre but it was named same app02.  This got me curious I asked the node to be taken out of rotation and take a heap dump.  Yesterday night a new release has happened and I had upgraded spymemcached library version as new relic now natively supports instrumentation on it so it was a suspect. And the hunch was a bullseye, the heap dump clearly showed it taking 1.3G and full GCs were taking 6 sec but not claiming anything.

I have a quartz job in each jvm that takes a thread dump every 5 minutes and saves last 300 of them, checking few of them quickly showed a common thread among all 3 data centres. It seems there was a long running job that was trying to replicate pending…

Email slavery

It seems I have become an EmailSlave. The first half of the day is spent in just answering to emails. There are so many emails where I am copied but I need not be. There are many emails  where its a 1-2 page email and somewhere down someone says @KP please answer this.  So it seems daily my work schedule is:
Signin to newrelic and check anomalies for 15 min. Check emails related production exception report and yes there are a ton of these report daily. Need a better tool here as this model is not scalable. I need to reduce the incoming data at me to only see relevant data like what newrelic does. May be I need to create a webapp out of these emails.Check emails for next few minutes before team callsDo team callsThen again back to checking emails until a I have taken a best shot at answering everyone waiting for my reply.Attend team meetings on Tue/Thu
Being an architect and coder at heart I don't feel satisfied at end of the day if there is nothing tangible getting done at the end.…