21 March 2019

Basic migration framework using Spring Boot


In the previous post i mentioned all the things you need to take into consideration when creating a migration process for you application.

in this post i will show an example of a migration framework that we developed based on our needs using the spring boot framework.

the basic idea is to register 2 Application Listeners in the spring boot application so that each will be invoked to run the corresponding phase and block the application startup until it finish.


List<ApplicationListener> applicationListeners = Arrays.asList(
        new ApplicationStartingListener(),
        new ApplicationContextReadyListener()
);
SpringApplication application = new SpringApplication(new Object[]{MyApplication.class});
application.addListeners(applicationListeners);

the first listener is:
ApplicationListener<ApplicationEnvironmentPreparedEvent> this listener will be invoked when the Spring Environment is ready but before any of the @beans are initialised. this is the perfect phase to make some configuration files changes.

the second listener is:
ApplicationListener<ContextRefreshedEvent> this listener will be invoked when the spring context is refreshed, all beans are constructed and registered but the server is still not responding to request from the outside world. this is the phase at which we decided to make changes in the DataBase.

each listener looks something like this:


String newVersion = // get the version from application.properties
String previousVersion = // get the previous version from versions.properties

if (!VersionsComparator.equals(newVersion, previousVersion)) {
    MigrationService migrationService = new MigrationService(newVersion, 
                                                             previousVersion);
    migrationService.runMigration();
}

* versions.properties is a file we keep updating each time the application starts and holds the current running version. the idea is that when the system is updated, his file holds the version from the previous build so we can use it as a reference.

the migrator service holds all the existing migration version and the MigrationService.runMigration() invokes only the relevant migrators from previous version to the newVersion (this is the reason we pass them as params to the constructor)


migrators = Arrays.asList(
        new DataBaseMigratorV1_1_0(),
        new DataBaseMigratorV1_2_0(),
        new DataBaseMigratorV1_3_0(),
        new DataBaseMigratorV1_4_0(),
        new DataBaseMigratorV1_4_1()
);


when each migator look something like:


public class DataBaseMigratorV1_1_0 implements DataBaseMigrator {

    @Override
    public String migratesFromVersion() {
        return "1.0.0";
    }

    @Override
    public String migratesToVersion() {
        return "1.1.0";
    }

    @Override
    public boolean migrate() {
        // **** do what ever migration you want        
        return true;   
    }
}


And so when the application is starting (after the update of the jars/binaries) spring boot will invoke the listener that uses the migration service that triggers all the relevant migrators:




By the end of the process when all the migrators where executed successfully the application can start with all the data and configuration up to date.

12 February 2019

How to prepare for your server application update?

We are working on a relatively new project, and a few months ago for the first time, we made some database and configuration changes in the new version that required a migration process.
While thinking and designing the migration progress we had many questions and decisions we had to make. In this post, I will try to list those questions and decisions to help you better prepare and design your migration process.

Note that this case was for an application server that is installed on the client servers (on-prem) but most of the point will also be relevant for Saas applications.

Also to note, there is no one magic solution or framework for migration, each application has different requirements and hence a different solution,
Understanding the needs is a crucial part in designing a solution, so let's get started.

Step 1)
Identify what you need to migrate Is it files (usually configuration files) or is it database collections and tables, or maybe both. This is an important step since this may affect the time at which the migration is running, for example, configuration files migration should probably run much earlier than the database migration, and must finish before the application starts (so that when the application starts after the update it will already have the newly updated files), while the database migration can in some cases continue running while the server is already serving user requests.

Step 2)
What type of migration do you need to run:

  • On the fly migration - when your migration process continues even when the new application version started and is already serving user requests
  • Lazy migration - when you migrate only the data that is currently being used.
  • Blocking migration - when you block the application boot until all data is migrated.

The decision is affected by the amount of data that is being migrated, in the current version and in the future.
If your application is expected to run on big scale database then you may need to use the “on the fly migration”, or the “lazy” migration, but if your application is expected to run on
Smaller sized database than you may use the “blocking migration”.

Step 3)
Are you running in a Highly available /clustered environment?.
This is where the fun begins :). You will need to ask some really hard questions here:

  1. Do we allow downtime for our entire cluster while installing new versions? 
  2. Do we allow the old nodes to operate while the upgrade is in progress?. While in the middle of the update process some nodes are running with a new version and some are still with the old one, do we allow them to operate as usual or do we enter an “upgrade in progress” mode where some functionality may be disabled until all nodes are updated (to prevent the change of the data that is currently being migrated by the other updating node)
  3. Do we allow a situation in which old nodes and new nodes work together for long periods of time?
  4. After the first node already finished upgrading and migrating the database tables, how will the next nodes know if they should run the migration process as well, or what part of the data still needs migration? (part of the data may still be in the old format since it was added by a node with an older version, while the first server was upgrading). 

After you asked all those questions and came up with some answers (be sure that some may change over time and even during the development of the migration framework), here are some tips and potential pitfalls to avoid:

  1. Incremental migration - each application version should only implement the migration from the previous version, they run in order and migrate the data step by step.
  2. Mandatory versions - Some versions can be considered as mandatory versions, meaning that when upgrading from an older version, you first have to update to this version and only then update to the newer version. (for example, when updating from version 1.3.0 to 3.1.0 you first have to install 2.0.0 and only then 3.1.0). You should try to avoid those as they make the update progress harder, but in some cases they are unavoidable (usually when major framework changes are introduced such as changing the database type etc).
  3. Don't use POJOs/entities/documents. Use raw objects: if you are using an ORM framework (such as spring data/ hibernate) you are used to working with POJO / documents to read data from your database, however when you are writing the migration process you should not use them since the code base that is running the migration will not necessarily have the same POJO, (upgrading from 1.5 to 2.6 will run with the code base of 2.6 whos POJO may be completely different and cannot read the data from 1.5). So use basic cursors and raw database elements to overcome this.
  4. Compatibility - when changing the data object between version try to make the change in a way that old code can still read the new data and the new code can read the old data, this way if you have several nodes in the cluster none of them will fail when the data begins to change.
  5. Consider using new collections/tables, if you need the other nodes to continue working while the migration is running or want to have a fallback or backup, consider adding a new table for the new version and migrating the data into the new table, the new version will work with the new table and the old version will work with the old table, until it will also undergo the upgrade.
  6. Keep backward compatibility - Don't ever change the type of an existing field/column in the database. For example, changing an existing int type field to a string type field in the new version, just use a new field. (see section 4).

Hopefully, by the time you go over all those points, you will have a better understanding and clearer picture on how to design and implement your process.

In the next post, I will show a high-level example for the system that we implemented.