Managing Your DB Changes
It probably happened to you as well... You develop a product, and after you successfully install it at your customers sites, you end up with dozens of versions/revisions of your product. Each revision has its specific SW, with appropriate DB schema, filled with your product initial data (here again, different for each revision) and customer data. How do you manage this DB changes ? What happen when you have dozens of customers with dozens of deployment revisions ? How to support more than one DB ?
These are some of the questions I had to answer on one of Tikal existing customers projects I'm involved with. The questions above are just example for the issues you have to deal when it comes to DB change management.
As said, the the product should be deployed, on dozens of customers sites, and in addition it will be deployed on various testing and sales persons environments. By the time, different customers and testers will have different versions of the product. At some point, they may want or need to upgrade their existing product version to a newer revision. This upgrade process, should keep the existing data of the customer, while enabling the administrator to upgrade the product to the new revision. As part of this upgrade process, there is a need to upgrade the database schema (executing DDL SQL statements) and the data stored within it (executing DML SQL statements). We may even need to apply some Java code to manipulate existing data, for example in case we may need BLOB migration. These updates depends of course on the customer/tester existing database schema and data. Nevertheless, when this upgrade process is been finished the database should have the appropriate schema and data according to the upgraded product revision. We also need to support failed resuming of failed migrations , due to some errors or network failure while the migration was running).
DB Change Management Tools
Rather than “inventing the wheel” as a solution for the problem, I chose to take one of the existing “DB migrations tool”, integrate it into the product, and adapt it to our needs. The tool should apply the DB migration on the product startup. This means that when the product will have finally been finished to load, the DB migration tool will have been finished successfully as well (in case of failure, the product will fail to load), keeping all the migration metadata in its own dedicated table/s.
Choosing The Right Tool
There are quite a few tools for DB migrations - Here is a link that compares their features. At the end f the day we ended with two choices: Liquibase and Flyway. Liquibase is more general purpose in the sense that the migration statements are embedded general “ChangeSet” XML tags in XML files. Liquibase is responsible to translate the “ChangeSet” instructions in the XML files to DB specific SQL statements. Liquibase has yet another advantage - Each and every “ChangeSet” statement is been executed as a separate transaction, reflected in a separated row in Liquibase metadata table, which can help in case the underlying DB does NOT support DDL transactions.
Nevertheless, I chose NOT to use Liquibase , and to take a new tool called Flyway instead. There were a few reasons for this decision: First, our chosen RDBMS is PostgreSQL, which does support DDL transactions - The whole file migration can be executed as one transaction (making this Liquibase advantage redundant here). Second, Liquibase does NOT support BLOB migrations, or any advanced Java code migrations. We may need to migrate our existing BLOBS data and tables , and we may have complected code that will be easier to write with Java than SQL script. Third, we also have some of the data deployed in yet another NoSQL DB (MongoDB), and its data will probably have to be migrated as well as part of the upgrade, and it will be easier to use the same tool for that.
As mentioned we chose to take Flyway as our migration tool. There is a special Spring bean in the configuration file, that is responsible to run the migration on Spring application startup (using Spring “init-method”), before initializing the Hibernate Session Factory, using “depends” keyword on the Hibernate Session Factory bean. This enables Hibernate's SessionFactory to connect after the migration had been applied, to the migrated DB, and “validate” the migrated schema against the current Hibernate configuration.
The easiest way to understand how flyway works, is to read this page. In our product I have a Maven project for the migrations, and a dedicated base directory to put all the migrations files. Each migration file will have the following file name convention:
So for example the initial file can be as follow :
The file above is SQL migration file for the product in version 18.104.22.168 and have a sequence number “A”. Here are the details for the fields place holder:
VERSION – this is the product version, following the internal version product's convention. It usually contains 4 octets with “.” as a separator between them.
SEQ – The sequence number/letter this file will be executed on behalf of the product version migration. Thus, the file V22.214.171.124_A__Desc.sql will run just before V126.96.36.199_B__Desc.sql file. However both files will run after running files from previous product versions – i.e file V188.8.131.52_B__Desc.sql will run before both files, since it belongs to a previous product version.
In practice, Flyway will run all migrations files with higher versions than the existing DB state (reflected in Flyway table), according to the version and sequence number. In case of failure while the migration is running, Flyway will roll back all changes of the current migration file, and will mark this migration step as failed. This will enable resuming and running the migration again, that will start from the last failed point.
DB Migration with Java
As mentioned, Flyway also enables Java based migrations. All Java migration classes for the product have a dedicated package “com.customername.productname.server.persistence.jpa.infra.flyway.postgresql”, and should be located in migration project.
The execution of the migration Java files is been integrated with the versioning and sequence of the SQL files , explained in previous section: The naming convention for Java files is similar to the SQL files:
The only difference is, this time the VERRSION placeholder can NOT have “.” separator field between the octets (due to Java files naming), and instead will have “_” as the separator. So the Java file V8_0_2_1_B__InitialVersion.java will run just after V184.108.40.206_A__InitialVersion.sql file execution. In case 2 migrations files (either Java or SQL file) will have an identical VERSION and the same sequence number, an in error will be omitted and the migration will NOT be applied.
When it comes to DB migration , and DB change management, there are various tools, and its better to adapt one of them to your needs, rather than build your own solution here. These tools have different characteristics, and its better to check your exact needs and constraints before choosing the tool. You can even integrate the tool (at least some of them), as part of your application startup, so it will be easier and somewhat transparent migration.