3 simple pipelines for database development with Redgate Deploy – Part 1: Setup & GitLab

Society must adapt to diamonds, diamonds don’t adapt to society.
Abhijit Naskar

The world is changed… I feel it in the water… I feel it in the earth… smell it in the air. On a totally unrelated note did you know December 2021 marks the 20th Anniversary of the Lord of the Rings films? Just in case you were looking for your reminder to go and watch those masterpieces again, this is that sign!

Seriously though – gone are the days when I would demonstrate database pipelines on 1 or two different technologies. Over the last 6 years I have walked people through database deployments using an array of CICD options; Jenkins, TeamCity, Octopus Deploy, Bamboo… and most recently I’ve spent most of my time on Azure DevOps. At times it can even feel like Azure DevOps is the only solution you’ll need, but increasingly it’s becoming obvious that isn’t the case and there are new, shiny providers who offer some amazing experiences and awesome functionality.

Now seems like the best time to explore 3 of the ones I’m coming across more and more – CircleCI, GitLab and GitHub Actions.

The interesting part of this is that I genuinely believe that this will be incredibly easy. Maybe I’m naïve but from the looks of all 3 they seem straight-forward, understandable… and of course I’ll be using Flyway in my pipeline, which is the easiest, cross-platform friendly solution to use for this.

Note: I will assume you have some familiarity with Flyway in this post, if you don’t read more about the capabilities of Redgate Deploy here.

The Setup

For this “challenge” (if I can call it that) I’m going to be using Flyway Desktop installed on my Windows laptop, GitHub as my Version Control system and 5 Azure SQL Databases: 2 for “Dev” & “Dev_Shadow” (from which I will generate 3 independent repos) and 3 environments for PROD_GitLab, PROD_CircleCI and PROD_GitHub respectively. The structure of the database will be the DMDatabase, unsurprisingly the database I use for pretty much everything I do on this blog.

Note: Everything I’m doing today uses SQL Server (well… Azure SQL Database) however everything here is cross OS – you can set up similar pipelines for everything from Oracle to PostgreSQL to CockroachDB if you would like!

5 Databases ready to go – as shown in the Azure Portal

Fortunately CTRL+C, CTRL+V exists, so I’ll only have to setup once and then I’ll just copy the files across into the other two repos; I set up a new private Repo in my GitHub specifically for GitLab but you could easily repeat these steps below separately for GitHub Actions or CircleCI:

GitLab repo in GitHub

I clone this down onto my Windows machine using Git Bash and then linked to and created my Flyway Desktop project (don’t know how? Try this!):

Link the development database and the shadow, generate the Schema Model and the Baseline Migration from DMDatabase_PROD_GitLab (I just grab the relevant JDBC connectors from the Azure Portalthis makes it much easier!), don’t forget to specify the list of Schemas, I did and it ain’t pretty (but is an easy enough fix)

Then I commit and push the schema model files and the baseline migration up into GitHub:

For good measure I also changed the DM_CUSTOMER table on the Dev environment and generated a new schema-model and migration change so I know what is going to be deployed to my “Prod” environments as part of this test:

Then after committing and pushing to my repo, I copied all of the files over to my GitHub and CircleCI repositories too:

A quick check of my other repos and everything seems good to go!

Principles

I’m setting up 3 separate pipelines in this post which will all effectively do the same thing, but for different “Prod” copies of databases, however when building and deploying in practice you will have a number of tasks you will want to accomplish in and around the process itself (such as really useful things like Unit Tests, Code Analysis etc.). To keep things simple I will be creating a 6th Database – the “Build” database which will act as our CI validation step and our process for all 3 pipelines will be:

  • Invoking a Flyway Clean against the “Builddatabase – this step will remove every object on the database leaving it empty
  • Invoking a Flyway Migrate against the “Builddatabase – this step will build the database from scratch to validate our baseline script and any further migrations build successfully
  • Invoking a further Flyway Migrate against our respective “Prod” database, to deploy the latest scripts we have generated.

GitLab

After following the Setup instructions above, in GitLab I need to create a New Project and I want it to Build/Deploy from my GitHub repo, so I pick “Run CI/CD for external repository

Fortunately it’s very easy to connect directly from GitHub, but you will have to generate a Personal Authentication Token which you can do by going to https://github.com/settings/tokens and then authorizing the main repo you want to build from – for me this is GitLab_Flyway:

Painless! From here I select CI/CD template and because I’m starting from scratch I’m going to use the starter 3 stage template:

It has a rather neat layout and is pretty darn easy to get up and running with:

I may have tried several combinations to get the Flyway Docker container up and running but essentially the code I ended up running for my pipeline was:

stages:          # List of stages for jobs, and their order of execution
  - build
  - deploy

variables:
    userName: "MyUserName"
    password: "MyPassword"
    prodJDBC: "jdbc:[TheJDBCConnectionToTheProdDBYoureUsing]"
    ciJDBC: "jdbc:[TheJDBCConnectionToTheBuildDBYoureUsing]"
    migrationPath: $CI_PROJECT_DIR

build-job:       # This job runs in the build stage, which runs first.
  image:
    name: flyway/flyway:latest-alpine
    entrypoint: [""]
  stage: build
  script:
    - flyway clean -url="$ciJDBC" -user="$userName" -password="$password" -locations="filesystem:$migrationPath"
    - flyway migrate -url="$ciJDBC" -user="$userName" -password="$password" -locations="filesystem:$migrationPath"

deploy-job:      # This job runs in the deploy stage.
  image:
    name: flyway/flyway:latest-alpine
    entrypoint: [""]
  stage: deploy  # It only runs when *both* jobs in the test stage complete successfully.
  script:
    - flyway migrate -url="$prodJDBC" -user="$userName" -password="$password" -locations="filesystem:$migrationPath" -baselineOnMigrate=true -baselineVersion="MyBaselineVersion"

It was actually quite easy to spot where I had failed in previous runs and GitLab breaks things up quite nicely for us to see:

Some interesting things I noted using this setup:

  • Because we are deploying up stream to an environment that hasn’t been deployed to with Flyway before you have to pass in the -baselineOnMigrate switch, what was interesting though is that I also needed to specify the -baselineVersion, otherwise it tried to baseline V1, which of course did not exist as my baseline was named something completely different (V001_DateTimeStamp_blahlablah)
  • The entrypoint is specified as it is because it drops you right at the Flyway executable so you can issue the Flyway commands – without this it doesn’t work
  • You can ABSOLUTELY pass your variables in (like *cough* password and username) in a much more secure way through GitHub using variables, but this was a great start for me
  • To pass in the files using a hosted repo, I had to use the environment variable $CI_PROJECT_DIR and that’s where the repo is checked out to, where your migrations are

But it ultimately ended up in what I was expecting – the database was migrated using the Flyway command steps:

Conclusion

Is it possible to setup a nice easy pipeline from Dev -> Prod with Redgate Deploy and GitLab? Yes, absolutely it is, and you can build out the pipeline in whatever fashion you want. Thankfully, the Docker container makes things much much easier!

Now, let’s see how we get on with GitHub Actions!

Flyway Desktop: Don’t be afraid of your own Shadow (DB)

Just don’t hold back. Don’t be afraid to make mistakes and stuff.
Kristen Stewart

<HolidayTalk>

Howdy folks! Welcome back! Well, I guess that should be aimed at me – it’s been a few weeks *cough* months *cough* since I last blogged anything on here and this is because I was on a sabbatical – I went to the Canary Islands with my wonderful wife for a few weeks and just spent the time doing what I do worst… relaxing. Anyway – enough about that and on to the post – but if you’re interested in seeing what we got up to while we were there, I took pictures every day and I’m PlantBasedSQL on Instagram too!

</HolidayTalk>

I’ve spent far too much time of late talking about Database Cataloging and Data Masking, and it occurred to me that it was about time for a new DevOps-y post, but the trouble was I had no idea what to write – and then something happened last week that I think could really help get people up and running, not just with Flyway Teams, but also with Flyway Desktop (formerly Redgate Change Control), which is the developer-assisting GUI found in Redgate Deploy.

Note: The problem that I’m going to describe below is universal with Flyway Desktop as of writing – whether you’re using it for SQL Server or Oracle etc. the solution I will describe is also universal, which is why I haven’t tailored this blog post to a specific RDBMS.

Flyway Desktop v5.0.682

The Tech

Flyway Desktop uses a principle called the Shadow Database; you have your dedicated development database (DEV) which you make database-first changes too, and the Shadow, which is and entirely separate database constructed by running your Baseline script against an empty database. Your Baseline is the script generated from an upstream environment like PROD or TEST containing… well, everything. All objects and the entire state of that Database at that point in time. It’s useful because once you’ve created that baseline and run it to create the Shadow, a comparison is carried out to detect pending changes in DEV (so you don’t have to throw any work away that’s not in PROD) and if some are found your initial V001 migration script is generated into your local repo. It’s pretty neat.

What is also really neat is that in certain situations (like swapping branches and resolving local migrations against migrations on the database), Flyway Desktop cleans the Shadow DB and builds it again from scratch including everything from the baseline up, all the way to your new migration script – this is awesome because you’re effectively doing a full database build every time you generate a migration and testing that the script is buildable and the database deployable.

My Oracle Dev environment in SQL Developer & the Shadow DB
My SQL Server Dev environment in SSMS & the Shadow DB

The Problem

What is not awesome though, is that if you have a REALLY REALLY big database, with dozens, if not hundreds of thousands of objects you might not want to have this baseline script run every single time you create a new migration script – it could take minutes, even hours!

Following along with that radical school of thought that shockingly “not all databases are perfect” there will also be other occasions where the size of the baseline or number of objects is irrelevant; one example of this might be if you use 3 or 4 part naming conventions in your SQL Server databases. A backup and restore will work, but if you try to actively create a view, for instance, that references a cross server or database object that doesn’t exist then the script cannot run against the Shadow and instead of hanging or taking forever it simply won’t work. Caveat: If you’re using Azure SQL Database then obviously this isn’t going to be that much of a problem for you for obvious reasons, but invalid objects can still cause major problems both with the Shadow and your own databases later down the line!

The Solution

In previous situations like this, such as with SQL Change Automation, we were able to use a SQL Clone as Baseline, instead of a full baseline script and I have no doubt that that kind of functionality will be available in the future (you’re possibly already reading this laughing to yourself thinking “Chris, it’s been a feature for months now!”, but right now it isn’t a feature, so indulge me for now!).

But it got me thinking: is there a way to get around the “big or invalid Shadow DB” problem… now?

Running Flyway Desktop there’s quite few things happening under the hood, but the method of “cleaning” and “updating” the Shadow database is, you guessed it… Flyway. Flyway has a number of callbacks which can run at any time during the cycle: between migrations, after cleaning etc. whatever you want.

Definition of Callbacks from the Flyway website

My assumption when first approaching the problem was that I could use a beforeMigrate or afterClean callback within my local repository to effectively swap out the database I was using for the Shadow – however in my initial testing with Oracle (that I have since also proven in SQL Server) this turned out to be a big no-no. The reason? When Flyway runs ANY command, it initializes the JDBC connection first, even with callbacks and even if it’s just running a script that does nothing in the context of the database. This means I’m effectively trying to drop the Shadow whilst connected to it – so depending on the RDBMS I experienced 1 of 2 scenarios:

  1. The command runs successfully, the Shadow database is replaced causing the JDBC connection to crash out, causing Flyway to stop and not migrate the new Shadow.
  2. The command doesn’t run successfully because “the database is in use” so nothing happens

This was… annoying and it was thanks to a member of the development team we were able to establish exactly what was happening. Fortunately, I work with some amazing people and we were able to come up with an ingeniously simple solution that all of a sudden creates a brand new realm of possiblities.

A new callback: beforeConnect

The new callback ‘beforeConnect‘ delivered by my rather excellent colleague was released into Flyway 8.0.5 and will (as of writing) soon be released into Flyway Desktop, meaning that if you want to use a SQL Clone Database as a Shadow for your SQL Server Flyway Desktop project, you can! If you want to use Pluggable Database as a Shadow for your Oracle Flyway Desktop project, you can!

Note: These are the two I’ve tested however beforeConnect will run any and all scripts you give it prior to the JDBC connection being established meaning you can use any methods you like for replacing the Shadow, and also include it within your pipeline upstream in case you have any preparation steps you require pre-deployment.

One (well two) Solution(s)

Like I said I initially tried this out with an Oracle pluggable database to test if it was feasible – PDBs have been available for a long time and have only gained popularity. I have a copy of the ACCEPTANCE PDB which has the Flyway_Schema_History table on it and I’m using it as a base PDB from which I copy each time – this script will only run though if it doesn’t detect the Flyway_Schema_History table on the Shadow, because this means it has been cleaned – if the table is still present, there is no need to replace it.

Same story with SQL Server, though I’m using SQL Clone to “reset” the cloned Shadow database – this will work like a backup/restore but faster AND solve any pesky 3-part naming convention errors you might have with your baseline!

Oracle

beforeConnect.cmd

cd C:\[Your Local Repo]
echo @clone-shadow.sql | sqlplus -L -S [User]/[Password]@localhost:1521/ORCL AS SYSDBA

clone-shadow.sql

DECLARE
    HistoryTable INT;
BEGIN
    SELECT COUNT(*) INTO HistoryTable FROM CDB_TABLES t LEFT JOIN DBA_PDBS p ON p.CON_ID = t.CON_ID
  WHERE p.PDB_NAME = [ShadowDBName]
	AND OWNER = 'HR'
    AND t.TABLE_NAME = 'flyway_schema_history';
IF HistoryTable = 0 THEN
   execute immediate 'alter pluggable database [ShadowDBName] close immediate';
   execute immediate 'drop pluggable database [ShadowDBName] including datafiles';
   execute immediate 'create pluggable database [ShadowDBName] from [BasePDBName]';
   execute immediate 'alter pluggable database [ShadowDBName] open';
END IF;
END;
/

SQL Server

beforeConnect.ps1 (uses DBATools PowerShell module)

#Set Variables

$instance = '[Machine Name]'
$instanceName = '[Instance Name]'
$machinePlusInstance = $instance + "\" + $instanceName
$cloneServer = "http://" + $machinePlusInstance + ":14145"

# Query the Shadow DB to see if it has been cleaned

$SqlQuery = "SELECT COUNT(*) FROM [YourShadowDatabase].INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'Flyway_Schema_History'"
$result = Invoke-DbaQuery -SqlInstance $machinePlusInstance -Query $SqlQuery

# If it has been cleaned, replace it

If($result.Column1 -eq 0) {
    Connect-SqlClone -Server $cloneServer
    $SqlServerInstance = Get-SqlCloneSqlServerInstance -MachineName $instance -InstanceName $instanceName
    $CloneToReset = Get-SqlClone -Name '[YourShadowDatabase]' -Location $SqlServerInstance
    Reset-SqlClone -Clone $CloneToReset
}

Setting it all up

The above callback works wonders when you’re simply replacing your copy of the Shadow DB instead of running the baseline every time, but how do you set it up in the first place? How do we set up the full project from scratch? Well it’s actually pretty easy step by step:

1 – Create the Dev and Shadow Databases

2 – Developer creates local git repo & creates new FWD project

3 – Developer links DEV database & commits schema model

4 – Developer “sets up” shadow database and generates Baseline migration

5 – !IMPORTANT! User does NOT hit finish (else you get an ugly error)

5 – Developer changes Flyway.conf file in local repo to a) baseline on migrate and b) baseline with the script just generated

6 – Developer hits finish and Flyway_Schema_History table is created and baseline marked as applied (no clean is ever run)

7 – Changes can now be made, scripts generated and put into Version Control as expected

8 – Add your callbacks (as above) to your repo to replace the Shadow and filter this file out with your gitignore*

*SQL Server Example given above

9 – Each user that pulls down the project will need their own beforeConnect callback to recreate their own Shadow DB, but once they’ve created it, the .gitignore will filter it out by default and they won’t need to create it again

Done