Producing human readable SQL script descriptions with the OpenAI APIs and PowerShell (a.k.a ChatGPT: I’m here for the hype)

Learning should be a joy and full of excitement. It is life’s greatest adventure; it is an illustrated excursion into the minds of the noble and the learned.
Taylor Caldwell

Perhaps I should have instead started with a better quote like “rumors of my demise have been greatly exaggerated” or something like that – you look like you’ve all seen a ghost! Where have I been you ask? Places. What have I been doing? Things.

I needed a break from blogging, and may still take more time, but I really enjoy playing around with new things and when I saw the hype around ChatGPT I had to get in on the action; it has produced some rather fine results thus far…

Magnificent.

But people have been building out examples of where they can use ChatGPT in wider life, whether it’s producing the easy 80% of content which they then sharpen up and add more unique ideas, producing the wireframes needed to begin some more code or even just looking for a recipe for a vegan Shakshouka; it can pretty much do it all. As someone who has never really played around with APIs all THAT much I wanted to see what I could do with it.

I failed dramatically – turns out writing REST calls is not my favorite thing ever, so I turned back to my good friend PowerShell as it’s much easier to play with and I already understood the syntax etc. etc. and this is the result.

The Idea

A while back I was given a SQL script by someone who wasn’t quite sure what it did and how and they asked me to look at it to see if I could figure it out – I did but it took me a little while, I wanted to see how well ChatGPT handles it and instead of copying and pasting I wanted to see if I could just pass the file across to the OpenAI APIs. All you need for this to work is a directory with SQL files in and an OpenAI API key (and free account unless you wanna pour some money in – keep an eye on that, don’t go crazy on the free tier and end up spending load on it) and this was the code I used:

# Authenticate to the OpenAI API
Write-Host "Authenticating to OpenAI API"
$apiKey = "YourAPIKey"

# Set the path to the directory containing the SQL scripts
$sqlScriptDirectory = "YourSQLDirectoryFullyQualified"

# Get the list of SQL scripts in the directory
Write-Host "Getting list of SQL scripts"
$sqlScriptFiles = Get-ChildItem $sqlScriptDirectory -Filter *.sql

# Loop through each SQL script
Write-Host "Parsing SQL scripts and generating descriptions"
foreach ($sqlScript in $sqlScriptFiles) {
  try {
    # Read the contents of the SQL script
    $sqlScriptContents = Get-Content $sqlScript.FullName

    # Use the OpenAI API to generate a description of the script
    $requestBody = @{
      model = "text-davinci-003"
      prompt = "Describe the purpose of the following SQL script: $($sqlScriptContents)"
      max_tokens = 256
    }
    $jsonBody = ConvertTo-Json $requestBody
    $description = Invoke-RestMethod -Method Post -Uri "https://api.openai.com/v1/completions" -Headers @{
      "Authorization" = "Bearer $apiKey"
      "Content-Type" = "application/json"
    } -Body $jsonBody

    # Write the description to the console
    Write-Host "Description for $($sqlScript.Name): $($description.choices[0].text)"
  } catch {
    Write-Host "An error occurred while processing $($sqlScript.Name): $($_.Exception.Message)"
  }
}

It does a pretty good job – I passed it a script from SQL Server: Sample Scripts – TechNet Articles – United States (English) – TechNet Wiki (microsoft.com) to get all the indexes in a database (Clustered and No Clustered), to see how it interpreted it, and you know it was pretty good:

> Authenticating to OpenAI API
> Getting list of SQL scripts
> Parsing SQL scripts and generating descriptions
> Description for V001__StoredProc.sql: This SQL script has the purpose of retrieving all constraints in a database and creating a script to drop them. The script uses an inner join to join the sys.indexes, sys.objects, and sys.schemas tables on their object_
id and schema_id columns, respectively. It also includes a WHERE clause to filter the results by type, primary key, index_id, schema name, and unique constraint. Once the query is executed, it will generate a script that can be used to drop the desired constraints.

The Application

I’ve thought quite a bit about what options we have here – the script itself was a 15/20 minute lunch project is very simple and doesn’t even recursively scan the directory of files… but the potential is so clear.

In my mind the most obvious options to extend this in the future are:

  1. Code Readability-> often we leave a legacy of code behind us (or certainly people I’ve worked with do) and it’s uncommented or unformatted and we’re not entirely sure what it does – this can help partway to making that headache go away so you can focus on value added tasks
  2. Data Lineage -> pairing this capability together with transformations and existing code could help give youa better idea of where your change may be disruptive in your pipelines
  3. DevOps pipelines -> as you deploy changes upstream, maybe with a technology like Flyway, you accumulate a list of scripts to be deployed that have a short description in the filename, and you can do some automated testing, but sometimes you just want the changes boiled down into a single easy to understand summary – you could use this capability to pass in all of the scripts pending deployment to provide a human readable deployment summary, making it easier to catch bad behaviors like where people drop tables, because it’s in clear English.

So maybe give it a try yourself and let me know where you see this capability coming in most handy for you in the future!

Moving from Redgate SQL Source Control pipelines to Flyway Desktop with Redgate Deploy

“Like all magnificent things, it’s very simple.”
Natalie Babbitt

There has been a lot of change over the years in the Redgate solutions – I hasten to add this is a good thing. Back in my day it was SQL Source Control to store your database in Version Control; at the time it was probably a 50/50 split between people who used Git and people who used other systems like SVN, TFVC (TFS/VSTS) and Vault or Mercurial etc. and you could then use DLM Automation to build and deploy this state-based database project to Test, Prod and so on.

SQL Source Control and DLM Automation (later SQL Change Automation) have formed the basis for many a pipeline for many many years, and they have been reliable, in some cases life changing for those who have used them… but the times, they are-a changing!

These technologies are still a great option and are still present in Redgate Deploy for those whom they work for, however with the rise of still further distributed computing topologies, and the dominance of cloud-hosted architecture and PaaS databases in todays world – something new is needed.

Enter Flyway Desktop.

As you’ve seen in some of my previous posts, Flyway Desktop is really really easy to get up and running with, not only that but it combines the State and Migrations models together creating one repo with ALL the benefits, and none of the deciding which model is best for you. It was architected from the ground up to be 3 things:

  • Ingeniously simple: to set up, to use, to everything.
  • Cloud ready: designed for use with IaaS and PaaS database options
  • A combination of the best of the best: all of the benefits of previous Redgate solutions, few to none of the drawbacks

...but what if you’re already using Redgate?

Yes Flyway Desktop and Redgate Deploy in general are super easy to get up and running with for new databases, even difficult, monolithic databases (thank you Clone as shadow!), but what about projects you already have under source control? Like I mentioned, SQL Source Control has been around for years and is beloved by many, and SQL Change Automation is still in use by thousands too. We want to maintain the history of our changes for reference, and we don’t want to simply disregard the whole pipeline. So the big question is how do we upgrade our state-based pipeline? Let’s find out together!

Note: This post is for people who want to or are interested in moving to a newer solution (and to give them an idea of what to expect) and in no way reflects any level of urgency you should be feeling – I’m certainly not pushing you to move any of your pipelines now, especially if you’re happy with what you have!

Setup

For starters I set up an end to end SQL Source Control and SQL Change Automation pipeline in Azure DevOps – my understanding of the approach I’m going to take is that this should work wherever your pipeline is (TeamCity & Octopus Deploy, Bamboo, whatever) so don’t feel that this post is not for you just because I used Azure DevOps.

I set up a copy of the DMDatabase on my local SQL Developer Instance, and then created an Azure DevOps repo and cloned it down to my machine:

I linked my database to the repo, created a filter to filter out users and committed it to my repo – then I set up the YAML for the build, and the Release steps for SQL Change Automation:

My SQL Source Control Project in Azure DevOps (Git)
The YAML to build my SQL Source Control Project
Release Steps in Azure DevOps
Deployment Steps

Everything seems to be deploying ok, I’ve even set up an Azure SQL Database as the target for my database changes. Now we have this SQL Source Control -> SQL Change Automation pipeline running, lets investigate replacing it.

SQL Source Control

The first thing I did was to open Flyway Desktop and create a new project – I pointed the project at my Dev DB and at the same local repo that I host my SQL Source Control files in:

and without committing the state to my schema-model folder, only linking to the Dev database, we end up with our repo looking like this:

I’m going to delete the Redgate.ssc file, because we’re no longer in SQL Source Control and I’m going to move every other file to the schema-model folder that is now under my project name (DMDatabase) – full on Copy Paste style:

…and then hit refresh in the Schema Model tab of Flyway Desktop:

and… nothing should happen. Absolutely nothing, because the state of your project, the Schema-Model folder should now exactly match the state of your development database (assuming you had everything committed to SQL Source Control!) – so now we come across to the version control tab aaaand…

WAIT!

If we commit now it will break our CI build, because when we trigger with a new push, my YAML will be expecting $(Pipeline.Workspace)/s/Database as the input, but now we have a slightly altered project we want to build a slightly different path. I’m going to temporarily disable my CI trigger in the YAML pipeline:

and now I’m going to Pull (to get the YAML file in my local repo) and then commit and push my changes:

Now I’m going to change my build YAML file to $(Pipeline.Workspace)/s/Database/DMDatabase/schema-model then save and re-enable Continuous Integration:

et voila!

SQL Change Automation sees it as a regular state based repo and builds and deploys it with no issues whatsoever:

and just like that! SQL Source Control is replaced – our teams can now pull down the latest copy of the Repo with the Flyway Desktop project in and open it. All they will need to do is re-specify their Dev Database Connection. If you are only using SQL Source Control or you’re using SQL Source Control with the SQL Compare GUI for more manual deployments currently then you’re done! When you want to extend your pipeline, you can read below.

SQL Change Automation

This is the step where we have to fundamentally change the way the pipeline works. It’s easy to switch across from a SQL Source Control to Flyway Desktop, which means we get immediate upgrades in speed, reliability and stability in our development process, especially where we’re working with Cloud-hosted databases.

With Redgate Deploy though, we’re fundamentally leveraging the Flyway command line capability for smooth, incremental deployments, and this is always a migrations only deployment – to move across to using Flyway then we’re going to need to make a few alterations to how the pipeline works.

First-things-first: We need some migrations, more specifically: THE migration. When you create a Flyway Desktop project usually you create a Baseline script. This script is the state of your Production environment(s), or a copy of them, and is used to basically be the starting point for your incremental migration scripts in the pipeline. The Baseline, once generated, is run against an empty database referred to in Flyway Desktop as the Shadow Database, although this can of course be a Clone too. Not every developer necessarily needs this – only the ones who will be generating the deployable artifacts, the migrations themselves, and putting them into source control, but they are definitely needed for deployments.

Note: I have some clients I’m working with who want every developer to affect schema changes and then immediately generate the migration for this and share with the team, but equally I have others who want 10 or so developers to share the responsibility of schema changes, and then once they’ve reviewed at the end of a sprint, they generate the Migration for the changes, source control it and approve it.

So in Flyway Desktop we set up our erasable database, our Shadow DB:

I use an empty database I stood up quickly in the Azure Portal:

and on the Generate Migrations tab I’m now prompted to create a baseline script:

I’m going to create the Baseline from my “Prod” environment that I’ve been using for my SQL Source Control deployments and hit baseline:

When you save and finish this will now run the baseline against the Shadow DB to recreate everything – and this is going to give you a chance to detect any changes you still have outstanding in the schema model – Flyway Desktop will compare the environments and detect any outstanding Dev changes, allowing you to also produce a migration for them.

Note: If your plan is to use this process to capture any outstanding code in a V002 “Delta” script to bring all environments back into line, you absolutely can but I would advise you to make the script idempotent – if you add all the necessary IF EXISTS statements for the deployment, you should be ok and it will only create or alter the objects that have to be, in order to sync all the environments up.

First Pull any pending changes from your repo then commit and push this into your Git remote:

and it should look a little like this:

Now for second-things-second, the build. This is actually going to be a very simple step, perhaps the easiest to change. We’re already using YAML, and as you know from previous posts it’s really very easy to leverage the Flyway command line as part of your YAML pipeline, so I’m going to simply swap out the SQL Change Automation build YAML with an updated version of the Flyway YAML from that post:

trigger:
- main

pool:
  vmImage: 'ubuntu-latest'
 
steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'
 
- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(locations):/flyway/sql flyway/flyway clean -url=$(JDBC) -user=$(userName) -password=$(password)
  displayName: 'Clean build schema'
 
- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(locations):/flyway/sql flyway/flyway migrate -url=$(JDBC) -user=$(userName) -password=$(password)
  displayName: 'Run flyway for build'

My password and username I shall hold back for the JDBC connection variable needs to be encapsulated in quotes, to prevent it being escaped or running partially because of the semi-colon:

jdbc:sqlserver://dmnonproduction.database.windows.net:1433;database=DMDatabase_Build”

and the locations variable was my newly created migrations folder:

$(Pipeline.Workspace)/s/Database/DMDatabase/migrations

Fortunately these few changes mean that I now have a green build where I’m cleaning my Build DB and then building all of my files from there:

Deploying to Production is the only thing left. There’s a decision to be made here – because we’re just invoking the Flyway Docker Container, and we already have the YAML pipeline set up for the build we can:

  • As part of the build, zip up the migrations from the repo and publish them as an artifact, which we can then hand off to the Release portion of Azure DevOps, or indeed any other solution such as Octopus Deploy and run Flyway command line from there
  • OR we can simply expand out the YAML file – discard the “Release” pipeline and go FULL pipeline as code (which is also easier to audit changes on).

Given that we’re modernizing our deployment pipeline and introducing lean deployments of these incremental migration scripts, I’m opting for the latter, so I disable and archive my Release pipeline specifically and simply expand my YAML file with an additional step and an additional variable for the ProdJDBC instead of the Build DB:

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(locations):/flyway/sql flyway/flyway migrate -url=$(ProdJDBC) -user=$(userName) -password=$(password) -baselineOnMigrate=true -baselineVersion=001.20211210091210
  displayName: 'Deploy to Prod'

and of course in that YAML not forgetting the all important –baselineOnMigrate and –baselineVersion switches (which I’ve always been forgetting) – these are important because we’ll be marking the baseline script as deployed against our target and not actually running the baseline script – we don’t want to try to recreate all of the objects that already exist there.

This is the result:

Successful deployment to Prod, successful move to Flyway Desktop

Pre- and Post- Deployment Scripts

You might leverage pre- and post-deployment scripts in your SQL Source Control pipeline, something that has to happen each time before or after a deployment – if you want to maintain these in your new repo moving forwards you’ll need to make use of the Flyway callback functionality; take your pre-deployment scripts and turn them into a beforeMigrate callback and turn your post-deployment into an afterMigrate callback. These can sit in your migrations folder but:

  1. You may not need these now – because you have access to the migrations first deployment model, most changes can now be tailor-made to your deployment needs, such as injecting DML. statements in with your DDL scripts
  2. They will also run every time against your Shadow DB when you generate a new migration – just something to be aware of.

Final Word

It was much much easier than I thought it would be to move across, but I by no means believe that this will be as easy for everyone who needs or wants to move in the medium-long term. I am always an advocate of testing things out prior to setting them up in earnest, and would encourage you to try this workflow out for yourself first, perhaps in tandem with your SQL Source Control pipeline against a dummy Prod DB temporarily to see how comfortable your team is with the process, and to give yourself the time to ask the questions you might have.

3 simple pipelines for database development with Redgate Deploy – Part 3: CircleCI

“There is no place to reach.. only places to rest to carry on.”
Jaya Bhateja

SPOILER ALERT – This is part 3 of a 3 part series on enabling database deployments using Redgate Deploy, so if you have not read at least the Setup and Principles section of my previous post (Part 1 which you can find here, and if you’re interested Part 2 here for GitHub Actions) then I would strongly advise you do so! Thanks!

In my setup post we managed to get 3 Flyway Desktop repositories set up: 1 for each CICD system we’ll be using, and a number of Azure SQL Databases to use as “Dev“, “Build“, “PROD” etc. – I have never used CircleCI before so this will be a new experience as I try to figure it out at the same time as set up a database deployment pipeline… but just to recap the principles of what we’re trying to achieve:

Principles

I’m setting up 3 separate pipelines in this post which will all effectively do the same thing, but for different “Prod” copies of databases, however when building and deploying in practice you will have a number of tasks you will want to accomplish in and around the process itself (such as really useful things like Unit Tests, Code Analysis etc.). To keep things simple I will be creating a 6th Database – the “Build” database which will act as our CI validation step and our process for all 3 pipelines will be:

  • Invoking a Flyway Clean against the “Builddatabase – this step will remove every object on the database leaving it empty
  • Invoking a Flyway Migrate against the “Builddatabase – this step will build the database from scratch to validate our baseline script and any further migrations build successfully
  • Invoking a further Flyway Migrate against our respective “Prod” database, to deploy the latest scripts we have generated.

CircleCI

Ok I made my way into CircleCI and it was really easy to get up and running with (the free tier that is) and OHMYGOSH will you look at this sleek beauty:

So far so good – CircleCI seems to be even easier to understand so far than GitLab (and CONSIDERABLY easier than GitHub Actions) – I’m sure there are a lot of major differences (and GitLab was really easy to use) but I’m hoping for a similar experience here by the looks of it!

I create a new project pipeline where it asks me to select a repo for this “project”:

So I hit “Set UP Project” and then “build my own yml script” – now you would think this might just give me a blank script but no, just like GitLab they give us the option of a starter pipe:

I’m going to go ahead and choose the “Hello World” pipeline because normally that’s the easiest to cannibalize!

Much like GitLab it has an indicator to let us know whether our YAML is valid or not (I’m looking at YOU Azure DevOps!!!!) which is a massive help, and in general it’s just pretty easy to see what each step is doing. I built out an example YAML file using similar commands to my GitLab pipeline like so:

version: 2.1
parameters:
  ciJDBC:
    type: string
    default: jdbc:sqlserver://dmnonproduction.database.windows.net:1433;database=DMDatabase_Build
  prodJDBC:
    type: string
    default: jdbc:sqlserver://dmproduction.database.windows.net:1433;database=DMDatabase_PROD_CircleCI
  userName:
    type: string
    default: username
  password:
    type: string
    default: password
  migrationPath:
    type: string
    default: .\

jobs:
  clean:
      docker:
        - image: flyway/flyway:latest-alpine
      steps:
        - checkout
        - run:
            name: "Clean Build Database"
            command: "flyway clean -url=${ciJDBC} -user=${userName} -password=${password} -locations=filesystem:${migrationPath}"

  build:
    docker:
      - image: flyway/flyway:latest-alpine
    steps:
      - checkout
      - run:
          name: "Migrate to Build Database"
          command: "flyway migrate -url=${ciJDBC} -user=${userName} -password=${password} -locations=filesystem:${migrationPath}"

  deploy:
    docker:
      - image: flyway/flyway:latest-alpine
    steps:
      - checkout
      - run:
          name: "Deploy to Prod"
          command: "flyway migrate -url=${prodJDBC} -user=${userName} -password=${password} -locations=filesystem:${migrationPath}"

workflows:
  database-deploy-workflow:
    jobs:
      - clean
      - build
      - deploy

and also used the variables reference from the CircleCI documentation which was pretty helpful. But it resulted in this:

Turns out I made a few boo-boos along the way. So variables I was passing in like this: ${Variable} but Circle only really seemed to like it when I used << pipeline.parameters.variable >> because I had defined it at the beginning of the YAML file under parameters.

I also had the jobs running in parallel because I hadn’t defined in my workflow which steps were dependent on which – a lesson I SHOULD really have remembered from GitHub… but oh well. I corrected that:

workflows:
  database-deploy-workflow:
    jobs:
      - clean
      - build:
          requires: 
            - clean
      - deploy:
          requires: 
            - build

Interestingly everything was still failing and although everything was being passed through correctly, the only thing that was ACTUALLY making it to the Flyway Docker container was the first part of the JDBC connection:

Guess what? I had my quote marks in the wrong place.

destroy GIF

It’s ok though because 2 other things failed:

  1. The Prod deploy failed because it found a non-empty schema, a problem I seem to fall over EVERY SINGLE TIME, but which is easily remedied by providing the 2 switches to the Prod deployment: -baselineOnMigrate=true and -baselineVersion=[YourBaselineScriptVersion]
  2. The filepath specified wasn’t a valid path:

Yes, before anyone tells me I know my top level repo folder is still called “GitLab-Flyway“, I figured that out in the last post and I’m still face-palming. So I’m going to quickly alter the Prod Flyway migrate command and then play around with the filesystem locations first to see if I can find a value it likes…

Oh. It just needed a “.”… neat. Well here is the finished YAML that seems to work a treat:

version: 2.1
parameters:
  ciJDBC:
    type: string
    default: "jdbc:sqlserver://dmnonproduction.database.windows.net:1433;database=DMDatabase_Build"
  prodJDBC:
    type: string
    default: "jdbc:sqlserver://dmproduction.database.windows.net:1433;database=DMDatabase_PROD_CircleCI"
  userName:
    type: string
    default: "username"
  password:
    type: string
    default: "password"
  migrationPath:
    type: string
    default: "./GitLab-Flyway/migrations"

jobs:
  clean:
      docker:
        - image: flyway/flyway:latest-alpine
      steps:
        - checkout
        - run:
            name: "Clean Build Database"
            command: flyway clean -url="<< pipeline.parameters.ciJDBC >>" -user=<< pipeline.parameters.userName >> -password=<< pipeline.parameters.password >> -locations=filesystem:<< pipeline.parameters.migrationPath >>

  build:
    docker:
      - image: flyway/flyway:latest-alpine
    steps:
      - checkout
      - run:
          name: "Migrate to Build Database"
          command: flyway migrate -url="<< pipeline.parameters.ciJDBC >>" -user=<< pipeline.parameters.userName >> -password=<< pipeline.parameters.password >> -locations=filesystem:<< pipeline.parameters.migrationPath >>

  deploy:
    docker:
      - image: flyway/flyway:latest-alpine
    steps:
      - checkout
      - run:
          name: "Deploy to Prod"
          command: flyway migrate -url="<< pipeline.parameters.prodJDBC >>" -user=<< pipeline.parameters.userName >> -password=<< pipeline.parameters.password >> -locations=filesystem:<< pipeline.parameters.migrationPath >> -baselineOnMigrate=true -baselineVersion=001.20211130101136

workflows:
  database-deploy-workflow:
    jobs:
      - clean
      - build:
          requires: 
            - clean
      - deploy:
          requires: 
            - build

and we have ourselves one nice, lean CircleCI build and deployment pipeline:

Baseline Script successfully marked as Deployed, and 2nd migration successfully deployed as shown by Flyway_Schema_History table on DMDatabase_Prod_CircleCI

Conclusion

Was the purpose of these three blog posts for me to build 3 perfect pipelines, with impeccable secrets handling, automated testing, code analysis and all the best practices that mean they can all be rolled out into Production deployment pipelines tomorrow with no editing?

No. No way. Far from it.

But the purpose was to prove something else – that it can be done. This is the bare bones approach to enabling your database pipelines with Redgate Deploy and the Flyway Docker container in 3 different CICD systems; GitLab, GitHub and CircleCI and what we hoped to observe was that they can all in fact be used, with Redgate Deploy, to deploy schema changes to any of the supported RDBMS’.

That is indeed what we did. Happy Migrating!

Thank you to everyone who has stuck it out through all 3 parts, trust me, I did an awful lot of learning here myself and made COUNTLESS YAML mistakes – although I don’t class myself a Level 20 Warlock-slash-CICD-Pipeline-Guru it has been thoroughly interesting and I hope you managed to use the basis for these posts as success for your own pipelines! If you do – let me know, I love to hear from anyone who reads my posts!

Flyway Desktop: Don’t be afraid of your own Shadow (DB)

Just don’t hold back. Don’t be afraid to make mistakes and stuff.
Kristen Stewart

<HolidayTalk>

Howdy folks! Welcome back! Well, I guess that should be aimed at me – it’s been a few weeks *cough* months *cough* since I last blogged anything on here and this is because I was on a sabbatical – I went to the Canary Islands with my wonderful wife for a few weeks and just spent the time doing what I do worst… relaxing. Anyway – enough about that and on to the post – but if you’re interested in seeing what we got up to while we were there, I took pictures every day and I’m PlantBasedSQL on Instagram too!

</HolidayTalk>

I’ve spent far too much time of late talking about Database Cataloging and Data Masking, and it occurred to me that it was about time for a new DevOps-y post, but the trouble was I had no idea what to write – and then something happened last week that I think could really help get people up and running, not just with Flyway Teams, but also with Flyway Desktop (formerly Redgate Change Control), which is the developer-assisting GUI found in Redgate Deploy.

Note: The problem that I’m going to describe below is universal with Flyway Desktop as of writing – whether you’re using it for SQL Server or Oracle etc. the solution I will describe is also universal, which is why I haven’t tailored this blog post to a specific RDBMS.

Flyway Desktop v5.0.682

The Tech

Flyway Desktop uses a principle called the Shadow Database; you have your dedicated development database (DEV) which you make database-first changes too, and the Shadow, which is and entirely separate database constructed by running your Baseline script against an empty database. Your Baseline is the script generated from an upstream environment like PROD or TEST containing… well, everything. All objects and the entire state of that Database at that point in time. It’s useful because once you’ve created that baseline and run it to create the Shadow, a comparison is carried out to detect pending changes in DEV (so you don’t have to throw any work away that’s not in PROD) and if some are found your initial V001 migration script is generated into your local repo. It’s pretty neat.

What is also really neat is that in certain situations (like swapping branches and resolving local migrations against migrations on the database), Flyway Desktop cleans the Shadow DB and builds it again from scratch including everything from the baseline up, all the way to your new migration script – this is awesome because you’re effectively doing a full database build every time you generate a migration and testing that the script is buildable and the database deployable.

My Oracle Dev environment in SQL Developer & the Shadow DB
My SQL Server Dev environment in SSMS & the Shadow DB

The Problem

What is not awesome though, is that if you have a REALLY REALLY big database, with dozens, if not hundreds of thousands of objects you might not want to have this baseline script run every single time you create a new migration script – it could take minutes, even hours!

Following along with that radical school of thought that shockingly “not all databases are perfect” there will also be other occasions where the size of the baseline or number of objects is irrelevant; one example of this might be if you use 3 or 4 part naming conventions in your SQL Server databases. A backup and restore will work, but if you try to actively create a view, for instance, that references a cross server or database object that doesn’t exist then the script cannot run against the Shadow and instead of hanging or taking forever it simply won’t work. Caveat: If you’re using Azure SQL Database then obviously this isn’t going to be that much of a problem for you for obvious reasons, but invalid objects can still cause major problems both with the Shadow and your own databases later down the line!

The Solution

In previous situations like this, such as with SQL Change Automation, we were able to use a SQL Clone as Baseline, instead of a full baseline script and I have no doubt that that kind of functionality will be available in the future (you’re possibly already reading this laughing to yourself thinking “Chris, it’s been a feature for months now!”, but right now it isn’t a feature, so indulge me for now!).

But it got me thinking: is there a way to get around the “big or invalid Shadow DB” problem… now?

Running Flyway Desktop there’s quite few things happening under the hood, but the method of “cleaning” and “updating” the Shadow database is, you guessed it… Flyway. Flyway has a number of callbacks which can run at any time during the cycle: between migrations, after cleaning etc. whatever you want.

Definition of Callbacks from the Flyway website

My assumption when first approaching the problem was that I could use a beforeMigrate or afterClean callback within my local repository to effectively swap out the database I was using for the Shadow – however in my initial testing with Oracle (that I have since also proven in SQL Server) this turned out to be a big no-no. The reason? When Flyway runs ANY command, it initializes the JDBC connection first, even with callbacks and even if it’s just running a script that does nothing in the context of the database. This means I’m effectively trying to drop the Shadow whilst connected to it – so depending on the RDBMS I experienced 1 of 2 scenarios:

  1. The command runs successfully, the Shadow database is replaced causing the JDBC connection to crash out, causing Flyway to stop and not migrate the new Shadow.
  2. The command doesn’t run successfully because “the database is in use” so nothing happens

This was… annoying and it was thanks to a member of the development team we were able to establish exactly what was happening. Fortunately, I work with some amazing people and we were able to come up with an ingeniously simple solution that all of a sudden creates a brand new realm of possiblities.

A new callback: beforeConnect

The new callback ‘beforeConnect‘ delivered by my rather excellent colleague was released into Flyway 8.0.5 and will (as of writing) soon be released into Flyway Desktop, meaning that if you want to use a SQL Clone Database as a Shadow for your SQL Server Flyway Desktop project, you can! If you want to use Pluggable Database as a Shadow for your Oracle Flyway Desktop project, you can!

Note: These are the two I’ve tested however beforeConnect will run any and all scripts you give it prior to the JDBC connection being established meaning you can use any methods you like for replacing the Shadow, and also include it within your pipeline upstream in case you have any preparation steps you require pre-deployment.

One (well two) Solution(s)

Like I said I initially tried this out with an Oracle pluggable database to test if it was feasible – PDBs have been available for a long time and have only gained popularity. I have a copy of the ACCEPTANCE PDB which has the Flyway_Schema_History table on it and I’m using it as a base PDB from which I copy each time – this script will only run though if it doesn’t detect the Flyway_Schema_History table on the Shadow, because this means it has been cleaned – if the table is still present, there is no need to replace it.

Same story with SQL Server, though I’m using SQL Clone to “reset” the cloned Shadow database – this will work like a backup/restore but faster AND solve any pesky 3-part naming convention errors you might have with your baseline!

Oracle

beforeConnect.cmd

cd C:\[Your Local Repo]
echo @clone-shadow.sql | sqlplus -L -S [User]/[Password]@localhost:1521/ORCL AS SYSDBA

clone-shadow.sql

DECLARE
    HistoryTable INT;
BEGIN
    SELECT COUNT(*) INTO HistoryTable FROM CDB_TABLES t LEFT JOIN DBA_PDBS p ON p.CON_ID = t.CON_ID
  WHERE p.PDB_NAME = [ShadowDBName]
	AND OWNER = 'HR'
    AND t.TABLE_NAME = 'flyway_schema_history';
IF HistoryTable = 0 THEN
   execute immediate 'alter pluggable database [ShadowDBName] close immediate';
   execute immediate 'drop pluggable database [ShadowDBName] including datafiles';
   execute immediate 'create pluggable database [ShadowDBName] from [BasePDBName]';
   execute immediate 'alter pluggable database [ShadowDBName] open';
END IF;
END;
/

SQL Server

beforeConnect.ps1 (uses DBATools PowerShell module)

#Set Variables

$instance = '[Machine Name]'
$instanceName = '[Instance Name]'
$machinePlusInstance = $instance + "\" + $instanceName
$cloneServer = "http://" + $machinePlusInstance + ":14145"

# Query the Shadow DB to see if it has been cleaned

$SqlQuery = "SELECT COUNT(*) FROM [YourShadowDatabase].INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'Flyway_Schema_History'"
$result = Invoke-DbaQuery -SqlInstance $machinePlusInstance -Query $SqlQuery

# If it has been cleaned, replace it

If($result.Column1 -eq 0) {
    Connect-SqlClone -Server $cloneServer
    $SqlServerInstance = Get-SqlCloneSqlServerInstance -MachineName $instance -InstanceName $instanceName
    $CloneToReset = Get-SqlClone -Name '[YourShadowDatabase]' -Location $SqlServerInstance
    Reset-SqlClone -Clone $CloneToReset
}

Setting it all up

The above callback works wonders when you’re simply replacing your copy of the Shadow DB instead of running the baseline every time, but how do you set it up in the first place? How do we set up the full project from scratch? Well it’s actually pretty easy step by step:

1 – Create the Dev and Shadow Databases

2 – Developer creates local git repo & creates new FWD project

3 – Developer links DEV database & commits schema model

4 – Developer “sets up” shadow database and generates Baseline migration

5 – !IMPORTANT! User does NOT hit finish (else you get an ugly error)

5 – Developer changes Flyway.conf file in local repo to a) baseline on migrate and b) baseline with the script just generated

6 – Developer hits finish and Flyway_Schema_History table is created and baseline marked as applied (no clean is ever run)

7 – Changes can now be made, scripts generated and put into Version Control as expected

8 – Add your callbacks (as above) to your repo to replace the Shadow and filter this file out with your gitignore*

*SQL Server Example given above

9 – Each user that pulls down the project will need their own beforeConnect callback to recreate their own Shadow DB, but once they’ve created it, the .gitignore will filter it out by default and they won’t need to create it again

Done

PASS Data Community Summit 2021

“Education is the kindling of a flame, not the filling of a vessel.”
Socrates

I will be speaking at PASS Data Community Summit 2021

I have spoken at previous PASS Summits; both through the virtue of working for Redgate, and off my own back through dedication and passion to the subject matter I speak about: Data Privacy and Protection.

In 2018 I stood on stage with Microsoft to speak about the nature of Static Data Masking, how it differs from Dynamic Masking and what challenges need to be considered for a successful static masking rollout.

In 2019 I stood on stage alone to talk about creating a strategy for masking non-Production environments, including a walkthrough of the dbatools.io masking functionality utilized alongside Azure SQL Database classifications. PASS Summit 2019 was also when Kendra Little encouraged me to set up this blog, for which I’m forever grateful.

In 2020… well. You know what happened.

In 2021 Summit sees a new lease of life. Data Community Summit will be entirely online (no surprises there) but one big surprise you might not know is that it is completely free to attend. Never before will there have been SUCH a swathe of incredible speakers, with such a huge variety of topics and learning pathways for free and available on demand afterwards.

The dates for your diary? November 8-12, 2021

As it happens, I will also be speaking about setting up an end to end deployment pipeline using the Flyway Community Edition, Azure SQL Database and Azure DevOps it would be great to see you but with so much on offer I could absolutely understand if you watched on catch up!

You can see all the speakers here, but here’s a short list of some oft he sessions I will definitely be tuning in to!

  • Erin Stellato – Demystifying Statistics in SQL Server
  • Grant Fritchey – Identify Poorly Performing Queries – Three Tools You Already Own
  • Tracy Boggiano – Azure SQL Fundamentals
  • Angela Tidwell – Azure Devops Dashboards EZ as pie-charts!
  • Indira Bandari – Getting started with Python for Data professionals
  • Jess Pomfret – Azure SQL & Our Toolbox To Manage It
  • Taiob Ali – Think like the Cardinality Estimator
  • Neil Hambly – Azure Notebooks – Data Science fundamentals
  • and many more!

So please go check it out & register, support the community and do a bunch of learning in the process – it will be amazing to see you there and hopefully I’ll even get to see some of you in person in the not-so-far future!

3 RDBMS’, 3 models, 3 end-to-end deployment pipelines with Azure DevOps and Redgate Deploy

“Choice is the most powerful tool we have. Everything boils down to choice. Every choice we make shuts an infinite number of doors and opens an infinite number of doors.”
– Lori Deschene (https://tinybuddha.com/)

PLEASE NOTE (edit 18/12/2021): Two of the three (and soon the third) models below have been superceded by Flyway Desktop – please see my posts on Flyway Desktop setup (first here) and shadow database setup here to setup any workflows past today’s date, the below should only be used for legacy purposes or with Flyway Teams for PostgreSQL.

Picking a Set-Up

One of the hardest parts of my job is that at any moments notice we could be asked to walk through better database change management processes. That’s not the challenge, the problem is that it could be with any kind of tech stack. I might need a Git Repo of some shape or form (Azure DevOps, plain ol’ Git, Bitbucket etc.) and then a CI server of some kind (Azure DevOps, GitLab, TeamCity, Bamboo etc.) and finally something to handle releases (Azure DevOps, Octopus Deploy, Bamboo etc.) – this is fairly easy to reproduce in multiple combinations with automation, terraform etc. but when you’re actually helping someone set it up – you’ve got to know where all the bits go.

The Redgate tools work with all of these options and combinations so making sure we’re setting everything up right usually means questions about the Repo/CI/CD tooling people choose.

The commonality above and the one I run into the most for all 3 stages, is Azure DevOps. Its straightforward to understand, all in the same place and just plain fun to use (AND it supports emojis ^_^).

Finally now, we have to pick a Relational Database Management System (RDBMS) to use – Redgate Deploy is one of the newest offerings from Redgate and it comprises capabilities for “Database DevOps” across MS SQL Server, Oracle Database and 18 (well actually 19 now thanks to Flyway v7!) other RDBMs‘! So instead of choosing, I’m going to pick the two key ones there, and one of the 18 others: MSSQL, Oracle DB and PostgreSQL.

One final question I had to ask of myself was what models I wanted to use. There are a couple of choices available within the Redgate solution, specifically for MSSQL and Oracle at the moment, so I decided that I would do State based deployments for Oracle and Hybrid deployments for MSSQL, given that PostgreSQL will have to be migrations anyway. Fear not though, the setup is not hugely dissimilar when it comes to the actual pipelines!

Setting up Azure DevOps Repos

This stage was relatively easy – I simply created 3 new projects in my DefaultCollection where I’m going to put the repos for each of the DBs.

and then I created 3 readme files, and cloned all 3 git repos down onto my machine as local repos:

and we’re ready to go!

A quick note: I’m using a mixture of Azure DevOps hosted (for PostgreSQL) and Azure DevOps Server locally installed on my Virtual Machine (for MSSQL/Oracle) with a local agent present to run everything below – you can adopt this methodology or you can use the hosted version, but for the Oracle solution below at least you will need a local agent available (unless you use the DockerHub Image for Schema/Data Compare).

Microsoft SQL Server

The first thing I need to do for all of these is to pick the databases I’ll be working on – for me I’m rather lucky as our demonstration environment has a rather nifty set of databases for me to choose from!

I’m going with SQL Source Control (the MSSQL State component in Redgate Deploy) and SQL Change Automation (the MSSQL Migrations component) both plugged into Management Studio (SSMS) with a set of databases called the ScaryDBA_Dev/Test/Prod environments (which I used SQL Clone to create the copies of), in homage to the wonderful Grant Fritchey.

So the first thing we need to do is get Dev under source control – we’ve refreshed back from Prod so there shouldn’t be any differences and we’re using the Hybrid model, so we’ll need to create the State first. I do this by going to SQL Source Control in SSMS, and linking my DB to Git, creating a State Folder in the top level of my local repo as I do so:

Then once linked I go ahead and source control the initial schema (not sure how? Watch the Redgate University videos here):

Next I setup my Migrations project using SQL Change Automation, creating the Migrations folder in the same top level of my local repo, but now instead of pointing to the database, I’m pointing to my SQL Source Control generated State folder:

Now at this point we get the options to choose filters and comparison options – I would recommend if you’re not sure speak to someone at Redgate or look up the documentation – I often see people wanting to filter out Security/Users/Roles at this stage so it might be worth a look! I just carried on as I only have a few objects anyway!

Connect to the target and create a baseline script (i.e. what does Prod look like now?) again, because I have a minimal setup I’ll go straight from my “Prod” database:

Commit and push and we’re on our way – everything is in version control:

Now i may have cheated by doing MSSQL first – because now actually building and deploying the project is pretty straight forward – much like I have done in previous posts here and here I just used the SQL Change Automation plugins from the Azure DevOps marketplace to first build:

and then deploy the project:

and it all succeeded… the 2nd time around when I remembered to specify which DB I was deploying to!

Oracle Database

The first thing I need to do for all of these is to pick the schemas I’ll be working on… wait, Deja Vu! – well once again I have a little set of schemas present on the demonstration machine that will serve me just fine!

Because we’re working in the State setup, out of Redgate Deploy I’m going to use Source Control for Oracle which allows me to specify the remote repo, the folder to create and even the fact I’m using Azure DevOps Git:

(Step 1 was simply providing the connection details to my Oracle Database, hence why I was on step 2!) – I select the Schema I’ll be putting in Source Control and even get a nifty run down of the structure:

Hit next and give a name to the Project (unsurprisingly I went with HR) and then check in all of your initial objects:

Now one thing that you may have noticed if you’re following along that I should clarify (and which I forgot when setting up this blog post):

  1. You don’t need to specify the local repo you cloned down because Source Control for Oracle handles this itself in the back end, if you want it to be part of a local repo with other code in it, use the Working Folder instead
  2. If you are using Git and NOT the working folder, committing will also Push your objects to the remote – you’ve been warned!

As above, I now head over to Pipelines and hit Create New Pipeline! I check out my repo with the schema objects in it, and add a job to my agent. But what am I going to pick? Well unlike SQL Change Automation there’s not a plugin available on the Azure DevOps Marketplace, we’ll need some good old fashioned command line calls!

First, let’s clean out the CI Schema, I’m going to use the script to remove all objects from the Redgate documentation site and make a call to run the script using sqlplus (I’m storing the file locally but you could even include it in your repo under a build folder maybe?)

echo on
Call exit | sqlplus hr/[passwordredacted]@//localhost:1521/CI @C:\DemoFiles\DropAllObjects.sql
echo off

Next we’ll add a call to the cmdline of Schema Compare for Oracle to build the database from our repo, using the files that were checked out by the agent (an Azure DevOps pre-defined environment variable) – again we’re using a similar script from the Redgate DevOps for Oracle site but because we’re deploying ALL objects from version control, we don’t really want a report per say, this is just to test the schema can be built from the ground up:

"C:\Program Files\Red Gate\Schema Compare for Oracle 5\sco.exe" /deploy /source $(Build.SourcesDirectory)\Schema{HR} /target SYSTEM/[passwordredacted]@localhost:1521/CI{HR} AS SYSDBA /indirect 

echo Build database from state:%ERRORLEVEL%
 
rem IF ERRORLEVEL is 0 then there are no changes.
IF %ERRORLEVEL% EQU 0 (
    echo ========================================================================================================
    echo == Warning - No schema changes detected. == echo ========================================================================================================
)
 
rem IF ERRORLEVEL is 61 there are differences, which we expect.
IF %ERRORLEVEL% EQU 61 (
    echo ========================================================================================================
    echo == Objects were found and built. ==
    echo ========================================================================================================
    rem Reset the ERRORLEVEL to 0 so the build doesn't fail 
    SET ERRORLEVEL=0
)

and assuming this all works, we’ll package up the files into a zip and publish them as an artifact so we can consume them at the release stage!

and guess what? It all just worked *cough* on build #23 when I got the syntax right finally…

Of course we can add additional stages to the build as well, such as a check for Invalid Objects and some Unit Testing, but I’ll keep this pretty lean for now!

Now, just like we did for MSSQL we’re going to set up a new deployment pipeline, grab the artifact we’re publishing from the build, enable a CD trigger and we’re going to deploy to, in this case, Acceptance.

Let’s first create a job on the agent to unpack the zip file and see how far we get – I’m just going to dump them in a DeploymentState folder in the working directory:

and… awww thanks Azure DevOps, I needed to hear that!

and now we add yet another command line task, but this one is just going to do a comparison, it’s not actually going to deploy anything – because we’re going to add a manual intervention step to approve the deployment first! I had a little help again from the Redgate docs for this one, because I keep having to catch cmdline error codes – if I was wise like Alex Yates I probably would have just handled this with PowerShell…

echo off
rem  We generate the deployment preview script artifact here
"C:\Program Files\Red Gate\Schema Compare for Oracle 5\sco.exe" /abortonwarnings:high /b:hdre /i:sdwgvac /source $(System.DefaultWorkingDirectory)\DeploymentState\Schema{HR} /target SYSTEM/Redgate1@localhost:1521/Acceptance{HR} AS SYSDBA /indirect /report:$(System.DefaultWorkingDirectory)\DeploymentState\changes_report.html /scriptfile:$(System.DefaultWorkingDirectory)\DeploymentState\deployment_script.sql > $(System.DefaultWorkingDirectory)\DeploymentState\Warnings.txt

echo Warnings exit code:%ERRORLEVEL%
rem In the unlikely event that the exit code is 63, this mean that a deployment warning has exceeded the allowable threshold (eg, data loss may have been detected)
rem If this occurs it is recommended to review the script, customize it, and perform a manual deployment
 
IF %ERRORLEVEL% EQU 0 (
    echo ========================================================================================================
    echo == No schema changes to deploy
    echo ========================================================================================================

    GOTO END
)
 
IF %ERRORLEVEL% EQU 63 (
    echo ========================================================================================================
    echo == High Severity Warnings Detected! Aborting the build. 
    echo == Review the deployment script and consider deploying manually.
    echo ========================================================================================================
    rem Aborting deployment because high severity warnings were detected
        SET ERRORLEVEL=1
    GOTO END
)
 
rem This is the happy path where we've identified changes and not detected any high warnings
IF %ERRORLEVEL% EQU 61 (
    echo ========================================================================================================
    echo == Schema changes found to deploy - generating deployment script for review
    echo ========================================================================================================
    rem Set ERROLEVEL to 0 so the build job doesn't fail
	SET ERRORLEVEL=0
    GOTO END
)
 
:END
EXIT /B %ERRORLEVEL%

I then throw in an agentless job (Manual Intervention Step) and then finally (once I have reviewed the deployment report that is produced) one further cmdline call to actually run the deployment script again my Acceptance target:

echo on
Call exit | sqlplus hr/[passwordRedacted]@//localhost:1521/Acceptance @$(System.DefaultWorkingDirectory)/DeploymentState\deployment_script.sql
echo off

I have saved my pipeline, now it’s time to test. So I’m going to make a very quick change (so that something is produced) and see what happens…

Boom. Pipeline done.

One word on this though – I haven’t included an awful lot of frills (error handling, checks, NuGet instead of Zip etc.) so you’re free to bulk this out how you see fit, but by golly it works! Also make sure you tick this on the second Agent Job, else it’ll wipe out your working directory – something that obviously definitely did not happen to me…

PostgreSQL

This one might be cheating a little. As you know I’ve already setup a CI pipeline with Flyway before, using Azure SQL DBs and the Flyway Docker container as part of the build, and in some cases even tSQLt for Unit Testing too! But this is PostgreSQL, and this is a new blog post, darn it!

Still getting your head around Flyway? Check out the Redgate University videos!

I started out by creating myself a PostgreSQL 10 server in the Azure Portal, because:

  • I can
  • I didn’t want a local install of PostgreSQL
  • I’m not self sabotaging

and I set up a Dev and Test database on it – that is once I remembered to allow my client IP address *sigh* and then connected from Azure Data Studio:

I already have some basic scripts from my last demo that I can use – so I pulled down the latest version of Flyway (V7) and unzipped it into my files:

Then I created a SQL folder in my local repository for the PostgreSQLPipeline (and popped a couple of migrations in – I’m using the StackOverflow scripts, adapted for PostgreSQL from Kendra Little’s GitHub, thank you Kendra!) – in the previous posts we’ve had to source control the state or initial baseline of the database, however as we’re using Flyway for PostgreSQL this requires us to create and name/order the migrations ourselves, so we have plenty of control over that – hence why we can jump straight into building some scripts this time around.

Finally, I pointed the config file for Flyway to that, also taking the opportunity to point it at my Dev DB using the PostgreSQL JDBC:

Now i didn’t really NEED to do this step and try things out against Dev, because I already have the scripts, so I could have just started building the pipeline – but it’s always worthwhile getting local validation first by running things against Dev and then migrating up!

A quick Flyway Info later and we were good to go – the scripts are recognized so we know we’ve set everything up correctly.

One git add / commit / push and everything is in my repo:

Now as you may know from my other post we can do 1 of 2 things here – we can now either build what we eventually push to the repo using a cmdline call (like we did with the Oracle build) to a machine where we have Flyway installed, or we can use the Docker image.

I’m actually going to use Docker again but this time, instead of specifying the various credentials in a config file that was getting passed to the container, I’m actually going to use Azure DevOps environment variables and build the connection string that way – it’s really easy to keep the variables secret in Pipelines, so I can pass my JDBC connection, complete with Username and Password, as well as my Flyway license key, without worrying someone might get hold of them!

I’m actually going to build against a live PostgreSQL database before deploying, so I also created another DB for me to use: demodb_ci

I actually stole the YAML from my previous pipeline (below) and updated the variables accordingly:

trigger:
- master
 
pool:
  vmImage: 'ubuntu-latest'
 
steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway clean -url=$(JDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Clean build schema'
 
- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway migrate -url=$(JDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Run flyway build'

and it ran just fine! Well actually it failed first, because I didn’t have permissions from the IP address that the container was running from, but fortunately Azure has a handy switch in the PostgreSQL Server settings to simply allow Azure Services traffic through the firewall:

Once that was sorted, the first stage (as always) is to download Docker and then we have 2 Flyway containers steps:

1 – Clean the schema and make sure the database is empty
2 – Migrate the schema changes

Then we have two options – we could do like we did in the Oracle pipeline and zip up the files, spitting them out at Release stage and consuming them, either calling Flyway from the command line, or we can go ahead and promote our deployment using the same pipeline.

I’m lazy, so I’m going for the latter!

In a normal “production like” situation I would probably take the opportunity to test and check etc. like I did above, but let’s keep this super lean – if the build works, I trust the deployment. Lets go ahead and deploy to Production – I’ll add this as an additional task in my YAML:

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway migrate -url=$(ProdJDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Promote to Production'

And the deployment was successful! Phew – I think I’ve earned a cup of tea!

Conclusion

In this blog post I have demonstrated 3 different (and initially very simple*) approaches to the source control and deployment of database changes – but there’s actually a much wider combination we could have adopted – all 3 models with MSSQL, all 3 models with Oracle, and Migrations for up to 18 other systems like DB2, Snowflake and even SAP HANA! But what did I need to do ALL of this? A single solutionRedgate Deploy**.

Thank you for stopping by! Have an amazing week!

*There is a lot missing from the code I have provided, like additional error handling, tests etc. and all of the above CAN be improved – but did we manage to build and deploy across three different systems all using Azure DevOps? Yes we did. If you intend on using any of the above, please ensure you build in the necessary controls and process around it and always pick what is best for you and your team.

**Redgate Deploy is going from strength to strength, expect to see a wide range of improvements made over the coming months – I won’t be surprised if this blog post is already out of date by the time I finish writing it – that’s how awesome the teams working on all of this are!

3 methods for seeding test data during CI builds with Flyway

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.
Sir Arthur Conan Doyle, Sherlock Holmes

EDIT: Modified 23/12/2020 to include updates to Flyway in v7, method 4 below

Can you tell I’m loving Flyway at the moment? Well I am. It’s JUST SO GOOD! Honestly there are so many things you can do with it! Don’t know what I’m talking about? Check out my posts on xRDBMS DevOps with Flyway and tSQLt unit tests with Flyway and you’ll see what I mean!

As a result of the above posts though I was asked a question that I had to think about for a little bit before having the best possible answer, how can we seed some testing data INTO the build database so that we can run some meaningful tests against it?

This makes perfect sense to me, but there’s also a few different ways to do this – so let’s go fly(way)!

flying i believe i can fly GIF

1 – Test Data Migration Scripts

In my previous posts on Flyway (above) I talked about having an entirely separate build folder present within the repository, and a folder of test migrations alongside our schema migrations – I called these the Build_Config folder, (containing the build configuration file) and the Test_Migrations folder (unsurprisingly containing testing migrations) in the _Migrations location:

I was using the same build config for 2 purposes; 1) to build the schema migrations from the base version, by passing it the Schema_Migrations location dynamically and 2) then building the tSQLt framework and testing objects by passing it the Test_Migrations location dynamically.

This actually worked surprisingly well, but even beyond this – the same method can be repurposed, or added to, by augmenting your testing scripts and adding a data insertion task (as an additional script or group of scripts). In my folder, I can simply add a migration like this:

Because of course I like dogs.

lana del rey yes GIF

and once pushed to the repository and the build has run we should be able to verify our testing data is present:

A bonus win for this step of course, is that where Devs have their own Flyway config files locally for their development databases they could also overwrite this behavior and point the testing and/or data scripts at their own database so they have some seed data to work with too!

2 – Add a data generation step to the pipeline

There are SO MANY technologies out on the inter-webs for generating data. SO MANY. Many of them also have a command line or PowerShell module that we can use to easily invoke them against a target, especially if that target is going to be persistent like my Flyway Azure SQL Build DBs!

Because I have access to it and because I’m using essentially SQL Server DBs, I could easily use Redgate SQL Data Generator – but to get the data you need you could use anything from DBATools Data Generation (also SQL Server) to FillDB for MySQL (which looks awesome and you could easily use this for Step 1 above too!)

There are numerous ways to invoke tools and applications and fortunately good CI/CD tools like Azure DevOps offer multiple ways to, for instance, run PowerShell or CLI steps from within the pipeline – so we could easily invoke SQL Data Generator on a VM or physical machine we have an Azure DevOps agent on – but this thinking also opens up the possibility of using something like Chocolatey to dynamically install the software on the Azure DevOps hosted pool VM during build (for the Redgate tools at the moment I suppose you’d need a Windows VM).

sassy pants chocolate GIF

I will be writing a future blog post about this step because it sounds _very_ interesting, but I’m not sure yet what can be done specifically using Chocolatey or if I’ll have to look elsewhere, although I have read this post in the past (thanks Paul!) detailing limitations and a great workaround using Azure DevOps, so it’s likely that’ll be my first port of call!

Just to give you an idea of end result with SQL Data Generator specifically though:

3 – Use existing data, don’t generate

Ok this one is going to be controversial already, I can tell! Let’s all stay calm!

happy chill GIF

The best data to be tested is our data. What we have in Production is what will have these changes deployed to it… eventually! So shouldn’t we just test against that? Well. Maybe, maybe not depending on what is in there.

There’s a few methods to achieve this – my personal favorite would be to use a SQL Clone, spin that up on a build VM rather than using an Azure SQL DB, and we can have all the data in an instant. Of course if we hold any sensitive PII/PHI then we should ensure that is protected first!

Of course there are lots of other options, like restoring a backup or spinning up a container etc. and these can all just be a stage in the YAML file before invoking Flyway but the point is, if we use an existing copy of our Prod database from some source or another, it will have 2 things we really care about:

  1. Data. Ready to go, ready to test, ready to give us the best possible insight into our changes.
  2. The flyway_schema_history table. Instead of running EVERY migration we’ve ever written, which could take a while for a large team, we run only the latest migrations to check that they would deploy happily to the Production target.

To get this stage to work though, you would need to do a couple of things differently:

  1. The build DB would have to be created from the clone/backup/other every time instead of simply cleaning the schema down.
  2. You would need to remove the Flyway Clean step from the pipeline in my previous post, because it would otherwise drop all the tables (and then we wouldn’t have any data!)
  3. By extension, this also makes the callback to remove the tSQLt objects void, so you can remove that too.

4 (Bonus Method) – Script Migrations

In Flyway v7 the team added the ability to also run script Migrations and Callbacks which mean it is possible to invoke .ps1, .bat, .cmd, .sh, .bash, and .py files as part of the version control > build and migration process.

This means that you can include a script to invoke any loading or processing of data you may prefer – you could invoke a data generation utility, data masking and of course anything else that can be invoked with these file formats. A good example of this might be calling Data Generator as above, or you could use DBATools, DTM Data Generator or even a more platform agnostic approach by using a Hazy generator to produce and then load an incredibly realistic data set.

Conclusion

There are a lot of different ways to generate data, you can generate completely synthetic data, you can mask data or use Prod data, it’s up to you! Ultimately it will just for another part of your pipeline – just be careful of ordering! You don’t want to try generating data into a table that hasn’t been built yet.

Respect your YAML file and you’ll get schema, data and unit tests and this will lead to one thing. Greater insight, earlier.

thumbs up GIF

Flyway and tSQLt – migrating to warmer test climates

“If you truly have faith in your convictions, then your convictions should be able to stand criticism and testing.”
DaShanne Stokes

Welcome fellow TestDriven-Development enthusiasts… is what I would say if i actually ever did TDD and didn’t just, you know… write regular unit tests after the fact instead.

I’m going to be honest, I love the idea of TDD but have I ever actually been able to do it? No. Have competent developers been able to do it successfully? Yes, of course. Don’t know anything about TDD? You’re in luck! Click here for an introduction (don’t worry though, THIS post is not going to be about TDD anyway, so you can also keep reading).

But one thing we can all agree on is that testing is pretty important. Testing has evolved over the years though and there are a million-and-one ways to test your code, but one of the most difficult and frustrating things to test, from experience, is database code.

gilmore girls shot of cynicism GIF

Some people argue that the days of testing, indeed, the days of stored procedures themselves are gone and that everything we do in databases should be tested using a combination of different logic and scripting languages like Python or PowerShell… but we’re not quite there yet, are we?

Fortunately though we’re not alone in this endeavor, we have access to one of the best ways to test T-SQL code: tsqlt. You can read more about tsql at the site here but in short – we have WAYS to test your SQL Server* code. The only problem is, when you’re using a migrations approach… how?

*There are also many ways to unit test code from other RDBMS’ of course, like utPLSQL for Oracle Database or pgTAP for PostgreSQL – would this method work for those? Maybe! Try adapting the method below and let me know how you get on!

I’ve already talked about how implementing tests is easier for state based database source control in a previous post because we can easily filter tests out when deploying to later stage environments, however with migrations this can be a real pain because you have to effectively work on tests like you would any normal database changes, and maybe even check them in at the same time – so ultimately, they should be managed in the same way as database schema migrations… but we can’t filter them out of migrations or easily pick and choose what migrations get run against test and Prod, without a whole lot of manual intervention.

Basically. It’s a mess.

mess fail GIF

But during my last post about Flyway I was inspired. This simple and easy to use technology just seems to make things really easy and seemingly has an option for EVERYTHING, so the question I started asking myself was: “How hard would it be to adapt this pipeline to add unit tests?” and actually although there were complications, it was still easier than I thought it would be! Here’s how you can get up and running with the tSQLt framework and Flyway migrations.

1 – Download the scripts to create the tSQLt framework and tests from the site

Ok this was the easiest step of them all, largely because in the zip file you download from the tsqlt website all you have is a set of scripts, first needed to enable CLR and the second to install the tsqlt framework:

As part of my previous pipeline I’m actually using Azure SQL Database as my development environment, where RECONFIGURE is not a supported keyword and where we don’t need to run the CLR script anyway, so all I needed was the tSQLt.class.sql file.

The good thing about this is that we can copy it across into a migration and have this as our base test class migration, and then any tests we write on top of it will just extend it – so as long as we remember to update it _fairly_ frequently with any new tsqlt update, we should be fine! (Flyway won’t throw an error because these are non persistent build objects, so no awkward checksum violations to worry about!)

2 – Adapt the folder structure in the repository for tests

I added 2 new folders to my _Migrations top level folder, a Schema_Migrations folder and a Test_Migrations folder. When you pass Flyway a location for migrations, it will recursively scan folders in that location looking for migrations to run in order. I copied the migrations I had previously into the Schema Migrations folder and then my new tSQLt creating migration into the Test Migrations folder. This allows them to be easily coupled by developers, whether you’re writing unit tests or practicing TDD:

You’ll have noticed I called my base testing migration V900__ – this is because I do still want complete separation and if we have a V5 migration in schema migrations and a V5 testing migration, we’re going to have some problems.

3 – Add a callback to handle removal of the objects

As I was putting this together, I noticed that I could use flyway migrate to run the tSQLt framework against my Dev database, but every time I tried to then flyway clean that database I got a very nasty error stating that the tSQLt assembly could not be removed because of dependent objects.

Flyway does not handle complex dependencies very well unfortunately, that’s where you’d use an industry leading comparison tool like SQL Compare so, with some advise from teh wonderful Flyway team, I set to work on a callback. A callback is how you can hook into Flyway’s own processes, telling it to do something before, during or after certain commands. In my case we were going to remove all of the tSQLt objects prior to running Flyway clean to remove the rest of the schema. To make it future proof (in case objects are added or removed from the tSQLt framework), I wrote a couple of cursors to go through the different objects that were dependent on the assembly and remove them, rather than generating a script I know to have all of the tSQLt objects in right now. You can find the code for the callback in my GitHub here, you are welcome to it!

Animated GIF

All you have to do is name it beforeClean.sql and ensure it is in the directory with your other sql migrations so that it will pick this up and run it – I put it in my Test_Migrations folder, because I only want it to run this callback when cleaning the build DB, as this is the only place we’re utilizing automated unit tests… for now!

4 – Update the Azure DevOps pipeline

I’ve got my callback, I’ve got my tSQLt migration and the folder structure is all correct and is pushed to Azure DevOps but naturally it is breaking the build *sad* but fortunately all we now have to do is update the YAML pipeline file:

trigger:
- master

pool:
  vmImage: 'ubuntu-latest'

steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS)/Test_Migrations:/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway clean -enterprise
  displayName: 'Clean build schema'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS)/Schema_Migrations:/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway migrate -enterprise
  displayName: 'Run flyway for schema'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS)/Test_migrations:/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway migrate -enterprise
  displayName: 'Run flyway for tSQLt'

You will notice a couple of important things that I have highlighted above:

  1. I’m cleaning the build schema using the Test_Migrations repository – this is because that is where my callback is and I need that to run before the clean otherwise it will fail due to the tSQLt assembly issue (line 17)
  2. I am running the migrate for the tests and the schema separately in the file, instead of just calling flyway to recursively run everything in the _Migrations folder. This is because I want them to be 2 separate steps, in case I need to modify or remove either one of them, or insert other steps in between and so that I can see the testing output in a separate stage of the CI pipeline (lines 23 and 29).

Caveat: As a result of (Option 2) running the 2 processes separately, it means running Flyway twice but specifying the Schema_Build and Test_Build folders in the YAML as being mapped to Flyway’s sql directory (lines 16 and 22 in the file above) but the problem this causes is that the second time Flyway runs, when it recursively scans the Test_Migrations folder it will not find the migrations that are present in the Flyway_Schema_History table, resulting in an error as Flyway is unable to find and resolve the migrations locally.

The way to fix this though is pretty simple – you find the line in the Flyway Config file that says “IgnoreMissingMigrations” which will allow it to easily continue. We wouldn’t have to worry about this setting though, if we were just recursively looking to migrate the Schema and Test migrations in the same step (but I’m a control freak tee-hee).

Now, once committed this all runs really successfully. Velvety smooth one might even say… but we’re not actually testing anything yet.

5 – Add some tests!

I’ve added a single tSQLt test to my repository (also available at the same GitHub link), it was originally created by George Mastros and is part of the SQLCop analysis tests – checking if I have any user procedures named “SP_”, as we know that is bad practice – and I have wrapped it up in a new tSQLt test class ready to run.

You’ll notice I also have a V999.9__ migration in the folder too, the purpose of this was to ‘top and tail’ the migrations; first have a script to set up tSQLt that could be easily maintained in isolation and then end with a script that lets me do just 1 thing: execute all of the tests. You can do this by simply executing:

EXEC tSQLt.RunAll

and we should be able to capture this output in the relevant stage of the pipeline.

Some of you may be asking why I chose to have the run unit tests as part of the setting up of the testing objects – this was because I had 2 options:

  1. I’m already executing scripts against the DB with Flyway, I may as well just carry on!
  2. The only other way I could think to do it was via a PowerShell script or run SQL job in Azure DevOps but the 2 plugins I tried fell over because I was using a Ubuntu machine for the build.

So naturally being the simple person I am, I opted for 1! But you could easily go for the second if you prefer!

6 – Test, Test, Test

Once you’ve handled the setup, got the callback in place (and also followed the steps from the last blog post to get this set up in the first place!) you should be able to commit it all these changes and have a build that runs, installs tSQLt and then runs your tests:

I realize there are a lot of “Warnings” in there, but that is just Azure DevOps capturing the output, the real part of this we’re interested in is lines 31-40 and if we clean up the warnings a little you’ll get:

+----------------------+
|Test Execution Summary|
+----------------------+
|No|Test Case Name|Dur(ms)|Result |
+--+---------------------------------------+-------+-------+ 
|1 |[somenewclass].[testProceduresNamedSP_]|144|Success|
------------------------------------------------------------
Test Case Summary: 
1 test case(s) executed, 1 succeeded, 0 failed, 0 errored. 
------------------------------------------------------------------

But if I introduce a migration to Flyway with a new Repeatable Migration that creates a stored procedure named SP_SomeNewProc…

+----------------------+
|Test Execution Summary|
+----------------------+
|No|Test Case Name|Dur(ms)|Result |
+--+---------------------------------------+-------+-------+ 
|1 |[somenewclass].[testProceduresNamedSP_]|184|Failure|
------------------------------------------------------------
Test Case Summary: 
1 test case(s) executed, 0 succeeded, 1 failed, 0 errored. 
------------------------------------------------------------------

It even tells us the name of the offending sproc:

All I have to do now is make the corresponding change to remove SP_ in dev against a bug fix branch, push it, create a PR, approve and merge it in and then boom, the build is right as rain again:

Thus bringing us back into line with standard acceptable practice, preventing us from delivering poor coding standards later in the pipeline and ensuring that we test our code before deploying.

Conclusion

Just because you adopt a more agile, migrations based method of database development and deployment, doesn’t mean that you have to give up on automated testing during Continuous Integration, and you can easily apply these same principles to any pipeline. With just a couple of tweaks you can easily have a fully automated Flyway pipeline (even xRDBMS) and incorporate Unit Tests too!

xRDBMS Database Continuous Integration with Flyway, Azure DevOps and Docker… the simple way.

“Some people try to make everything complicated, be the person who tries to make everything simple.”
Dave Waters

Simplicity is in my blood. That’s not to say I am ‘simple’ in the sense I cannot grasp more than the most basic concepts, but more that I am likely to grasp more complex problems and solutions when they are phrased in simple ways.

This stems from my love of teaching others (on the rare occasion it falls to me to do so), where I find the moment that everything just ‘clicks’ and the realization comes over them to be possibly one of the most satisfying moments one can enjoy in life.

shocked star trek GIF

Now recently I’ve been enjoying getting my head around Flyway – an open source JDBC based migrations tool that brings the power of schema versioning and deployments together with the agility that developers need to focus on innovation in Development. There’s something about Flyway that just… ‘clicks’.

It doesn’t really matter what relational database you’re using; MySQL, IBM DB2, even SAP HANA! You can achieve at least the core tenants of database DevOps with this neat and simple little command line tool – there’s not even an installer, you just have to unzip!

Now I’ve had a lot of fun working with Flyway so far and, thanks to a few people (Kendra, Julia – i’m looking at you both!) I have been able to wrap my head around it to, I would say, a fair standard. Caveat on that – being a pure SQL person please don’t ask me about Java based migrations, I’m not quite there yet!! But there is one thing that I kept asking myself:

“When I’m talking to colleagues and customers about Database DevOps, I’m always talking about the benefits of continuous integration; building the database from scratch to ensure that everything builds and validates…” etc. etc. so why haven’t I really come across this with Flyway yet?

think tom hanks GIF by The Late Show With Stephen Colbert

Probably for a few reasons. You can include Flyway as a plugin in your Maven and Gradle configurations, so people writing java projects already get that benefit. It can easily form part Flyway itself by virtue is simply small incremental scripts and developers can go backwards and forwards however and as many times as they like with the Flyway Migrate, Undo and Clean commands, so is there really a need for a build? And most importantly, Flyway’s API just allows you to build it in. So naturally you’re building WITH the application.

But naturally when you’re putting your code with other people’s code, things have to be tested and verified, and I like to do this in isolation too – especially for databases that are decoupled from the application, or if you have a number of micro-service style databases you’d want to test all in parallel etc. it’s a great way to shift left. So I started asking myself if there was some way I could implement a CI build using Flyway in Azure DevOps, like I would any of the other database tooling I use on a regular basis? Below you’ll find the product of my tinkering, and a whole heap of help from Julia and Kendra, without whom I would still be figuring out what Baseline does!

Option 1) The simplest option – cmdline

Flyway can be called via the command line and it doesn’t get more simple than that.

You can pass any number of arguments and switches to Flyways command line, including specifying what config files it’s going to be using – which means that all you have to do, is unzip the Flyway components on a dedicated build server (VM or on-prem) and then, after refreshing the migrations available, invoke the command line using Azure DevOps pipelines (or another CI tool) to run Flyway with the commands against a database on the build server (or somewhere accessible to the build server) and Bingo!

No Idea Build GIF by Rooster Teeth

And that’s all there is to it! You get to verify that all of the migrations up to the very latest in your VCS will run, and even if you don’t have the VERY base version as a baseline migration, you can still start with a copy of the database – you could even use a Clone for that!

But yes, this does require somewhere for Flyway to exist prior to us running with our migrations… wouldn’t it be even easier if we could do it without even having to unzip Flyway first?

Option 2) Also simple, but very cool! Flyway with Docker

Did you know that Flyway has it’s own docker image? No? Well it does!* Not only that but we can map our own version controlled Migration scripts and Config files to the container so that, if it can point at a database, you sure as heck know it’s going to migrate to it!

*Not sure what the heck all of this Docker/Container stuff is? You’re not alone! Check out this great video on all things containers from The Simple Engineer!

This was the method I tried, and it all started with putting a migration into Version Control. Much like I did for my post on using SQL Change Automation with Azure SQL DB – I set up a repo in Azure DevOps, cloned it down to my local machine and I added a folder for the migrations:

Into this I proceeded to add my base script for creating the DMDatabase (the database I use for EVERYTHING, for which you can find the scripts here):

Once I had included my migration I did the standard

Git add .
Git commit -m "Here is some code"
Git push

and I had a basis from which to work.

Next step then was making sure I had a database to work with. Now the beauty of Flyway means that it can easily support 20+ RDBMS’ so I was like a child at a candy store! I didn’t know what to pick!

For pure ease and again, simplicity, I went for good ol’ SQL Server – or to be precise, I created an Azure SQL Database (at the basic tier too so it’s only costing £3 per month!):

Now here’s where it gets customizable. You don’t NEED to actually even pass in a whole config file to this process. Because the Flyway container is going to spin up everything that would come with an install of Flyway, you can pass it switches to override the default behavior specified in the config file. You can adapt this either by hard-coding strings or by using Environment Variables alongside the native switches – this means you could pass in everything you might need securely through Azure Pipeline’s own methods.

I, on the other hand, was incredibly lazy and decided to use the same config file I use for my Dev environment, but I swapped out the JDBC connection to instead be my Build database:

I think saved this new conf file in my local repo under a folder named Build Configuration – in case I want to add any logic later on to include in the build (like the tSQLt framework and tests! Hint Hint!)

This means that I would only need to specify 2 things as variables, the location of my SQL migrations, and the config file. So the next challenge was getting the docker container up and running, which fortunately it’s very easy to do in Azure Pipelines, here was the entirety of the YAML to run Flyway in a container (and do nothing with it yet):

trigger:
- master

pool:
  vmImage: 'ubuntu-latest'

steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run flyway/flyway -v
  displayName: 'Run Flyway'

So, on any changes to the main branch we’ll be spinning up a Linux VM, grabbing Docker and firing up the Flyway container. That’s it. Simple.

So now I just have to pass in my config file, which is already in my ‘build config’ folder, and my migrations which are in my VCS root. To do this it was a case of mapping where Azure DevOps stores the files from Git during the build to the containers own mount location in which it expects to find the relevant conf and sql files. Fortunately Flyway and Docker have some pretty snazzy and super clear documentation on this – so it was a case of using:

-v [my sql files in vcs]:/flyway/sql

as part of the run – though I had to ensure I also cleaned the build environment first, otherwise it would just be like deploying to a regular database, and we want to make sure we can build from the ground up every single time! This lead to me having the following environment variables:

As, rather helpfully, all of our files from Git are copied to the working directory during the build and we can use the environment variable $(Build.Repository.LocalPath) to grab them! This lead to me updating my YAML to actually do some Flyway running when we spin up the container!

trigger:
- master

pool:
  vmImage: 'ubuntu-latest'

steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS):/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway clean -enterprise
  displayName: 'Clean build schema'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run -v $(FLYWAY_LOCATIONS):/flyway/sql -v $(FLYWAY_CONFIG_FILES):/flyway/conf flyway/flyway migrate -enterprise
  displayName: 'Run flyway for schema'

Effectively, this will spin up the VM in ADO, download and install Docker, fire up the Flyway container and then 1) clean the target schema (my Azure SQL DB in this case) and 2) then migrate all of the migrations scripts in the repo up to the latest version – and this all seemed to work great!*

*Note: I have an enterprise Flyway licenses which enables loads of great features and support, different version comparisons can be found described here.

So now, whenever I add Flyway SQL migrations to my repo as part of a branch, I can create a PR, merge them back into Trunk and trigger an automatic build against my Flyway build DB in Azure SQL:

Conclusion

Getting up and running with Flyway is so very very easy, anyone can do it – it’s part of the beauty of the technology, but it turns out getting the build up and running too, when you’re not just embedding it directly within your application, is just as straightforward and it was a great learning curve for me!

The best part about this though – is that everything above can be achieved using pretty much any relational database management system you would like, either via the command line and a dedicated build server, or via the Docker container at build time. So get building!

ready lets go GIF