3 RDBMS’, 3 models, 3 end-to-end deployment pipelines with Azure DevOps and Redgate Deploy

“Choice is the most powerful tool we have. Everything boils down to choice. Every choice we make shuts an infinite number of doors and opens an infinite number of doors.”
– Lori Deschene (https://tinybuddha.com/)

Picking a Set-Up

One of the hardest parts of my job is that at any moments notice we could be asked to walk through better database change management processes. That’s not the challenge, the problem is that it could be with any kind of tech stack. I might need a Git Repo of some shape or form (Azure DevOps, plain ol’ Git, Bitbucket etc.) and then a CI server of some kind (Azure DevOps, GitLab, TeamCity, Bamboo etc.) and finally something to handle releases (Azure DevOps, Octopus Deploy, Bamboo etc.) – this is fairly easy to reproduce in multiple combinations with automation, terraform etc. but when you’re actually helping someone set it up – you’ve got to know where all the bits go.

The Redgate tools work with all of these options and combinations so making sure we’re setting everything up right usually means questions about the Repo/CI/CD tooling people choose.

The commonality above and the one I run into the most for all 3 stages, is Azure DevOps. Its straightforward to understand, all in the same place and just plain fun to use (AND it supports emojis ^_^).

Finally now, we have to pick a Relational Database Management System (RDBMS) to use – Redgate Deploy is one of the newest offerings from Redgate and it comprises capabilities for “Database DevOps” across MS SQL Server, Oracle Database and 18 (well actually 19 now thanks to Flyway v7!) other RDBMs‘! So instead of choosing, I’m going to pick the two key ones there, and one of the 18 others: MSSQL, Oracle DB and PostgreSQL.

One final question I had to ask of myself was what models I wanted to use. There are a couple of choices available within the Redgate solution, specifically for MSSQL and Oracle at the moment, so I decided that I would do State based deployments for Oracle and Hybrid deployments for MSSQL, given that PostgreSQL will have to be migrations anyway. Fear not though, the setup is not hugely dissimilar when it comes to the actual pipelines!

Setting up Azure DevOps Repos

This stage was relatively easy – I simply created 3 new projects in my DefaultCollection where I’m going to put the repos for each of the DBs.

and then I created 3 readme files, and cloned all 3 git repos down onto my machine as local repos:

and we’re ready to go!

A quick note: I’m using a mixture of Azure DevOps hosted (for PostgreSQL) and Azure DevOps Server locally installed on my Virtual Machine (for MSSQL/Oracle) with a local agent present to run everything below – you can adopt this methodology or you can use the hosted version, but for the Oracle solution below at least you will need a local agent available (unless you use the DockerHub Image for Schema/Data Compare).

Microsoft SQL Server

The first thing I need to do for all of these is to pick the databases I’ll be working on – for me I’m rather lucky as our demonstration environment has a rather nifty set of databases for me to choose from!

I’m going with SQL Source Control (the MSSQL State component in Redgate Deploy) and SQL Change Automation (the MSSQL Migrations component) both plugged into Management Studio (SSMS) with a set of databases called the ScaryDBA_Dev/Test/Prod environments (which I used SQL Clone to create the copies of), in homage to the wonderful Grant Fritchey.

So the first thing we need to do is get Dev under source control – we’ve refreshed back from Prod so there shouldn’t be any differences and we’re using the Hybrid model, so we’ll need to create the State first. I do this by going to SQL Source Control in SSMS, and linking my DB to Git, creating a State Folder in the top level of my local repo as I do so:

Then once linked I go ahead and source control the initial schema (not sure how? Watch the Redgate University videos here):

Next I setup my Migrations project using SQL Change Automation, creating the Migrations folder in the same top level of my local repo, but now instead of pointing to the database, I’m pointing to my SQL Source Control generated State folder:

Now at this point we get the options to choose filters and comparison options – I would recommend if you’re not sure speak to someone at Redgate or look up the documentation – I often see people wanting to filter out Security/Users/Roles at this stage so it might be worth a look! I just carried on as I only have a few objects anyway!

Connect to the target and create a baseline script (i.e. what does Prod look like now?) again, because I have a minimal setup I’ll go straight from my “Prod” database:

Commit and push and we’re on our way – everything is in version control:

Now i may have cheated by doing MSSQL first – because now actually building and deploying the project is pretty straight forward – much like I have done in previous posts here and here I just used the SQL Change Automation plugins from the Azure DevOps marketplace to first build:

and then deploy the project:

and it all succeeded… the 2nd time around when I remembered to specify which DB I was deploying to!

Oracle Database

The first thing I need to do for all of these is to pick the schemas I’ll be working on… wait, Deja Vu! – well once again I have a little set of schemas present on the demonstration machine that will serve me just fine!

Because we’re working in the State setup, out of Redgate Deploy I’m going to use Source Control for Oracle which allows me to specify the remote repo, the folder to create and even the fact I’m using Azure DevOps Git:

(Step 1 was simply providing the connection details to my Oracle Database, hence why I was on step 2!) – I select the Schema I’ll be putting in Source Control and even get a nifty run down of the structure:

Hit next and give a name to the Project (unsurprisingly I went with HR) and then check in all of your initial objects:

Now one thing that you may have noticed if you’re following along that I should clarify (and which I forgot when setting up this blog post):

  1. You don’t need to specify the local repo you cloned down because Source Control for Oracle handles this itself in the back end, if you want it to be part of a local repo with other code in it, use the Working Folder instead
  2. If you are using Git and NOT the working folder, committing will also Push your objects to the remote – you’ve been warned!

As above, I now head over to Pipelines and hit Create New Pipeline! I check out my repo with the schema objects in it, and add a job to my agent. But what am I going to pick? Well unlike SQL Change Automation there’s not a plugin available on the Azure DevOps Marketplace, we’ll need some good old fashioned command line calls!

First, let’s clean out the CI Schema, I’m going to use the script to remove all objects from the Redgate documentation site and make a call to run the script using sqlplus (I’m storing the file locally but you could even include it in your repo under a build folder maybe?)

echo on
Call exit | sqlplus hr/[passwordredacted]@//localhost:1521/CI @C:\DemoFiles\DropAllObjects.sql
echo off

Next we’ll add a call to the cmdline of Schema Compare for Oracle to build the database from our repo, using the files that were checked out by the agent (an Azure DevOps pre-defined environment variable) – again we’re using a similar script from the Redgate DevOps for Oracle site but because we’re deploying ALL objects from version control, we don’t really want a report per say, this is just to test the schema can be built from the ground up:

"C:\Program Files\Red Gate\Schema Compare for Oracle 5\sco.exe" /deploy /source $(Build.SourcesDirectory)\Schema{HR} /target SYSTEM/[passwordredacted]@localhost:1521/CI{HR} AS SYSDBA /indirect 

echo Build database from state:%ERRORLEVEL%
 
rem IF ERRORLEVEL is 0 then there are no changes.
IF %ERRORLEVEL% EQU 0 (
    echo ========================================================================================================
    echo == Warning - No schema changes detected. == echo ========================================================================================================
)
 
rem IF ERRORLEVEL is 61 there are differences, which we expect.
IF %ERRORLEVEL% EQU 61 (
    echo ========================================================================================================
    echo == Objects were found and built. ==
    echo ========================================================================================================
    rem Reset the ERRORLEVEL to 0 so the build doesn't fail 
    SET ERRORLEVEL=0
)

and assuming this all works, we’ll package up the files into a zip and publish them as an artifact so we can consume them at the release stage!

and guess what? It all just worked *cough* on build #23 when I got the syntax right finally…

Of course we can add additional stages to the build as well, such as a check for Invalid Objects and some Unit Testing, but I’ll keep this pretty lean for now!

Now, just like we did for MSSQL we’re going to set up a new deployment pipeline, grab the artifact we’re publishing from the build, enable a CD trigger and we’re going to deploy to, in this case, Acceptance.

Let’s first create a job on the agent to unpack the zip file and see how far we get – I’m just going to dump them in a DeploymentState folder in the working directory:

and… awww thanks Azure DevOps, I needed to hear that!

and now we add yet another command line task, but this one is just going to do a comparison, it’s not actually going to deploy anything – because we’re going to add a manual intervention step to approve the deployment first! I had a little help again from the Redgate docs for this one, because I keep having to catch cmdline error codes – if I was wise like Alex Yates I probably would have just handled this with PowerShell…

echo off
rem  We generate the deployment preview script artifact here
"C:\Program Files\Red Gate\Schema Compare for Oracle 5\sco.exe" /abortonwarnings:high /b:hdre /i:sdwgvac /source $(System.DefaultWorkingDirectory)\DeploymentState\Schema{HR} /target SYSTEM/Redgate1@localhost:1521/Acceptance{HR} AS SYSDBA /indirect /report:$(System.DefaultWorkingDirectory)\DeploymentState\changes_report.html /scriptfile:$(System.DefaultWorkingDirectory)\DeploymentState\deployment_script.sql > $(System.DefaultWorkingDirectory)\DeploymentState\Warnings.txt

echo Warnings exit code:%ERRORLEVEL%
rem In the unlikely event that the exit code is 63, this mean that a deployment warning has exceeded the allowable threshold (eg, data loss may have been detected)
rem If this occurs it is recommended to review the script, customize it, and perform a manual deployment
 
IF %ERRORLEVEL% EQU 0 (
    echo ========================================================================================================
    echo == No schema changes to deploy
    echo ========================================================================================================

    GOTO END
)
 
IF %ERRORLEVEL% EQU 63 (
    echo ========================================================================================================
    echo == High Severity Warnings Detected! Aborting the build. 
    echo == Review the deployment script and consider deploying manually.
    echo ========================================================================================================
    rem Aborting deployment because high severity warnings were detected
        SET ERRORLEVEL=1
    GOTO END
)
 
rem This is the happy path where we've identified changes and not detected any high warnings
IF %ERRORLEVEL% EQU 61 (
    echo ========================================================================================================
    echo == Schema changes found to deploy - generating deployment script for review
    echo ========================================================================================================
    rem Set ERROLEVEL to 0 so the build job doesn't fail
	SET ERRORLEVEL=0
    GOTO END
)
 
:END
EXIT /B %ERRORLEVEL%

I then throw in an agentless job (Manual Intervention Step) and then finally (once I have reviewed the deployment report that is produced) one further cmdline call to actually run the deployment script again my Acceptance target:

echo on
Call exit | sqlplus hr/[passwordRedacted]@//localhost:1521/Acceptance @$(System.DefaultWorkingDirectory)/DeploymentState\deployment_script.sql
echo off

I have saved my pipeline, now it’s time to test. So I’m going to make a very quick change (so that something is produced) and see what happens…

Boom. Pipeline done.

One word on this though – I haven’t included an awful lot of frills (error handling, checks, NuGet instead of Zip etc.) so you’re free to bulk this out how you see fit, but by golly it works! Also make sure you tick this on the second Agent Job, else it’ll wipe out your working directory – something that obviously definitely did not happen to me…

PostgreSQL

This one might be cheating a little. As you know I’ve already setup a CI pipeline with Flyway before, using Azure SQL DBs and the Flyway Docker container as part of the build, and in some cases even tSQLt for Unit Testing too! But this is PostgreSQL, and this is a new blog post, darn it!

Still getting your head around Flyway? Check out the Redgate University videos!

I started out by creating myself a PostgreSQL 10 server in the Azure Portal, because:

  • I can
  • I didn’t want a local install of PostgreSQL
  • I’m not self sabotaging

and I set up a Dev and Test database on it – that is once I remembered to allow my client IP address *sigh* and then connected from Azure Data Studio:

I already have some basic scripts from my last demo that I can use – so I pulled down the latest version of Flyway (V7) and unzipped it into my files:

Then I created a SQL folder in my local repository for the PostgreSQLPipeline (and popped a couple of migrations in – I’m using the StackOverflow scripts, adapted for PostgreSQL from Kendra Little’s GitHub, thank you Kendra!) – in the previous posts we’ve had to source control the state or initial baseline of the database, however as we’re using Flyway for PostgreSQL this requires us to create and name/order the migrations ourselves, so we have plenty of control over that – hence why we can jump straight into building some scripts this time around.

Finally, I pointed the config file for Flyway to that, also taking the opportunity to point it at my Dev DB using the PostgreSQL JDBC:

Now i didn’t really NEED to do this step and try things out against Dev, because I already have the scripts, so I could have just started building the pipeline – but it’s always worthwhile getting local validation first by running things against Dev and then migrating up!

A quick Flyway Info later and we were good to go – the scripts are recognized so we know we’ve set everything up correctly.

One git add / commit / push and everything is in my repo:

Now as you may know from my other post we can do 1 of 2 things here – we can now either build what we eventually push to the repo using a cmdline call (like we did with the Oracle build) to a machine where we have Flyway installed, or we can use the Docker image.

I’m actually going to use Docker again but this time, instead of specifying the various credentials in a config file that was getting passed to the container, I’m actually going to use Azure DevOps environment variables and build the connection string that way – it’s really easy to keep the variables secret in Pipelines, so I can pass my JDBC connection, complete with Username and Password, as well as my Flyway license key, without worrying someone might get hold of them!

I’m actually going to build against a live PostgreSQL database before deploying, so I also created another DB for me to use: demodb_ci

I actually stole the YAML from my previous pipeline (below) and updated the variables accordingly:

trigger:
- master
 
pool:
  vmImage: 'ubuntu-latest'
 
steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway clean -url=$(JDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Clean build schema'
 
- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway migrate -url=$(JDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Run flyway build'

and it ran just fine! Well actually it failed first, because I didn’t have permissions from the IP address that the container was running from, but fortunately Azure has a handy switch in the PostgreSQL Server settings to simply allow Azure Services traffic through the firewall:

Once that was sorted, the first stage (as always) is to download Docker and then we have 2 Flyway containers steps:

1 – Clean the schema and make sure the database is empty
2 – Migrate the schema changes

Then we have two options – we could do like we did in the Oracle pipeline and zip up the files, spitting them out at Release stage and consuming them, either calling Flyway from the command line, or we can go ahead and promote our deployment using the same pipeline.

I’m lazy, so I’m going for the latter!

In a normal “production like” situation I would probably take the opportunity to test and check etc. like I did above, but let’s keep this super lean – if the build works, I trust the deployment. Lets go ahead and deploy to Production – I’ll add this as an additional task in my YAML:

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway migrate -url=$(ProdJDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Promote to Production'

And the deployment was successful! Phew – I think I’ve earned a cup of tea!

Conclusion

In this blog post I have demonstrated 3 different (and initially very simple*) approaches to the source control and deployment of database changes – but there’s actually a much wider combination we could have adopted – all 3 models with MSSQL, all 3 models with Oracle, and Migrations for up to 18 other systems like DB2, Snowflake and even SAP HANA! But what did I need to do ALL of this? A single solutionRedgate Deploy**.

Thank you for stopping by! Have an amazing week!

*There is a lot missing from the code I have provided, like additional error handling, tests etc. and all of the above CAN be improved – but did we manage to build and deploy across three different systems all using Azure DevOps? Yes we did. If you intend on using any of the above, please ensure you build in the necessary controls and process around it and always pick what is best for you and your team.

**Redgate Deploy is going from strength to strength, expect to see a wide range of improvements made over the coming months – I won’t be surprised if this blog post is already out of date by the time I finish writing it – that’s how awesome the teams working on all of this are!

3 methods for seeding test data during CI builds with Flyway

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.
Sir Arthur Conan Doyle, Sherlock Holmes

EDIT: Modified 23/12/2020 to include updates to Flyway in v7, method 4 below

Can you tell I’m loving Flyway at the moment? Well I am. It’s JUST SO GOOD! Honestly there are so many things you can do with it! Don’t know what I’m talking about? Check out my posts on xRDBMS DevOps with Flyway and tSQLt unit tests with Flyway and you’ll see what I mean!

As a result of the above posts though I was asked a question that I had to think about for a little bit before having the best possible answer, how can we seed some testing data INTO the build database so that we can run some meaningful tests against it?

This makes perfect sense to me, but there’s also a few different ways to do this – so let’s go fly(way)!

flying i believe i can fly GIF

1 – Test Data Migration Scripts

In my previous posts on Flyway (above) I talked about having an entirely separate build folder present within the repository, and a folder of test migrations alongside our schema migrations – I called these the Build_Config folder, (containing the build configuration file) and the Test_Migrations folder (unsurprisingly containing testing migrations) in the _Migrations location:

I was using the same build config for 2 purposes; 1) to build the schema migrations from the base version, by passing it the Schema_Migrations location dynamically and 2) then building the tSQLt framework and testing objects by passing it the Test_Migrations location dynamically.

This actually worked surprisingly well, but even beyond this – the same method can be repurposed, or added to, by augmenting your testing scripts and adding a data insertion task (as an additional script or group of scripts). In my folder, I can simply add a migration like this:

Because of course I like dogs.

lana del rey yes GIF

and once pushed to the repository and the build has run we should be able to verify our testing data is present:

A bonus win for this step of course, is that where Devs have their own Flyway config files locally for their development databases they could also overwrite this behavior and point the testing and/or data scripts at their own database so they have some seed data to work with too!

2 – Add a data generation step to the pipeline

There are SO MANY technologies out on the inter-webs for generating data. SO MANY. Many of them also have a command line or PowerShell module that we can use to easily invoke them against a target, especially if that target is going to be persistent like my Flyway Azure SQL Build DBs!

Because I have access to it and because I’m using essentially SQL Server DBs, I could easily use Redgate SQL Data Generator – but to get the data you need you could use anything from DBATools Data Generation (also SQL Server) to FillDB for MySQL (which looks awesome and you could easily use this for Step 1 above too!)

There are numerous ways to invoke tools and applications and fortunately good CI/CD tools like Azure DevOps offer multiple ways to, for instance, run PowerShell or CLI steps from within the pipeline – so we could easily invoke SQL Data Generator on a VM or physical machine we have an Azure DevOps agent on – but this thinking also opens up the possibility of using something like Chocolatey to dynamically install the software on the Azure DevOps hosted pool VM during build (for the Redgate tools at the moment I suppose you’d need a Windows VM).

sassy pants chocolate GIF

I will be writing a future blog post about this step because it sounds _very_ interesting, but I’m not sure yet what can be done specifically using Chocolatey or if I’ll have to look elsewhere, although I have read this post in the past (thanks Paul!) detailing limitations and a great workaround using Azure DevOps, so it’s likely that’ll be my first port of call!

Just to give you an idea of end result with SQL Data Generator specifically though:

3 – Use existing data, don’t generate

Ok this one is going to be controversial already, I can tell! Let’s all stay calm!

happy chill GIF

The best data to be tested is our data. What we have in Production is what will have these changes deployed to it… eventually! So shouldn’t we just test against that? Well. Maybe, maybe not depending on what is in there.

There’s a few methods to achieve this – my personal favorite would be to use a SQL Clone, spin that up on a build VM rather than using an Azure SQL DB, and we can have all the data in an instant. Of course if we hold any sensitive PII/PHI then we should ensure that is protected first!

Of course there are lots of other options, like restoring a backup or spinning up a container etc. and these can all just be a stage in the YAML file before invoking Flyway but the point is, if we use an existing copy of our Prod database from some source or another, it will have 2 things we really care about:

  1. Data. Ready to go, ready to test, ready to give us the best possible insight into our changes.
  2. The flyway_schema_history table. Instead of running EVERY migration we’ve ever written, which could take a while for a large team, we run only the latest migrations to check that they would deploy happily to the Production target.

To get this stage to work though, you would need to do a couple of things differently:

  1. The build DB would have to be created from the clone/backup/other every time instead of simply cleaning the schema down.
  2. You would need to remove the Flyway Clean step from the pipeline in my previous post, because it would otherwise drop all the tables (and then we wouldn’t have any data!)
  3. By extension, this also makes the callback to remove the tSQLt objects void, so you can remove that too.

4 (Bonus Method) – Script Migrations

In Flyway v7 the team added the ability to also run script Migrations and Callbacks which mean it is possible to invoke .ps1, .bat, .cmd, .sh, .bash, and .py files as part of the version control > build and migration process.

This means that you can include a script to invoke any loading or processing of data you may prefer – you could invoke a data generation utility, data masking and of course anything else that can be invoked with these file formats. A good example of this might be calling Data Generator as above, or you could use DBATools, DTM Data Generator or even a more platform agnostic approach by using a Hazy generator to produce and then load an incredibly realistic data set.

Conclusion

There are a lot of different ways to generate data, you can generate completely synthetic data, you can mask data or use Prod data, it’s up to you! Ultimately it will just for another part of your pipeline – just be careful of ordering! You don’t want to try generating data into a table that hasn’t been built yet.

Respect your YAML file and you’ll get schema, data and unit tests and this will lead to one thing. Greater insight, earlier.

thumbs up GIF