3 RDBMS’, 3 models, 3 end-to-end deployment pipelines with Azure DevOps and Redgate Deploy

“Choice is the most powerful tool we have. Everything boils down to choice. Every choice we make shuts an infinite number of doors and opens an infinite number of doors.”
– Lori Deschene (https://tinybuddha.com/)

Picking a Set-Up

One of the hardest parts of my job is that at any moments notice we could be asked to walk through better database change management processes. That’s not the challenge, the problem is that it could be with any kind of tech stack. I might need a Git Repo of some shape or form (Azure DevOps, plain ol’ Git, Bitbucket etc.) and then a CI server of some kind (Azure DevOps, GitLab, TeamCity, Bamboo etc.) and finally something to handle releases (Azure DevOps, Octopus Deploy, Bamboo etc.) – this is fairly easy to reproduce in multiple combinations with automation, terraform etc. but when you’re actually helping someone set it up – you’ve got to know where all the bits go.

The Redgate tools work with all of these options and combinations so making sure we’re setting everything up right usually means questions about the Repo/CI/CD tooling people choose.

The commonality above and the one I run into the most for all 3 stages, is Azure DevOps. Its straightforward to understand, all in the same place and just plain fun to use (AND it supports emojis ^_^).

Finally now, we have to pick a Relational Database Management System (RDBMS) to use – Redgate Deploy is one of the newest offerings from Redgate and it comprises capabilities for “Database DevOps” across MS SQL Server, Oracle Database and 18 (well actually 19 now thanks to Flyway v7!) other RDBMs‘! So instead of choosing, I’m going to pick the two key ones there, and one of the 18 others: MSSQL, Oracle DB and PostgreSQL.

One final question I had to ask of myself was what models I wanted to use. There are a couple of choices available within the Redgate solution, specifically for MSSQL and Oracle at the moment, so I decided that I would do State based deployments for Oracle and Hybrid deployments for MSSQL, given that PostgreSQL will have to be migrations anyway. Fear not though, the setup is not hugely dissimilar when it comes to the actual pipelines!

Setting up Azure DevOps Repos

This stage was relatively easy – I simply created 3 new projects in my DefaultCollection where I’m going to put the repos for each of the DBs.

and then I created 3 readme files, and cloned all 3 git repos down onto my machine as local repos:

and we’re ready to go!

A quick note: I’m using a mixture of Azure DevOps hosted (for PostgreSQL) and Azure DevOps Server locally installed on my Virtual Machine (for MSSQL/Oracle) with a local agent present to run everything below – you can adopt this methodology or you can use the hosted version, but for the Oracle solution below at least you will need a local agent available (unless you use the DockerHub Image for Schema/Data Compare).

Microsoft SQL Server

The first thing I need to do for all of these is to pick the databases I’ll be working on – for me I’m rather lucky as our demonstration environment has a rather nifty set of databases for me to choose from!

I’m going with SQL Source Control (the MSSQL State component in Redgate Deploy) and SQL Change Automation (the MSSQL Migrations component) both plugged into Management Studio (SSMS) with a set of databases called the ScaryDBA_Dev/Test/Prod environments (which I used SQL Clone to create the copies of), in homage to the wonderful Grant Fritchey.

So the first thing we need to do is get Dev under source control – we’ve refreshed back from Prod so there shouldn’t be any differences and we’re using the Hybrid model, so we’ll need to create the State first. I do this by going to SQL Source Control in SSMS, and linking my DB to Git, creating a State Folder in the top level of my local repo as I do so:

Then once linked I go ahead and source control the initial schema (not sure how? Watch the Redgate University videos here):

Next I setup my Migrations project using SQL Change Automation, creating the Migrations folder in the same top level of my local repo, but now instead of pointing to the database, I’m pointing to my SQL Source Control generated State folder:

Now at this point we get the options to choose filters and comparison options – I would recommend if you’re not sure speak to someone at Redgate or look up the documentation – I often see people wanting to filter out Security/Users/Roles at this stage so it might be worth a look! I just carried on as I only have a few objects anyway!

Connect to the target and create a baseline script (i.e. what does Prod look like now?) again, because I have a minimal setup I’ll go straight from my “Prod” database:

Commit and push and we’re on our way – everything is in version control:

Now i may have cheated by doing MSSQL first – because now actually building and deploying the project is pretty straight forward – much like I have done in previous posts here and here I just used the SQL Change Automation plugins from the Azure DevOps marketplace to first build:

and then deploy the project:

and it all succeeded… the 2nd time around when I remembered to specify which DB I was deploying to!

Oracle Database

The first thing I need to do for all of these is to pick the schemas I’ll be working on… wait, Deja Vu! – well once again I have a little set of schemas present on the demonstration machine that will serve me just fine!

Because we’re working in the State setup, out of Redgate Deploy I’m going to use Source Control for Oracle which allows me to specify the remote repo, the folder to create and even the fact I’m using Azure DevOps Git:

(Step 1 was simply providing the connection details to my Oracle Database, hence why I was on step 2!) – I select the Schema I’ll be putting in Source Control and even get a nifty run down of the structure:

Hit next and give a name to the Project (unsurprisingly I went with HR) and then check in all of your initial objects:

Now one thing that you may have noticed if you’re following along that I should clarify (and which I forgot when setting up this blog post):

  1. You don’t need to specify the local repo you cloned down because Source Control for Oracle handles this itself in the back end, if you want it to be part of a local repo with other code in it, use the Working Folder instead
  2. If you are using Git and NOT the working folder, committing will also Push your objects to the remote – you’ve been warned!

As above, I now head over to Pipelines and hit Create New Pipeline! I check out my repo with the schema objects in it, and add a job to my agent. But what am I going to pick? Well unlike SQL Change Automation there’s not a plugin available on the Azure DevOps Marketplace, we’ll need some good old fashioned command line calls!

First, let’s clean out the CI Schema, I’m going to use the script to remove all objects from the Redgate documentation site and make a call to run the script using sqlplus (I’m storing the file locally but you could even include it in your repo under a build folder maybe?)

echo on
Call exit | sqlplus hr/[passwordredacted]@//localhost:1521/CI @C:\DemoFiles\DropAllObjects.sql
echo off

Next we’ll add a call to the cmdline of Schema Compare for Oracle to build the database from our repo, using the files that were checked out by the agent (an Azure DevOps pre-defined environment variable) – again we’re using a similar script from the Redgate DevOps for Oracle site but because we’re deploying ALL objects from version control, we don’t really want a report per say, this is just to test the schema can be built from the ground up:

"C:\Program Files\Red Gate\Schema Compare for Oracle 5\sco.exe" /deploy /source $(Build.SourcesDirectory)\Schema{HR} /target SYSTEM/[passwordredacted]@localhost:1521/CI{HR} AS SYSDBA /indirect 

echo Build database from state:%ERRORLEVEL%
 
rem IF ERRORLEVEL is 0 then there are no changes.
IF %ERRORLEVEL% EQU 0 (
    echo ========================================================================================================
    echo == Warning - No schema changes detected. == echo ========================================================================================================
)
 
rem IF ERRORLEVEL is 61 there are differences, which we expect.
IF %ERRORLEVEL% EQU 61 (
    echo ========================================================================================================
    echo == Objects were found and built. ==
    echo ========================================================================================================
    rem Reset the ERRORLEVEL to 0 so the build doesn't fail 
    SET ERRORLEVEL=0
)

and assuming this all works, we’ll package up the files into a zip and publish them as an artifact so we can consume them at the release stage!

and guess what? It all just worked *cough* on build #23 when I got the syntax right finally…

Of course we can add additional stages to the build as well, such as a check for Invalid Objects and some Unit Testing, but I’ll keep this pretty lean for now!

Now, just like we did for MSSQL we’re going to set up a new deployment pipeline, grab the artifact we’re publishing from the build, enable a CD trigger and we’re going to deploy to, in this case, Acceptance.

Let’s first create a job on the agent to unpack the zip file and see how far we get – I’m just going to dump them in a DeploymentState folder in the working directory:

and… awww thanks Azure DevOps, I needed to hear that!

and now we add yet another command line task, but this one is just going to do a comparison, it’s not actually going to deploy anything – because we’re going to add a manual intervention step to approve the deployment first! I had a little help again from the Redgate docs for this one, because I keep having to catch cmdline error codes – if I was wise like Alex Yates I probably would have just handled this with PowerShell…

echo off
rem  We generate the deployment preview script artifact here
"C:\Program Files\Red Gate\Schema Compare for Oracle 5\sco.exe" /abortonwarnings:high /b:hdre /i:sdwgvac /source $(System.DefaultWorkingDirectory)\DeploymentState\Schema{HR} /target SYSTEM/Redgate1@localhost:1521/Acceptance{HR} AS SYSDBA /indirect /report:$(System.DefaultWorkingDirectory)\DeploymentState\changes_report.html /scriptfile:$(System.DefaultWorkingDirectory)\DeploymentState\deployment_script.sql > $(System.DefaultWorkingDirectory)\DeploymentState\Warnings.txt

echo Warnings exit code:%ERRORLEVEL%
rem In the unlikely event that the exit code is 63, this mean that a deployment warning has exceeded the allowable threshold (eg, data loss may have been detected)
rem If this occurs it is recommended to review the script, customize it, and perform a manual deployment
 
IF %ERRORLEVEL% EQU 0 (
    echo ========================================================================================================
    echo == No schema changes to deploy
    echo ========================================================================================================

    GOTO END
)
 
IF %ERRORLEVEL% EQU 63 (
    echo ========================================================================================================
    echo == High Severity Warnings Detected! Aborting the build. 
    echo == Review the deployment script and consider deploying manually.
    echo ========================================================================================================
    rem Aborting deployment because high severity warnings were detected
        SET ERRORLEVEL=1
    GOTO END
)
 
rem This is the happy path where we've identified changes and not detected any high warnings
IF %ERRORLEVEL% EQU 61 (
    echo ========================================================================================================
    echo == Schema changes found to deploy - generating deployment script for review
    echo ========================================================================================================
    rem Set ERROLEVEL to 0 so the build job doesn't fail
	SET ERRORLEVEL=0
    GOTO END
)
 
:END
EXIT /B %ERRORLEVEL%

I then throw in an agentless job (Manual Intervention Step) and then finally (once I have reviewed the deployment report that is produced) one further cmdline call to actually run the deployment script again my Acceptance target:

echo on
Call exit | sqlplus hr/[passwordRedacted]@//localhost:1521/Acceptance @$(System.DefaultWorkingDirectory)/DeploymentState\deployment_script.sql
echo off

I have saved my pipeline, now it’s time to test. So I’m going to make a very quick change (so that something is produced) and see what happens…

Boom. Pipeline done.

One word on this though – I haven’t included an awful lot of frills (error handling, checks, NuGet instead of Zip etc.) so you’re free to bulk this out how you see fit, but by golly it works! Also make sure you tick this on the second Agent Job, else it’ll wipe out your working directory – something that obviously definitely did not happen to me…

PostgreSQL

This one might be cheating a little. As you know I’ve already setup a CI pipeline with Flyway before, using Azure SQL DBs and the Flyway Docker container as part of the build, and in some cases even tSQLt for Unit Testing too! But this is PostgreSQL, and this is a new blog post, darn it!

Still getting your head around Flyway? Check out the Redgate University videos!

I started out by creating myself a PostgreSQL 10 server in the Azure Portal, because:

  • I can
  • I didn’t want a local install of PostgreSQL
  • I’m not self sabotaging

and I set up a Dev and Test database on it – that is once I remembered to allow my client IP address *sigh* and then connected from Azure Data Studio:

I already have some basic scripts from my last demo that I can use – so I pulled down the latest version of Flyway (V7) and unzipped it into my files:

Then I created a SQL folder in my local repository for the PostgreSQLPipeline (and popped a couple of migrations in – I’m using the StackOverflow scripts, adapted for PostgreSQL from Kendra Little’s GitHub, thank you Kendra!) – in the previous posts we’ve had to source control the state or initial baseline of the database, however as we’re using Flyway for PostgreSQL this requires us to create and name/order the migrations ourselves, so we have plenty of control over that – hence why we can jump straight into building some scripts this time around.

Finally, I pointed the config file for Flyway to that, also taking the opportunity to point it at my Dev DB using the PostgreSQL JDBC:

Now i didn’t really NEED to do this step and try things out against Dev, because I already have the scripts, so I could have just started building the pipeline – but it’s always worthwhile getting local validation first by running things against Dev and then migrating up!

A quick Flyway Info later and we were good to go – the scripts are recognized so we know we’ve set everything up correctly.

One git add / commit / push and everything is in my repo:

Now as you may know from my other post we can do 1 of 2 things here – we can now either build what we eventually push to the repo using a cmdline call (like we did with the Oracle build) to a machine where we have Flyway installed, or we can use the Docker image.

I’m actually going to use Docker again but this time, instead of specifying the various credentials in a config file that was getting passed to the container, I’m actually going to use Azure DevOps environment variables and build the connection string that way – it’s really easy to keep the variables secret in Pipelines, so I can pass my JDBC connection, complete with Username and Password, as well as my Flyway license key, without worrying someone might get hold of them!

I’m actually going to build against a live PostgreSQL database before deploying, so I also created another DB for me to use: demodb_ci

I actually stole the YAML from my previous pipeline (below) and updated the variables accordingly:

trigger:
- master
 
pool:
  vmImage: 'ubuntu-latest'
 
steps:
- task: DockerInstaller@0
  inputs:
    dockerVersion: '17.09.0-ce'
  displayName: 'Install Docker'

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway clean -url=$(JDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Clean build schema'
 
- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway migrate -url=$(JDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Run flyway build'

and it ran just fine! Well actually it failed first, because I didn’t have permissions from the IP address that the container was running from, but fortunately Azure has a handy switch in the PostgreSQL Server settings to simply allow Azure Services traffic through the firewall:

Once that was sorted, the first stage (as always) is to download Docker and then we have 2 Flyway containers steps:

1 – Clean the schema and make sure the database is empty
2 – Migrate the schema changes

Then we have two options – we could do like we did in the Oracle pipeline and zip up the files, spitting them out at Release stage and consuming them, either calling Flyway from the command line, or we can go ahead and promote our deployment using the same pipeline.

I’m lazy, so I’m going for the latter!

In a normal “production like” situation I would probably take the opportunity to test and check etc. like I did above, but let’s keep this super lean – if the build works, I trust the deployment. Lets go ahead and deploy to Production – I’ll add this as an additional task in my YAML:

- task: Bash@3
  inputs:
    targettype: 'inline'
    script: docker run --rm -v $(FLYWAY_LOCATIONS):/flyway/sql flyway/flyway migrate -url=$(ProdJDBC) -licenseKey=$(licenseKey) -user=$(userName) -password=$(password) -enterprise 
  displayName: 'Promote to Production'

And the deployment was successful! Phew – I think I’ve earned a cup of tea!

Conclusion

In this blog post I have demonstrated 3 different (and initially very simple*) approaches to the source control and deployment of database changes – but there’s actually a much wider combination we could have adopted – all 3 models with MSSQL, all 3 models with Oracle, and Migrations for up to 18 other systems like DB2, Snowflake and even SAP HANA! But what did I need to do ALL of this? A single solutionRedgate Deploy**.

Thank you for stopping by! Have an amazing week!

*There is a lot missing from the code I have provided, like additional error handling, tests etc. and all of the above CAN be improved – but did we manage to build and deploy across three different systems all using Azure DevOps? Yes we did. If you intend on using any of the above, please ensure you build in the necessary controls and process around it and always pick what is best for you and your team.

**Redgate Deploy is going from strength to strength, expect to see a wide range of improvements made over the coming months – I won’t be surprised if this blog post is already out of date by the time I finish writing it – that’s how awesome the teams working on all of this are!

Automated Dev Database Branch-Switching with AzureSQL, PowerShell and GitHooks

“Keep it simple, stupid!”
– My year 12 & 13 English Lit. Teacher

Recently I’ve been fascinated with something really cool. A couple of my colleagues at Redgate wrote a GitHook which allows you to easily switch branches using Redgate SQL Clone. You can see the hook here with full instructions – and I thought it was pretty neat.

But it got me thinking – I’ve posted a lot about when people are using just Azure SQL DBs (PaaS), about Masking and DB Change Automation, but when you’re using AzureSQL for Dev and Test DBs as well as Prod, you still don’t have the same agility one would expect from a local copy, like a clone.

But, the above GitHook leverages PowerShell (among some other fancy wizardry) so, what if we could do this exact same thing, using the PowerShell Az module to dynamically create and switch Azure SQL DBs in our own private resource groups every time we checkout a branch?

my hero academia wtf GIF by Funimation

I don’t imagine it would be fast because I’m restricted to using very low tier SQL DBs by my tiny allowance of (as Kendra Little calls them) “Azure Bucks”, but it should absolutely be possible!

So I decided to write a PowerShell script to do just that.

The first question I had to come up with an answer to was, how do I replace the Clone “Image” in this process, because I need something that is effectively a copy of our Production (or as near as possible) environment so we have something to base EVERY copy from – so I created the idea of a Golden Copy within the script; effectively this golden copy could be created by copying masking and copying back down from Production using something like Redgate Data Masker and my scripts here in GitHub but as a stop-gap, if it doesn’t find one in the Resource Group and Server you select, it will simply create one from your Dev DB. Best answer I could think of, you’re welcome to improve it!

All that remained was effectively to go through and just mimic the functionality of the Clone script but using Az: so if you are switching to a new branch where you don’t already have an existing Dev DB, then you get a new copy of Golden. If you’re switching to a branch you’ve checked out before, it renames the DBs to swap you back to the correct branch.

Here is an empty AzureSQL DB called DMDatabase_Dev:

When I now issue the git checkout “feature/newfeature” command it asks me to sign in to my Azure account:

and then gets to work:

And… that was it really.

I now have an Azure SQL DB called DMDatabase_Dev_master as I switched from the master branch, and I have a branch new DMDatabase_Dev DB that I can use for my featurebranch. You’ll notice I didn’t include -b in my git command, let’s assume a colleague is already working on this branch. I can now just update my copy (of my golden copy) with their work:

And we’re good to go!

But now if I switch back to my main branch, the object is gone and I can carry on with work on this branch:

It was really straightforward I can’t believe I haven’t seen this in use in more places, but hey guess what? The PowerShell is yours right here if you want it:

https://github.com/ChrisUnwin/PowerShell/blob/master/Demos/Redgate%20Demos/GitHookAzureSQL.ps1

The pre-requisites for it are:

  • You should have a Dev DB and you should update the values at the top of the script with the Dev DB name, server and resource group it is in
  • The script make reference to and creates a Golden copy DB so that you have something you should always be creating from, for consistency – so when you get started, create your own “Golden copy” back from Test/UAT or something if you can – maybe using the script mentioned above – it should be the name of your Dev DB appended with “_Golden”
  • If you want to change how it is authenticated so you don’t have to enter your credentials each time, then go for it – this was just the simplest method for me (and it’s currently 11:05pm so I’m going to bed!)

Feel free to improve it, I’m sure there are plenty improvements that can be made, but it’s a starter for 10 for anyone out there just getting started with development in Azure SQL. Plus it’s kinda neat!

Using things weirdly – Part 3: Moving from State to Hybrid-ish Source Control with Microsoft SSDT and Redgate SQL Change Automation

“It works on MY machine”
Everyone

Some time back I blogged “which database source control model works well for you” and it has come in SO handy when explaining the different models to people – but the one question I get asked not infrequently, is:

Is there any way that I can use Redgate SQL Change Automation with Visual Studio based SSDT?

It’s always a really difficult question to answer because fundamentally SQL Source Control (Redgate’s state based tool) and SSDT (Microsoft’s state based tool) functionally seek to do the same thing, making them competitors. However there are, on the odd occasion, good reasons as to why I am asked the question and one of those same scenarios came up today:

  • Our developers work in Visual Studio and have already been using SSDT for a long time, it works for them, we just want to extend it with Migrations to handle complex changes.

So the option here is, leave it as it is, or try to work with both. Not always going to be my first choice but it got me thinking.

Starting from a memory

A few months ago, when life was “normal” and I was working in my office with *gulp* PEOPLE, I tried to make this scenario work by simply linking SQL Change Automation to the project folder created by SSDT but unfortunately it was riddled with problems. The SSDT importer and repo look like this:

And if you point SQL Change Automation at the local repo with this in it will correctly tell you:

Now of course this wasn’t unexpected. It’s not designed to work this way, is it? No. But way back then I did figure out, shrewdly, that if I used SQL Source Control to carry out an initial commit just to a working folder, it would generate a RedGate.ssc and RedGateDatabaseInfo.xml file and you can copy them into the SSDT repo to trick SQL Change Automation into thinking that it’s a SQL Source Control repo… unfortunately this trick no longer works. Sad.

sad a christmas story GIF

Add a hop and a step

But what got me thinking today was the context with which the question was asked. It was more about separation of duties. Once the developers have effectively done their job and delivered the change into the repo, their job was effectively done! “That’s how it should look moving forwards. What’s next?” – and then I had an idea.

Given that SSDT allows you to push and pull the code and apply it to your own database, what is stopping us from using SQL Change Automation to pick up on the changes against the database we sync our changes to from our SSDT project?

Genius. Evil genius.

So I created a new Database to simulate having another developer on my instance and gave it to Peter Parker:

You can then do a schema compare to another DB from your project, effectively PULL down changes from the remote to your local repository, and then sync them back up to your local development DB; this is how Devs stay up to date with each other but could, in this methodology, be how DBAs or senior developers pull down the changes to their local DB, where they test the new state, and then generate a new migration from it.

So I made a change on my dev database and captured it in the project right click on the project name > schema compare > dev db compare to project > update) and then committed and pushed:

and sure enough my repository was updated:

But then I simulated pulling down the change and applying it to Peter Parker’s DB (again using Schema Compare) and then I created a SQL Change Automation project in VS, in the same solution but pointing the project to a migrations folder in the repo:

Yes I accidentally called the project Database1 don’t remind me I’m embarrassed enough!

Then I added my baseline database:

It created the baseline and the project immediately with no issues and picked up on the changes I had made using SSDT:

and I was able to commit my project and changes into my repository in Azure DevOps:

It was just that easy! Now what this means for the development process is that developers _could_ feasibly work with SSDT, as they are comfortable with it, and then more senior members of the team can generate migration scripts from there, building the database from scratch and deploying in a reliable, repeatable fashion.

Just to prove to you my build even ran green from this:

So in summary what this gives us is the ability to adapt a regular SSDT workflow, one that developers are comfortable with and which has been in the team for months or years, add in the knowledge of DBAs or team leads, a greater separation of duties for high risk schema changes, and the control and flexibility (and peace of mind) that comes with a migrations based deployment process.

Nice.

The fine print

I’m sure by now you’ve realized something: this is not, nor will it ever (I believe) be a supported workflow. If you implement the above in a production sense for something other than just testing then it’s not something you’ll be able to get help with from one of the Redgate engineers if you need to troubleshoot.

Also, if you’re going to introduce a sequence of changes like this to achieve the hybrid model, it does make more sense that you implement SQL Source Control for the state side (given that it’s right there in the SQL Toolbelt with SQL Change Automation anyway).

But IS IT POSSIBLE to achieve a similar, Visual Studio based* hybrid workflow with SSDT and SQL Change Automation by using a database to ‘hop’ the changes across?

Yes, it certainly looks that way!

*If you’re planning on using SSDT in Azure Data Studio too then this workflow could also work for you, SQL Change Automation is present in SSMS and VS so it’s really up to you!

SQL Change Automation and GitLab CI/CD (a.k.a. Oh this is fun on Windows)

“You never know what you can do until you try, and very few try unless they have to.”
C.S. Lewis

Well I don’t have to, but many of the people I speak to on a daily basis are moving into GitLab, so it’s about time I tried it! You can find here testament to the mistakes I make as I try to set up a full end-to-end database change management process with SQL Change Automation and GitLab.

Will it all work perfectly? I don’t doubt that everything will fall over at some point, but let’s see how we get on all the same, and hopefully if you’re setting up this same pipeline, you’ll be able to avoid the errors and failings I inevitably cause! So here we go!

ready come on GIF

Let’s set up a GitLab Project (and rename the default branch)

Naturally, I didn’t have a GitLab account, so I had to set one up. I’m assuming that if you’re using it already or you’ve just started using it you’re taking advantage of the more business features but I’ve just stuck with the good ol’ free account for now! It was remarkably simple, sign up, email address, confirm and here we are:

Ok there is something very cool I like about setting up a new project, can you tell what it is?

You can completely set up a new blank project but they have templates, you can import projects OR, and I love this, you can setup a full CI/CD pipeline from another repo! Having done this before in Azure DevOps it was not easy, let me tell you. It really seems like Azure DevOps hates you for setting up CI/CD from an external repo, even though it has plenty of helpful ways of doing so!

So I initialized my repository with a README and updated it:

Don’t ever say I’m not descriptive enough!

The first thing I did was a renamed my default branch to ‘trunk’ by going to branches, creating the new branch and then in Settings > Repository changing it to the default and then swapping out the protected status with the outdated master:

Then finally delete the old default in Repository > Branches:

Excellent. Now it’s time to clone trunk onto my machine as we will need the local repository to put our change automation project in!

I created a folder called GitLab test and cloned the mostly empty repository into it:

Simple!

Create a new SQL Change Automation project and push it to trunk

In SSMS I opened up the most recent version of SQL Change Automation an created a new project called “DoggosAreCoolDB” using a copy of a Dev database I had lying around from a previous demonstration (BlogsDotRedgate):

Then I created my baseline as a migration script against the up-stream copy, BlogsDotRedgate_Integration, because who has access to Prod for this? Am I right? *cough & shifty eyes* not me!

I successfully generated my baseline and a change script (I added a column to a table, nice and simple) and then committed them to my local repo, and pushed! Forget branching, today isn’t about that, we’re just PUSHING TO TRUNK, WOO-HOO!

Setting up the CI/CD Pipeline

Now that we have our project and migrations in GitLab we can build out a pipeline! So first stop I went straight to CI/CD > Pipelines and was presented and I hit “Get Started”:

They immediately throw you into a Quick Start “Help” style guide which is immediately a little un-intuitive but surprisingly helpful if you read the whole thing. Effectively we need a YAML file called .gitlab-ci.yml that will store our pipeline as code telling it how and where to build, and we need a runner to actually fire up and execute these steps.

In my experience with some other CI/CD tools, it’s been advantageous to actually create the Runner / Agent first on the machines you’re going to be using, so as I just have my laptop to do this on, I will set one up on there! I found the full documentation for a Windows Runner here, and followed it just so I would have it available.

The GitLab Runner was up and running in my services but I’ll be darned if I can see them anywhere in GitLab…

confused britney spears GIF

Aha! So it turns out after a bit of digging that you need to register the runner specifically using the CI/CD section on the project settings, that was probably my bad for not reading the documentation thoroughly but my counter-argument… who actually does? So I issued the register command, applied tags and a description and chose my runner type, I chose shell because I need to be able to run PowerShell on the machine (I’ll need the SQL Change Automation PowerShell components available on the machine where the Build and Deployment are happening of course):

I’ve got the runner on the machine, I’m using an instance of SQL Server to build against, now I just need the YAML file (fortunately GitLab has full documentation for how to structure this as well!)

So I can build my project I’m going to need to know where the repo is cloned to during the process (i.e. to find the .sqlproj file) so by taking a look I managed to find a list of environment variables that can be used in the YAML file, just to be sure though, I created and committed the most basic YAML file that would just echo back the location of the cloned files:

stages:
  - DatabaseBuild

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - echo $env:CI_PROJECT_DIR

After this let me know the environment variable worked correctly and the build pipeline was being fired up correctly on my private runner, I tried something a little more ambitious, building the .sqlproj file using the cmdlet reference from the SQL Change Automation documentation for help:

stages:
  - DatabaseBuild

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - $projectPath = $env:CI_PROJECT_DIR
   - $project = "$projectPath\DoggosAreCoolDB\DoggosAreCoolDB.sqlproj"
   - $validatedProject = $project | Invoke-DatabaseBuild -TemporaryDatabaseServer "Data Source=PSE-LT-CHRISU\"

and we successfully built a database!

All that’s left to do now is 2 things:

  1. Create a NuGet package as part of the CI build
  2. Release the database changes to the target DB

I’m still using the same machine for the release portion too, so naturally I can use the same runner for this, if you have other servers you’re deploying to you will of course need additional runners.

We can very easily extend what we already have in our YAML file by just telling the process to create and export a new build artifact – I’m going to name it the same as everything else, and then append the BuildId to the end of the file so we always get something unique:

stages:
  - DatabaseBuild

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - $projectPath = $env:CI_PROJECT_DIR
   - $project = "$projectPath\DoggosAreCoolDB\DoggosAreCoolDB.sqlproj"
   - $validatedProject = $project | Invoke-DatabaseBuild -TemporaryDatabaseServer "Data Source=PSE-LT-CHRISU\"
   - $buildArtifact = $validatedProject | New-DatabaseBuildArtifact -PackageId DoggosAreCool.Database -PackageVersion 1.$env:CI_JOB_ID
   - $buildArtifact | Export-DatabaseBuildArtifact -Path "$env:CI_PROJECT_DIR\Export"
  artifacts:
    paths:
     - $env:CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.$env:CI_JOB_ID.nupkg
    expire_in: 1 week

You’ll notice how I’m exporting the NuGet package to the project directory and then uploading it, this is so that we’ll have access to it to release but also so that we can use the artifacts argument in our YAML to upload the file and make it a downloadable package through the GitLab interface (if you go to that SPECIFIC job):

Whilst we’re on a roll here (and things haven’t gone wrong for a while) I’m going to add 2 additional stages ALL AT ONCE to both “Create a Database Release Artifact” and “Deploy from a Database Release Artifact” using, once again, the SQL Change Automation PowerShell cmdlets.

Woo-Hoo! I’m invincible!

I broke it.

Can you see what I did wrong? The error is:

New-DatabaseReleaseArtifact : The specified value for the Source parameter is neither a valid
41database connection string nor a path to an existing NuGet package file or scripts folder:
42'CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.725147351.nupkg'

So 2 fun things. 1 – I forgot to highlight there was an environment variable at one point, so it was just looking for the name of the variable in the path and 2) it keeps erroring out saying my NuGet file isn’t a NuGet file, weird right?

On further inspection it is yet another mistake I made. I’m using the job ID to name the NuGet package, which means when it tries to find the file it’s 2 steps ahead because each stage is counted as a different job! Duh!

sylvester stallone facepalm GIF

A few quick changes should hopefully sort this out! I’m going to put the instance of the pipeline ID in ($env:CI_PIPELINE_ID) and see if that makes a difference!

Wait. Did it just say the pipeline ran? SUCESSFULLY? That’s exactly what it said! We can verify that this actually happened as well by checking the DatabaseDeploymentResources folder for the Release Artifact to Integration:

And everything is there! Note you won’t have a changes.html report just yet because this is the first time we’ve successfully deployed to Integration, however if we run 1 more change through (I’ll add a stored procedure):

Boom.

Prince Harry Mic Drop GIF

Now of course we can add additional stages to this, for manual intervention or to promote to other environments, but I’m going to call it a win here and retire (until the next post) gracefully. I’m sure you’re all wondering what my final YAML file looked like too – well (counterintuitively) I’ve popped it all into GitHub for you and pasted it below. Enjoy!

stages:
  - DatabaseBuild
  - CreateRelease
  - DeployToIntegration

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - $projectPath = $env:CI_PROJECT_DIR
   - $project = "$projectPath\DoggosAreCoolDB\DoggosAreCoolDB.sqlproj"
   - echo "Building project $project"
   - $validatedProject = $project | Invoke-DatabaseBuild -TemporaryDatabaseServer "Data Source=PSE-LT-CHRISU\"
   - $buildArtifact = $validatedProject | New-DatabaseBuildArtifact -PackageId DoggosAreCool.Database -PackageVersion 1.$env:CI_PIPELINE_ID
   - echo "Exporting artifact to $env:CI_PROJECT_DIR\Export"
   - $buildArtifact | Export-DatabaseBuildArtifact -Path "$env:CI_PROJECT_DIR\Export"
  artifacts:
    paths:
     - $env:CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.$env:CI_PIPELINE_ID.nupkg
    expire_in: 1 week

CreateRelease:
  stage: CreateRelease
  tags: 
   - sql
  script: 
   - $integrationDB = New-DatabaseConnection -ServerInstance "PSE-LT-CHRISU\" -Database "BlogsDotRedgate_Integration"
   - $buildArtifact = "$env:CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.$env:CI_PIPELINE_ID.nupkg"
   - echo "Creating Release Artifact for DoggosAreCuteDB - check C:\DatabaseDeploymentResources\DoggosAreCuteInc\ReleaseArtifacts\$env:CI_PIPELINE_ID\Integration for more information"
   - $releaseArtifact = New-DatabaseReleaseArtifact -Source $buildArtifact -Target $integrationDB
   - $releaseArtifact | Export-DatabaseReleaseArtifact -Path "C:\DatabaseDeploymentResources\DoggosAreCuteInc\ReleaseArtifacts\$env:CI_PIPELINE_ID\Integration" -Format Folder

Integration:
  stage: DeployToIntegration
  tags: 
   - sql
  script: 
   - $integrationDB = New-DatabaseConnection -ServerInstance "PSE-LT-CHRISU\" -Database "BlogsDotRedgate_Integration"
   - echo "Deploying changes to Integration"
   - Import-DatabaseReleaseArtifact -Path "C:\DatabaseDeploymentResources\DoggosAreCuteInc\ReleaseArtifacts\$env:CI_PIPELINE_ID\Integration" | Use-DatabaseReleaseArtifact -DeployTo $integrationDB

3 methods for seeding test data during CI builds with Flyway

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.
Sir Arthur Conan Doyle, Sherlock Holmes

Can you tell I’m loving Flyway at the moment? Well I am. It’s JUST SO GOOD! Honestly there are so many things you can do with it! Don’t know what I’m talking about? Check out my posts on xRDBMS DevOps with Flyway and tSQLt unit tests with Flyway and you’ll see what I mean!

As a result of the above posts though I was asked a question that I had to think about for a little bit before having the best possible answer, how can we seed some testing data INTO the build database so that we can run some meaningful tests against it?

This makes perfect sense to me, but there’s also a few different ways to do this – so let’s go fly(way)!

flying i believe i can fly GIF

1 – Test Data Migration Scripts

In my previous posts on Flyway (above) I talked about having an entirely separate build folder present within the repository, and a folder of test migrations alongside our schema migrations – I called these the Build_Config folder, (containing the build configuration file) and the Test_Migrations folder (unsurprisingly containing testing migrations) in the _Migrations location:

I was using the same build config for 2 purposes; 1) to build the schema migrations from the base version, by passing it the Schema_Migrations location dynamically and 2) then building the tSQLt framework and testing objects by passing it the Test_Migrations location dynamically.

This actually worked surprisingly well, but even beyond this – the same method can be repurposed, or added to, by augmenting your testing scripts and adding a data insertion task (as an additional script or group of scripts). In my folder, I can simply add a migration like this:

Because of course I like dogs.

lana del rey yes GIF

and once pushed to the repository and the build has run we should be able to verify our testing data is present:

A bonus win for this step of course, is that where Devs have their own Flyway config files locally for their development databases they could also overwrite this behavior and point the testing and/or data scripts at their own database so they have some seed data to work with too!

2 – Add a data generation step to the pipeline

There are SO MANY technologies out on the inter-webs for generating data. SO MANY. Many of them also have a command line or PowerShell module that we can use to easily invoke them against a target, especially if that target is going to be persistent like my Flyway Azure SQL Build DBs!

Because I have access to it and because I’m using essentially SQL Server DBs, I could easily use Redgate SQL Data Generator – but to get the data you need you could use anything from DBATools Data Generation (also SQL Server) to FillDB for MySQL (which looks awesome and you could easily use this for Step 1 above too!)

There are numerous ways to invoke tools and applications and fortunately good CI/CD tools like Azure DevOps offer multiple ways to, for instance, run PowerShell or CLI steps from within the pipeline – so we could easily invoke SQL Data Generator on a VM or physical machine we have an Azure DevOps agent on – but this thinking also opens up the possibility of using something like Chocolatey to dynamically install the software on the Azure DevOps hosted pool VM during build (for the Redgate tools at the moment I suppose you’d need a Windows VM).

sassy pants chocolate GIF

I will be writing a future blog post about this step because it sounds _very_ interesting, but I’m not sure yet what can be done specifically using Chocolatey or if I’ll have to look elsewhere, although I have read this post in the past (thanks Paul!) detailing limitations and a great workaround using Azure DevOps, so it’s likely that’ll be my first port of call!

Just to give you an idea of end result with SQL Data Generator specifically though:

3 – Use existing data, don’t generate

Ok this one is going to be controversial already, I can tell! Let’s all stay calm!

happy chill GIF

The best data to be tested is our data. What we have in Production is what will have these changes deployed to it… eventually! So shouldn’t we just test against that? Well. Maybe, maybe not depending on what is in there.

There’s a few methods to achieve this – my personal favorite would be to use a SQL Clone, spin that up on a build VM rather than using an Azure SQL DB, and we can have all the data in an instant. Of course if we hold any sensitive PII/PHI then we should ensure that is protected first!

Of course there are lots of other options, like restoring a backup or spinning up a container etc. and these can all just be a stage in the YAML file before invoking Flyway but the point is, if we use an existing copy of our Prod database from some source or another, it will have 2 things we really care about:

  1. Data. Ready to go, ready to test, ready to give us the best possible insight into our changes.
  2. The flyway_schema_history table. Instead of running EVERY migration we’ve ever written, which could take a while for a large team, we run only the latest migrations to check that they would deploy happily to the Production target.

To get this stage to work though, you would need to do a couple of things differently:

  1. The build DB would have to be created from the clone/backup/other every time instead of simply cleaning the schema down.
  2. You would need to remove the Flyway Clean step from the pipeline in my previous post, because it would otherwise drop all the tables (and then we wouldn’t have any data!)
  3. By extension, this also makes the callback to remove the tSQLt objects void, so you can remove that too.

Conclusion

There are a lot of different ways to generate data, you can generate completely synthetic data, you can mask data or use Prod data, it’s up to you! Ultimately it will just for another part of your pipeline – just be careful of ordering! You don’t want to try generating data into a table that hasn’t been built yet.

Respect your YAML file and you’ll get schema, data and unit tests and this will lead to one thing. Greater insight, earlier.

thumbs up GIF

Using things weirdly – Part 2: Static Data in the Hybrid Model (w/ Redgate SQL Source Control and Change Automation)

“Where’s your will to be weird?”
Jim Morrison

I had some really positive feedback on my last “Using things weirdly” post, and truth be told, I love to use things weirdly. The number of times I’ve heard: “Oh, well, sure yeah I guess it works that way too…” is just too many to count. So imagine how my eyes lit up when I realized that you can do something weird with one of my favorite things to talk about at the moment, The Hybrid Model.

Now if you don’t know about the Hybrid Model then I would suggest you check out my post here that’s all about the different source control models available for your databases!

The Problem

Across both the State based and Migrations based offerings within the SQL Toolbelt, you have access to something very cool: the ability to easily (alongside the schema) source control your static data. Now don’t ‘@’ me because you think I should be referring to it as “Semi-Static” because it might change occasionally and ‘that’s not truly static then is it?‘; I could easily also refer to it as ‘Lookup Data’ or ‘Reference Data’, basically whatever you class things like “Country Codes” and “Currency Codes” as.

Whatever you call it, it can go into your VCS like any part of the database schema:

SQL Source Control: State Based
SQL Change Automation: Migrations Based

But. One thing that – as of writing this RIGHT NOW (23/07/2020 10:09am BST) – is not officially available in the Hybrid Model combining these two methods… is Static Data. The Data tab in SQL Change Automation even disappears when you set it up as a Hybrid workflow:

And this gives me a sad.

sad monty python and the holy grail GIF

The Solution?

Got your Hybrid Model setup and ready to go? Let’s use it weirdly!

1 – Use SQL Source Control to commit some static data to your state repository. This is as easy as right clicking on your (highlighted green) source control linked database and selecting “Other SQL Source Control Tasks” > “Link/Unlink Static Data“, and picking your tables.

Nothing should be showing in SQL Change Automation:

2 – Unlink SQL Change Automation from the state repository for a moment and link it instead directly to the development database. This will cause it to go into Migrations-First mode. You can do this by clicking the blue source name in SSMS and picking Existing Database instead:

Because it’s technically the same database as you’re source controlling in the state repository, it should all just work™ and should tell you there are no dev changes to the source. Then you’ll see the “Data” tab has been enabled:

3 – Select the same tables to source control as you did with SQL Source Control by using the Add Tables wizard:

BUT WAIT!!

shocked oh my god GIF

Isn’t it now going to generate a migration for our static data?? This isn’t included in the baseline or anything at all so far, so is it going to try and insert all of my static data into tables later on that already have it?

No. Actually we’re fine!

4 – Generate the static data migration script (and look at it for peace of mind). Notice that the script actually has checks in there – because we’re newly adding these tables, the migration will check if the tables are empty before trying to run the script, and only AFTER this migration will SQL Change Automation start generating the differential, incremental static data migrations:

Commit this migration script to your migrations repository, and that’s all we need to do here!

5 – All that’s left now, is to re-hook-up the Hybrid pipeline, follow the same steps you did before in Step 2 but this time, instead of an existing database, just link it back to the state repository like it was before. If you’ve done this right, you should see no changes pending for migrations:

BUT you will notice that if you change any static data with SQL Source Control, it should now show up in SQL Change Automation!

Change to static data prior to commit in SQL Source Control
Change before migration script generation in SQL Change Automation
Generated migration script to be committed

Conclusion

Is it an intended use? No absolutely not, the reason it’s disabled is that with all things at Redgate they are considered heavily to ensure users are offered the best possible user experience, functionality and essentially something that meets requirements across the board.

But. We can use it weirdly to, as i say, just make it work™.

What have you used weirdly lately? Let me know!

(SQL) Change ALL the Azure SQL Database Automation!

“But I can hardly sit still. I keep fidgeting, crossing one leg and then the other. I feel like I could throw off sparks, or break a window–maybe rearrange all the furniture.”
Raymond Carver

I understand that starting off a blog about Azure SQL Database with the above quote is a little weird, but honestly I’m _really_ excited about what I’m about to tell you.

***Note before starting: This blog post assumes you’re familiar with the concepts of Database Source Control, CI and CD, Azure SQL Database and pipelines within Azure DevOps, otherwise here be dragons.***

I am a huge fan of SQL Change Automation – mostly because of the migrations functionality. In my mind it represents an ideal workflow for making complex SQL Server database changes. If you’re not sure about the different models (State, Migrations, Hybrid), take a look at my blog post from last week here! But until this time it has had one thing that I could not easily do with it… Platform as a Service, Azure SQL DB.

Now don’t get me wrong, SQL Change Automation could easily deploy to Azure SQL Database but I had a problem. The words:

Chris how do we benefit from the migrations approach and put the shadow database and build db in Azure SQL too? We don’t have any local instances or VMs we can use for this and Dev, Test and Prod are all in PaaS!”

elicited this response:

cry crying GIF

But. No. Longer.

Now for those of you who don’t know, the _SHADOW_ database that SQL Change Automation creates is effectively a schema and static data only copy of your database, and it is dropped and built each time you verify, to ensure that all of the migrations run successfully and you can effectively check your work and shift the build left (!!), before you even check into source control.

This shadow database and the build database shared one thing in common and that was that you couldn’t build them in Azure SQL DB, which left 2 choices:

  • Use an instance of SQL Server. Developer for the shadow locally maybe; a VM in Azure or on-prem hosted instance for building
  • [For build specifically] Use localDB. Not advisable if your database contains any objects not supported by localDB because (juuuust in case you didn’t know) it is SQL Server Express.

But on May 12th 2020 (and I only found out about this like 2 weeks ago) the SQL Change Automation team at Redgate released version 4.2.20133 of the plugin for SSMS which included a few super cool things like additional Azure SQL support and the Custom Provisioning Scripts feature.*

excited excitement GIF

Now this is great because not only can we now easily create SQL Clones to be used as the development source (and I’ll blog about THAT a little later) but of course you can use it to use an Azure SQL DB for the shadow AND to use a persistent Azure SQL DB for the CI build as well!

Now unfortunately Kendra kinda beat me to the punch here and she produced a fabulous 3 part video series you can watch on using SQL Change Automation solely with Azure SQL DB, and you can view those here if you don’t want to see me try it out:

Getting set up

The first thing I did was make sure that I had all of the necessary environments to try this out – I created 3 Azure SQL Databases to mimic Development, Build and Production environments on 2 separate Prod and Non Prod Azure SQL Servers. I ran the DMDatabase prep scripts (you can find these here) to setup Dev_Chris and Production, but left BuildDB empty.

Next It was time to create my project, so I hopped over into Azure DevOps and created a new project, initialized it with a README and then Cloned it down onto my local environment:

Everything was ready to go so it was time to create my project!

Setting up SQL Change Automation in SSMS

*cough* or if you’re me, update it first because you’re on a REALLY old version *cough*

Then I hit “Create a New Project” and it allowed me to just specify the connection string to the Dev Azure SQL DB and the project location was the checked-out local repo:

Didn’t change any of the options because I’m a rebel and I didn’t feel like filtering anything out! But of course now comes the fun bit… the baseline. I chose my production Azure SQL DB as it’s my only upstream DB at this point, and it’s time to hit “Create Project”.

…and Huzzah! It’s worked and we’re all good!

excited andrew garfield GIF by The Academy Awards

Now… that’s actually not the best bit! The reason why Andrew there is clapping so hard? Well that little piece of magic has happened in the background! A Shadow database has actually been created for me against my azure server automatically! This is done by using the connection string that is used for dev!

Now… one thing to check, and I didn’t think to do this, but you can specify the connection string in the SQL Change Automation user file but I just left mine for a bit not realizing it created an Azure SQL DB for the Shadow that was CONSIDERABLY higher tier than my dev environment (bye bye Azure credit!), but fortunately I was able to scale it down quickly to basic and that has stuck, but be warned!

So I did what all ‘good devs’ would do now… I committed and pushed my initial commit directly to my main branch! (Don’t tell my boss!)

and safely sat my Database in Azure DevOps:

Setting up the build and deployment stages

This bit was actually just as easy. I used to hate YAML but thanks to a certain (wonderful) Alex Yates I jumped in anyway and it turned out to be just fine!

I created a new basic YAML file within Azure DevOps (and used the assistant to just auto populate the Redgate defaults, if you don’t know YAML or what it can do already, there’s a really good MS article here) and committed it to the main branch again (whoopsie) and the only component was the SQL Change Automation plugin I pulled in from the Azure DevOps marketplace, and I configured the build to target my “nonprod” server and the Build DB I had created previously.

On saving and running the pipeline succeeded!

All that was left to do was to create a Release Pipeline. So naturally, I jumped straight in and created a new pipeline, and I started with an empty job and called it Production*note* make sure you also choose your Build artifact before configuring your release stage too by clicking the Add an Artifact option!:

I added the SQL Change Automation: Release step to the agent job (note because this is all hosted, I’m using an Azure DevOps hosted agent to do this step):

Now you’ll need to add 2 stages (both the SQL Change Automation: Release plugin) at this point, a “Create Release” and a “Deploy from Database Release Artifact” because one will look at the target and figure everything out for you, and you’ll be able to review exactly what will be deployed, and the other will actually _do_ the deployment:

From here you just have to specify the options available, like in this wonderful walk-through here from the fabulous Chris Kerswell of DBAle fame! For me, this was simply targeting my Production Azure SQL Database.

You’ll definitely want to use the project variables to pick up the right package, and also leave the export path blank in both steps for now:

You can Clone the step by right clicking instead if you want to which will preserve all the connections you’ve already provided! Then once it’s all pointed at the right place, save and queue the release!

And of course, we were successful:

and then finally with a couple of triggers set to automatically build and deploy I made a change to my Contacts table in my Azure SQL Dev DB and a few minutes later, thanks to Azure DevOps and Redgate SQL Change Automation the very same change appeared in Production, with no reliance on anything other than Azure SQL DB and SQL Change Automation:

Before the DevOps process on Dev, ready for a migration to be generated
After: Automatic post-build deployment of the new column to the Production Azure SQL Database

Conclusion

If you have all of your databases in Azure SQL Database**, fear not because SQL Change Automation to the rescue! You can very easily set up and configure a pipeline in Azure DevOps or indeed any pipeline of your choice, but it’s never been easier to persist development changes all the way through to Production in a low risk, incremental, “DevOps” way!

—NOTES—

*An important word from the release notes: Note that it is still generally recommended to locate the shadow database locally where possible as that will usually result in a faster database connection. The default CreateDatabase.sql and DropDatabase.sql scripts can be altered to improve performance or implement custom provisioning logic.

**If you have all of your Databases in Azure and you need them masked for Dev/Test too, check out this previous blog post in which I outlined how to do that using Azure DevOps too!

Which database source control model works best for me?

“Destiny is not a matter of chance; it is a matter of choice. It is not a thing to be waited for, it is a thing to be achieved.”
William Jennings Bryan

For many people, figuring out how to get their development database into source control is the first step to a robust, repeatable, automated (and exciting) database DevOps pipeline. This, coupled with exactly which technology (Azure DevOps, Github, GitLab, BitBucket… the list goes on) you’ll be using for Source Control (and later CI/CD) can make it quite overwhelming.

overwhelmed choices GIF

Now fortunately I’ve worked with a number of teams on setting up source control methodologies and some work better than others depending on how you want to work. Remember:

“DevOps is the union of people, process, and products to enable continuous delivery of value to our end users.”Donovan Brown

And the key there is equal parts, whilst technology has a part to play, it comes down to the teams; nurturing and feeding a positive mindset of collaboration and communication within the team and then defining which methodologies and processes work best for you.

Once you’ve got that down, pick the source control methodology that works best for you, and luckily there are 4 choices:

  • State-First
  • Migrations-First
  • Hybrid- or Optimized-Model
  • Other

Ok… maybe I lied about the 4 choices because other can encapsulate many many different options in itself. But, what I’m going to talk about below are the 3 primary options I see development teams adopt and how they fit into your teams culture.

If you’re already tired of reading then you’re in luck! I also talked about this same topic at Redgate Streamed on 28th May 2020 so if you follow that link you can “register” to watch on demand, I will tell you in your own ears! (As opposed to reading below) – I won’t tell you to just watch my session because you should DEFINITELY check out the sessions also given by Kendra, Grant, Ben and Frank which were… well:*

Lets Go Yes GIF by Music Choice
*(SOOOOOO Good!)

State-First Approach

The state-first approach is, as it would suggest: the state of each object within the database is captured by whatever tool you use, i.e. the script needed to CREATE that object, and it is written out into its own flat file (most often a .sql file) in version control. The actual structure of these files and folders can vary by technology but largely it will follow a logical structure and the bottom line will be a create script per object.

When an update is made to that object, a newer version of that same create script is generated and it is added as a newer version of that script in version control and that is the latest version of the database which we can then deploy. When using state-first we have no alters, only creates, so it will be necessary to do a comparison at a later stage to work out the difference, and by extension the update/alter script that will be needed to propagate changes to later stage environments.

The benefits of the State-First method include (but are not limited to):

  • A simple approach to get started with standardizing development practices: It’s aligned with the practices we already have in place on the application development side, where source control has been standard practice for years.
  • Easy on-boarding for teams in the ‘Shared development model’: When every developer is forced to share a single development environment it can be quite hard to ensure that developers are keeping work separate; most tools that enable you to work in this model allow you to ‘lock’ objects at the database level as you work on them, or who exactly made each change that might be committed.
  • Easier to roll-back to previous state: Rollbacks are a pain with databases, but there are times where they are necessary. Maintaining a full history of the state at any given time makes it easier for us to compare and rollback environments to a state that we know worked well.

The drawbacks though of the State-First method include (but are not limited to):

  • Not as easy to achieve small, incremental deployments: Because we’re reliant on the state of the database at each stage there is still a certain element of overhead that is attached to each deployment.
  • Upgrade script determined at a later stage: Lots of people like to know EXACTLY what changes will be deployed and HOW against target environments, but because of the above reason, we’re reliant on approving changes early on, but only truly seeing how it will be deployed later in the pipeline, which doesn’t give us the same reliability or peace of mind.
  • Not as easy to refactor complex table changes: The State-First method is “How did it look to begin with and how did it look at the end?” so it doesn’t take into account the nuanced steps that may have been involved, which can be problematic when you’re adding a NOT NULL column to a table that has existing data, so these sorts of complex changes might require additional pre- and post-deployment scripts.

Migrations-First Approach

The Migrations-First approach differs significantly from the State-First approach because, as it would suggest, it relies on migrations to identify the version of the schema across environments and they usually rely on guids, numbering conventions, checksums and others to keep track of the schema, normally within a log table of their own on the affected schema itself. The migrations often come in the form of .sql files that have been written or generated and there are lots of different types, but they can be boiled down often to the idea of Repeatable, Versioned and Undo Migrations (see here on the Flyway site for a more in depth summary of these types)

The migrations then, actually contain the changes as you would like them to go out; many believe that (after testing) the script they have written is as it should be deployed, and that is exactly what is then being run against each stage. Now naturally, you need to build ON something, if you have an existing database, so many technologies will offer some sort of baselining option, to understand what already exists and what the incremental migration scripts are deploying to.

The benefits of the Migrations-First method include (but are not limited to):

  • Enables small, frequent, incremental migrations and predictable deployments: Everything is just that tiny piece of work you did, specifically. That means that only what you need to go out will go out; only what was approved at Pull Request time. This gives us high confidence that we’re sending the right changes to Prod.
  • Ideal for environments with high up-time requirements: There’s no heavy state to check, we’re just migrating these tiny changes, which means there’s far less chance of causing huge overhead on Production at deployment time.
  • Ability to use your own custom standards and code for table changes in deployments: No script generation or the ability to edit generated scripts is one of the greatest capabilities of this model. For complex changes, the steps to achieve this we KNOW that work are included, and not only that, the scripts are commented and formatted and easy to understand with our company standard, making it easy to keep track of what has been deployed.

The drawbacks though of the Migrations-First method include (but are not limited to):

  • Not as easy to pick and choose changes to be deployed: If a developer has captured multiple changes within the same script, but we only want to deploy a subset of those changes, or we don’t want to deploy to a subset of those objects right now, then it’s really hard (almost impossible) to try and unpick these changes, this also makes testing certain changes in isolation tough!
  • Higher learning curve for teams: This method is neither as easy to adopt nor as intuitive as the State-First approach, which means developers need to get used to writing their migration scripts, ensuring they’re properly formatted, commented, tested, numbered and where necessary, the undo script for those changes. This results in a much higher ask for the team; the cost for gaining the predictability of deployments.
  • Harder to roll-back changes: _On those very same undo scripts then_ they have to be absolutely perfect. It’s still much harder to undo, especially if we’re trying to undo migration 5.0.1 when we’re already on 6.1.2, everything has to cascade neatly if you’re carrying out multiple undo’s and having a water tight undo strategy is hard to nail down.

The Hybrid / Optimized Model

This particular model is a rare one to find because it is not offered widely, but where it is achievable it can offer the benefits of both the State- and Migrations-First models.

As the name would suggest, it is a combination of the state and migrations approaches into a single Hybrid model; developers store the state of their database in source control, allowing them to easily rework their changes and commit multiple times to their working branch as they develop the “end goal”, and then from this same location once those changes are confirmed, pushed and ready to go, the relevant migrations are generated from the latest state.

Now this model can be adapted into lots of different workflows: developers can all generate their own migrations from their state and check them in together when they’re happy. This records a granular history of each change that was made and how it applies to each object, and is easy to work with, and then the migration contains just what needs to go out from all of that work. Another option would be having developers make the changes and check these into a DB State folder in source control, and then having more experienced developers or DBAs etc. generate the respective migrations from the state, knowing that they have a greater confidence in the SQL specific changes that are captured in the script. This is nice because it gives cause for another pair of eyes, which again gives greater confidence in what ultimately gets deployed.

The benefits of the Hybrid / Optimized method include (but are not limited to):

  • Full granular history around object changes on a state level, but with customization, flexibility and reliability of migration scripts: Know exactly what has changed, when and by whom, but don’t worry that you don’t know exactly what change will be deployed.
  • Separation of duties for Developers and Senior Team Leads / DBAs (who generates what / who has what specialty) and a lower learning curve for developers: Easy for developers to make changes quickly and easily without having to worry about the “nitty-gritty” and exactly what SQL will be needed. Gives DBAs and senior developers peace of mind that changes are ultimately adopted and improved by people who _know_ the database.
  • Easily extends existing state-first model where migrations are needed: State-First is a great choice 70% of the time but there ARE times where data migrations or complex changes are needed. This method includes these changes where needed, instead of relying on pre- and post-migration scripts, which run globally every single time.
  • Easier to pick and choose changes to go out: Because we can choose which changes to which objects are going out in the migration scripts, it’s easier for us to grab only the ones we want to push out each time, like an additional “cherry pick” layer within the development process.

The drawbacks though of the Hybrid / Optimized method include (but are not limited to):

  • Additional step added to the process can make it feel like red-tape / added work: In some cases teams may wish to make changes and get them out _fast_ as part of continuous deployment, and could be doing so hundreds of times per day. This model can get in the way of that because it adds an additional layer of dependency.
  • Could add some time to the overall development process for new changes: This is almost exactly the same as the above reason. More steps to include, more people to include, slightly less automation than we would like _perhaps_ so naturally time to deploy increases slightly (but arguably is offset by greater confidence in the change? I’ll let you decide!)
  • Duplicated schema model in Source Control repository: Some tools keep a copy of the schema in source control as reference for the migrations, others don’t. In either case, you’re maintaining two versions of a repository, which many say should be the single source of truth, if these two are even slightly out of sync, who are we going to believe? This model calls for discipline, as sloppiness can destroy all of the proposed benefits.

aaaaaaaaaaaaaaaaaand… breathe!

Slow Down Reaction GIF by True and the Rainbow Kingdom

Conclusion

There are lots of different models you can adopt for the source controlling of your database and changes, in this post I’ve outlined 3 (well… 2 and a half really) but whatever you’re looking to adopt, hopefully this will give you greater confidence in adopting the right one.

Have a wonderful week!