Creating rollback scripts automatically with SQL Compare and SQL Data Compare in Azure DevOps (migrations approach)

“Life’s under no obligation to give us what we expect.”
Margaret Mitchell

Hello everyone and happy 2021! Wait, what? It’s FEBRUARY!?! Oops.

You may well have noticed I have not been around for a couple of months. That is because I’ve been struggling. A lot. We’ve had some home worries, and with the state of the world playing on our minds constantly, it has never felt right to sit down and blog anything other than some of the recipes I’m working on (purely to get out of my own head!) but this week I was asked about rollbacks when working in the Migrations based approach with SQL Change Automation, and I couldn’t resist writing something.

Before I get into this though, I want to be very clear in my preferences. I don’t believe database changes SHOULD ever be rolled back. Always forward. If for any reason you need to undo a change, either for a disaster or just because you want to, and other code is being rolled back – then this should be a task. A rollback branch should be created, the respective changes should be made in dev and then using your fabulous, flexible and repeatable pipeline should be merged, built, checked, tested and deployed upstream to Prod. This rolls the database forwards and does so in a way that is controlled, auditable and prevents data loss.

Jimmy Fallon Agree GIF

However there MAY be some scenarios where a rollback is unavoidable, for example if you are required to do so immediately, or you’re not sure which of the 3 changes that were just deployed actually caused something to go bang, so you want to blanket undo until you can investigate properly.

With this in mind then there are a couple of ways of rolling back in the migrations approach that can absolutely be a part of your pipeline, one or two of which I want to try and flesh out in this post. This post would of course be much shorter if we were working with the State-Based approach, as we would simply need to compare a previous artifact with our target and generate a down script, but as we know, the state-based method does indeed come with it’s own challenges.

  1. Generate down scripts up front – this is an option that is described fairly well in the SQL Change Automation documentation, and may be preferable to some, however this is limited to VS and does definitely include some additional work for developers, and potentially generates quite a lot of scripts that may have to be rationalized
  2. Ad-hoc rollback with SQL Compare – again, this is described in the documentation but they will have to be changes that are then recaptured in the development pipeline in some way, plus this method is ideal for programmable objects but not so much for table changes.

But I think we’re onto something there with a combination of these 2 approaches. Is there a way to use SQL Compare and SQL Data Compare alongside SQL Change Automation, to automatically generate down scripts as artifacts, before each Production deployment? This is what I want to investigate.

Set Up

As usual, I set up an Azure DevOps repo (this time called RollBackPipe) and created myself a set of local databases that I’ve been using to record videos lately so I now have the MaskMePlease database in Dev, Test, Staging and Production flavors. I’m doing this locally on my machine with an Azure DevOps agent I had hanging around anyway – it saved me firing up a bunch of Azure SQL DBs, and I just used SQL Clone.

Implementation

I got the pipeline running straight through, I was simply using YAML for this pipeline as I didn’t want to “faff” around with the Release stages etc. but this could also have been achieved with Classic Pipelines of course.

Ignore the terrible stage names, essentially I was doing 3 things with SQL Change Automation:

  • Database Build: Producing a Nuget Package to represent the successfully building files stored in my Git repo, to consume later in the pipeline
  • Create Release Artifact: This stage looks at the target environments (Test, STG and Production) and creates a number of reports in my local DatabaseDeploymentResources folder for me to review
  • Deploy From Release Artifact: Take the patch script that has been produced and run this against the target DB

So it was time to try and build in the SQL Compare Command Lines. Note if I was doing this on an Azure Pipelines agent I might need to run this as a container or programmatically install the command lines instead, but fortunately they’re just installed on my machine so I could just add a command line stage.

There was actually a couple of ways to do this, I can either pass in the details of the 2 databases completely via the command line (full documentation here), or I could generate an XML config file to pass in and store that in version control, or I could pre-construct a SQL Compare / Data Compare “Project” and just pass that in. The latter seemed easier, so of course I did that!

I opened up SQL Compare and did a Database to Database comparison for Staging and Production, but in the reverse direction to ensure we create a down script and not an up script!

I then saved this as a project, then using the red SQL Compare logo in the top left I was able to launch SQL Data Compare with the same project, so it loads the settings for you. The reason for doing this is that if we undo anything that was created as part of a migration, then try to redeploy it, it won’t actually recognize those features as missing, because the Migration Log table tracks all the migrations, and it will still be saying that it has been deployed. So we need to compare the MigrationLog tables to re-sync back to the previous state, before we ran the migrations.

However I DID make sure that before comparing I only the __MigrationLog table in the comparison in Tables and Mappings:

I then had to change the Comparison Key being used for the migration log table to [Custom], because the default comparison key includes the date/time executed, so this is what I went for:

I saved this project down with the Compare script:

and then following the rather helpful documentation on the command lines, I was able to make the following calls as part of my pipeline (YAML included):

- task: CmdLine@2
  inputs: 
    script: '"C:\Program Files (x86)\Red Gate\SQL Compare 14\sqlcompare.exe" /project:"C:\DatabaseDeploymentResources\MaskMePlease\_ComparisonResources\ProdToStagingDownScript.scp" /scriptfile:"C:\DatabaseDeploymentResources\MaskMePlease\%BuildId%\SchemaDowngradeScript.sql" /include:Identical'
  env:
    BuildId: $(Build.BuildId)
- task: CmdLine@2
  inputs:
    script: '"C:\Program Files (x86)\Red Gate\SQL Data Compare 14\sqldatacompare.exe" /project:"C:\DatabaseDeploymentResources\MaskMePlease\_ComparisonResources\ProdToStagingDataDownScript.sdc" /scriptfile:"C:\DatabaseDeploymentResources\MaskMePlease\%BuildId%\DataDowngradeScript.sql" /include:Identical'
  env:
    BuildId: $(Build.BuildId)

and sure enough, having this run after the deployment to staging but before the deployment to Production I was able to get the reports being produced into the artifacts folder with the relevant changes document:

So it was time to try it out! I created a BadTable on Dev and generated the migration for it, and committed it to source control:

This then triggered the pipeline to run:

and produced the correct reports and .sql scripts:

Then I ran these scripts to rollback Production (of course I could have modified them to preserve any data I needed to retain, this actually works best with Programmable objects like sprocs). Then I re-ran the deployment pipeline and the BadTable lives once again!

So does this work with multiple changes too? I created a stored procedure called “deletemeplease” and modified our BadTable to add a “randomthing” column, to see if we could roll those back:

Production after deployment:

Production after rollback:

All I have to do to roll back forwards now is run my pipeline:

and they’re back:

This image has an empty alt attribute; its file name is image-24.png
This image has an empty alt attribute; its file name is image-25.png

Conclusions

Is it perfectly possible to have a version specific script output as an artifact as part of your Azure DevOps pipelines? Yes it absolutely is.

Is it perfectly possible to rollback changes that you’ve made when taking a Migrations-First approach to database development? Yes it absolutely is.

Would I recommend this approach? No. Absolutely not.

Just because this is possible doesn’t mean it is good process. I’ve been asked about it enough times that I was more curious than anything – however after actually IMPLEMENTING this process, it is clear to see there are many aspects to it that can fall down and it makes it clearer now, more so than ever, that the best approach would be to either roll forwards from Dev, through the pipeline in an automated, auditable and controllable way, or if we just need to roll back temporarily because of other code being rolled back, then use SQL Compare ad-hoc. This pipeline was very simple, with very few changes and when i made mistakes I could rectify them easily myself – but with a larger dev team, when under pressure to fix things, and many changes to consider, this approach can very quickly fall apart.

Long story short, (I believe) we should follow and trust the documentation.

YAML from Azure DevOps:

trigger:
- main

pool: 'default'

steps:
- task: RedgateSqlChangeAutomationBuild@4
  inputs:
    operation: 'Build SQL Change Automation Project'
    sqlProj: 'RollBackPipe\RollBackPipe.sqlproj'
    packageName: 'MaskMePlease.Database'
    tempServerTypeBuild: 'localDB'
    nugetPackageVersionSelector: 'Specific'
    nugetPackageVersion: '1.0'
    nugetPackageVersionUseBuildId: true
    requiredVersionOfDlma: 'latest'

- task: RedgateSqlChangeAutomationRelease@4
  inputs:
    Operation: 'Create'
    NuGetFile: 'MaskMePlease.Database.1.0.$(Build.BuildID).nupkg'
    ExportPath: 'C:\DatabaseDeploymentResources\MaskMePlease\$(Build.BuildID)\Test'
    AppendEnvironment: false
    TargetDatabaseConnectionType: 'explicit'
    TargetDatabaseServer: '.\TOOLS'
    TargetDatabaseName: 'MaskMePlease_Test'
    TargetAuthMethod: 'windowsAuth'
    TransactionIsolationLevel: 'serializable'
    AbortOnWarning: 'none'
    requiredVersionOfDlma: 'latest'


- task: RedgateSqlChangeAutomationRelease@4
  inputs:
    Operation: 'DeployFromResources'
    ImportPath: 'C:\DatabaseDeploymentResources\MaskMePlease\$(Build.BuildID)\Test'
    AppendEnvironment: false
    TargetDatabaseConnectionType: 'explicit'
    TargetDatabaseServer: '.\TOOLS'
    TargetDatabaseName: 'MaskMePlease_Test'
    TargetAuthMethod: 'windowsAuth'
    requiredVersionOfDlma: 'latest'

- task: RedgateSqlChangeAutomationRelease@4
  inputs:
    Operation: 'Create'
    NuGetFile: 'MaskMePlease.Database.1.0.$(Build.BuildID).nupkg'
    ExportPath: 'C:\DatabaseDeploymentResources\MaskMePlease\$(Build.BuildID)\Staging'
    AppendEnvironment: false
    TargetDatabaseConnectionType: 'explicit'
    TargetDatabaseServer: '.\TOOLS'
    TargetDatabaseName: 'MaskMePlease_STG'
    TargetAuthMethod: 'windowsAuth'
    TransactionIsolationLevel: 'serializable'
    AbortOnWarning: 'none'
    requiredVersionOfDlma: 'latest'


- task: RedgateSqlChangeAutomationRelease@4
  inputs:
    Operation: 'DeployFromResources'
    ImportPath: 'C:\DatabaseDeploymentResources\MaskMePlease\$(Build.BuildID)\Staging'
    AppendEnvironment: false
    TargetDatabaseConnectionType: 'explicit'
    TargetDatabaseServer: '.\TOOLS'
    TargetDatabaseName: 'MaskMePlease_STG'
    TargetAuthMethod: 'windowsAuth'
    requiredVersionOfDlma: 'latest'

- task: CmdLine@2
  inputs: 
    script: '"C:\Program Files (x86)\Red Gate\SQL Compare 14\sqlcompare.exe" /project:"C:\DatabaseDeploymentResources\MaskMePlease\_ComparisonResources\ProdToStagingDownScript.scp" /scriptfile:"C:\DatabaseDeploymentResources\MaskMePlease\%BuildId%\SchemaDowngradeScript.sql" /include:Identical'
  env:
    BuildId: $(Build.BuildId)
- task: CmdLine@2
  inputs:
    script: '"C:\Program Files (x86)\Red Gate\SQL Data Compare 14\sqldatacompare.exe" /project:"C:\DatabaseDeploymentResources\MaskMePlease\_ComparisonResources\ProdToStagingDataDownScript.sdc" /scriptfile:"C:\DatabaseDeploymentResources\MaskMePlease\%BuildId%\DataDowngradeScript.sql" /include:Identical'
  env:
    BuildId: $(Build.BuildId)
- task: RedgateSqlChangeAutomationRelease@4
  inputs:
    Operation: 'Create'
    NuGetFile: 'MaskMePlease.Database.1.0.$(Build.BuildID).nupkg'
    ExportPath: 'C:\DatabaseDeploymentResources\MaskMePlease\$(Build.BuildID)\Production'
    AppendEnvironment: false
    TargetDatabaseConnectionType: 'explicit'
    TargetDatabaseServer: '.\TOOLS'
    TargetDatabaseName: 'MaskMePlease_Production'
    TargetAuthMethod: 'windowsAuth'
    TransactionIsolationLevel: 'serializable'
    AbortOnWarning: 'none'
    requiredVersionOfDlma: 'latest'


- task: RedgateSqlChangeAutomationRelease@4
  inputs:
    Operation: 'DeployFromResources'
    ImportPath: 'C:\DatabaseDeploymentResources\MaskMePlease\$(Build.BuildID)\Production'
    AppendEnvironment: false
    TargetDatabaseConnectionType: 'explicit'
    TargetDatabaseServer: '.\TOOLS'
    TargetDatabaseName: 'MaskMePlease_Production'
    TargetAuthMethod: 'windowsAuth'
    requiredVersionOfDlma: 'latest'

Refreshing SQL Server Development workflows with Redgate SQL Provision

“If you quit on the process, you are quitting on the result.
Idowu Koyenikan

SQL Provision is really cool. But you knew that didn’t you? It’s obvious – we get teeny-tiny clones, based on an image with completely sanitized data we can use for just about anything in dev and test, and if we break them? Boom! There’s a new one.

I’m not just talking about refreshing Dev & Test environments though, oh no! I’m talking:

  • Clones as baseline with SQL Change Automation – baseline scripts for projects are a thing of the past, goodbye invalid object headaches!
  • Clones every single time you switch a branch – keeping everything separate and not cross-pollenating database work between branches
  • Clones to check Pull Requests instead of relying solely on the code itself in Version Control

Watch my session on all 3 of these from Redgate Streamed back in August: https://www.red-gate.com/hub/events/redgate-events/redgate-streamed/redgate-streamed-global-august-26

But one question always comes up about clones in any workflow and that is – how often should I refresh Images and Clones?

This question obviously depends a lot on the process but in reality I think the question should be less about clones and more about the images themselves. Clones are transient and can be flipped at a moments notice, but the image, or the “clone tax” as Steve Jones calls it, is the thing that takes time, resource and space.

I’m going to take my own go at answering this question as I would in any customer meeting or architecture session – but if you want some excellent detailed advice and examples, check out this awesome documentation page here: https://documentation.red-gate.com/clone/how-sql-clone-improves-database-devops/self-service-disposable-databases-for-development-and-testing

Q: So, how often should we refresh it?

A: It depends on your use of the Clone – how often do you need up to date data?

As a rule of thumb though, I tend to see the following behaviours:

  • Customer Support – overnight during the working week: Where you have data that needs people to troubleshoot customer issues, it always helps to have data as close to now as possible to help resolve issues. You want an image on standby ready so that at any second a member of support can pull down a copy to look through (if it NEEDS to have sensitive data for this purpose, you can restrict who can create clones from these images by using SQL Clone’s Teams functionality)
  • BI / MIS and Report testing – once a week (if not more often): Business Intelligence and reporting workflows can just mean that you’re reading a lot from your clones in which case they should stay small and you should be able to move seamlessly between clones. But. If your ETL process puts a very heavy load on your clones (like truncating and re-populating tables) you may cause bloat and need to rethink your refresh frequency to be more often where possible, perhaps overnight so that any transformations are captured in the new images, and clones by extension.
  • “BAU” Development (Schema and Static Data Changes) – Every 1 or 2 weeks: If you’re not affecting a large number of changes to your clone, or they are limited to schema and static data only then you should be absolutely fine with a wider refresh cadence – keeping the clones around for the whole sprint or only refreshing once during the sprint can mean everyone more easily stays up to date with the same environment consistently.
  • Ad-Hoc and Test workflows – once per month: There are going to be times where you occasionally need a copy of the live DB, but the fact it is 99% similar in terms of schema, and the data is a few weeks out of date isn’t a big deal. You can pull one down from this “cold copy” for any kind of test, destructive or even to validate certain behaviors / sense check if an update or query will work. It’s also handy to maintain a slightly older copy where possible if you need to start digging into failed updates made in development, so need to have a milestone to compare from.

Again – these workflows may vary and you may have needs to be more or less frequent based on differences being recorded, bloat, space available on the fileshare etc. but generally I find customers are pretty happy with this.

Q: Once we have our refresh rate in place – how do we move developers across?

This is a great question I get a lot of the time, and it stems from the fact a developer may have made a few dozen changes to a clone, and then the frequent refresh rate blows their clones away (and they forgot to commit to version control – D’oh!) – so it’s important to bear in mind that development work, and as a result the cloning of environments is not “cut and dried“. We should give developers a chance to move across as-and-when they’re ready, so I often end up recommending the below workflow, to ease this process.

For the sake of this proposed workflow I’m assuming a couple of things:

  1. The selected workflow is BAU Development and we want to refresh once per week
  2. We have enough space available on our fileshare to allow for 2 (or more) distinct copies of the primary image
  3. Clones are being delivered to jump boxes / VMs within the network that are always connected (and not developer machines), and we can control when they are deleted
  4. We operate on a standard western work schedule where the week begins on Sunday, Saturday and Sunday are considered non-working days and developers typically work anywhere between 8am and 6pm
  5. This can all be automated using SQL Clone’s PowerShell module

Week 1 – Sunday night

  • We create Image A of Primary Database from most recent backup file onto fileshare, applying data masking

Week 1 – Monday to Friday

  • Developers X, Y and Z create their own clones of Database A as they begin the working week
  • The clones are linked to a Git repo where, using SQL Change Automation, the developers commit all changes they make to their clones throughout the week
  • Developer X finishes with their changes, makes their final commit and push on Thursday and works on a different task on Friday

Week 2 – Sunday night

  • We create Image B of Database A – with slightly more up to date (and sanitized) data and capturing any deployed changes the team committed and pushed to git previously
  • We retain Image A for now but do a check for which developers have clones remaining (Developers Y and Z) and either nudge them in the team stand up that they only have a few days left, or automate the sending of an email to those developers warning them their clones are now 1 week old

Week 2 – Monday morning

  • Developer X creates their new clone from Image B and links it to Git ready to start making changes

Week 2 – Tuesday to Friday

  • Gradually over the course of the week as Developers Y and Z finish with their tasks and commit their changes they remove their clones and create new ones from Image B
  • A final reminder, as an email or a notification in MS Teams / Slack goes out on Friday morning that any clones of Image A will be deleted over the weekend

Week 3 – Sunday night

  • Image A with no clones remaining is deleted (or any remaining clones are deleted first) and Image C is created to begin the cycle again

Conclusion

Although this workflow requires the duplication of the central image, it has a number of benefits:

  • It is easily automated using PowerShell
  • The source control process suffers minimal disruption and developers don’t need to rush to finish anything
  • We don’t accidentally destroy developer work – the onus is on the developer to ensure work is committed
  • If, for any reason the image creation process fails, you still have a persisting image, so you don’t prevent developers from doing any work / waiting for the image process to manually complete
  • Moving to newer clones is a more organic process
  • If you wanted to maintain an image throughout the week and refresh a second image overnight for more up to date data, you can simply re-purpose the above principles. This could then be used for a number of the different teams and workflows simultaneously

Bonus Point – Naming Conventions

Many people choose to append the images they create with a date stamp like Image_A_16102020 so we know when it was taken and what the latest is. This is good practice but be warned if you’re using Clones as baseline or for branch switching etc. you will need to have a persistent name else that link will break. An alternative is always having the same name for the most current image and then simply renaming the older image with the date time stamp e.g. Image_A is current, but before creation of a new Image_A, it is renamed to Image_A_16102020 – this will not disrupt the clones that already exist on it, and it allows you to always know which one is most recent.

Using things weirdly – Part 3: Moving from State to Hybrid-ish Source Control with Microsoft SSDT and Redgate SQL Change Automation

“It works on MY machine”
Everyone

Some time back I blogged “which database source control model works well for you” and it has come in SO handy when explaining the different models to people – but the one question I get asked not infrequently, is:

Is there any way that I can use Redgate SQL Change Automation with Visual Studio based SSDT?

It’s always a really difficult question to answer because fundamentally SQL Source Control (Redgate’s state based tool) and SSDT (Microsoft’s state based tool) functionally seek to do the same thing, making them competitors. However there are, on the odd occasion, good reasons as to why I am asked the question and one of those same scenarios came up today:

  • Our developers work in Visual Studio and have already been using SSDT for a long time, it works for them, we just want to extend it with Migrations to handle complex changes.

So the option here is, leave it as it is, or try to work with both. Not always going to be my first choice but it got me thinking.

Starting from a memory

A few months ago, when life was “normal” and I was working in my office with *gulp* PEOPLE, I tried to make this scenario work by simply linking SQL Change Automation to the project folder created by SSDT but unfortunately it was riddled with problems. The SSDT importer and repo look like this:

And if you point SQL Change Automation at the local repo with this in it will correctly tell you:

Now of course this wasn’t unexpected. It’s not designed to work this way, is it? No. But way back then I did figure out, shrewdly, that if I used SQL Source Control to carry out an initial commit just to a working folder, it would generate a RedGate.ssc and RedGateDatabaseInfo.xml file and you can copy them into the SSDT repo to trick SQL Change Automation into thinking that it’s a SQL Source Control repo… unfortunately this trick no longer works. Sad.

sad a christmas story GIF

Add a hop and a step

But what got me thinking today was the context with which the question was asked. It was more about separation of duties. Once the developers have effectively done their job and delivered the change into the repo, their job was effectively done! “That’s how it should look moving forwards. What’s next?” – and then I had an idea.

Given that SSDT allows you to push and pull the code and apply it to your own database, what is stopping us from using SQL Change Automation to pick up on the changes against the database we sync our changes to from our SSDT project?

Genius. Evil genius.

So I created a new Database to simulate having another developer on my instance and gave it to Peter Parker:

You can then do a schema compare to another DB from your project, effectively PULL down changes from the remote to your local repository, and then sync them back up to your local development DB; this is how Devs stay up to date with each other but could, in this methodology, be how DBAs or senior developers pull down the changes to their local DB, where they test the new state, and then generate a new migration from it.

So I made a change on my dev database and captured it in the project right click on the project name > schema compare > dev db compare to project > update) and then committed and pushed:

and sure enough my repository was updated:

But then I simulated pulling down the change and applying it to Peter Parker’s DB (again using Schema Compare) and then I created a SQL Change Automation project in VS, in the same solution but pointing the project to a migrations folder in the repo:

Yes I accidentally called the project Database1 don’t remind me I’m embarrassed enough!

Then I added my baseline database:

It created the baseline and the project immediately with no issues and picked up on the changes I had made using SSDT:

and I was able to commit my project and changes into my repository in Azure DevOps:

It was just that easy! Now what this means for the development process is that developers _could_ feasibly work with SSDT, as they are comfortable with it, and then more senior members of the team can generate migration scripts from there, building the database from scratch and deploying in a reliable, repeatable fashion.

Just to prove to you my build even ran green from this:

So in summary what this gives us is the ability to adapt a regular SSDT workflow, one that developers are comfortable with and which has been in the team for months or years, add in the knowledge of DBAs or team leads, a greater separation of duties for high risk schema changes, and the control and flexibility (and peace of mind) that comes with a migrations based deployment process.

Nice.

The fine print

I’m sure by now you’ve realized something: this is not, nor will it ever (I believe) be a supported workflow. If you implement the above in a production sense for something other than just testing then it’s not something you’ll be able to get help with from one of the Redgate engineers if you need to troubleshoot.

Also, if you’re going to introduce a sequence of changes like this to achieve the hybrid model, it does make more sense that you implement SQL Source Control for the state side (given that it’s right there in the SQL Toolbelt with SQL Change Automation anyway).

But IS IT POSSIBLE to achieve a similar, Visual Studio based* hybrid workflow with SSDT and SQL Change Automation by using a database to ‘hop’ the changes across?

Yes, it certainly looks that way!

*If you’re planning on using SSDT in Azure Data Studio too then this workflow could also work for you, SQL Change Automation is present in SSMS and VS so it’s really up to you!

SQL Change Automation and GitLab CI/CD (a.k.a. Oh this is fun on Windows)

“You never know what you can do until you try, and very few try unless they have to.”
C.S. Lewis

Well I don’t have to, but many of the people I speak to on a daily basis are moving into GitLab, so it’s about time I tried it! You can find here testament to the mistakes I make as I try to set up a full end-to-end database change management process with SQL Change Automation and GitLab.

Will it all work perfectly? I don’t doubt that everything will fall over at some point, but let’s see how we get on all the same, and hopefully if you’re setting up this same pipeline, you’ll be able to avoid the errors and failings I inevitably cause! So here we go!

ready come on GIF

Let’s set up a GitLab Project (and rename the default branch)

Naturally, I didn’t have a GitLab account, so I had to set one up. I’m assuming that if you’re using it already or you’ve just started using it you’re taking advantage of the more business features but I’ve just stuck with the good ol’ free account for now! It was remarkably simple, sign up, email address, confirm and here we are:

Ok there is something very cool I like about setting up a new project, can you tell what it is?

You can completely set up a new blank project but they have templates, you can import projects OR, and I love this, you can setup a full CI/CD pipeline from another repo! Having done this before in Azure DevOps it was not easy, let me tell you. It really seems like Azure DevOps hates you for setting up CI/CD from an external repo, even though it has plenty of helpful ways of doing so!

So I initialized my repository with a README and updated it:

Don’t ever say I’m not descriptive enough!

The first thing I did was a renamed my default branch to ‘trunk’ by going to branches, creating the new branch and then in Settings > Repository changing it to the default and then swapping out the protected status with the outdated master:

Then finally delete the old default in Repository > Branches:

Excellent. Now it’s time to clone trunk onto my machine as we will need the local repository to put our change automation project in!

I created a folder called GitLab test and cloned the mostly empty repository into it:

Simple!

Create a new SQL Change Automation project and push it to trunk

In SSMS I opened up the most recent version of SQL Change Automation an created a new project called “DoggosAreCoolDB” using a copy of a Dev database I had lying around from a previous demonstration (BlogsDotRedgate):

Then I created my baseline as a migration script against the up-stream copy, BlogsDotRedgate_Integration, because who has access to Prod for this? Am I right? *cough & shifty eyes* not me!

I successfully generated my baseline and a change script (I added a column to a table, nice and simple) and then committed them to my local repo, and pushed! Forget branching, today isn’t about that, we’re just PUSHING TO TRUNK, WOO-HOO!

Setting up the CI/CD Pipeline

Now that we have our project and migrations in GitLab we can build out a pipeline! So first stop I went straight to CI/CD > Pipelines and was presented and I hit “Get Started”:

They immediately throw you into a Quick Start “Help” style guide which is immediately a little un-intuitive but surprisingly helpful if you read the whole thing. Effectively we need a YAML file called .gitlab-ci.yml that will store our pipeline as code telling it how and where to build, and we need a runner to actually fire up and execute these steps.

In my experience with some other CI/CD tools, it’s been advantageous to actually create the Runner / Agent first on the machines you’re going to be using, so as I just have my laptop to do this on, I will set one up on there! I found the full documentation for a Windows Runner here, and followed it just so I would have it available.

The GitLab Runner was up and running in my services but I’ll be darned if I can see them anywhere in GitLab…

confused britney spears GIF

Aha! So it turns out after a bit of digging that you need to register the runner specifically using the CI/CD section on the project settings, that was probably my bad for not reading the documentation thoroughly but my counter-argument… who actually does? So I issued the register command, applied tags and a description and chose my runner type, I chose shell because I need to be able to run PowerShell on the machine (I’ll need the SQL Change Automation PowerShell components available on the machine where the Build and Deployment are happening of course):

I’ve got the runner on the machine, I’m using an instance of SQL Server to build against, now I just need the YAML file (fortunately GitLab has full documentation for how to structure this as well!)

So I can build my project I’m going to need to know where the repo is cloned to during the process (i.e. to find the .sqlproj file) so by taking a look I managed to find a list of environment variables that can be used in the YAML file, just to be sure though, I created and committed the most basic YAML file that would just echo back the location of the cloned files:

stages:
  - DatabaseBuild

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - echo $env:CI_PROJECT_DIR

After this let me know the environment variable worked correctly and the build pipeline was being fired up correctly on my private runner, I tried something a little more ambitious, building the .sqlproj file using the cmdlet reference from the SQL Change Automation documentation for help:

stages:
  - DatabaseBuild

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - $projectPath = $env:CI_PROJECT_DIR
   - $project = "$projectPath\DoggosAreCoolDB\DoggosAreCoolDB.sqlproj"
   - $validatedProject = $project | Invoke-DatabaseBuild -TemporaryDatabaseServer "Data Source=PSE-LT-CHRISU\"

and we successfully built a database!

All that’s left to do now is 2 things:

  1. Create a NuGet package as part of the CI build
  2. Release the database changes to the target DB

I’m still using the same machine for the release portion too, so naturally I can use the same runner for this, if you have other servers you’re deploying to you will of course need additional runners.

We can very easily extend what we already have in our YAML file by just telling the process to create and export a new build artifact – I’m going to name it the same as everything else, and then append the BuildId to the end of the file so we always get something unique:

stages:
  - DatabaseBuild

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - $projectPath = $env:CI_PROJECT_DIR
   - $project = "$projectPath\DoggosAreCoolDB\DoggosAreCoolDB.sqlproj"
   - $validatedProject = $project | Invoke-DatabaseBuild -TemporaryDatabaseServer "Data Source=PSE-LT-CHRISU\"
   - $buildArtifact = $validatedProject | New-DatabaseBuildArtifact -PackageId DoggosAreCool.Database -PackageVersion 1.$env:CI_JOB_ID
   - $buildArtifact | Export-DatabaseBuildArtifact -Path "$env:CI_PROJECT_DIR\Export"
  artifacts:
    paths:
     - $env:CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.$env:CI_JOB_ID.nupkg
    expire_in: 1 week

You’ll notice how I’m exporting the NuGet package to the project directory and then uploading it, this is so that we’ll have access to it to release but also so that we can use the artifacts argument in our YAML to upload the file and make it a downloadable package through the GitLab interface (if you go to that SPECIFIC job):

Whilst we’re on a roll here (and things haven’t gone wrong for a while) I’m going to add 2 additional stages ALL AT ONCE to both “Create a Database Release Artifact” and “Deploy from a Database Release Artifact” using, once again, the SQL Change Automation PowerShell cmdlets.

Woo-Hoo! I’m invincible!

I broke it.

Can you see what I did wrong? The error is:

New-DatabaseReleaseArtifact : The specified value for the Source parameter is neither a valid
41database connection string nor a path to an existing NuGet package file or scripts folder:
42'CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.725147351.nupkg'

So 2 fun things. 1 – I forgot to highlight there was an environment variable at one point, so it was just looking for the name of the variable in the path and 2) it keeps erroring out saying my NuGet file isn’t a NuGet file, weird right?

On further inspection it is yet another mistake I made. I’m using the job ID to name the NuGet package, which means when it tries to find the file it’s 2 steps ahead because each stage is counted as a different job! Duh!

sylvester stallone facepalm GIF

A few quick changes should hopefully sort this out! I’m going to put the instance of the pipeline ID in ($env:CI_PIPELINE_ID) and see if that makes a difference!

Wait. Did it just say the pipeline ran? SUCESSFULLY? That’s exactly what it said! We can verify that this actually happened as well by checking the DatabaseDeploymentResources folder for the Release Artifact to Integration:

And everything is there! Note you won’t have a changes.html report just yet because this is the first time we’ve successfully deployed to Integration, however if we run 1 more change through (I’ll add a stored procedure):

Boom.

Prince Harry Mic Drop GIF

Now of course we can add additional stages to this, for manual intervention or to promote to other environments, but I’m going to call it a win here and retire (until the next post) gracefully. I’m sure you’re all wondering what my final YAML file looked like too – well (counterintuitively) I’ve popped it all into GitHub for you and pasted it below. Enjoy!

stages:
  - DatabaseBuild
  - CreateRelease
  - DeployToIntegration

Build:
  stage: DatabaseBuild
  tags: 
   - sql
  script: 
   - $projectPath = $env:CI_PROJECT_DIR
   - $project = "$projectPath\DoggosAreCoolDB\DoggosAreCoolDB.sqlproj"
   - echo "Building project $project"
   - $validatedProject = $project | Invoke-DatabaseBuild -TemporaryDatabaseServer "Data Source=PSE-LT-CHRISU\"
   - $buildArtifact = $validatedProject | New-DatabaseBuildArtifact -PackageId DoggosAreCool.Database -PackageVersion 1.$env:CI_PIPELINE_ID
   - echo "Exporting artifact to $env:CI_PROJECT_DIR\Export"
   - $buildArtifact | Export-DatabaseBuildArtifact -Path "$env:CI_PROJECT_DIR\Export"
  artifacts:
    paths:
     - $env:CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.$env:CI_PIPELINE_ID.nupkg
    expire_in: 1 week

CreateRelease:
  stage: CreateRelease
  tags: 
   - sql
  script: 
   - $integrationDB = New-DatabaseConnection -ServerInstance "PSE-LT-CHRISU\" -Database "BlogsDotRedgate_Integration"
   - $buildArtifact = "$env:CI_PROJECT_DIR\Export\DoggosAreCool.Database.1.$env:CI_PIPELINE_ID.nupkg"
   - echo "Creating Release Artifact for DoggosAreCuteDB - check C:\DatabaseDeploymentResources\DoggosAreCuteInc\ReleaseArtifacts\$env:CI_PIPELINE_ID\Integration for more information"
   - $releaseArtifact = New-DatabaseReleaseArtifact -Source $buildArtifact -Target $integrationDB
   - $releaseArtifact | Export-DatabaseReleaseArtifact -Path "C:\DatabaseDeploymentResources\DoggosAreCuteInc\ReleaseArtifacts\$env:CI_PIPELINE_ID\Integration" -Format Folder

Integration:
  stage: DeployToIntegration
  tags: 
   - sql
  script: 
   - $integrationDB = New-DatabaseConnection -ServerInstance "PSE-LT-CHRISU\" -Database "BlogsDotRedgate_Integration"
   - echo "Deploying changes to Integration"
   - Import-DatabaseReleaseArtifact -Path "C:\DatabaseDeploymentResources\DoggosAreCuteInc\ReleaseArtifacts\$env:CI_PIPELINE_ID\Integration" | Use-DatabaseReleaseArtifact -DeployTo $integrationDB

Using things weirdly – Part 2: Static Data in the Hybrid Model (w/ Redgate SQL Source Control and Change Automation)

“Where’s your will to be weird?”
Jim Morrison

I had some really positive feedback on my last “Using things weirdly” post, and truth be told, I love to use things weirdly. The number of times I’ve heard: “Oh, well, sure yeah I guess it works that way too…” is just too many to count. So imagine how my eyes lit up when I realized that you can do something weird with one of my favorite things to talk about at the moment, The Hybrid Model.

Now if you don’t know about the Hybrid Model then I would suggest you check out my post here that’s all about the different source control models available for your databases!

The Problem

Across both the State based and Migrations based offerings within the SQL Toolbelt, you have access to something very cool: the ability to easily (alongside the schema) source control your static data. Now don’t ‘@’ me because you think I should be referring to it as “Semi-Static” because it might change occasionally and ‘that’s not truly static then is it?‘; I could easily also refer to it as ‘Lookup Data’ or ‘Reference Data’, basically whatever you class things like “Country Codes” and “Currency Codes” as.

Whatever you call it, it can go into your VCS like any part of the database schema:

SQL Source Control: State Based
SQL Change Automation: Migrations Based

But. One thing that – as of writing this RIGHT NOW (23/07/2020 10:09am BST) – is not officially available in the Hybrid Model combining these two methods… is Static Data. The Data tab in SQL Change Automation even disappears when you set it up as a Hybrid workflow:

And this gives me a sad.

sad monty python and the holy grail GIF

The Solution?

Got your Hybrid Model setup and ready to go? Let’s use it weirdly!

1 – Use SQL Source Control to commit some static data to your state repository. This is as easy as right clicking on your (highlighted green) source control linked database and selecting “Other SQL Source Control Tasks” > “Link/Unlink Static Data“, and picking your tables.

Nothing should be showing in SQL Change Automation:

2 – Unlink SQL Change Automation from the state repository for a moment and link it instead directly to the development database. This will cause it to go into Migrations-First mode. You can do this by clicking the blue source name in SSMS and picking Existing Database instead:

Because it’s technically the same database as you’re source controlling in the state repository, it should all just work™ and should tell you there are no dev changes to the source. Then you’ll see the “Data” tab has been enabled:

3 – Select the same tables to source control as you did with SQL Source Control by using the Add Tables wizard:

BUT WAIT!!

shocked oh my god GIF

Isn’t it now going to generate a migration for our static data?? This isn’t included in the baseline or anything at all so far, so is it going to try and insert all of my static data into tables later on that already have it?

No. Actually we’re fine!

4 – Generate the static data migration script (and look at it for peace of mind). Notice that the script actually has checks in there – because we’re newly adding these tables, the migration will check if the tables are empty before trying to run the script, and only AFTER this migration will SQL Change Automation start generating the differential, incremental static data migrations:

Commit this migration script to your migrations repository, and that’s all we need to do here!

5 – All that’s left now, is to re-hook-up the Hybrid pipeline, follow the same steps you did before in Step 2 but this time, instead of an existing database, just link it back to the state repository like it was before. If you’ve done this right, you should see no changes pending for migrations:

BUT you will notice that if you change any static data with SQL Source Control, it should now show up in SQL Change Automation!

Change to static data prior to commit in SQL Source Control
Change before migration script generation in SQL Change Automation
Generated migration script to be committed

Conclusion

Is it an intended use? No absolutely not, the reason it’s disabled is that with all things at Redgate they are considered heavily to ensure users are offered the best possible user experience, functionality and essentially something that meets requirements across the board.

But. We can use it weirdly to, as i say, just make it work™.

What have you used weirdly lately? Let me know!

Classification by design: Including data cataloging in a DevOps process

“The most powerful tool we have as developers is automation.”
Scott Hanselman

It is no secret that I love to talk about data protection, specifically from the perspective of structured data. When we talk about database development practices, we often find ourselves talking about 3 things most often:

  • Source Control
  • Continuous Integration and Continuous Delivery/Deployment (CI/CD)
  • Estate Monitoring

Some people refer to this as “DataOps“, others refer to it as “DevDataOps” but in reality, it’s all DevOps guys. This may be an unpopular opinion (and if it clashes with yours please forgive me, it’s just my opinion) but just because a certain niche area hasn’t been specifically called out within a subset of DevOps doesn’t mean you have to invent your own term for it!

Now this leads me on to DevSecOps, or as I like to call it… More secure DevOps.

rimshot GIF

No but seriously this is a slightly different case – DevSecOps is like DevOps but fortified with security from the ground up. There’s a fantastic article and diagram of this on Plutora from Mark Robinson of how this looks (below) and if you haven’t read his article I would definitely go and give it a read!

DevSecOps Diagram

Good DevOps practice is a combination of different things working together, bringing the right mentality, the principles, processes and amazing tools at our disposal like automation but this all includes security from the ground up too. DevOps is about putting those principles and practices in place to strengthen the pipeline, so why don’t we treat security in the same way?

Take, for example, 3 pieces of legislation that have been very much in the spotlight:

The controller shall implement appropriate technical and organisational measures for ensuring that, by default, only personal data which are necessary for each specific purpose of the processing are processed.
GDPR (Europe) Art. 25 “Data protection by design and by default”

Processing agents shall adopt security, technical and administrative
measures
able to protect personal data from unauthorized accesses and accidental or unlawful situations of destruction, loss, alteration, communication or any type of improper or unlawful processing.

– LGPD (Brazil) Chapter VII, Art. 46 “Security and Secrecy of Data”

“A Controller or Processor is required to implement appropriate technical and organisational measures to demonstrate that Processing is performed in accordance with this Law…”
DIFC LAW NO. 5 OF 2020 (Dubai) Part 2D, Art. 14 (2) “Accountability and notification”

There’s a common running theme here and although lots of global legislation will either allude to, or directly tell you ways you can be compliant and what some of these “organizational” and “technical” measures are, it’s still pretty blurry.

How do we know what we can do? How do we know what “default” and “design” mean in this context? Well, we build it into the DevOps process.

Now I could sit here forever and talk about why transforming your database development, deployment and provisioning processes allows us to be more secure, but that’s a lot of material and it might have to come in chunks! So what we’re going to focus on today is as the title suggests: Data Classification and Cataloging.

Why is Cataloging important?

Cataloging structured data is incredibly important because it can be one of the first steps we take to securing sensitive Personally Identifiable Information (PII) or Protected Health Information (PHI) wherever it exists across our database environments. It allows us to make strengthened, contextual decisions about the data we hold including how we treat it in pre-Production, how long we retain it for and which systems and processes consume it.

But the most important part of this is simply: it tells us where the risk is.

Read through any of the most recent data protection laws and you will notice that a few things come up quite a lot including “Data Protection Impact Assessment“, or DPIA. Effectively if you can assess the risk of processing activities you can more readily answer the data protection questions and challenges you may face.

Knowing where your data resides can be the first step to helping you assess this risk, and to more readily answer your own data questions. If you want to read more about Cataloging specifically and why it is useful, you can read more about it on my previous blog here.

Where does Cataloging fit into DevOps?

This one is simple to answer. Once you have fully classified your entire estate, you’re not done. No, if you’re a development house or indeed even a single developer – if you are making any schema changes to the tables holding that sensitive data, you’re never done.

The reason for this is that Cataloging is an evergreen activity – if you update tables by removing columns, adding columns, splitting tables, adding tables… anything! Well then you need to be ready to make sure that you are:

a) Prepared and equipped with knowledge of the tables you’re working on and if this is a high risk activity.

b) Updating classification information to reflect the new “truth”, i.e. if you’re adding a column that will collect people’s Twitter handles, then that column should be classified as sensitive, and this should be reflected the moment it is deployed to Production.

So it is important to have the correct people working on this, with the right knowledge, preparation and processes and using the correct tools ensuring that those updates are persisted properly and securely through your deployment pipelines.

Huh… people, processes and tools… That sounds familiar!

The Process: SQL Data Catalog, SQL Change Automation and Azure DevOps

For this little experiment of mine I used Redgate’s SQL Change Automation (Migrations First approach in SQL Server Management Studio) and SQL Data Catalog to both develop & deploy and classify/categorize respectively, and for simple version control and orchestration of this pipeline I opted for Azure DevOps (with SQL Change Automation CI/CD plugins):

NOTE: Heads up, all of the code I used for this can be found right here in my GitHub, feel free to have at it!

Step 1) Setup SQL Change Automation project with an Azure DevOps Git Repo and then create the YAML file to build it, and create a deployment process to Acceptance and the Production.

Ok – that’s a bit of a mouthful and a massive ask. There’s a lot of things there you have to be familiar with, but we don’t have time to go into right here. BUT fortunately if you checkout Redgate University right here, my blog post on using Change Automation with Azure DB and Redgate’s Product Learning section, you’ll be a Database DevOps ‘Whizz-Kid’ in no time!

Fast forward a little and I had my example databases, VCS and pipeline all up and running:

Step 2)The “Theory”: This is where things get interesting. So we have an example pipeline set up and we are able to completely deploy all the way through to “Production” so let’s talk theory.

In SQL Data Catalog I have covered both my Production and Acceptance Databases:

Now, in development we don’t make changes directly to Production, so why should Classification be any different? Now how you adapt the above code is up to you, feel free to split it, move it around, incorporate it into Pull Requests if you want to… But I’m going with a bit more of a simple situation.

Situation: Developer makes a change in Development, which gets committed, reviewed and merged o the main branch, resulting in a build and a deployment, in this case to Acceptance and then it is later deployed to Production.

Now, by Acceptance we should only have the “good work”, i.e. all of our testing is shifted left within DevOps so Acceptance is basically the last stop before Production. Therefore we should classify the work we have done on Acceptance, crucially, before it gets to Production and starts gathering sensitive data, and then copy this classification up on deployment.

Ideal: We should have no columns on Production that have not been classified.

Step 3) – In Practice: Fortunately it’s very easy to automate a lot of these steps with SQL Data Catalog utilizing it’s PowerShell cmdlets and REST API. The cmdlets are fully documented and very easy to use (docs here). This allows us to easily scan, classify and copy classifications up to other databases, but we’ll also need to do some checks and report if there are discrepancies, as part of the deployment pipeline that can be investigated.

  • Are there any columns on Acceptance that aren’t classified but have been deployed to Production? (failure to comply with process)
  • Are there any columns on Production that have not been classified? (classification drift)
  • Are there any unclassified columns on Acceptance that have not yet been deployed to Production (for pipeline hygiene purposes)

The other part of this ‘fun’ is reporting what has been changed in the same process. Now fortunately SQL Change Automation spits out a Changes.json file with its Release Artifacts and we can steal that away and find out how many tables have been created or changed in this release and report that back so we can correlate what has been done and what is missing:

So actually getting this up and running is just going to require 3 things:

  1. The PowerShell script from GitHub (or your own personalized variant) as a step in your production deployment
  2. Data Catalog available and pointed at Acceptance and Production (or your versions of these environments)
  3. Variables set in Azure DevOps to fill the gaps (e.g. Where is Data Catalog? Whats my PowerShell Auth token? What are my Acceptance and PROD DBs called? etc.)

3 is the last step there so you’ll need something like this to run the script:

  • DatabaseDeploymentJSON – where the JSON file will be with the latest changes in the Prod release
  • DataCatalogAuthToken – Your PowerShell Auth token from Settings in Data Catalog
  • DataCatalogUrl – The full URL to your Data Catalog installation, missing the “\” at the end (ending :15156)
  • ExportPath (Optional) – I specified the path for my Database Deployment Resources to save typing it out in the Redgate plugins
  • ProdDB / StageDB – As you would expect, the Production and Acceptance/Staging DBs you’re deploying to/from
  • ProdInstance / StageInstance – As above, except the instance the Database are located on

In the variables above the Instance and DB names are purely used within Data Catalog, so there’s no need to worry about anything happening to the actual databases themselves!

Once you’ve run through the deployment pipeline a couple of times and the changes.json file is being produced, you can go ahead and copy the script into an inline PowerShell script step in your release and you should find it will fire to life! I simulated an example by modifying my Contacts table and my Articles table, adding 1 column each and deploying both to Acceptance. I then classified just 1 of these in Acceptance in Data Catalog:

and then approved the deployment to Production and tada!

Ok you probably can’t make all that out, but it effectively says:

(Information) Table dbo.Articles was modified in this deployment.
(Information) Table dbo.Contacts was modified in this deployment.

That much we knew!

1 column(s) with classifications were discovered on VoiceOfTheDBA Acceptance that are not classified in VoiceOfTheDBA Production:
dbo.Articles.TestingPineapple

Excellent, we classified that one so it gets copied up and we can verify that in data catalog against Production:

and finally, we get a warning about Production now containing unclassified columns:

(Alert) The following columns have been discovered on VoiceOfTheDBA Production that require classification:

dbo.Contacts.TestingPineapple

You should classify these columns in VoiceOfTheDBA Acceptance prior to the next deployment.

Just as we expected. Success!

Happy Tom And Jerry GIF

Conclusion

Classification and categorization belongs as part of DevOps, if you expect the context for your business decisions around data to remain evergreen and informed then it cannot sit on the shoulders of one or two people to support it, and it cannot live in a manually updated Excel sheet or document.

By including it within the DevOps process, not only do you add an additional layer of security but you also make it an automated, team activity that can be audited, checked and easily kept up to date.

Is this DevSecOps? Well… not really no. Is this a more secure approach to Database DevOps? Absolutely! Happy DevOpsing!

(SQL) Change ALL the Azure SQL Database Automation!

“But I can hardly sit still. I keep fidgeting, crossing one leg and then the other. I feel like I could throw off sparks, or break a window–maybe rearrange all the furniture.”
Raymond Carver

I understand that starting off a blog about Azure SQL Database with the above quote is a little weird, but honestly I’m _really_ excited about what I’m about to tell you.

***Note before starting: This blog post assumes you’re familiar with the concepts of Database Source Control, CI and CD, Azure SQL Database and pipelines within Azure DevOps, otherwise here be dragons.***

I am a huge fan of SQL Change Automation – mostly because of the migrations functionality. In my mind it represents an ideal workflow for making complex SQL Server database changes. If you’re not sure about the different models (State, Migrations, Hybrid), take a look at my blog post from last week here! But until this time it has had one thing that I could not easily do with it… Platform as a Service, Azure SQL DB.

Now don’t get me wrong, SQL Change Automation could easily deploy to Azure SQL Database but I had a problem. The words:

Chris how do we benefit from the migrations approach and put the shadow database and build db in Azure SQL too? We don’t have any local instances or VMs we can use for this and Dev, Test and Prod are all in PaaS!”

elicited this response:

cry crying GIF

But. No. Longer.

Now for those of you who don’t know, the _SHADOW_ database that SQL Change Automation creates is effectively a schema and static data only copy of your database, and it is dropped and built each time you verify, to ensure that all of the migrations run successfully and you can effectively check your work and shift the build left (!!), before you even check into source control.

This shadow database and the build database shared one thing in common and that was that you couldn’t build them in Azure SQL DB, which left 2 choices:

  • Use an instance of SQL Server. Developer for the shadow locally maybe; a VM in Azure or on-prem hosted instance for building
  • [For build specifically] Use localDB. Not advisable if your database contains any objects not supported by localDB because (juuuust in case you didn’t know) it is SQL Server Express.

But on May 12th 2020 (and I only found out about this like 2 weeks ago) the SQL Change Automation team at Redgate released version 4.2.20133 of the plugin for SSMS which included a few super cool things like additional Azure SQL support and the Custom Provisioning Scripts feature.*

excited excitement GIF

Now this is great because not only can we now easily create SQL Clones to be used as the development source (and I’ll blog about THAT a little later) but of course you can use it to use an Azure SQL DB for the shadow AND to use a persistent Azure SQL DB for the CI build as well!

Now unfortunately Kendra kinda beat me to the punch here and she produced a fabulous 3 part video series you can watch on using SQL Change Automation solely with Azure SQL DB, and you can view those here if you don’t want to see me try it out:

Getting set up

The first thing I did was make sure that I had all of the necessary environments to try this out – I created 3 Azure SQL Databases to mimic Development, Build and Production environments on 2 separate Prod and Non Prod Azure SQL Servers. I ran the DMDatabase prep scripts (you can find these here) to setup Dev_Chris and Production, but left BuildDB empty.

Next It was time to create my project, so I hopped over into Azure DevOps and created a new project, initialized it with a README and then Cloned it down onto my local environment:

Everything was ready to go so it was time to create my project!

Setting up SQL Change Automation in SSMS

*cough* or if you’re me, update it first because you’re on a REALLY old version *cough*

Then I hit “Create a New Project” and it allowed me to just specify the connection string to the Dev Azure SQL DB and the project location was the checked-out local repo:

Didn’t change any of the options because I’m a rebel and I didn’t feel like filtering anything out! But of course now comes the fun bit… the baseline. I chose my production Azure SQL DB as it’s my only upstream DB at this point, and it’s time to hit “Create Project”.

…and Huzzah! It’s worked and we’re all good!

excited andrew garfield GIF by The Academy Awards

Now… that’s actually not the best bit! The reason why Andrew there is clapping so hard? Well that little piece of magic has happened in the background! A Shadow database has actually been created for me against my azure server automatically! This is done by using the connection string that is used for dev!

Now… one thing to check, and I didn’t think to do this, but you can specify the connection string in the SQL Change Automation user file but I just left mine for a bit not realizing it created an Azure SQL DB for the Shadow that was CONSIDERABLY higher tier than my dev environment (bye bye Azure credit!), but fortunately I was able to scale it down quickly to basic and that has stuck, but be warned!

So I did what all ‘good devs’ would do now… I committed and pushed my initial commit directly to my main branch! (Don’t tell my boss!)

and safely sat my Database in Azure DevOps:

Setting up the build and deployment stages

This bit was actually just as easy. I used to hate YAML but thanks to a certain (wonderful) Alex Yates I jumped in anyway and it turned out to be just fine!

I created a new basic YAML file within Azure DevOps (and used the assistant to just auto populate the Redgate defaults, if you don’t know YAML or what it can do already, there’s a really good MS article here) and committed it to the main branch again (whoopsie) and the only component was the SQL Change Automation plugin I pulled in from the Azure DevOps marketplace, and I configured the build to target my “nonprod” server and the Build DB I had created previously.

On saving and running the pipeline succeeded!

All that was left to do was to create a Release Pipeline. So naturally, I jumped straight in and created a new pipeline, and I started with an empty job and called it Production*note* make sure you also choose your Build artifact before configuring your release stage too by clicking the Add an Artifact option!:

I added the SQL Change Automation: Release step to the agent job (note because this is all hosted, I’m using an Azure DevOps hosted agent to do this step):

Now you’ll need to add 2 stages (both the SQL Change Automation: Release plugin) at this point, a “Create Release” and a “Deploy from Database Release Artifact” because one will look at the target and figure everything out for you, and you’ll be able to review exactly what will be deployed, and the other will actually _do_ the deployment:

From here you just have to specify the options available, like in this wonderful walk-through here from the fabulous Chris Kerswell of DBAle fame! For me, this was simply targeting my Production Azure SQL Database.

You’ll definitely want to use the project variables to pick up the right package, and also leave the export path blank in both steps for now:

You can Clone the step by right clicking instead if you want to which will preserve all the connections you’ve already provided! Then once it’s all pointed at the right place, save and queue the release!

And of course, we were successful:

and then finally with a couple of triggers set to automatically build and deploy I made a change to my Contacts table in my Azure SQL Dev DB and a few minutes later, thanks to Azure DevOps and Redgate SQL Change Automation the very same change appeared in Production, with no reliance on anything other than Azure SQL DB and SQL Change Automation:

Before the DevOps process on Dev, ready for a migration to be generated
After: Automatic post-build deployment of the new column to the Production Azure SQL Database

Conclusion

If you have all of your databases in Azure SQL Database**, fear not because SQL Change Automation to the rescue! You can very easily set up and configure a pipeline in Azure DevOps or indeed any pipeline of your choice, but it’s never been easier to persist development changes all the way through to Production in a low risk, incremental, “DevOps” way!

—NOTES—

*An important word from the release notes: Note that it is still generally recommended to locate the shadow database locally where possible as that will usually result in a faster database connection. The default CreateDatabase.sql and DropDatabase.sql scripts can be altered to improve performance or implement custom provisioning logic.

**If you have all of your Databases in Azure and you need them masked for Dev/Test too, check out this previous blog post in which I outlined how to do that using Azure DevOps too!