Sport isn’t necessarily a metaphor for everything, but I cannot help but find similarities between my adventures in sport and the technologies I work with every day. I recently battled through the process of extracting data from an OAuth 2.0 API with Azure Data Factory, something I’ve been struggling with for some time. When I finally got it to work, my involvement with the sport of rugby provided the perfect metaphor for the process.

Cadence is everything

Photo by Chino Rocha on Unsplash

The scrum in rugby is a set phase (restart) that usually happens after a team made some mistake, like passing or dropping the ball forward. The sixteen biggest guys on the field (8 from each side) pack together in a very specific format, and it’s pretty much an exercise in brute strength and technique to see who can win the ball back for their team.

To ensure the safety of all the players involved, referees have a very specific series of commands (cadence) they relay to the players: CrouchBindSet. There is a little pause between each command and if either the referee or players feel that something is out of place, everyone will stand up and start again. We call that a reset.

Each team also has their own cadence in preparation of the scrum, and as the referee I will have a conversation with each team before the game to ask what their cadence is. This is important because I need to give them time to get ready, and once they are I can start my cadence to get everything under way.

What does this have to do with OAuth 2.0?

The thing that makes working with OAuth 2.0 APIs difficult is the cadence. Every vendor implements it in a slightly different way, like the rugby team’s preparation for the scrum, and if you don’t get everything right it just doesn’t work…many times without a good explanation.

Adding an ETL tool into the mix makes things even a little more complicated, because the initial authorization flow requires somebody (a user) to physically log in and authenticate. And that’s where we usually get stuck; we try to do that first step in the ETL tool as well…and it just doesn’t work.

The series

As you can imagine, there’s quite a few moving pieces to this puzzle, and we’ll deal with each in a separate post to make the reading a bit more manageable. The blog series will cover the following topics:

  1. The authorization flow
  2. Using Postman to get tokens and test API requests
  3. The ADF linked service and dataset
  4. Refreshing tokens
  5. Extracting data

4 thoughts on “Working with OAuth 2.0 APIs in Azure Data Factory: A series

  1. Dirk Grabenhorst says:

    Hello Martin,
    nice article, thank you for publishing.
    Do you also know how to access from ADF external sources only supporting OAUTH 2.0 with OAUTH SAML Type?
    thank you,
    Dirk

    1. Hi Dirk, I have not worked with SAML before and would unfortunately not be much help with that :-/

  2. Dan says:

    Stunnin article and stunning information. However, both here and on youtube, when the steps are created for oauth2 linked rest service, you skip the most important steps, hence unable to finalize the deployment. Would be much appreciated if you could include all the steps in adf as well, end to end

    cheers

    1. Hi Dan,

      The goal of the series was not to provide a step-by-step guide, but rather point out the various nuances that you’ll encounter with APIs. If you’re having specific issues, posting a question here or on your favorite forum with what you’ve tried and what isn’t working may be a good way to go.

Leave a Reply to Martin SchoombeeCancel reply

%d