This blog post is part of a “Working with OAuth 2.0 APIs in Azure Data Factory” series, and you can find a list of the other posts here.

As part of the authorization code flow you’ll receive two very important tokens. The access token is what you will use for authentication when sending API requests, but access tokens are only valid for a certain amount of time. How long the access token is valid for usually depends on vendor, and it could be anything from a few minutes to a few hours.

Once the access token has expired, you’ll typically use the refresh token along with some other identifiers (which is also different depending on the API and vendor) to get a new access and refresh token. The need to refresh tokens periodically means that you have to build that into your ETL process somehow, and there’s two ways in which you can approach this:

  1. Assume that a request failure is due to an expired token, and build some error logic into your process to then refresh the tokens.
  2. Assume that your token has expired every time the process runs, and refresh the tokens as a first step.

The first option is not completely fail-safe, because API requests could fail for other reasons too and the error messages are not always very helpful. If you were to build a loop into your error logic, your process may end up in an endless loop. Option two is not perfect either and will cause problems if you try to execute it concurrently as part of a parallel process, or if you have multiple long running API requests that extend beyond the validity of the tokens.

I usually implement token refresh as a first step (option 2) and before each series of API requests (i.e. ETL process). It seems like overkill, especially if you’re sending a bunch of requests…but I have found it to be the most trustworthy option and API servers don’t mind refreshing tokens often.

If you have long-running processes, you may even have to refresh the tokens before each request, and enforce asynchronous execution to avoid any concurrency pitfalls.

Here’s an example of my token refresh pipeline in ADF:

Saving tokens

Azure KeyVault is a good place to securely store your tokens, and I use it to store the client ID and secret too. Once you have the necessary items in your KeyVault, give the Azure Data Factory Managed Identity the necessary access to it, and follow this reference to extract specific items from your KeyVault.

Important: You are extracting sensitive information and sending it to a subsequent step in your process, and these values are sent to the logs in clear text unless you secure it. Make sure that you secure both the input and output of the steps that send/receive the sensitive information (image below).

The most important step is sending the request to refresh the tokens, and this is where our experimentation in Postman will pay dividends. From that blog post, we needed to do the following in order to get a new token:

  • Send an authorization header which contains the client ID and secret in base64-encoded format.
  • Send the grant type and refresh token in the body.

The details in Postman will help us troubleshoot any issues in Data Factory, as those will most likely be due to an incorrectly formatted request. Every little detail is important here, and even though not apparent at first you also have to set the content type in the header (especially if you encode the values):

The expression to formulate the authorization header is the following, and as you can see we use the resulting output from the previous steps to concatenate the client ID and secret (separated by a colon), and encode the entire string with the base64() function:

Basic @{base64(concat(activity('Get Xero Client ID').output.value, ':', activity('Get Xero Client Secret').output.value))}

The expression for the request body looks like this:

grant_type=refresh_token&refresh_token=@{activity('Get Xero Refresh Token').output.value}

The last two steps in the process will replace the tokens in Azure KeyVault, and we are now ready to extract some data from the API.

23 thoughts on “Working with OAuth 2.0 APIs in Azure Data Factory: Refreshing tokens

  1. Great blog Martin – I have been trying and struggling to do precisely this. We are looking to extra data from Xero into an Azure Blob container (where we can pick it up with our management reporting tool).

    One question – would you be able to share the configuration for the Web calls where you save the new Refresh and Access Tokens back to the Azure KeyVault? I tried that and wasn’t quite sure of how to pick out the new values from the “Refresh Token” call. I was also unsure on how to call the “Save” action on the DataVault and which parameters to send.

    1. Thanks! I’ll post the details of that when I’m back in the office next week.

    2. Hi Tristan,

      The web activity to save the refresh token will have the following properties:

      URL – (where “KeyVaultName” is the name of your KeyVault and “SecretName” is the name of the secret you’re trying to save to)
      Method – PUT

      Resource –

      Authentication – I’m using “System Assigned Managed Identity” because I’ve given the ADF Managed Identity the necessary permissions in my KeyVault.

      Body – {“value”:”@{activity(‘Refresh Xero Access Token’).output.refresh_token}”} (This is dynamic content to get the refresh_token value from the previous activity called “Refresh Xero Access Token”)

      Hope this helps.

    3. Martin, first time finding your stuff and this post nailed-it. Absolutely provided the guidance I needed to work through calling a token through Azure Data Factory! Thanks a million! The Azure Key Vault examples were subperb too.

  2. Sean says:

    Thanks Martin, much appreciated. I’m struggling to get this to work consistently, it still keeps throwing up a response of invalid_request. I’ve no idea why. Params work fine in Postman. Thanks in advance

    1. When you’re trying to refresh tokens? If so, my guess is that you’re missing something small like 64-bit encoding the parameters you’re passing. Postman will encode some things by default…check the sent headers/body in Postman to make sure.

      1. Same thing actually happened to me today, and from what I could see it was a case of the refresh & access tokens not being valid anymore…most likely because of something the vendor did on their end. I used Postman to manually get a new set of access/refresh tokens and saved that to Azure Key Vault, and everything is now working as expected again.

  3. Mathias says:

    Hey Martin, SUPER helpful series – Helped me do Oauth with API for the first time. Got two noob questions for ya:

    1. I have implemented the flow up and including “Save New XXXX Refresh Token” and secured output + input. It works fine and correctly updates the Key Vault secret value. My question is now; do I actually need to secure it further with the encoding base64 stuff? When I try, it fails and returns the following message:
    {“error”:{“code”:”BadParameter”,”message”:”Property value has invalid value\r\nProperty value has invalid value\r\n”}}

    The code I have in the body is like so (works fine without the base64 ():

    2: Can you explain why we need the “Save XXXX Access Token” – I can see it’s an Output by the Refresh XXXX Access Token step, but aren’t we fine with just the refresh token? I assume if it’s needed you put it in a secret, I just don’t see how it’s being used as the first 3 get activities doesn’t use it either.

    1. Hey Mathias, glad you’ve found the series helpful.

      EDIT: In my experience, most APIs require you to encode private information you send when refreshing tokens. That being said, it’s usually the header data that needs to be encoded. Your API documentation should be able to tell you if encoding is required and what the request should look like. If it’s working without the encoding, then there’s probably nothing left to do.

      You need to save the access token because you’ll most likely need it in other API calls that extract information.

      Hope this helps.

    2. Stephen WOLF says:

      Hello Mathias, Have you found out how to solve this issue? I’m facing the exact same error message while trying to update a Key Vault secret from an Azure API Management instance.

      1. Anonymous says:

        Hi Stephen – ended ud writing it in Python as the key vault messed up when there would be a new bearer token pr. API call.

  4. Revathy Wilson says:

    Hi Martin,

    Can you send me the settings screenshot of Get Xero Refresh Token – Web activity

    1. I’m not able to post images in the comments, but here are the Settings details of that task:

      URL: @{pipeline().parameters.KeyVaultSecretsUrl}xero-refresh-token?api-version=7.0
      Method: GET
      Authentication: System Assigned Managed Identity

      I have a parameter with the base URL of my Azure Key Vault secret, which is used in the first part of the URL value. I also use the ADF managed identity for authentication, as described in the series.

      Hope this helps.

  5. Revathy Wilson says:

    Hi Martin,

    I am getting below error. Please help me on this regards

    {“error”:”unsupported_grant_type”,”error_description”:”AADSTS70003: The app requested an unsupported grant type ‘refresh_code’.\r\nTrace ID: 69e52df1-8e0d-4db7-ac2a-792b9bcfb800\r\nCorrelation ID: 31a4ea42-a95f-4220-b7aa-3cf7bc9111c0\r\nTimestamp: 2022-09-07 05:11:38Z”,”error_codes”:[70003],”timestamp”:”2022-09-07 05:11:38Z”,”trace_id”:”69e52df1-8e0d-4db7-ac2a-792b9bcfb800″,”correlation_id”:”31a4ea42-a95f-4220-b7aa-3cf7bc9111c0″}

    1. Hi Revathy,

      From the error above, it looks like your API doesn’t like the value you’ve used for the grant type, or it is in the wrong place (header instead of the body, etc.). Are you working with the Xero API? If not, you’ll have to check your documentation.

  6. Kafka Hava says:

    This is my body:
    client_id= @{base64(pipeline().parameters.client_id)}

    getting error Grat type is not set
    whats wrong

    1. Most likely the double quotes around client_credentials.

  7. Anonymous says:

    Hi Martin, thanks for the series very helpful. Can you advise what the URL is for the Save New xx Refresh Token web call?

    1. The URL is the same, but you use the PUT method and the body contains the value…something like this: {“value”:”@{activity(‘Refresh Xero Access Token’).output.refresh_token}”}

  8. Anonymous says:

    Hi Martin,

    Great blog and video – very helpful and insightful. Any help with the below issues will be appreciated

    Currently i am having two issue

    First when i have implemented this my pipeline fails on Save New XX Refresh token step due to “The expression ‘activity(‘Refresh Infor Access Token’).output.refresh_token’ cannot be evaluated because property ‘refresh_token’ doesn’t exist, available properties are ‘access_token, token_type, expires_in, ADFWebActivityResponseHeaders, effectiveIntegrationRuntime, executionDuration, durationInQueue, billingReference’.

    secondly would you be kind enough to include the logic for save new xero access token step i assume the properties are:

    URL – (where “KeyVaultName” is the name of your KeyVault and “SecretName” is the name of the secret i.e. xero-access-token
    Method – PUT

    Resource –

    Authentication – I’m using “System Assigned Managed Identity” because I’ve given the ADF Managed Identity the necessary permissions in my KeyVault.

    Body – {“value”:”@{activity(‘Refresh Xero Access Token’).output.access_token}”} (This is dynamic content to get the refresh_token value from the previous activity called “Refresh Xero Access Token”)

    1. From the message that you’re getting, it seems like the API call didn’t return a refresh token property. It could be a nuance of the API, the fact that you don’t need refresh tokens because of the way you’re authenticating, or it could be that something is wrong with the request.

      Yeah, you’re the second person asking for it…I’ll see if I can add some more detail to this post at some point.

  9. RebeccaD says:

    Thank you so much Martin for sharing this. Sincerely. Do you have a pipline to refresh the client secret?

    1. Client secrets are typically static and wouldn’t need to be refreshed. If you mean the tokens then yes, I usually have a separate pipeline to refresh those and I will either schedule it separately to run once a day (dependent on the API of course) or before I need to interact with the API.

Leave a Reply to Stephen WOLFCancel reply