For the first workdays of this new year, I had the pleasurable task of resetting a bunch of passwords. The passwords belong to Azure AD users, service principals, SQL Server users etc. and all of these are tied into the ETL process of a data warehouse. Some are used to connect to source data, others to connect to the Power BI REST API and some are used in Power BI to connect to the data warehouse. The reason I had to reset all these passwords was because of the LastPass breach (we use LastPass to store our passwords). Even though there were no actual passwords stolen, our IT Security Officer send an “encouraging” email to change all passwords, internal and external. GDPR and all that.
This however presented me with the opportunity to review what users were actually necessary in the ETL. Turns out, not that many. In many cases, one Azure resource (for example, an Logic App) can use a managed identity to access another Azure resource (such as an Azure SQL DB). For those of you not familiar with the concept, a managed identity is basically a service principal in Azure AD with the same name as your resource. If your Azure Data Factory instance is called myADF, you’ll have an entry in AAD with the name myADF (it’s very much alike the SQL Server service account used on-premises). You can then assign role permissions to this managed identity.
Let’s take Azure Data Factory for example. In the Managed Identities tab of the properties, you can assign a system-assigned managed identity (MI).
This means we have in AAD an enterprise application with the name “mssqltips-df”.
We can treat this as any other user/group in AAD. For example, in Azure SQL DB, we can add this user and assign permissions to it.
CREATE USER [myADF] FOR EXTERNAL PROVIDER; GO ALTER ROLE [db_datareader] ADD MEMBER [myADF]; GO
And now this specific instance of ADF can read data from the database. In the linked service, you’ll have to specify that you’re using the managed identity.
Aside from system-assigned MI, you can also use user-assigned MI. In this case, you create a managed identity yourself, choose a name and assign it to one or more resources.
The great benefit of managed identities is that they don’t have a password! As implied by the name, they are managed by Azure itself.
I was already using managed identities in a couple of places:
But during my review I discovered other use cases where I could apply managed identities:
In the end, by using managed identities, removing redundant users and by making some processes more efficient (for example by writing data directly to Azure SQL DB in a Logic App instead of dumping it first in Blob Storage), I could remove almost all passwords from the ETL flow. There are only two users left:
Maybe there are alternatives for these last two? Let me know in the comments!
I recently read the book Agile Data Warehouse Design - Collaborative Dimensional Modeling, from Whiteboard…
You can find the slides for the session Building the €100 data warehouse with the…
I was asked to do a review of the book Microsoft Power BI Performance Best…
This is a quick blog post, mainly so I have the code available if I…
Praise whatever deity you believe in, because it's finally here, a tenant switcher for Microsoft…
This book was making its rounds on social media, and the concept seems interesting enough…
View Comments
Thanks for the wonderful article.
How SQL Server MI or SQL database going to behave if we use managed identities in a geo-replicated SQL Managed instance?
I've asked around, and for system-assigned managed identities, each instance would most likely have its own identity (but to be tested).
I would assume that a user-assigned identity can be shared, as it is independent.
So currently there is no way to connect Power BI to Azure SQL using managed identities?
I don't think so.