Azure

How to Access Storage Account from Databricks Workspace using Unity Catalog

There are multiple ways to access Azure Storage Accounts from Azure Databricks Notebook. When the access is configured through Azure credentials, we can choose Service Principal, SAS token, or Storage Account Key. This post describes an alternative approach: using Unity Catalog. This way, we set up Azure Databricks Connectors, Storage Credentials, and External Locations, without providing the configuration code in the Notebook.

Steps:

  1. Create an Access Connector for Azure Databricks (Dbx)
  2. Grant IAM role (Storage Blob Contributor) to the Dbx Connector on the Storage Account
  3. The Dbx workspace needs to be assigned to a metastore. Config this on https://accounts.azuredatabricks.net
  4. In the workspace, create a Storage Credential using the Dbx connector
  5. In the workspace, create an External Location using the Storage Credential
  6. Use one of the following commands to test the connectivity in the Notebook:
dbutils.fs.ls("abfss://container@storageAccount.dfs.core.windows.net/path/to/data")
display(spark.sql("LIST 'abfss://container@storageAccount.dfs.core.windows.net/path/to/data'"))

Note: The cluster running the notebook needs to be either “Single user” or “Shared” access mode. Only these two support Unity Catalog.

References: