In continuation with our Azure Every Day mini-series on Azure Databricks, I will be covering some key topics within Databricks such as Azure Key Vault, storage accounts, PowerPoint and DevOps. If you’re just starting out with Databricks, you may want to check out our previous posts on Databricks 101 and Getting Started with Azure Databricks. Today’s post is focused on accessing Azure Storage accounts.
Azure Databricks connects easily with Azure Storage accounts using blob storage. To do this we’ll need a shared access signature (SAS) token, a storage account, and a container. We can peruse our files with the downloadable application called Azure Storage Explorer.
My video included below is a demo of this process. Here’s how to connect Azure Databricks to an Azure Storage Account (blob storage):
- With Azure Databricks loaded, we click on Launch Workspace which takes us to our Azure infrastructure.
- In my demo, I already have a cluster up and running and a notebook. A notebook is how we do our coding within Databricks.
- First thing we need to do is create a storage account, in my case I created a blob storage. I also created a container that I named demo.
- I can access that container through the Azure Storage Explorer. I did this by connecting to my Azure Storage account and drilled into my blob storage and my demo container, then uploaded 2 files which are CSV in nature.
- You need to be aware that within Azure Databricks you’ll use some Python code to connect to your storage account. (You can see this code and detail about it in my video.)
- There are also some steps you’ll need to follow to generate an SAS connection string which you can also see in my demo.
- Once we’ve done that, we go back to our Databricks and click Ctrl/Enter. Since the cluster is already up and running, it’s going to send the command to the cluster and the cluster will perform the action.
- If it’s successful, we’ll have a connection. We will have a link and we’ll mount the demo folder. It may take a few minutes to run and you’ll see if it succeeded, then you should be able to see your files.
- If it succeeded, we now have a mounted folder structure in our blob storage account. We can display the contents of this folder and I see that I have my 2 CSV files in there.
That is how easy it is to connect your blob storage in Azure in Azure Databricks using your connection.
Keep an eye on this blog for more on our mini-series on Azure Databricks. If you’d like to learn more about how to leverage Databricks, Azure in general or any Azure product or service, we can help. Contact us to have a conversation about how our team of experts can help you at any stage in your Azure journey.