Uploading Notebooks to Azure Machine Learning Workspace with Bicep

How to upload Jupyter notebooks to your Azure ML Workspace with Bicep — allowing you to add sample notebooks for your users

Shan Singh
3 min readDec 5, 2023

The easiest way to upload notebooks with infrastructure-as-code is by using a Deployment Script.

Deploying a Machine Learning Workspace

In this article, I’ll assume you’ve already created a bicep file to deploy an Azure Machine Learning Workspace. If you haven’t, an example of what the Bicep may look like is below in main.bicep:

resource machineLearningService 'Microsoft.MachineLearningServices/workspaces@2023-06-01-preview' = {
name: machineLearningServiceName
location: location
sku: {
name: 'basic'
tier: 'Basic'
}
identity: {
type: 'UserAssigned'
userAssignedIdentities: {
'${userAssignedIdentity.id}': {}
}
}
properties: {
storageAccount: storageAccount.id
keyVault: keyVault.id
applicationInsights: applicationInsights.id
hbiWorkspace: false
v1LegacyMode: false
publicNetworkAccess: 'Enabled'
primaryUserAssignedIdentity: userAssignedIdentity.id
}
}

In the example above a Key Vault, Application Insights instance, Storage Account and User Assigned Identity (optional) have already been deployed.

Uploading Notebooks with a Deployment Script

After you have deployed an Azure Machine Learning Workspace, you can use a Deployment Script to upload notebooks. Essentially, what we’ll be doing is copying the notebook from an existing location (in the example this will be a storage account where the notebook is publicly accessible) and then uploading it to the Storage Account file-share that’s used by Azure ML.

One challenge you may face here when automating the infrastructure is determining the name of the file-share. This is because it is auto-created by Azure ML on deployment and has a name that looks like code-XXXX-XXXX-XXXX-XXXX. However, we can use an Azure CLI command to find this name programmatically. You can see this being done in the uploadNotebooks.script.sh file.

A Bicep example is given below in notebooks.bicep:

param machineLearningServiceName string 
param userAssignedIdentityName string
param storageAccountName string
param location string

// Change the URL below with that of your notebook
var urlNotebook= 'https://www.test.com/notebook.ipynp'

var dataStoreName = 'workspaceworkingdirectory' // Note: name auto-created by ML Workspace, DO NOT CHANGE
var azCliVersion = '2.47.0'

resource userAssignedIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' existing = {
name: userAssignedIdentityName
}

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' existing = {
name: storageAccountName
}

resource deploymentScript 'Microsoft.Resources/deploymentScripts@2020-10-01' = {
name: 'notebooksUploadScript'
location: location
kind: 'AzureCLI'
identity: {
type: 'UserAssigned'
userAssignedIdentities: {
'${userAssignedIdentity.id}': {}
}
}
properties: {
azCliVersion: azCliVersion
environmentVariables: [
{
name: 'URL_NOTEBOOK'
value: urlNotebook
}
{
name: 'MACHINE_LEARNING_SERVICE_NAME'
value: machineLearningServiceName
}
{
name: 'RESOURCE_GROUP_NAME'
value: resourceGroup().name
}
{
name: 'DATASTORE_NAME'
value: dataStoreName
}
{
name: 'AZURE_STORAGE_ACCOUNT'
value: storageAccount.name
}
{
name: 'AZURE_STORAGE_KEY'
secureValue: storageAccount.listKeys().keys[0].value
}
]
scriptContent: loadTextContent('uploadNotebooks.script.sh')
retentionInterval: 'P1D'
cleanupPreference: 'Always'
timeout: 'PT1H'
forceUpdateTag: 'v1'
}
}

Note: in the script above the userAssignedIdentity has the correct access needed to run the script in uploadNotebooks.script.sh.

A copy of the uploadNotebooks.script.sh file is below:

#!/bin/bash

## Download the Azure CLI ML extension
az extension add -n azure-cli-ml

## Determine the Fileshare name in Azure Storage Account
FileShareName=$(az ml datastore show --name $DATASTORE_NAME --workspace-name $MACHINE_LEARNING_SERVICE_NAME --resource-group $RESOURCE_GROUP_NAME --query "container_name" -otsv)

## Create a new directory in the Fileshare called "MySamples"
az storage directory create --share-name $FileShareName --name MySamples

## Download Notebook Files
wget "$URL_NOTEBOOK"

## Upload Notebooks to File Shares in the "MySamples" folder
az storage file upload -s $FileShareName --source notebook.ipynp --path MySamples/notebook.ipynp

This script determines the name of the Fileshare which was automatically created by Azure ML and then downloads the notebook from the external source and then uploads it — ready to by used by Azure ML Workspace. Ideally, the external source should have a CI/CD process behind it, allowing revisions to the Sample Notebook to be updated added for future deployments.

Conclusion

Voila! With the example above you should be able to upload sample notebooks to your Azure ML Workspace using bicep and allow users to get started with any examples you provide them!

If you have any questions about the approach above feel free to comment below or reach out to me 😊

--

--

Shan Singh
Shan Singh

Written by Shan Singh

Software Engineer ☁️ • Follow me for articles on Cloud, AI and DevOps 🤖 • I also talk about about the future of: food, cities and energy🌱🌆⚡

No responses yet