Environment
Environment is used to capture the runtime dependencies of a Component. It is almost identical to the definition of the Environment class in the AzureML python SDK.
Define new environment
Environment can be created in number of ways. For example using docker, conda or even a combination of two. User can define an environment in a component spec yaml file. In this case a new environment will be created when the component is submitted for running.
There are several ways to set running environment of a component. Here is a basic example defining using inline conda and existing docker image.
environment:
docker:
image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04
conda:
conda_dependencies:
name: project_environment
channels:
- defaults
dependencies:
- python=3.6.8
- pip=20.0
- pip:
- azureml-defaults
- azureml-dataprep>=1.6
os: Linux
The following sample components demonstrate more scenarios.
Conda section
If user defines the conda section, the component will be executed in the conda environment created in the container. If not set, the component will be directly executed in the container.
Separate conda environment file
Conda section of environment in component spec yaml file.
environment:
conda:
conda_dependencies_file: conda.yaml
os: Linux
conda.yaml
name: project_environment
channels:
- defaults
dependencies:
- python=3.6.8
- pip=20.0
- pip:
- azureml-defaults
- azureml-dataprep>=1.6
Separate pip requirements file
Conda section of environment in component spec yaml file.
environment:
conda:
pip_requirements_file: requirements.txt
os: Linux
requirements.txt
azureml-defaults
azureml-dataprep>=1.6
Inline conda environment definition
Conda section of environment in component spec yaml file.
environment:
conda:
conda_dependencies:
name: project_environment
channels:
- defaults
dependencies:
- python=3.6.8
- pip=20.0
- pip:
- azureml-defaults
- azureml-dataprep>=1.6
os: Linux
use private python packages
If you’re actively developing Python packages for your machine learning application, you can host them in an Azure DevOps repository as artifacts and publish them as a feed. This approach allows you to integrate the DevOps workflow for building packages with your Azure Machine Learning Workspace.
Learn v1 Document: use-private-python-packages
SDK example to set up the connection:
# Run this once with your own PAT to connect the workspace to your private Python feed
# assume we have
from azureml.core import Workspace
ws = Workspace.from_config()
pat = "<Insert your PAT here>"
ws.set_connection(
name="o365exchange",
category="PythonFeed",
target="https://o365exchange.pkgs.visualstudio.com",
authType="PAT",
value=pat,
)
Refer the private package in conda yaml:
name: python_trainer_env
dependencies:
- python=3.7
- pip:
- numpy==1.19.2
- torch==1.4.0
- azureml-defaults==1.13.0
- --index-url https://o365exchange.pkgs.visualstudio.com/_packaging/PolymerPythonPackages/pypi/simple/
#- --extra-index-url https://o365exchange.pkgs.visualstudio.com/_packaging/PolymerPythonPackages/pypi/simple/
Docker section
Currently, the docker of environment can be defined through existing docker image or dockerfile. And the docker of environment support Linux and Windows operating systems.
When the user doesn’t set the docker image, the environment will be created by the default docker base image according to environment OS. The default base images used in different environment OS are shown in the following table.
| OS | Base image |
|---|---|
| Linux | mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04 |
| Windows | viennaprivate.azurecr.io/base-windowsservercore-3.5dotnet-ltsc2019:latest |
NOTE: It is still a private feature that execute component in Windows docker container.
See reference for more details of the base image.
Existing docker image
Docker section of environment in component spec yaml file.
environment:
docker:
image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04
os: Linux
User can configure the image that supports GPU to allow the component executes on GPU.
Example of Docker image, with gpu support
Docker section of environment in component spec yaml file.
environment:
docker:
image: mcr.microsoft.com/azureml/openmpi3.1.2-cuda10.2-cudnn8-ubuntu18.04
os: Linux
Docker image from docker register.
Below approach is not recommended as this means you will keep password in component yaml/source code. Instead, you can register an environment in your workspace and reference it by name version in component.
environment:
docker:
image: azureml/azureml_3f91b3a9e8271b3add0b56f43ac6d07c
registry:
address: your_registry_name.azurecr.io
username:
password:
os: Linux
Reference dockerfile
You can also specify a custom Dockerfile. It’s simplest to start from one of Azure Machine Learning base images using Docker FROM command, and then add your own custom steps. Use this approach if you need to install non-Python packages as dependencies.
NOTE: Python is an implicit dependency in Azure Machine Learning so a custom dockerfile must have Python installed.
Example of separate docker file
Environment in component spec yaml file.
environment:
docker:
build:
dockerfile: file:component.dockerfile
conda:
conda_dependencies_file: conda.yaml
os: Linux
component.dockerfile
FROM mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04
RUN echo "Hello from custom container!"
Example of inline docker file definition
Environment in component spec yaml file.
environment:
docker:
build:
dockerfile: |-
FROM mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04
RUN echo "Hello from custom container!"
conda:
conda_dependencies_file: conda.yaml
os: Linux
Reference existing environment
User may want to reference existing curated environment, instead of writing the definition in component yaml spec. For example:
Reference Azure ML provided curated environment: like AzureML-Designer.
Reference environment maintained by partner in same workspace. These environments can be created and managed follow this doc.
User may achieve similar functionality by referencing a pre-cooked docker image. But this requires hardcode an image uri and may need to specify user and password of a private container registry in yaml which is not ideal.
Reference existing environment by name and version
For example: User want to reference the version 19 of AzureML-Designer curated environment.
environment:
name: AzureML-Designer
version: 19
Reference existing environment only by name
For example: User reference the AzureML-Designer curated environment by name.
The default version of the environment name will be used in component run.
environment:
name: AzureML-Designer
Notes
Default version resolve at component create time
The default environment version will be resolved at component create (registration) time, i.e. the environment definition will be pinned in this certain component version. Rerun the component version will always use same environment definition, even default version changed for this referenced curated environment.
In this case the component version runtime behavior is deterministic.
If user want to upgrade the environment version, the best practice is to create a new component version.
No named environment creation
If there is name or version in environment definition, there should be no other things like conda, docker. Definition like below will be banned.
# this is not acceptable
environment:
name: AzureML-Designer
conda:
conda_dependencies_file: conda.yaml