Frequently asked questions

Components vs Pipeline Steps

Azure Machine Learning pipelines can help organizations automate machine learning workflows by using the PipelineStep class. Components are similar, but break down pipelines further to make individual sections of code testable components instead of an entire pipeline.

Pipeline steps Components
CI/CD process support Create components from local files only Create components from GitHub public repo, Azure DevOps artifacts, or local files
Reuse across pipelines Not supported Supported
Version management Not supported Supported - manage with Azure Command Line Interface (CLI)
Supported workflows SDK only SDK - and - designer

Components vs Designer Execute Python Script component

Azure Machine Learning Designer execute python script components is a great way to customize Designer with your own code. It only spans across Designer and has certain limitations in the input and output formats. Components are more programmatic friendly and can be versioned

Designer Execute Python Components
Domain scope Designer only All across AzureML including Designer, SDK pipelines, CLI etc
Reuse across pipelines and workspaces Not supported Supported
Version management Not supported Supported - manage with Azure Command Line Interface (CLI)
Programmatic provisioning No, created from designer only Yes (SDK and CLI)
Extensibility with libraries Yes, with pip Yes, with AzureML environments

Setup environment common issues

SDK environment

ImportError: cannot import name xxx from azureml.xx

Take the following exception as an example:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\__init__.py", line 17, in <module>
    from .component import Component
  File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\component.py", line 28, in <module>
    from azure.ml.component._api._api import _dto_2_definition
  File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\_api\_api.py", line 12, in <module>
    from azure.ml.component._api._snapshots_client import SnapshotsClient
  File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\_api\_snapshots_client.py", line 9, in <module>
    from azureml._restclient.constants import SNAPSHOT_MAX_FILES, SNAPSHOT_BATCH_SIZE
ImportError: cannot import name 'SNAPSHOT_MAX_FILES'

That’s because we rely on newer version of azureml-core, run the following command to fix the error.

pip install azureml-core>1.17

Run.wait_for_completion visualization does not update run status

In browser F12 window, there is Couldn't process kernel message error in Console log. WrappedError message will be like:

Error: Class jupyter.widget not found in registry at http://xxx

This is because ipywidgets not correctly enabled in jupyter extension when using pip to install. Reinstall using conda will solve this issue. See more details here.

conda install -c conda-forge ipywidgets

NotImplementedError: Unsupported Linux distribution ubuntu 20.04

Take the following exception as an example:

  File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/_datastore_helper.py", line 143, in _set_auth_type
    get_engine_api().set_aml_auth(SetAmlAuthMessageArgument(AuthType.DERIVED, json.dumps(auth)))
  File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/engineapi/api.py", line 19, in get_engine_api
    _engine_api = EngineAPI()
  File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/engineapi/api.py", line 65, in __init__
    self._message_channel = launch_engine()
  File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/engineapi/engine.py", line 337, in launch_engine
    dependencies_path = runtime.ensure_dependencies()
  File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/dotnetcore2/runtime.py", line 276, in ensure_dependencies
    if not attempt_get_deps():
  File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/dotnetcore2/runtime.py", line 270, in attempt_get_deps
    raise NotImplementedError(err_msg)
NotImplementedError: Unsupported Linux distribution ubuntu 20.04

If raising a NotImplementedError contains unsupported ubuntu 20.04, when execute some operations of azure-ml-component in Ubuntu 20.04. It means cannot find required dependencies for .NET Core in current environment. You need to follow the guide to install .NET Cores dependencies for Ubuntu 20.04.

CLI environment

CommandNotFoundError: ‘component/module’ is misspelled or not recognized by the system

That’s because you are using an older version of azure CLI. Run az upgrade to update it.

If “upgrade” is not found in “az” command group, you may need to reinstall it.

An error occurred. Pip failed with status code 1. Use –debug for more information.

That’s probably because pip version is too low for CLI Python. Use the following steps to upgrade it to a newer version manually.

  1. Run az -v to and find “Python location” in output. For example, it could be “Python location ‘C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\python.exe’” in Windows, “Python location ‘/opt/az/bin/python3’” in Linux.

  2. Switch current working directory to CLI Python’s location.

  3. Run ./python -m pip install --upgrade pip==20.2 --user to upgrade pip for CLI Python.

See here for more information.

Runsettings

How to configure ITP job parameters by environment variable.

By default, HomePathMount is enabled, if you want to disable it, you need to change the environment variables through runsetting.

step1.runsettings.environment_variables = {"ENABLE_HOME_MOUNT": "false"}

Contact ITP team for more possible environment variables that could be used to configure job parameters.

Access workspace in another tenant

See Authentication in Azure Machine Learning for more details.

from azureml.core.authentication import InteractiveLoginAuthentication
from azureml.core import Workspace

# For workspace in another tenant
ws = Workspace.get(
  subscription_id="a6d8cf0d-b71e-4ffe-9a03-3e6013fed98a",
  resource_group="MOP.HERON.PPE.10118487-eaf4-4622-b1fd-3eb9f9834d73",
  name="amlworkspace4h4txh7jx7sl6",  # name of the workspace
  auth=InteractiveLoginAuthentication(tenant_id="cdc5aeea-15c5-4db6-b079-fcadd2505dc2")
)

Set the logger level of Component SDK

The log name of the Component SDK is defined as azure.ml.component. The default Component SDK logger level is INFO. If you want more information of Component SDK, you can execute this code before your code to change the Component SDK logger level and the format.

import logging

component_logger = logging.getLogger("azure.ml.component")
component_logger.setLevel(logging.DEBUG)

# Update the format of logger shown in terminal.
for log_handler in component_logger.handlers:
    # Get the stream handler of azure.ml.component logger.
    if isinstance(log_handler, logging.StreamHandler):
        # Update the format of stream handler.
        formatter = logging.Formatter("[%(asctime)s][%(name)s][%(levelname)s] - %(message)s")
        log_handler.setFormatter(formatter)
        break

The basic log format of azure.ml.component:

[2021-12-15 22:55:59,417][azure.ml.component][INFO] - Created Anonymous Component: {'name': 'streamprocessor_tsv_tsv', 'type': 'ScopeComponent', 'workspace': 'xxxxx', 'subscriptionId': 'xxxxx', 'resourceGroup': 'xxxxx'}