Frequently asked questions
Components vs Pipeline Steps
Azure Machine Learning pipelines can help organizations automate machine learning workflows by using the PipelineStep class. Components are similar, but break down pipelines further to make individual sections of code testable components instead of an entire pipeline.
| Pipeline steps | Components | |
|---|---|---|
| CI/CD process support | Create components from local files only | Create components from GitHub public repo, Azure DevOps artifacts, or local files |
| Reuse across pipelines | Not supported | Supported |
| Version management | Not supported | Supported - manage with Azure Command Line Interface (CLI) |
| Supported workflows | SDK only | SDK - and - designer |
Components vs Designer Execute Python Script component
Azure Machine Learning Designer execute python script components is a great way to customize Designer with your own code. It only spans across Designer and has certain limitations in the input and output formats. Components are more programmatic friendly and can be versioned
| Designer Execute Python | Components | |
|---|---|---|
| Domain scope | Designer only | All across AzureML including Designer, SDK pipelines, CLI etc |
| Reuse across pipelines and workspaces | Not supported | Supported |
| Version management | Not supported | Supported - manage with Azure Command Line Interface (CLI) |
| Programmatic provisioning | No, created from designer only | Yes (SDK and CLI) |
| Extensibility with libraries | Yes, with pip | Yes, with AzureML environments |
Setup environment common issues
SDK environment
ImportError: cannot import name xxx from azureml.xx
Take the following exception as an example:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\__init__.py", line 17, in <module>
from .component import Component
File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\component.py", line 28, in <module>
from azure.ml.component._api._api import _dto_2_definition
File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\_api\_api.py", line 12, in <module>
from azure.ml.component._api._snapshots_client import SnapshotsClient
File "C:\Users\wanhan\Anaconda3\envs\temp\lib\site-packages\azure\ml\component\_api\_snapshots_client.py", line 9, in <module>
from azureml._restclient.constants import SNAPSHOT_MAX_FILES, SNAPSHOT_BATCH_SIZE
ImportError: cannot import name 'SNAPSHOT_MAX_FILES'
That’s because we rely on newer version of azureml-core, run the following command to fix the error.
pip install azureml-core>1.17
Run.wait_for_completion visualization does not update run status
In browser F12 window, there is Couldn't process kernel message error in Console log. WrappedError message will be like:
Error: Class jupyter.widget not found in registry at http://xxx
This is because ipywidgets not correctly enabled in jupyter extension when using pip to install. Reinstall using conda will solve this issue. See more details here.
conda install -c conda-forge ipywidgets
NotImplementedError: Unsupported Linux distribution ubuntu 20.04
Take the following exception as an example:
File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/_datastore_helper.py", line 143, in _set_auth_type
get_engine_api().set_aml_auth(SetAmlAuthMessageArgument(AuthType.DERIVED, json.dumps(auth)))
File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/engineapi/api.py", line 19, in get_engine_api
_engine_api = EngineAPI()
File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/engineapi/api.py", line 65, in __init__
self._message_channel = launch_engine()
File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/azureml/dataprep/api/engineapi/engine.py", line 337, in launch_engine
dependencies_path = runtime.ensure_dependencies()
File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/dotnetcore2/runtime.py", line 276, in ensure_dependencies
if not attempt_get_deps():
File "/home/miniconda3/envs/scpilot000rc7/lib/python3.7/site-packages/dotnetcore2/runtime.py", line 270, in attempt_get_deps
raise NotImplementedError(err_msg)
NotImplementedError: Unsupported Linux distribution ubuntu 20.04
If raising a NotImplementedError contains unsupported ubuntu 20.04, when execute some operations of azure-ml-component in Ubuntu 20.04. It means cannot find required dependencies for .NET Core in current environment. You need to follow the guide to install .NET Cores dependencies for Ubuntu 20.04.
CLI environment
CommandNotFoundError: ‘component/module’ is misspelled or not recognized by the system
That’s because you are using an older version of azure CLI.
Run az upgrade to update it.
If “upgrade” is not found in “az” command group, you may need to reinstall it.
An error occurred. Pip failed with status code 1. Use –debug for more information.
That’s probably because pip version is too low for CLI Python. Use the following steps to upgrade it to a newer version manually.
Run
az -vto and find “Python location” in output. For example, it could be “Python location ‘C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\python.exe’” in Windows, “Python location ‘/opt/az/bin/python3’” in Linux.Switch current working directory to CLI Python’s location.
Run
./python -m pip install --upgrade pip==20.2 --userto upgrade pip for CLI Python.
See here for more information.
Runsettings
How to configure ITP job parameters by environment variable.
By default, HomePathMount is enabled, if you want to disable it, you need to change the environment variables through runsetting.
step1.runsettings.environment_variables = {"ENABLE_HOME_MOUNT": "false"}
Contact ITP team for more possible environment variables that could be used to configure job parameters.
Access workspace in another tenant
See Authentication in Azure Machine Learning for more details.
from azureml.core.authentication import InteractiveLoginAuthentication
from azureml.core import Workspace
# For workspace in another tenant
ws = Workspace.get(
subscription_id="a6d8cf0d-b71e-4ffe-9a03-3e6013fed98a",
resource_group="MOP.HERON.PPE.10118487-eaf4-4622-b1fd-3eb9f9834d73",
name="amlworkspace4h4txh7jx7sl6", # name of the workspace
auth=InteractiveLoginAuthentication(tenant_id="cdc5aeea-15c5-4db6-b079-fcadd2505dc2")
)
Set the logger level of Component SDK
The log name of the Component SDK is defined as azure.ml.component. The default Component SDK logger level is INFO.
If you want more information of Component SDK, you can execute this code before your code to change the Component SDK logger level and the format.
import logging
component_logger = logging.getLogger("azure.ml.component")
component_logger.setLevel(logging.DEBUG)
# Update the format of logger shown in terminal.
for log_handler in component_logger.handlers:
# Get the stream handler of azure.ml.component logger.
if isinstance(log_handler, logging.StreamHandler):
# Update the format of stream handler.
formatter = logging.Formatter("[%(asctime)s][%(name)s][%(levelname)s] - %(message)s")
log_handler.setFormatter(formatter)
break
The basic log format of azure.ml.component:
[2021-12-15 22:55:59,417][azure.ml.component][INFO] - Created Anonymous Component: {'name': 'streamprocessor_tsv_tsv', 'type': 'ScopeComponent', 'workspace': 'xxxxx', 'subscriptionId': 'xxxxx', 'resourceGroup': 'xxxxx'}