Change Log
v0.9.18 (2023.04.06)
Features:
Task 2132306: Export registered subgraph as code for more portability.
Task 2244639: Input mode of node inside subgraph is not taking effect.
Bugs Fixes:
Bug 2202524: Invalid run.ipynb file created by Designer’s “Export to Code” feature
Bug 2204741: Exporting pipeline to code fails on Unknown data node.
Bug 2220307: _get_target return value wrong for datatransfer component.
Bug 2211421: object of type ‘list_reverseiterator’ has no len()
Bug 2211930: Edge connecting between subgraph input and if/else operator generated incorrectly in subgraph.
Bug 2229358: Export to code failed for news_and_feeds pipeline.
v0.9.17 (2023.02.10)
Features:
AetherBridge component SDK Experience
This feature is in private preview, please play it around and share feedback with us.
See sample notebook.
Feature 2152382: Add hash_version & hash to SDK 1.5 component snapshot creation
pipeline step with user_identity runsetting will pass runtime snapshot validation with this change
example:
node.runsettings.identity_type = 'UserIdentity'
Improvements:
Task 2166895: Support runtime sweep on distributed component: relax client side check
Task 2172895: SDK print warning and skip for new run settings: trail timeout seconds
Recommend to upgrade to this version if you meet timeout_seconds runsetting not working issue
Task 2173007: Generate package: using the passed in auth first when generate package from workspace assets
Bugs Fixes:
Bug 2126161: Sweep component: Error “LineageRoot not found for assetReference” when publishing to registry
v0.9.16 (2022.12.14)
Improvements:
Feature 2076640: relax validation for connect primitive type output for subgraph
Feature 2053827: Generate package: generated code should escape dataset name
Task 2116942: [Doc] Highlight v1.5 spark component yaml difference with others
Bugs Fixes:
Bug 2115141: SDK validation logic need to support number type in Spark component
Bug 2087148: Component from yaml raise type error unsupported operand type(s) for /: ‘WindowsPath’ and ‘dict’
Bug 2115409: [Component Loader] Load the component by name:version in component config failed
v0.9.15 (2022.11.15)
Improvements:
Feature 2070375: [Additional Includes] Add integrity check when using downloaded artifacts in the disk cache.
Task 1958887: sub pipeline’s outputs binding with pipeline parameter.
Note: We don’t recommend you to use pipeline parameter and binding sub pipeline’s outputs together.
Example: Support return sub pipeline’s outputs that assigned with pipeline parameters to root pipeline.
In this example, sub_node’s outputs will be set a static value by
sub_path,sub_datastore,sub_path_on_compute,sub_mode.
@dsl.pipeline() def sub_pipeline_func(sub_path='sub_pipeline/defined/default', sub_datastore='workspacefilestore', sub_path_on_compute='/tmp/component', sub_mode='mount') -> Pipeline: node = component_func() node.outputs.output_dir.configure( path_on_datastore=sub_path, datastore=sub_datastore, path_on_compute=sub_path_on_compute, mode=sub_mode ) return node.outputs @dsl.pipeline() def root_pipeline() -> Pipeline: sub_node = sub_pipeline_func()
Note: Not support set sub_node’s outputs with pipeline parameters in root pipeline.
In this example, job will throw an error.
@dsl.pipeline() def sub_pipeline_func() -> Pipeline: node = component_func() return node.outputs @dsl.pipeline() def root_pipeline(path='custom/defined/path', datastore='workspacefilestore', path_on_compute='/tmp/component', mode='mount') -> Pipeline: sub_node = sub_pipeline_func() sub_node.outputs.output_dir.configure( path_on_datastore=path, datastore=datastore, path_on_compute=path_on_compute, mode=mode )
Task 2068921: Generated code should exclude pipeline runsetting attribute in pipeline.submit function params.
This is a job that use runsettiing, you can export code by this job.
pipeline = root_pipeline() pipeline.runsettings.default_compute = 'aml-compute' pipeline.runsettings.priority.scope = 900 pipeline.runsettings.priority.compute_cluster = 800 pipeline.runsettings.timeout_seconds = 500 pipeline.runsettings.continue_on_failed_optional_input = True pipeline.runsettings.default_datastore = 'workspacefilestore' pipeline.runsettings.force_rerun = False pipeline.runsettings.continue_on_step_failure = False pipeline_run = pipeline.submit( timeout_seconds=400, continue_on_failed_optional_input=False, default_compute='cpu-cluster', default_datastore='workspaceworkingdirectory', force_rerun=True, continue_on_step_failure=True)
Bugs Fixes:
Bug 1920193: Fall back unused parameter to optional Input instead of optional string now.
Bug 2065379: Tag change should not trigger component registration when using –skip-if-no-change.
Bug 2058915: The command “az-ml run debug” raise “NameError: name ‘args’ is not defined”.
Bug 2034695: When register workspace independent component with artifacts additional includes, it raises “[Errno 2] No such file or directory”. It caused by artifacts issue, SDK will avoid this issue by retrying.
v0.9.14 (2022.10.18)
Features:
Feature 1932709: [Smart mode] Do not create new component version if underlying code/contents have not changed compare with default/latest version
Example:
az ml component create --file component.yaml --skip-if-no-change
Feature 1754412: Support Spark component in SDK 1.5
Example: sample notebook
Doc: Spark Component
NOTE: This is a private preview feature, we strongly user directly use v2 spark component which is already in public preview. See v2 sample : link
Feature 1984480: Ignore pycache by default to avoid reuse issues
Feature 1959001: Support force_rerun, continue_on_step_failure set by pipeline.runsetting
Example:
pipeline.runsettings.force_rerun = False pipeline.runsettings.continue_on_step_failure = False
Improvements:
Task 1948966: Fix export code doesn’t contain pipeline runsetting
Task 1932759: Register component cache get by workspace_name and resource_name change to by workspace_id and resource_id
Known Bugs to address in this release:
Bug 1953955: Set pipeline.runsetting.target will set the compute to all nodes and pipeline.submit(default_compute_name=xxxx) will not take effect.
v0.9.13 (2022.09.06)
This release contains features like support set timeout in pipeline runsetting, support submit pipeline run with parent, Support Registry component Archive/Restore.
Features:
Feature 1875597: Support to set timeout at pipeline level
Usage:
pipeline.runsetting.timeout_seconds=500orpipeline.submit(timeout_seconds=500).Example of supported runsettings:
pipeline.runsettings.priority.scope = 900 pipeline.runsettings.priority.compute_cluster = 800 pipeline.runsettings.timeout_seconds = 500 pipeline.runsettings.continue_on_failed_optional_input=True pipeline.submit(timeout_seconds=400, continue_on_failed_optional_input=False)
NOTE: After timeout_seconds is set, if the run timeout, current backend behavior is only print a warning:

Feature 1910909: Support submit pipeline run with parent
SDK Example:
# Attach pipeline to parent pipeline by parent run id. pipeline.submit(parent=parent_run_id) # Attach pipeline to parent pipeline by parent pipeline run object. pipeline.submit(parent=parent_run)
Feature 1795212: Local Debugging Experience for WebXT–distributed component
Feature 1931232: Support Registry component Archive/Restore
Doc: CLI cheat sheet
Improvements:
Task 1839015: Improve error handing of pipeline runsettings for subgraph
get or set subgraph runsetting will raise a error.
Only root pipeline runsettings will take effect, like timeout_seconds.
This simplify default override model which improve our consistency across the system:
PipelineComponent will not have default runsettings in DPv2
Default compute is not supported when publish pipeline component to registry
Bugs Fixes:
Bug 1943175: enum: [‘true’] loaded as enum: [True] when put type of parameter at the end of parameter definition
Bug 1930750: Pipeline.validate not raise the top-level component default_compute/default_datastore error
Bug 1949180: [Component] Deadlock when download artifacts failed.
Bug 1896581: Additional_includes of components was not shown in code section as needed.
Bug 1951097: When export the graph with registry component to code, raise “get() got an unexpected keyword argument ‘registry_name’”
Bug 1882723: path_on_datastore link pipeline parameter error
v0.9.12 (2022.08.07)
This release contains features like pipeline step level timeout, convert dsl.command_component to yaml and other improvements.
Features:
Feature 1814084: Support pipeline step level timeout.
SDK Example:
component.runsettings.timeout_seconds = 600
Note: This setting only apply to command component and distributed component now.
Feature 1832902: convert dsl.command_component to yaml for registration
CLI Example:
az-ml compile --source ./src/smile/components/**/*.pySDK Example:
from azure.ml.component.dsl import compile from test.test_az_ml_compile import command_component_func1 compile(source=command_component_func1) compile(source='./test/test_az_ml_compile.py') compile(source='./test/test_az_ml_compile.py', name='command_component_func1') compile(source='./test/*.py')
See reference doc
Feature 1851843: Support resolve azure artifact in additional includes
This feature is in private preview, please use with caution. The additional_includes format may be changed.
Example:
additional_includes: - your/local/path - type: artifact organization: <your_devops_organization> project: <your_devops_project> feed: <your_artifacts_feed_name> name: <your_universal_package_name> version: <your_package_version> scope: project
See reference doc
Improvements:
Task 1844670: Postpone query runsettings from MT for all component types
Bugs Fixes:
Bug 1896581: Additional_includes of components was not shown in code section as needed.
v0.9.11 (2022.07.07)
This release contains features like parameter group support inheritance, override environment by (name, version) and other improvements.
Features:
Feature 1806333: Parameter group support inheritance
Parameter group will not support initialize with positional argument anymore.
Example:
# define the parent parameter group @dsl.parameter_group class ParentClass: str_param: str int_param: int = 1 @dsl.parameter_group class GroupClass(ParentClass): float_param: float str_param: str = 'test' # see the help of auto-gen __init__ function help(GroupClass.__init__)
Feature 1796978: Support override environment runsetting via name + version, so it can support registry username/password
Example:
from azure.ml.component.environment import Environment @dsl.pipeline() def test_pipeline(input_data): component = component_func(input_folder=input_data) # Override environment via name and version. component.runsettings.environment = Environment(name="AzureML-Minimal")
See reference doc
Improvements:
Feature 1825231: Support force_rerun in pipeline._publish() or published_pipeline.submit()
Example:
pipeline._publish(force_rerun=True)
Feature 1812015: [Generate Package] Ban the component which name not conform to Python function naming rules
Feature 1811793: Support ‘ContinueRunOnFailedOptionalInput’ in component SDK
Example:
pipeline.submit(continue_on_failed_optional_input=True)
Bugs Fixes:
Bug 1834772: dsl.command_component create unnecessary parent folder for boolean
Bug 1815865: Pipeline parameter lost in graph when only used as condition of dsl.condition
v0.9.10 (2022.05.31)
This release contains features like batch changing compute and datastore of component in a pipeline and DataStoreName, PathOnCompute, DataStoreMode of output settings be linked with pipeline parameter.
Features:
Feature 1763227: The default datastore and default compute target of all nodes use the default_datastore and default_compute_target specified in submit/validate function.
Example:
pipeline.validate(default_datastore="xxx", default_compute_target="xxx")pipeline.submit(default_datastore="xxx", default_compute_target="xxx")
Feature 1781684: Support DataStoreName, PathOnCompute, DataStoreMode of Output Settings to be linked with pipeline parameters
Example:
component.output.port_name.configure(datastore=pipeline_parameter_datastore, path_on_compute=f'{pipeline_parameter_path_on_compute}/run/compute')Doc reference doc
Feature 1788016: Support ComponentDefinition.list: Pip-style component version constraints
Example:
from azure.ml.component._core._component_definition import ComponentDefinition # List specified component in workspace ComponentDefinition.list(workspace=workspace, name="xxx") # List specified component in registry ComponentDefinition.list(registry_name="xxx", name="xxx")
Please contact us before use this, as this is a private feature.
Improvements:
Feature 1778282: Support load component from registry without workspace by SDK code
Feature 1759712: Pipeline submission should include more detail pointing out which component spec is having error
Feature 1806390: Support “HDFS” mode for component output
Feature 1802856: local component debug – refine the AML job status in local debug mode – mark as canceled only when user stop the local job container
Bugs Fixes:
Bug 1784520: SDK should not generate dangling output
Bug 1786428: Print request id when received service error
Bug 1785061: Postpone getting PipelineComponent runsettings definition
v0.9.9 (2022.04.27)
This release contains features like local debug using common runtime, path_on_datastore link with pipeline parameter, force_rerun pipeline setting, and az-ml export improvements.
Features:
Feature 1712434: Allow user to link path_on_datastore with pipeline parameter
Example:
output.configure(path_on_datastore=f'azureml/decode_output/{pipeline_parameter}')
Feature 1742360: add force_rerun pipeline setting
Example:
pipeline.submit(force_rerun=True): True to indicate force rerun all child runs under this root pipeline run, all child runs will not latch/reuse to any run and cannot be latched/reused by other runsDoc: reference doc
Feature 1718007: Support local debugging using common runtime
Example:
az-ml run debug --run-id <failed-run-id>Doc: reference doc
This is an early preview feature, execution service deployment is still on going.
For not supported region, below error message will shown:
RuntimeError: azureml-setup/common_runtime_bootstrapper_info.json are not in the execution service response.
Improvements:
Feature 1740584: The component CLI support publish to a registry without region “hints”
Feature 1561806: az-ml export cli improvements
Feature 1699706: make az-ml export use az cli auth as first option then fallback to interactive auth.
Feature 1685566: support pipeline with conditional workflow in export to code
Feature 1582401: support registry component in export to code
Feature 1684996: Support node.sweep() in export to code
Feature 1586743: Reduce pipeline submission time: Backend improvement
E2E time optimized (60s -> 25s for PW_OFE)
Submission time optimized from(40s -> 6s for PW_OFE)
Feature 1730569: remove pycache for dsl.command_component snapshot
Bugs Fixes:
Bug 1680153: Name ‘exit’ is not defined when dsl_generate_package
Bug 1737004: When component spec have incorrect indentation, dsl.generate_package raised error is poor readability
Bug 1738590: ‘Non-default argument follows default argument’ raised when use parameter group and dynamic parameter at the same time
Bug 1753924: Pipeline Expression raise ‘pop from empty list’
v0.9.8 (2022.04.02)
This release contains features like pipeline level compute priority runsettings, PipelineParameter as primitive types, intellisense for dsl.pipeline output and reduce the duplicate snapshot folders in component creation.
Features:
Feature 1467481: Pipeline runsetting to support different compute priority
e.g.
pipeline.runsettings.priority.scope = 901
Feature 1469019: Use PipelineParameter as primitive types in pipeline/subgraph function
See reference doc
See sample notebook which implements a dummy federated-learning pipeline with this feature.
Improvements:
Feature 1562566: Support intellisense for dsl.pipeline output
See reference doc
Feature 1682414: Reduce the duplicate snapshot folders in temp folder when creating to multiple workspace
Delete temp snapshot folder when successfully create component
Task 1719676: Refer to default curated environment in stable version for
dsl.command_component
Bugs Fixes:
Bug 1711918: ComponentDefinition load YAML returned None
Bug 1599087: Component local run didn’t work in docker mode when use remote dataset
Bug 1710484: Exclude the pycache folder for expression component
v0.9.7 (2022.03.21)
This release contains features like runtime if-else conditional, dynamic sweep on command component, and component create error handling improvement.
Features:
Feature 1511038: Conditional flow control.
See reference doc
See sample notebook
Notebook visualize and pipeline export to code are not supported yet for this feature.
This feature is still in preview state.
Pipeline backend only whitelisted certain subscriptions, please contact us if you would like to use.
Feature 1573963: Sweeping Command Component Search Space Setting in RunSettings
See reference doc
See sample notebook
This feature is still in preview state.
Pipeline backend only whitelisted certain subscriptions, please contact us if you would like to use.
Improvements:
Feature 1665098: Improve error handling of component creation: e.g. invalid component yaml
Bug 1552709: Improve error message: CLI is failing when component display_name starts with square brackets
Feature 1664789: Set default values for runsettings for a component in component spec
added support in ScopeComponent yaml: adla_account_name, scope_param, custom_job_name_suffix
added support in ScopeComponent runsettings: scope.runsettings.priority
Task 1701993: Add extras require: az that specifies the compatible azure-cli version
Bugs Fixes:
Bug 1689646: Component cache some times raise cannot find file specified
Bug 1682406: Notebook run wait for completion show graph raise AttributeError: ‘PipelineResponse’ object has no attribute ‘response’
Bug 1682635: Component created log is empty
v0.9.6 (2022.02.28)
This release contains features like registry support, component create cli overrides, and dsl.pipeline build & submission perf improvement.
Features:
Feature 1295350: Support Create & Consume component in registry
CLI reference: CLI cheat sheet
Supported Component Types: Command, Distributed, HDInsight, Scope, Parallel, Sweep.
Not Supported Component Types: Pipeline.
Feature 1586711: Expose is_deterministic on pipeline for subgraph reuse: dsl.pipeline(is_deterministic=True).
See reference doc
This feature is still in preview state, please use with caution.
Feature 1613065: Component CLI support override tags during component creation
See reference doc
Improvements:
Feature 1573898: Improve pipeline submission time by 2x (60s for PW OFE)
2.5x E2E time optimized relative to Base line (148s -> 60s)
Feature 1616998: Improve mechanism to get dsl.pipeline local variables name
Build time 2x improvement to previous version (13.01s -> 5.49s)
Task 1582487: Cache registered sub pipeline in local
Submission stage from 96.87s to 46.7s, tested with pw.pw_ofe on a 4-core VM in West US.
User can disable the pipeline disk cache by set environment variable:
AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE=’False’.AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE_EXPIRE_SECONDSis used to config the expired time of cached pipeline in seconds.Pipeline and component cache storage path is:
cmd:
%TEMP%/azure-ml-component/<SDK_VERSION>powershell:
$env:TMP/azure-ml-component/<SDK_VERSION>
-
Feature 1590658: Improve error message for generate_package
Feature 1575813: Change generate package function doc-string style to google style
Feature 1564982: Rest client change to adapt to open API 3.0 swagger
Bugs Fixes:
Bug 1617225: GPT3 components fail to upload via azure-ml-component 0.9.4 : “The dict contain an unexpected keyword ‘meta’”
Bug 1610031: Incorrect connecting edges when component output has camel case, e.g. tsvFile
Bug 1649889: Port name of edge for data node should be empty string
Bug 1617306: az-ml export: Run object has no attribute display_name when export code
Bug 1621489: Improve error handling when user incorrect set regenerate_outputs as regenerate_output
Bug 1615310 Pipeline submit raise validation error: Required parameter ‘runsettings.sweep.limits.max_total_trials’ not provided
v0.9.5 (2022.02.07)
This release contains features like run.resubmit/get_lineage, export/compare code cli, ae365exepool component and other improvements.
Features:
Feature 1556439: run.get_lineage(): support get lineage run from cloned run
See reference: Run.get_lineage
Feature 1538617: run.resubmit(): resubmit experiment from Python SDK (resubmit without published pipeline)
See reference: Run.resubmit
Feature 1443367: Export AML pipeline graph to source code
See document: export-existing-pipeline-to-code
NOTE: some advanced features are not supported yet, e.g. parameter group.
Feature 1436690: Pipeline graph Snapshot Code Comparison including node/component comparison
See document: compare-existing-pipeline-as-code
Feature 1565879: Release AE365ExePool Component and documentation
See reference doc.
See github sample.
Improvements:
Feature 1529495: Improve generate package performance from 10s to 2s
dsl.generate_package will detect the modification of component spec file in the asset when force_regenerate=False. If the component spec files in the asset are not modified, it will reuse the generated component module.
Feature 1586739: Reduce pipeline submission time by 20% (pw_ofe):
Feature 1586739: Disk cache the runsetting parameters and anonymous components to reduce submission time
User can disable the component disk cache by set environment variable:
AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE=’False’.AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE_EXPIRE_SECONDSis used to config the expired time of cached component in seconds.
In generate_package, set force_regenerate=False to detect the modification of component spec files to shorten the generation time.
Bugs Fixes:
Bug 1572473: When local run registry component, it raise “The filename, directory name, or volume label syntax is incorrect”
Bug 1583720: Parameter Group Parameter String Interpolation
Bug 1551908: SDK UX some color fallback to default color under dark theme
v0.9.4 (2022.01.04)
This release contains features like dsl.command_component, multi-level parameter group, and other improvements.
Features:
Feature 1541400: Support dsl.command_component
reach document at define-components-with-python-function.
see sample at how-to-use-dsl-component.
Feature 1529967: Support multi-level parameter group, see the updated sample how-to-use-parameter-group.
Improvements:
dsl.generate_package:
Feature 1486648: Config override support for functions generated in component package
Performance improvement
Below number are tested using 7 assets with a total of 150 components.
dsl.generate_package time is reduced 10x: from 119s to 10s
Feature 1425453: Better logging to help diagnostic issues like pipeline build/submission perf issue
Change the log format of Component SDK to
[<timestemp>][azure.ml.component][<level>] - <message>.See reference doc for setting the logger level of Component SDK
Feature 1553591: Expose get_portal_url and id of PipelineRun as public
Feature 1538101: Support configuring description and tags in registering AML datasets via register_as()
Task 1530999: Support python 3.9 in setup.py
Feature 1536317: Notebook UX Widget have same outline panel in workspace portal
Support MSAL(Microsoft Authentication Library)
Starting from azureml-core v1.37.0 supports MSAL. For more details see azureml-core change log.
Component CLI version >=0.9.4 pick up the change and will work for Azure CLI >=2.30.0. Note previous version of Component CLI only works with az cli version < 2.30.0.
Note: if you meet any issue with the new auth machanism. you can switch back by:
SDK: pip install “azureml-core<1.37.0”
CLI: use previous Component CLI version or az cli version (< 2.30.0)
Bugs Fixes:
Bug 1521648: The whole notebooks are upload as snapshot when execute component local run.
Bug 1505810: Didn’t throw error during validation stage when required parameter missing valid value
Bug 1514600: Throw exception when user specify duplicate node name
Bug 1534319: Workspace independent sweep component output ports error if renamed
Bug 1532755: Invalid runsetting warning for workspace independent pipeline
Bug 1536100: az ml component show does not show args for scope component
Bug 1537884: Output not json serializable when calling pipeline.validate()
Bug 1557445: Duplicate outputs added in parameters when calling pipeline_endpoint.submit()
Bug 1550138: SDK Notebook widget API call is much slower than portal
Bug 1563762: All files with .additional_includes inside component snapshot will stop uploading & downloading
v0.9.3 (2021.11.30)
This release contains feature Hemera/Starlite component and other improvements.
Features:
Hemera component SDK experience
See component doc.
See github sample.
Starlite component SDK Experience
This feature is in private preview, please play it around and share feedback with us.
See sample notebook.
Improvements:
Feature 1424136: Improve component.from_yaml experience through adding an uploading bar for uploading snapshot
Feature 1424045: Improve component list performance in Component CLI
Average and p90 latency is high for list components
Feature 1526350: Component SDK support for future new types of workspace independent Component: Hemera/Starlite
Feature 1466311: dsl.component support workspace independent experience
The new dsl component sample notebook.
Bugs Fixes:
Bug 1514651: AML_COMPONENT_REGISTRATION_MAX_WORKER could only be set as string, which will cause thread pool creation error
Bug 1504152: [CLI] Improve error handling when load from yaml: “error_message”: “‘args’”
Bug 1521590: ParameterAssignment validation error: “Only parameter can be referenced”
Bug 1519963: DSL Validation should not throw error for optional target run settings
Bug 1517477: Pipeline component lost port type id list
Bug 1514481: Error category when load workspace independent from yaml with file not found
Bug 1518650: Error category when group parameter missing some attribute
v0.9.2 (2021.11.15)
This release contains feature dsl.generate_package and bug fixes.
Features:
dsl.generate_package: generate python stub code to support static intellisense
See reference doc for import experience
Feature 1475886: Support passing primary_metric (not hard-coded in the sweep spec yaml file) in the runsetting
Feature 1486877: Support “link” output mode in Component SDK
Improvements:
Performance improvement for large graphs:
Below number are tested using graph with 20K nodes/ 4 Level Subgraph
dsl.pipeline instance build time: 1.8x improvement
Feature 1488878: Cache component environment to avoid duplication getting environment from workspace
Feature 1471963: Apply user defined name to for-loop created components and user can assign name by using node.node_name = ‘a’
Bugs Fixes:
Bug 1469528: The pipeline parameter is deleted, if it’s not used.
NOTE: reverted in 0.9.2.post2
Bug 1482927: Client side validation on no duplicate inputs/output names & print component/pipeline name when parallel creation failure.
NOTE: this change will break some existing scope component with same name for inputs/outputs
Bug 1471425: Input type in generate package annotation should contain Output type.
Bug 1476544: pipeline._run can’t match pipeline parameters when multi-layer sub-pipeline exists
Bug 1483036: Update runsettings failed with ‘xx not found in pipeline parameter’ after workspace independent component registration
Bug 1483042: When load workspace independent component from yaml with unexpected key, the exception classification is wrong.
Bug 1483220: The default value of the component parameter is lost after registering the workspace independent component.
Bug 1506028: Azureml-core deleted ruamel.yaml dependency, add this in Component SDK
Bug 1512867: Using environment variable
AML_COMPONENT_REGISTRATION_MAX_WORKERto limit the max worker of the component registration thread pool to avoid lots of retry requests when validate/submit the pipeline with workspace independent components.
v0.9.1 (2021.10.09)
This release contains feature like workspace independent component, pipeline run display name, runsetting features like pipeline component target runsetting override, runsetting environment override, etc..
And starting from this version we use v1 SDK style version names, starting from 0.9.1. For Component SDK, the install command does not change. For Component CLI, you need to update the extra index URL inside the install command, see here for more information. For example, the original install command to install CLI was:
az extension add –source https://azuremlsdktestpypi.blob.core.windows.net/wheels/modulesdkpreview/azure_cli_ml-0.1.0.44094775-py3-none-any.whl –pip-extra-index-urls https://azuremlsdktestpypi.azureedge.net/CLI-SDK-Runners-Validation/44094775 –yes –verbose
You need to change it to:
az extension add –source https://azuremlsdktestpypi.blob.core.windows.net/wheels/componentsdk/azure_cli_ml-0.9.1-py3-none-any.whl –pip-extra-index-urls https://azuremlsdktestpypi.azureedge.net/componentsdk/0.9.1 –yes –verbose
Or, if you want to use preview version(not recommended), change it to:
az extension add –source https://azuremlsdktestpypi.blob.core.windows.net/wheels/modulesdkpreview/azure_cli_ml-0.1.0.48138292-py3-none-any.whl –pip-extra-index-urls https://azuremlsdktestpypi.azureedge.net/modulesdkpreview/0.1.0.48138292 –yes –verbose
Features:
Pipeline submit/validate with workspace independent component
Support not specify workspace in
Component.from_yaml(yaml_file=xxx)to create a workspace independent componentSpecify workspace for the pipeline with workspace independent components in
pipeline.submit(workspace=ws)orpipeline.validate(workspace=ws)
Pipeline run display name
Support set pipeline run display_name in
pipeline.submit(display_name='pipeline_name')dsl.pipeline display_name will by default be the run display name
Runsetting:
Environment of component can be override at runtime
See reference doc for override experience
Support
runsettings.target="cluster-name"for pipeline component
PipelineParameter: Support dynamic pipeline parameter (**kwargs) in dsl.pipeline
See reference doc for Dynamic PipelineParameter
Improvements:
Performance improvement for large graphs:
Below number are tested using graph with 20K nodes/ 4 Level Subgraph
60s dsl.pipeline instance build time: 5x improvement (300s in previous version)
30s pipeline.submit time: 10x+ improvement (hangs in previous version)
Refined following API interface(old interface is still supported but without intellisense):
azure.ml.component.Component.regenerate_output -> regenerate_outputs
azure.ml.component.component.Output.configure(output_mode=None) -> (mode=None)
Refined Reference doc:
Add document for notebook visualization support
Update recommended schema url to be reachable links, Example: $schema: https://componentsdk.azureedge.net/jsonschema/CommandComponent.json
Refine samples to add -> Pipeline output annotation for dsl.pipelines: for better intellisense
Feature 1421055: Environment create fails due to “variables” key
Bugs Fixes:
Bug 1327294: Change anonymous PipelineComponent name and re-create should have new component id
Bug 1327295: Export to code does not export datastore name for scope component
Bug 1327299: Export to code gives incorrect port name for scope component: output2 -> Output2
Bug 1327390: Pipeline component’s name is an empty string. Eg: _ = func()
Bug 1329298: Pipeline submission hangs when nodes 20K+
Bug 1334680: InternalSDKError for SweepComponent when sweep_spec_relative_path does not startwith base path
Bug 1407405: Failed to register pipeline component with exception: “item with same key has already been added” (due to duplicate dataset node in graph)
Bug 1417022: Additional includes for sweep component is not properly supported for “az component build”
Bug 1355478: Raise KeyError when get the package name from frame
Bug 1329456: Scope Component node inputs ports takes wrong output data port from previous node
Bug 1433486: Renamed pipeline output will set outputSettings useGraphDefaultDatastore as true
Bug 1431365: PipelineComponent Create should not add parameter name to output setting for pipeline output
v0.1.0.44094775 (2021.08.19)
This release contains feature like PipelineComponent (Subgraph), Parameter Group, Non-PipelineParameter.
Features:
Pipeline Component (Subgraph) SDK Experience
This feature is in private preview, please play it around and share feedback to us.
See pipeline component reference doc.
New Component.create API to create a dsl.pipeline function as PipelineComponent。
See sample notebook.
Yaml support is not in scope of this release.
Parameter Group
See sample notebook
Non-PipelineParameter
This feature supports building pipelines dynamically in SDK client side
See sample notebook
Bugs Fixes:
Bug 1276298: Pip upgrade cannot upgrade the package to latest version when using pip version 21.1.x
next pip version will fix this, meanwhile please use a lower version pip to make the command in getting started page work.
Bug 1296880: Export to code should keep node name info
Bug 1302968: Component function docstring should be python native type
v0.1.0.42428082 (2021.07.30)
This release contains feature improvements on pipeline output pipeline parameter, interactive debug and bug fixes.
Features:
Support override pipeline outputs’ datastore and path_on_datastore during submit using pipeline_parameters.
Interactive debug component run in remote compute:
This is an early preview feature, please play it around and feel free to share feedback to us.
Please see reference doc on how to use it.
Bugs Fixes:
Bug 1224648: Sweep Output not show correct path of best child run in UI
Bug 1259740: Graph to sdk code exported target selector’s setting type not right
Bug 1276686: Graph to sdk code sub pipeline’s output port name incorrect
Bug 1247178: No error appears when configured invalid datastore inside dsl
Bug 1248830: FileNotFoundError: [WinError 3] The system cannot find the path specified when creating component
Bug 1220886: ImportError: cannot import name ‘SNAPSHOT_MAX_FILES’ from ‘azureml._restclient.constants’
Bug 1256975: Error Handling Improve: The filename, directory name, or volume label syntax is incorrect
Bug 1243626: Pipeline._endpoint submit() parameters: ‘str’ object has no attribute ‘items’
v0.1.0.40555082 (2021.07.02)
This release contains feature improvements on Runsetting, Sweep Component and validate logic.
Features:
Runsetting:
Feature 1185978: The component SDK should allow users to configure Docker configurations like: shared memory size
Sweep Component:
Feature 1186591: Sweep Component to run on a registered component rather than yaml files
Feature 1198701: User-defined outputs of the best child run is accessible at parent Sweep component run
Improvements:
Improve validate logic:
Task 1188667: Pipeline.validate does not reveal compute target not set error
Improve reference doc site:
Task 1169017: Improve reference doc to make it friendly for customers who previously work on Aether/ITP
See
Benefits of using COmponent SDKandIntroduction to AzureMLsections in Overview Page
Task 1197897: Add document explain the component run reuse logic
Bugs Fixes:
Bug 1178205: The component SDK should set spark.precache_package to false in default runconfiguration
Bug 1185427: Component.from_yaml should return same component id given same code snapshot for ParallelComponent
Bug 1189018: dsl.pipeline not correctly handle limited depth of recursion
Bug 1196832: pass **kwargs as parameter to dsl pipeline should raise exception
Bug 1199481: Compute validate should block non ADF compute target for DataTransferComponent
Bug 1188833: Invalid experiment name raises weird error message which user could not understand
Bug 1188832: Pipeline is submitted but an exception is raised when some node doesn’t have a compute target
Bug 1204845: ruamel.yaml should not use deprecated api “ruamel.yaml.safe_load”
Bug 1225252: CLI component create hangs “Failed to flush task queue within 600 seconds”
Bug 1166003: .amlignore does not work for folders in additional_includes
Bug 1217723: run.wait_for_completion() complains about missing data-prep
Bug 1208261: sweep component forces quniform to float, not usable with Bayesian+
v0.1.0.38576839 (2021.05.24)
This release contains feature improvements on Runsetting & Pipeline Parameter, and provides how-to-guides to setup job instance as interactive dev environment in ITP.
Features:
Runsetting:
Task 1128034: Support GJD submission in SDK published pipeline
Task 1048193: Support configure priority in runsetting
Pipeline parameter:
Feature 882845: Support runsetting config as pipeline parameter
Only a subset of the runsettings are supported when resubmit from pipeline run, see work-with-pipeline-parameters in runsettings doc
Task 1130389: Support output of dsl.pipeline as pipeline parameter
Improvements:
Improve component snapshot building speed by 8x (8min -> 1min) for deeprank scenario when files locates in ADLS:
Feature 1166175: [Component SDK/CLI]Make snapshot creation fast on remote file systems(ADLS)
Improve reference doc site:
Feature 1147069: Describe how to interactive development in ITP job instance with component SDK.
Bug 1183657: ParallelComponent 1.5 is poorly documented on predefined arguments and interfaces
See revamped ParallelComponent document doc
Improve runsettings:
Task 1143225: Support set runsettings using dictionary type
See doc on set runsettings using dictionary type
Refine validation experience:
Task 1126983: Postpone output.configure datastore error thrown to validation stage
Bugs Fixes:
Bug 1143740: [Component CLI] Sweep yaml cannot reference training yaml in a different folder
Bug 1165904: [Component CLI] User managed deps not passed correctly for Sweep Component
Bug 1152832: [Component CLI] component create hangs “Failed to flush task queue within 300 seconds”
Bug 1190062: [Component CLI] Component CLI Failed with obscure WindowsPath error
Bug 1166003: [Component CLI] Top level .amlignore does not work for folders in additional_includes
Bug 1178004: [Component SDK] Local runs of PRS components fail with obscure error
Bug 1167599: [Component SDK] AttributeError: module ‘os’ has no attribute ‘R_ok’
Bug 1164711: [Component SDK] k8srunsetting configuration not work when target_selector is used
Bug 1183773: [Component SDK] resource_layout.instance_type should work when specify gpu count and cpu count for GJD jobs
Bug 1196513: [Component SDK] from_yaml stuck with RuntimeError when no main module is defined in Windows
v0.1.0.36279725 (2021.04.23)
This release contains improvements of component types like SweepComponent, ScopeComponent.
Features:
Environment:
Task 1059940: Support reference docker file of environment in component spec file
Runsetting:
Task 1048196: Support resource_layout.instance_type/instance_count to better express resource requirement for ITP
Task 1048200: Support target_selector runsetting to integrate with GJD
Distributed component
Task 1123753: Support pytorch launcher type Distributed Component
Scope component
Task 1123754: Support dynamic resource in scope component
Sweep component
Task 1123751: Support using azureml.train.hyperdrive package contract to set hyperparameter expression
IO setting:
Task 1059783: Improve output.configure() performance: support datastore name as parameter
Feature 781165: Support component node comment
Papercut 1068384: Module display names not unique
Improvements:
Refine validation experience:
Task 1106972: Support components.inputs.input0 = some_dataset
Task 1106979: Print clear error messages when pipeline.validate()
Task 1120591: Add validation on that dataset & component are from same workspace
Improve reference doc site:
Task 1062884: Easy contributing to SDK 1.5 reference doc: add “Edit in DevOps” button to the reference doc site
Increase component snapshot file size limit to 2GB when creating components:
Task 1128383: Increase component snapshot file size limit to 2GB
Bugs Fixes:
Bug 1093483: Exe pool style command does not handle {, } when it comes are part of input.
Bug 1092886: Dataset created in Heron workspace use DataFrameDirectory, but did not register that data_type causing pipeline not start
Bug 1119030: run.wait_for_completion cannot raise ActivityFailedException because the pipeline run details has no ‘error’
Bug 1073425: Pipeline_validate raises “TypeError: Object of type ‘FileDataset’ is not JSON serializable”
Bug 1121405: Sweep Component Bandit policy default runsetting not handled correctly
Bug 1126869: Sweep Component throw 400 after submission: Failed to get required parameter from platform_config: ‘Definition.Overrides.Script
Bug 1126829: Sweep Component: SDK did not print clear error message when user setting a hyperparameter with invalid value
Bug 1136705: local submit AML pipeline: cannot submit experiment with similar names included in package
v0.1.0.34049888 (2021.03.23)
This release contains support of new component types like SweepComponent, DataTransferComponent.
Features:
-
First preview version
-
First preview version
Scope component
Support OBO token and update document
Input:
Support configure mode and path_on_compute of component’s input.
Command component
Support positive return code by defining successful_return_code
Support exepool style command.
Support using CommandComponent@1-preview to execute the component as command not python runpy.
Support above features in component.run
Improvements:
Improve export-to-code feature:
Support input/output configurations
Add reference doc
Improve document
Refined environment doc
Refined Distributed component doc: add example yaml
Include CLI cheat sheet in reference doc
Improve error categorization logic for:
exceptions in dsl.pipeline user code
Keyboard Interrupt in certain cases
SnapshotException
Fundamental
improved telemetry to track api performance
improved telemetry to track count of visualize in notebook
Bugs Fixes:
Bug 1047025: Jobs submitted using the component SDK do not properly interpolate inputs when there is no whitespace around the moustaches
Bug 1030885: When setting compute target in runsetting, cannot find newly created compute
Bug 1052993: Optional params don’t work with ScopeScript
Bug 1061444: In multi level pipeline, default datastore/compute in inner pipeline is overwritten by the outer pipeline.
Bug 999088: Can not support Enum type as pipeline parameter
Bug 1073427: Exception raised when component input is not reachable in scope of current pipeline
Known Bugs: This version is yanked from pypi because of below bug.
Bug 1245181: docker_configuration.arguments, expected type: ‘list’ or ‘tuple’, actually ‘str’
v0.1.0.31132438 (2021.02.09)
This release contains support of new component types like DistributedComponent, ScopeComponent.
Features:
Support MPI launcher type Distributed Component
Support Scope Component
Preview quality version with some planned features not supported, like: dynamic resource.
Only Dataset on ADLS is supported as component’s input.
Runsetting: Support runsettings.environment_variables, see runsettings doc
Environment: Support reference an existing environment by name in component yaml
Output: Support configure path_on_datastore of component’s output.
Add doc page on input/output configure behavior.
PipelineParameter: Support use pipeline parameter as substring in a component parameter
Component.Run: Support run arbitrary command component locally. For more details, please refer to component.run doc
Improvements:
Fundamental
Improve snapshot creation logic
improve performance by 2x in AML Notebooks(which has low file system performance when based on Azure File Share)
Add doc help user troubleshooting code snapshot issues
Added debug info for
az ml component create/build, user can check by adding--verboseparameter in CLI.Supports snapshot cache for all components(We don’t support component with additional includes or amlignore files before).
Supports recursive ignore files in component code snapshot.
Improve validation logic of local/remote run
Improve error handling, correct category of exceptions
Remove fields(source, contact, helpDocument, shared_scope) in CLI outputs to align with UX
Improve pipeline export yaml with support for output configurations
Bugs Fixes:
Bug 1005863: Output.configure(mode=’download’) should take effect
Bug 1008265: Optional boolean argument is set as False rather than not specified in AML
v0.1.0.29596699 (2021.01.15)
This release contains support of new component types like CommandComponent, ParallelComponent, HDInsightComponent.
Features:
Provide component SDK json schema to help authoring yaml spec, doc: use-component-spec-schema-in-vscode.md
Support Any CommandComponent
Sample1: how-to-use-command-component.ipynb
Sample2: how-to-use-r-command-component.ipynb
This feature is still in preview, has known backend limitations and not recommended for production use case yet.
Command starts with python will work same as previous basic module.
Support ParallelComponent SDK 1.5 component yaml
Support HDInsightComponent SDK 1.5 component yaml
Support
az ml component build: user can get a local component snapshot and do some processing like code-sign before create.Doc: component-build.md
Sample: how-to-use-component-cli.ipynb
Components created using new component yaml can shown correctly in Pipeline Draft Module Tree and Modules Management page of workspace portal UX. Add
&flight=cmin browser url to enable this flight feature.
Improvements:
Improve CLI
Refine CLI help docs, set more meaningful description to each command
Align CLI Error hanlding with other cli subgroups
Fundamental
Improve setup-environment doc by adding vscode devcontainer option
Test on multiple platforms (windows, Linux, Mac) * Python (3.6, 3.7, 3.8)
Regular CI target multiple platforms for stable & preview sdk/cli version for notebooks on github doc repro, and add status badge
Improve Docs
Improved reference doc site
Add runsettings doc
Bugs Fixes:
HDInsight component runsettings.configure(target=’xxx’) not taking effect
Pipeline parameter not taking effect in some scenarios
v0.1.0.27532912 (2020.11.30)
First Release of azure-ml-component package
Introduce Component concept & retain only essential interface
Previous azureml-pipeline-wrapper pacakge deprecated but still working:
API stays the same in azureml-pipeline-wrapper, Implementation is alias of azure-ml-component
Features:
Support component output register as dataset
Example: component.outputs.some_output.register_as(name=”dataset_name”, create_new_version=True)
Support environment.conda.pip_requirements_file in component yaml spec
Improvements:
Improve component create performance
Cache last time snapshot locally
Detect snapshot change: delta update snapshot if changed, reuse last snapshot if not change
Improve dsl.pipeline
Improve dsl.pipeline performance for complex graph
Refactor how dsl.pipeline build pipeline component definition
Improve CLI
Dynamic loading az ml component subgroup using entrypoint technique
Improve Inputs/outputs/runsetting
Allow tabular dataset for HDI module
Validate json schema for PRS module
Allow specify non string enum in SDK
Fundamental
Error Handling Logic Improve
Reference doc related improve
Refactor: package structure, each function area code logic
Bugs Fixes:
Component.load()fails for builtin components in new workspacesAuto provision built-in components and types in new workspace
az ml component createfails in new workspace for port type not existAuto provision known port types like: path
Fix status aggregation logic for subgraph in notebook visualization