Change Log

Latest stable version: PyPI version PyPI - Downloads

v0.9.18 (2023.04.06)

Features:

  • Task 2132306: Export registered subgraph as code for more portability.

  • Task 2244639: Input mode of node inside subgraph is not taking effect.

Bugs Fixes:

  • Bug 2202524: Invalid run.ipynb file created by Designer’s “Export to Code” feature

  • Bug 2204741: Exporting pipeline to code fails on Unknown data node.

  • Bug 2220307: _get_target return value wrong for datatransfer component.

  • Bug 2211421: object of type ‘list_reverseiterator’ has no len()

  • Bug 2211930: Edge connecting between subgraph input and if/else operator generated incorrectly in subgraph.

  • Bug 2229358: Export to code failed for news_and_feeds pipeline.

v0.9.17 (2023.02.10)

Features:

  • AetherBridge component SDK Experience

  • Feature 2152382: Add hash_version & hash to SDK 1.5 component snapshot creation

    • pipeline step with user_identity runsetting will pass runtime snapshot validation with this change

    • example: node.runsettings.identity_type = 'UserIdentity'

Improvements:

  • Task 2166895: Support runtime sweep on distributed component: relax client side check

  • Task 2172895: SDK print warning and skip for new run settings: trail timeout seconds

    • Recommend to upgrade to this version if you meet timeout_seconds runsetting not working issue

  • Task 2173007: Generate package: using the passed in auth first when generate package from workspace assets

Bugs Fixes:

  • Bug 2126161: Sweep component: Error “LineageRoot not found for assetReference” when publishing to registry

v0.9.16 (2022.12.14)

Improvements:

  • Feature 2076640: relax validation for connect primitive type output for subgraph

  • Feature 2053827: Generate package: generated code should escape dataset name

  • Task 2116942: [Doc] Highlight v1.5 spark component yaml difference with others

Bugs Fixes:

  • Bug 2115141: SDK validation logic need to support number type in Spark component

  • Bug 2087148: Component from yaml raise type error unsupported operand type(s) for /: ‘WindowsPath’ and ‘dict’

  • Bug 2115409: [Component Loader] Load the component by name:version in component config failed

v0.9.15 (2022.11.15)

Improvements:

  • Feature 2070375: [Additional Includes] Add integrity check when using downloaded artifacts in the disk cache.

  • Task 1958887: sub pipeline’s outputs binding with pipeline parameter.

    • Note: We don’t recommend you to use pipeline parameter and binding sub pipeline’s outputs together.

    • Example: Support return sub pipeline’s outputs that assigned with pipeline parameters to root pipeline.

    • In this example, sub_node’s outputs will be set a static value by sub_path, sub_datastore, sub_path_on_compute, sub_mode.

    @dsl.pipeline()
    def sub_pipeline_func(sub_path='sub_pipeline/defined/default', sub_datastore='workspacefilestore', sub_path_on_compute='/tmp/component', sub_mode='mount') -> Pipeline:
      node = component_func()
      node.outputs.output_dir.configure(
           path_on_datastore=sub_path,
           datastore=sub_datastore,
           path_on_compute=sub_path_on_compute,
           mode=sub_mode
      )
      return node.outputs
    
    @dsl.pipeline()
    def root_pipeline() -> Pipeline:
      sub_node = sub_pipeline_func()
    
    • Note: Not support set sub_node’s outputs with pipeline parameters in root pipeline.

    • In this example, job will throw an error.

    @dsl.pipeline()
    def sub_pipeline_func() -> Pipeline:
      node = component_func()
      return node.outputs
    
    @dsl.pipeline()
    def root_pipeline(path='custom/defined/path', datastore='workspacefilestore', path_on_compute='/tmp/component', mode='mount') -> Pipeline:
      sub_node = sub_pipeline_func()
      sub_node.outputs.output_dir.configure(
          path_on_datastore=path,
          datastore=datastore,
          path_on_compute=path_on_compute,
          mode=mode
      )
    
  • Task 2068921: Generated code should exclude pipeline runsetting attribute in pipeline.submit function params.

    • This is a job that use runsettiing, you can export code by this job.

    pipeline = root_pipeline()
    pipeline.runsettings.default_compute = 'aml-compute'
    pipeline.runsettings.priority.scope = 900
    pipeline.runsettings.priority.compute_cluster = 800
    pipeline.runsettings.timeout_seconds = 500
    pipeline.runsettings.continue_on_failed_optional_input = True
    pipeline.runsettings.default_datastore = 'workspacefilestore'
    pipeline.runsettings.force_rerun = False
    pipeline.runsettings.continue_on_step_failure = False
    
    pipeline_run = pipeline.submit(
        timeout_seconds=400,
        continue_on_failed_optional_input=False,
        default_compute='cpu-cluster',
        default_datastore='workspaceworkingdirectory',
        force_rerun=True,
        continue_on_step_failure=True)
    

Bugs Fixes:

  • Bug 1920193: Fall back unused parameter to optional Input instead of optional string now.

  • Bug 2065379: Tag change should not trigger component registration when using –skip-if-no-change.

  • Bug 2058915: The command “az-ml run debug” raise “NameError: name ‘args’ is not defined”.

  • Bug 2034695: When register workspace independent component with artifacts additional includes, it raises “[Errno 2] No such file or directory”. It caused by artifacts issue, SDK will avoid this issue by retrying.

v0.9.14 (2022.10.18)

Features:

  • Feature 1932709: [Smart mode] Do not create new component version if underlying code/contents have not changed compare with default/latest version

    • Example: az ml component create --file component.yaml --skip-if-no-change

  • Feature 1754412: Support Spark component in SDK 1.5

    • Example: sample notebook

    • Doc: Spark Component

    • NOTE: This is a private preview feature, we strongly user directly use v2 spark component which is already in public preview. See v2 sample : link

  • Feature 1984480: Ignore pycache by default to avoid reuse issues

  • Feature 1959001: Support force_rerun, continue_on_step_failure set by pipeline.runsetting

    • Example:

      pipeline.runsettings.force_rerun = False
      pipeline.runsettings.continue_on_step_failure = False
      

Improvements:

  • Task 1948966: Fix export code doesn’t contain pipeline runsetting

  • Task 1932759: Register component cache get by workspace_name and resource_name change to by workspace_id and resource_id

Known Bugs to address in this release:

  • Bug 1953955: Set pipeline.runsetting.target will set the compute to all nodes and pipeline.submit(default_compute_name=xxxx) will not take effect.

v0.9.13 (2022.09.06)

This release contains features like support set timeout in pipeline runsetting, support submit pipeline run with parent, Support Registry component Archive/Restore.

Features:

  • Feature 1875597: Support to set timeout at pipeline level

    • Usage: pipeline.runsetting.timeout_seconds=500 or pipeline.submit(timeout_seconds=500).

    • Example of supported runsettings:

    pipeline.runsettings.priority.scope = 900
    pipeline.runsettings.priority.compute_cluster = 800
    pipeline.runsettings.timeout_seconds = 500
    pipeline.runsettings.continue_on_failed_optional_input=True
    
    pipeline.submit(timeout_seconds=400, continue_on_failed_optional_input=False)
    
    • NOTE: After timeout_seconds is set, if the run timeout, current backend behavior is only print a warning:

    image

  • Feature 1910909: Support submit pipeline run with parent

    • SDK Example:

    # Attach pipeline to parent pipeline by parent run id.
    pipeline.submit(parent=parent_run_id)
    # Attach pipeline to parent pipeline by parent pipeline run object.
    pipeline.submit(parent=parent_run)
    
  • Feature 1795212: Local Debugging Experience for WebXT–distributed component

  • Feature 1931232: Support Registry component Archive/Restore

Improvements:

  • Task 1839015: Improve error handing of pipeline runsettings for subgraph

    • get or set subgraph runsetting will raise a error.

    • Only root pipeline runsettings will take effect, like timeout_seconds.

    • This simplify default override model which improve our consistency across the system:

      • PipelineComponent will not have default runsettings in DPv2

      • Default compute is not supported when publish pipeline component to registry

Bugs Fixes:

  • Bug 1943175: enum: [‘true’] loaded as enum: [True] when put type of parameter at the end of parameter definition

  • Bug 1930750: Pipeline.validate not raise the top-level component default_compute/default_datastore error

  • Bug 1949180: [Component] Deadlock when download artifacts failed.

  • Bug 1896581: Additional_includes of components was not shown in code section as needed.

  • Bug 1951097: When export the graph with registry component to code, raise “get() got an unexpected keyword argument ‘registry_name’”

  • Bug 1882723: path_on_datastore link pipeline parameter error

v0.9.12 (2022.08.07)

This release contains features like pipeline step level timeout, convert dsl.command_component to yaml and other improvements.

Features:

  • Feature 1814084: Support pipeline step level timeout.

    • SDK Example:

      component.runsettings.timeout_seconds = 600
      
    • Note: This setting only apply to command component and distributed component now.

  • Feature 1832902: convert dsl.command_component to yaml for registration

    • CLI Example: az-ml compile --source ./src/smile/components/**/*.py

    • SDK Example:

     from azure.ml.component.dsl import compile
     from test.test_az_ml_compile import command_component_func1
    
     compile(source=command_component_func1)
     compile(source='./test/test_az_ml_compile.py')
     compile(source='./test/test_az_ml_compile.py', name='command_component_func1')
     compile(source='./test/*.py')
    
  • Feature 1851843: Support resolve azure artifact in additional includes

    • This feature is in private preview, please use with caution. The additional_includes format may be changed.

    • Example:

      additional_includes:
       - your/local/path
       - type: artifact
         organization: <your_devops_organization>
         project: <your_devops_project>
         feed: <your_artifacts_feed_name>
         name: <your_universal_package_name>
         version: <your_package_version>
         scope: project
      
    • See reference doc

Improvements:

  • Task 1844670: Postpone query runsettings from MT for all component types

Bugs Fixes:

  • Bug 1896581: Additional_includes of components was not shown in code section as needed.

v0.9.11 (2022.07.07)

This release contains features like parameter group support inheritance, override environment by (name, version) and other improvements.

Features:

  • Feature 1806333: Parameter group support inheritance

    • Parameter group will not support initialize with positional argument anymore.

    • Example:

      # define the parent parameter group
      @dsl.parameter_group
      class ParentClass:
          str_param: str
          int_param: int = 1
      
      @dsl.parameter_group
      class GroupClass(ParentClass):
          float_param: float
          str_param: str = 'test'
      
      # see the help of auto-gen __init__ function
      help(GroupClass.__init__)
      
  • Feature 1796978: Support override environment runsetting via name + version, so it can support registry username/password

    • Example:

    from azure.ml.component.environment import Environment
    
    @dsl.pipeline()
    def test_pipeline(input_data):
        component = component_func(input_folder=input_data)
        # Override environment via name and version.
        component.runsettings.environment = Environment(name="AzureML-Minimal")
    

Improvements:

  • Feature 1825231: Support force_rerun in pipeline._publish() or published_pipeline.submit()

    • Example: pipeline._publish(force_rerun=True)

  • Feature 1812015: [Generate Package] Ban the component which name not conform to Python function naming rules

  • Feature 1811793: Support ‘ContinueRunOnFailedOptionalInput’ in component SDK

    • Example: pipeline.submit(continue_on_failed_optional_input=True)

Bugs Fixes:

  • Bug 1834772: dsl.command_component create unnecessary parent folder for boolean

  • Bug 1815865: Pipeline parameter lost in graph when only used as condition of dsl.condition

v0.9.10 (2022.05.31)

This release contains features like batch changing compute and datastore of component in a pipeline and DataStoreName, PathOnCompute, DataStoreMode of output settings be linked with pipeline parameter.

Features:

  • Feature 1763227: The default datastore and default compute target of all nodes use the default_datastore and default_compute_target specified in submit/validate function.

    • Example: pipeline.validate(default_datastore="xxx", default_compute_target="xxx") pipeline.submit(default_datastore="xxx", default_compute_target="xxx")

  • Feature 1781684: Support DataStoreName, PathOnCompute, DataStoreMode of Output Settings to be linked with pipeline parameters

    • Example: component.output.port_name.configure(datastore=pipeline_parameter_datastore, path_on_compute=f'{pipeline_parameter_path_on_compute}/run/compute')

    • Doc reference doc

  • Feature 1788016: Support ComponentDefinition.list: Pip-style component version constraints

    • Example:

    from azure.ml.component._core._component_definition import ComponentDefinition
    # List specified component in workspace
    ComponentDefinition.list(workspace=workspace, name="xxx")
    # List specified component in registry
    ComponentDefinition.list(registry_name="xxx", name="xxx")
    
    • Please contact us before use this, as this is a private feature.

Improvements:

  • Feature 1778282: Support load component from registry without workspace by SDK code

  • Feature 1759712: Pipeline submission should include more detail pointing out which component spec is having error

  • Feature 1806390: Support “HDFS” mode for component output

  • Feature 1802856: local component debug – refine the AML job status in local debug mode – mark as canceled only when user stop the local job container

Bugs Fixes:

  • Bug 1784520: SDK should not generate dangling output

  • Bug 1786428: Print request id when received service error

  • Bug 1785061: Postpone getting PipelineComponent runsettings definition

v0.9.9 (2022.04.27)

This release contains features like local debug using common runtime, path_on_datastore link with pipeline parameter, force_rerun pipeline setting, and az-ml export improvements.

Features:

  • Feature 1712434: Allow user to link path_on_datastore with pipeline parameter

  • Feature 1742360: add force_rerun pipeline setting

    • Example: pipeline.submit(force_rerun=True): True to indicate force rerun all child runs under this root pipeline run, all child runs will not latch/reuse to any run and cannot be latched/reused by other runs

    • Doc: reference doc

  • Feature 1718007: Support local debugging using common runtime

    • Example: az-ml run debug --run-id <failed-run-id>

    • Doc: reference doc

    • This is an early preview feature, execution service deployment is still on going.

    • For not supported region, below error message will shown: RuntimeError: azureml-setup/common_runtime_bootstrapper_info.json are not in the execution service response.

Improvements:

  • Feature 1740584: The component CLI support publish to a registry without region “hints”

  • Feature 1561806: az-ml export cli improvements

  • Feature 1586743: Reduce pipeline submission time: Backend improvement

    • E2E time optimized (60s -> 25s for PW_OFE)

    • Submission time optimized from(40s -> 6s for PW_OFE)

  • Feature 1730569: remove pycache for dsl.command_component snapshot

Bugs Fixes:

  • Bug 1680153: Name ‘exit’ is not defined when dsl_generate_package

  • Bug 1737004: When component spec have incorrect indentation, dsl.generate_package raised error is poor readability

  • Bug 1738590: ‘Non-default argument follows default argument’ raised when use parameter group and dynamic parameter at the same time

  • Bug 1753924: Pipeline Expression raise ‘pop from empty list’

v0.9.8 (2022.04.02)

This release contains features like pipeline level compute priority runsettings, PipelineParameter as primitive types, intellisense for dsl.pipeline output and reduce the duplicate snapshot folders in component creation.

Features:

  • Feature 1467481: Pipeline runsetting to support different compute priority

    • e.g. pipeline.runsettings.priority.scope = 901

  • Feature 1469019: Use PipelineParameter as primitive types in pipeline/subgraph function

Improvements:

  • Feature 1562566: Support intellisense for dsl.pipeline output

  • Feature 1682414: Reduce the duplicate snapshot folders in temp folder when creating to multiple workspace

    • Delete temp snapshot folder when successfully create component

  • Task 1719676: Refer to default curated environment in stable version for dsl.command_component

Bugs Fixes:

  • Bug 1711918: ComponentDefinition load YAML returned None

  • Bug 1599087: Component local run didn’t work in docker mode when use remote dataset

  • Bug 1710484: Exclude the pycache folder for expression component

v0.9.7 (2022.03.21)

This release contains features like runtime if-else conditional, dynamic sweep on command component, and component create error handling improvement.

Features:

  • Feature 1511038: Conditional flow control.

    • See reference doc

    • See sample notebook

    • Notebook visualize and pipeline export to code are not supported yet for this feature.

    • This feature is still in preview state.

    • Pipeline backend only whitelisted certain subscriptions, please contact us if you would like to use.

  • Feature 1573963: Sweeping Command Component Search Space Setting in RunSettings

    • See reference doc

    • See sample notebook

    • This feature is still in preview state.

    • Pipeline backend only whitelisted certain subscriptions, please contact us if you would like to use.

Improvements:

  • Feature 1665098: Improve error handling of component creation: e.g. invalid component yaml

    • Bug 1552709: Improve error message: CLI is failing when component display_name starts with square brackets

  • Feature 1664789: Set default values for runsettings for a component in component spec

    • added support in ScopeComponent yaml: adla_account_name, scope_param, custom_job_name_suffix

    • added support in ScopeComponent runsettings: scope.runsettings.priority

  • Task 1701993: Add extras require: az that specifies the compatible azure-cli version

Bugs Fixes:

  • Bug 1689646: Component cache some times raise cannot find file specified

  • Bug 1682406: Notebook run wait for completion show graph raise AttributeError: ‘PipelineResponse’ object has no attribute ‘response’

  • Bug 1682635: Component created log is empty

v0.9.6 (2022.02.28)

This release contains features like registry support, component create cli overrides, and dsl.pipeline build & submission perf improvement.

Features:

Improvements:

  • Feature 1573898: Improve pipeline submission time by 2x (60s for PW OFE)

    • 2.5x E2E time optimized relative to Base line (148s -> 60s)

    • Feature 1616998: Improve mechanism to get dsl.pipeline local variables name

      • Build time 2x improvement to previous version (13.01s -> 5.49s)

    • Task 1582487: Cache registered sub pipeline in local

      • Submission stage from 96.87s to 46.7s, tested with pw.pw_ofe on a 4-core VM in West US.

      • User can disable the pipeline disk cache by set environment variable: AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE=’False’.

      • AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE_EXPIRE_SECONDS is used to config the expired time of cached pipeline in seconds.

      • Pipeline and component cache storage path is:

        • cmd: %TEMP%/azure-ml-component/<SDK_VERSION>

        • powershell: $env:TMP/azure-ml-component/<SDK_VERSION>

  • Component Package

  • Feature 1564982: Rest client change to adapt to open API 3.0 swagger

Bugs Fixes:

  • Bug 1617225: GPT3 components fail to upload via azure-ml-component 0.9.4 : “The dict contain an unexpected keyword ‘meta’”

  • Bug 1610031: Incorrect connecting edges when component output has camel case, e.g. tsvFile

  • Bug 1649889: Port name of edge for data node should be empty string

  • Bug 1617306: az-ml export: Run object has no attribute display_name when export code

  • Bug 1621489: Improve error handling when user incorrect set regenerate_outputs as regenerate_output

  • Bug 1615310 Pipeline submit raise validation error: Required parameter ‘runsettings.sweep.limits.max_total_trials’ not provided

v0.9.5 (2022.02.07)

This release contains features like run.resubmit/get_lineage, export/compare code cli, ae365exepool component and other improvements.

Features:

Improvements:

  • Feature 1529495: Improve generate package performance from 10s to 2s

    • dsl.generate_package will detect the modification of component spec file in the asset when force_regenerate=False. If the component spec files in the asset are not modified, it will reuse the generated component module.

  • Feature 1586739: Reduce pipeline submission time by 20% (pw_ofe):

    • Feature 1586739: Disk cache the runsetting parameters and anonymous components to reduce submission time

      • User can disable the component disk cache by set environment variable: AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE=’False’.

      • AZUREML_COMPONENT_ANONYMOUS_COMPONENT_CACHE_EXPIRE_SECONDS is used to config the expired time of cached component in seconds.

    • In generate_package, set force_regenerate=False to detect the modification of component spec files to shorten the generation time.

Bugs Fixes:

  • Bug 1572473: When local run registry component, it raise “The filename, directory name, or volume label syntax is incorrect”

  • Bug 1583720: Parameter Group Parameter String Interpolation

  • Bug 1551908: SDK UX some color fallback to default color under dark theme

v0.9.4 (2022.01.04)

This release contains features like dsl.command_component, multi-level parameter group, and other improvements.

Features:

Improvements:

  • dsl.generate_package:

    • Feature 1486648: Config override support for functions generated in component package

    • Performance improvement

      • Below number are tested using 7 assets with a total of 150 components.

      • dsl.generate_package time is reduced 10x: from 119s to 10s

  • Feature 1425453: Better logging to help diagnostic issues like pipeline build/submission perf issue

  • Feature 1553591: Expose get_portal_url and id of PipelineRun as public

  • Feature 1538101: Support configuring description and tags in registering AML datasets via register_as()

  • Task 1530999: Support python 3.9 in setup.py

  • Feature 1536317: Notebook UX Widget have same outline panel in workspace portal

  • Support MSAL(Microsoft Authentication Library)

    • Starting from azureml-core v1.37.0 supports MSAL. For more details see azureml-core change log.

    • Component CLI version >=0.9.4 pick up the change and will work for Azure CLI >=2.30.0. Note previous version of Component CLI only works with az cli version < 2.30.0.

    • Note: if you meet any issue with the new auth machanism. you can switch back by:

      • SDK: pip install “azureml-core<1.37.0”

      • CLI: use previous Component CLI version or az cli version (< 2.30.0)

Bugs Fixes:

  • Bug 1521648: The whole notebooks are upload as snapshot when execute component local run.

  • Bug 1505810: Didn’t throw error during validation stage when required parameter missing valid value

  • Bug 1514600: Throw exception when user specify duplicate node name

  • Bug 1534319: Workspace independent sweep component output ports error if renamed

  • Bug 1532755: Invalid runsetting warning for workspace independent pipeline

  • Bug 1536100: az ml component show does not show args for scope component

  • Bug 1537884: Output not json serializable when calling pipeline.validate()

  • Bug 1557445: Duplicate outputs added in parameters when calling pipeline_endpoint.submit()

  • Bug 1550138: SDK Notebook widget API call is much slower than portal

  • Bug 1563762: All files with .additional_includes inside component snapshot will stop uploading & downloading

v0.9.3 (2021.11.30)

This release contains feature Hemera/Starlite component and other improvements.

Features:

Improvements:

  • Feature 1424136: Improve component.from_yaml experience through adding an uploading bar for uploading snapshot

  • Feature 1424045: Improve component list performance in Component CLI

    • Average and p90 latency is high for list components

  • Feature 1526350: Component SDK support for future new types of workspace independent Component: Hemera/Starlite

  • Feature 1466311: dsl.component support workspace independent experience

Bugs Fixes:

  • Bug 1514651: AML_COMPONENT_REGISTRATION_MAX_WORKER could only be set as string, which will cause thread pool creation error

  • Bug 1504152: [CLI] Improve error handling when load from yaml: “error_message”: “‘args’”

  • Bug 1521590: ParameterAssignment validation error: “Only parameter can be referenced”

  • Bug 1519963: DSL Validation should not throw error for optional target run settings

  • Bug 1517477: Pipeline component lost port type id list

  • Bug 1514481: Error category when load workspace independent from yaml with file not found

  • Bug 1518650: Error category when group parameter missing some attribute

v0.9.2 (2021.11.15)

This release contains feature dsl.generate_package and bug fixes.

Features:

  • dsl.generate_package: generate python stub code to support static intellisense

  • Feature 1475886: Support passing primary_metric (not hard-coded in the sweep spec yaml file) in the runsetting

  • Feature 1486877: Support “link” output mode in Component SDK

Improvements:

  • Performance improvement for large graphs:

    • Below number are tested using graph with 20K nodes/ 4 Level Subgraph

    • dsl.pipeline instance build time: 1.8x improvement

  • Feature 1488878: Cache component environment to avoid duplication getting environment from workspace

  • Feature 1471963: Apply user defined name to for-loop created components and user can assign name by using node.node_name = ‘a’

Bugs Fixes:

  • Bug 1469528: The pipeline parameter is deleted, if it’s not used.

    • NOTE: reverted in 0.9.2.post2

  • Bug 1482927: Client side validation on no duplicate inputs/output names & print component/pipeline name when parallel creation failure.

    • NOTE: this change will break some existing scope component with same name for inputs/outputs

  • Bug 1471425: Input type in generate package annotation should contain Output type.

  • Bug 1476544: pipeline._run can’t match pipeline parameters when multi-layer sub-pipeline exists

  • Bug 1483036: Update runsettings failed with ‘xx not found in pipeline parameter’ after workspace independent component registration

  • Bug 1483042: When load workspace independent component from yaml with unexpected key, the exception classification is wrong.

  • Bug 1483220: The default value of the component parameter is lost after registering the workspace independent component.

  • Bug 1506028: Azureml-core deleted ruamel.yaml dependency, add this in Component SDK

  • Bug 1512867: Using environment variable AML_COMPONENT_REGISTRATION_MAX_WORKER to limit the max worker of the component registration thread pool to avoid lots of retry requests when validate/submit the pipeline with workspace independent components.

v0.9.1 (2021.10.09)

This release contains feature like workspace independent component, pipeline run display name, runsetting features like pipeline component target runsetting override, runsetting environment override, etc..

And starting from this version we use v1 SDK style version names, starting from 0.9.1. For Component SDK, the install command does not change. For Component CLI, you need to update the extra index URL inside the install command, see here for more information. For example, the original install command to install CLI was:

az extension add –source https://azuremlsdktestpypi.blob.core.windows.net/wheels/modulesdkpreview/azure_cli_ml-0.1.0.44094775-py3-none-any.whl –pip-extra-index-urls https://azuremlsdktestpypi.azureedge.net/CLI-SDK-Runners-Validation/44094775 –yes –verbose

You need to change it to:

az extension add –source https://azuremlsdktestpypi.blob.core.windows.net/wheels/componentsdk/azure_cli_ml-0.9.1-py3-none-any.whl –pip-extra-index-urls https://azuremlsdktestpypi.azureedge.net/componentsdk/0.9.1 –yes –verbose

Or, if you want to use preview version(not recommended), change it to:

az extension add –source https://azuremlsdktestpypi.blob.core.windows.net/wheels/modulesdkpreview/azure_cli_ml-0.1.0.48138292-py3-none-any.whl –pip-extra-index-urls https://azuremlsdktestpypi.azureedge.net/modulesdkpreview/0.1.0.48138292 –yes –verbose

Features:

  • Pipeline submit/validate with workspace independent component

    • Support not specify workspace in Component.from_yaml(yaml_file=xxx) to create a workspace independent component

    • Specify workspace for the pipeline with workspace independent components in pipeline.submit(workspace=ws) or pipeline.validate(workspace=ws)

    • Learn more

  • Pipeline run display name

    • Support set pipeline run display_name in pipeline.submit(display_name='pipeline_name')

    • dsl.pipeline display_name will by default be the run display name

  • Runsetting:

    • Environment of component can be override at runtime

    • Support runsettings.target="cluster-name" for pipeline component

  • PipelineParameter: Support dynamic pipeline parameter (**kwargs) in dsl.pipeline

Improvements:

  • Performance improvement for large graphs:

    • Below number are tested using graph with 20K nodes/ 4 Level Subgraph

    • 60s dsl.pipeline instance build time: 5x improvement (300s in previous version)

    • 30s pipeline.submit time: 10x+ improvement (hangs in previous version)

  • Refined following API interface(old interface is still supported but without intellisense):

    • azure.ml.component.Component.regenerate_output -> regenerate_outputs

    • azure.ml.component.component.Output.configure(output_mode=None) -> (mode=None)

  • Refined Reference doc:

    • Add document for notebook visualization support

    • Update recommended schema url to be reachable links, Example: $schema: https://componentsdk.azureedge.net/jsonschema/CommandComponent.json

    • Refine samples to add -> Pipeline output annotation for dsl.pipelines: for better intellisense

  • Feature 1421055: Environment create fails due to “variables” key

Bugs Fixes:

  • Bug 1327294: Change anonymous PipelineComponent name and re-create should have new component id

  • Bug 1327295: Export to code does not export datastore name for scope component

  • Bug 1327299: Export to code gives incorrect port name for scope component: output2 -> Output2

  • Bug 1327390: Pipeline component’s name is an empty string. Eg: _ = func()

  • Bug 1329298: Pipeline submission hangs when nodes 20K+

  • Bug 1334680: InternalSDKError for SweepComponent when sweep_spec_relative_path does not startwith base path

  • Bug 1407405: Failed to register pipeline component with exception: “item with same key has already been added” (due to duplicate dataset node in graph)

  • Bug 1417022: Additional includes for sweep component is not properly supported for “az component build”

  • Bug 1355478: Raise KeyError when get the package name from frame

  • Bug 1329456: Scope Component node inputs ports takes wrong output data port from previous node

  • Bug 1433486: Renamed pipeline output will set outputSettings useGraphDefaultDatastore as true

  • Bug 1431365: PipelineComponent Create should not add parameter name to output setting for pipeline output

v0.1.0.44094775 (2021.08.19)

This release contains feature like PipelineComponent (Subgraph), Parameter Group, Non-PipelineParameter.

Features:

Bugs Fixes:

  • Bug 1276298: Pip upgrade cannot upgrade the package to latest version when using pip version 21.1.x

    • next pip version will fix this, meanwhile please use a lower version pip to make the command in getting started page work.

  • Bug 1296880: Export to code should keep node name info

  • Bug 1302968: Component function docstring should be python native type

v0.1.0.42428082 (2021.07.30)

This release contains feature improvements on pipeline output pipeline parameter, interactive debug and bug fixes.

Features:

Bugs Fixes:

  • Bug 1224648: Sweep Output not show correct path of best child run in UI

  • Bug 1259740: Graph to sdk code exported target selector’s setting type not right

  • Bug 1276686: Graph to sdk code sub pipeline’s output port name incorrect

  • Bug 1247178: No error appears when configured invalid datastore inside dsl

  • Bug 1248830: FileNotFoundError: [WinError 3] The system cannot find the path specified when creating component

  • Bug 1220886: ImportError: cannot import name ‘SNAPSHOT_MAX_FILES’ from ‘azureml._restclient.constants’

  • Bug 1256975: Error Handling Improve: The filename, directory name, or volume label syntax is incorrect

  • Bug 1243626: Pipeline._endpoint submit() parameters: ‘str’ object has no attribute ‘items’

v0.1.0.40555082 (2021.07.02)

This release contains feature improvements on Runsetting, Sweep Component and validate logic.

Features:

Improvements:

  • Improve validate logic:

    • Task 1188667: Pipeline.validate does not reveal compute target not set error

  • Improve reference doc site:

Bugs Fixes:

  • Bug 1178205: The component SDK should set spark.precache_package to false in default runconfiguration

  • Bug 1185427: Component.from_yaml should return same component id given same code snapshot for ParallelComponent

  • Bug 1189018: dsl.pipeline not correctly handle limited depth of recursion

  • Bug 1196832: pass **kwargs as parameter to dsl pipeline should raise exception

  • Bug 1199481: Compute validate should block non ADF compute target for DataTransferComponent

  • Bug 1188833: Invalid experiment name raises weird error message which user could not understand

  • Bug 1188832: Pipeline is submitted but an exception is raised when some node doesn’t have a compute target

  • Bug 1204845: ruamel.yaml should not use deprecated api “ruamel.yaml.safe_load”

  • Bug 1225252: CLI component create hangs “Failed to flush task queue within 600 seconds”

  • Bug 1166003: .amlignore does not work for folders in additional_includes

  • Bug 1217723: run.wait_for_completion() complains about missing data-prep

  • Bug 1208261: sweep component forces quniform to float, not usable with Bayesian+

v0.1.0.38576839 (2021.05.24)

This release contains feature improvements on Runsetting & Pipeline Parameter, and provides how-to-guides to setup job instance as interactive dev environment in ITP.

Features:

Improvements:

  • Improve component snapshot building speed by 8x (8min -> 1min) for deeprank scenario when files locates in ADLS:

    • Feature 1166175: [Component SDK/CLI]Make snapshot creation fast on remote file systems(ADLS)

  • Improve reference doc site:

  • Improve runsettings:

  • Refine validation experience:

    • Task 1126983: Postpone output.configure datastore error thrown to validation stage

Bugs Fixes:

  • Bug 1143740: [Component CLI] Sweep yaml cannot reference training yaml in a different folder

  • Bug 1165904: [Component CLI] User managed deps not passed correctly for Sweep Component

  • Bug 1152832: [Component CLI] component create hangs “Failed to flush task queue within 300 seconds”

  • Bug 1190062: [Component CLI] Component CLI Failed with obscure WindowsPath error

  • Bug 1166003: [Component CLI] Top level .amlignore does not work for folders in additional_includes

  • Bug 1178004: [Component SDK] Local runs of PRS components fail with obscure error

  • Bug 1167599: [Component SDK] AttributeError: module ‘os’ has no attribute ‘R_ok’

  • Bug 1164711: [Component SDK] k8srunsetting configuration not work when target_selector is used

  • Bug 1183773: [Component SDK] resource_layout.instance_type should work when specify gpu count and cpu count for GJD jobs

  • Bug 1196513: [Component SDK] from_yaml stuck with RuntimeError when no main module is defined in Windows

v0.1.0.36279725 (2021.04.23)

This release contains improvements of component types like SweepComponent, ScopeComponent.

Features:

  • Environment:

    • Task 1059940: Support reference docker file of environment in component spec file

  • Runsetting:

    • Task 1048196: Support resource_layout.instance_type/instance_count to better express resource requirement for ITP

    • Task 1048200: Support target_selector runsetting to integrate with GJD

  • Distributed component

    • Task 1123753: Support pytorch launcher type Distributed Component

  • Scope component

  • Sweep component

    • Task 1123751: Support using azureml.train.hyperdrive package contract to set hyperparameter expression

  • IO setting:

    • Task 1059783: Improve output.configure() performance: support datastore name as parameter

  • Feature 781165: Support component node comment

Improvements:

  • Refine validation experience:

    • Task 1106972: Support components.inputs.input0 = some_dataset

    • Task 1106979: Print clear error messages when pipeline.validate()

    • Task 1120591: Add validation on that dataset & component are from same workspace

  • Improve reference doc site:

    • Task 1062884: Easy contributing to SDK 1.5 reference doc: add “Edit in DevOps” button to the reference doc site

  • Increase component snapshot file size limit to 2GB when creating components:

    • Task 1128383: Increase component snapshot file size limit to 2GB

Bugs Fixes:

  • Bug 1093483: Exe pool style command does not handle {, } when it comes are part of input.

  • Bug 1092886: Dataset created in Heron workspace use DataFrameDirectory, but did not register that data_type causing pipeline not start

  • Bug 1119030: run.wait_for_completion cannot raise ActivityFailedException because the pipeline run details has no ‘error’

  • Bug 1073425: Pipeline_validate raises “TypeError: Object of type ‘FileDataset’ is not JSON serializable”

  • Bug 1121405: Sweep Component Bandit policy default runsetting not handled correctly

  • Bug 1126869: Sweep Component throw 400 after submission: Failed to get required parameter from platform_config: ‘Definition.Overrides.Script

  • Bug 1126829: Sweep Component: SDK did not print clear error message when user setting a hyperparameter with invalid value

  • Bug 1136705: local submit AML pipeline: cannot submit experiment with similar names included in package

v0.1.0.34049888 (2021.03.23)

This release contains support of new component types like SweepComponent, DataTransferComponent.

Features:

Improvements:

  • Improve export-to-code feature:

  • Improve document

  • Improve error categorization logic for:

    • exceptions in dsl.pipeline user code

    • Keyboard Interrupt in certain cases

    • SnapshotException

  • Fundamental

    • improved telemetry to track api performance

    • improved telemetry to track count of visualize in notebook

Bugs Fixes:

  • Bug 1047025: Jobs submitted using the component SDK do not properly interpolate inputs when there is no whitespace around the moustaches

  • Bug 1030885: When setting compute target in runsetting, cannot find newly created compute

  • Bug 1052993: Optional params don’t work with ScopeScript

  • Bug 1061444: In multi level pipeline, default datastore/compute in inner pipeline is overwritten by the outer pipeline.

  • Bug 999088: Can not support Enum type as pipeline parameter

  • Bug 1073427: Exception raised when component input is not reachable in scope of current pipeline

Known Bugs: This version is yanked from pypi because of below bug.

  • Bug 1245181: docker_configuration.arguments, expected type: ‘list’ or ‘tuple’, actually ‘str’

v0.1.0.31132438 (2021.02.09)

This release contains support of new component types like DistributedComponent, ScopeComponent.

Features:

Improvements:

  • Fundamental

    • Improve snapshot creation logic

      • improve performance by 2x in AML Notebooks(which has low file system performance when based on Azure File Share)

      • Add doc help user troubleshooting code snapshot issues

      • Added debug info for az ml component create/build, user can check by adding --verbose parameter in CLI.

      • Supports snapshot cache for all components(We don’t support component with additional includes or amlignore files before).

      • Supports recursive ignore files in component code snapshot.

    • Improve validation logic of local/remote run

    • Improve error handling, correct category of exceptions

  • Remove fields(source, contact, helpDocument, shared_scope) in CLI outputs to align with UX

  • Improve pipeline export yaml with support for output configurations

Bugs Fixes:

  • Bug 1005863: Output.configure(mode=’download’) should take effect

  • Bug 1008265: Optional boolean argument is set as False rather than not specified in AML

v0.1.0.29596699 (2021.01.15)

This release contains support of new component types like CommandComponent, ParallelComponent, HDInsightComponent.

Features:

Improvements:

  • Improve CLI

    • Refine CLI help docs, set more meaningful description to each command

    • Align CLI Error hanlding with other cli subgroups

  • Fundamental

    • Improve setup-environment doc by adding vscode devcontainer option

    • Test on multiple platforms (windows, Linux, Mac) * Python (3.6, 3.7, 3.8)

    • Regular CI target multiple platforms for stable & preview sdk/cli version for notebooks on github doc repro, and add status badge

  • Improve Docs

Bugs Fixes:

  • HDInsight component runsettings.configure(target=’xxx’) not taking effect

  • Pipeline parameter not taking effect in some scenarios

v0.1.0.27532912 (2020.11.30)

Features:

  • Support component output register as dataset

    • Example: component.outputs.some_output.register_as(name=”dataset_name”, create_new_version=True)

  • Support environment.conda.pip_requirements_file in component yaml spec

Improvements:

  • Improve component create performance

    • Cache last time snapshot locally

    • Detect snapshot change: delta update snapshot if changed, reuse last snapshot if not change

  • Improve dsl.pipeline

    • Improve dsl.pipeline performance for complex graph

    • Refactor how dsl.pipeline build pipeline component definition

  • Improve CLI

    • Dynamic loading az ml component subgroup using entrypoint technique

  • Improve Inputs/outputs/runsetting

    • Allow tabular dataset for HDI module

    • Validate json schema for PRS module

    • Allow specify non string enum in SDK

  • Fundamental

    • Error Handling Logic Improve

    • Reference doc related improve

    • Refactor: package structure, each function area code logic

Bugs Fixes:

  • Component.load() fails for builtin components in new workspaces

    • Auto provision built-in components and types in new workspace

  • az ml component create fails in new workspace for port type not exist

    • Auto provision known port types like: path

  • Fix status aggregation logic for subgraph in notebook visualization