Pipeline Component
Overview
A PipelineComponent contains several components which connected as a pipeline. PipelineComponent can be treated as a black-box component, and can be defined with the same interface as a Component.
How to define Pipeline Component using dsl.pipeline
In the Component SDK, a PipelineComponent can be defined as a function with @dsl.pipeline decorator imported from azure.ml.component.
You can define components(or pipeline components) objects inside the function and connected their inputs and outputs to make a workflow. Reference here for more information.
Just like other types of components, a PipelineComponent must have an interface for inputs/parameters. Since a PipelineComponent is defined as a function in Component SDK, its interface will be defined along with the function’s parameter. There are 2 ways a PipelineComponent’s interface’s decided: by inferring or user annotation.
Inferred interface
The easiest way to define a PipelineComponent’s interface is to not define it. If a dsl.pipeline function’s parameter does not have annotation or default value, we will try to infer it’s meta(type, range, enum, etc.) based on component parameters it assigned to.
@dsl.pipeline(name='sample-pipeline',
description='a sample pipeline',
default_compute_target='aml-compute')
def sample_pipeline(pipeline_input, pipeline_enum, pipeline_int) -> Pipeline:
hello_world = hello_world_component_func(
input=pipeline_input,
enum_param=pipeline_enum,
int_param=pipeline_int
)
return hello_world.outputs
The above snippet shows a pipeline contains 1 component hello_world.
Every component’s parameter are linked to a parameter of the function(aka pipeline parameter).
Since the interface of hello_world’s parameters is already decided when it’s created.
We can use component hello_world’s interface to infer pipeline component sample_pipeline’s interface.
Say component hello_world have the following interface:
inputs:
input:
type: path
enum_param:
type: Enum
enum:
- Option1
- Option2
int_param:
type: Integer
min: 0
max: 10
optional: true
The pipeline component sample_pipeline will have similar interface:
inputs:
pipeline_input:
type: path
pipeline_enum:
type: Enum
enum:
- Option1
- Option2
pipeline_int:
type: Integer
min: 0
max: 10
optional: true
User annotated interface
A pipeline component’s interface can also be defined via annotation. The annotation can be:
Python basic built in type annotation, supported types are
int,float,bool,strandEnum, eg:@dsl.pipeline() def sample_pipeline(int_param: int, float_param: float, bool_param: bool, str_param: str, enum_param: Enum) -> Pipeline: ...
Default value with basic types, supported types are
int,float,bool,strandEnum, eg:
from enum import Enum
class EnumOps(Enum):
Option1 = 'option1'
Option2 = 'option2'
@dsl.pipeline()
def sample_pipeline(int_param=1, float_param=1.0, bool_param=False, str_param="str", enum_param=EnumOps.Option1) -> Pipeline:
...
dsl.types, user can add additional fields(min, max, enum, optional, description, default) for a parameter, supported types are
Input,Integer,Float,Boolean,String,Enum, eg:
from azure.ml.component.dsl.types import Input, Enum, Integer
@dsl.pipeline()
def sample_pipeline(
pipeline_input: Input(type="path", description="input path"),
pipeline_enum: Enum(enum=["Option1", "Option2"])="Option1",
pipeline_int: Integer(min=0, max=10, optional=True)
) -> Pipeline:
...
Validate logic
Pipeline component validation based on 2 rules:
If a pipeline parameter has annotation, validate it against it’s linked component parameter’s interface.
If a pipeline parameter do not have an annotation, iterate all it’s linked parameters and check if the pipeline parameter linked parameters have different type/range/enum.
Note: if a pipeline parameter annotated as optional, but linked to a required component parameter, there will be validation error.
Create pipeline component
A pipeline component can be created via Component.create.
Reference here for more information, eg:
from azure.ml.component import Component, Pipeline
@dsl.pipeline(name='sample-pipeline',
description='a sample pipeline',
default_compute_target='aml-compute')
def sample_pipeline() -> Pipeline:
...
component_func = Component.create(sample_pipeline, version="0.0.1")
Naming rule
The name of pipeline component here must be between 1 and 255 characters, start with letters or numbers. Valid characters are letters, numbers, “.”, “-” and “_”.
Create anonymous pipeline component
By default, when submitting a dsl.pipeline, all sub pipelines of current pipeline will be created as anonymous pipeline components, eg:
from azure.ml.component import Component, Pipeline
@dsl.pipeline(name='sub-pipeline',
default_compute_target='aml-compute')
def sub_pipeline() -> Pipeline:
...
@dsl.pipeline(name='sample-pipeline',
default_compute_target='aml-compute')
def sample_pipeline() -> Pipeline:
node1 = sub_pipeline()
...
pipeline = sample_pipeline()
pipeline.submit()
In above code snippet, when submitting pipeline, sub_pipeline will be created as anonymous pipeline component.