Component build
Overview
When developing custom components, data scientists may need to get a “fully resolved” snapshot of a component and do some processing before creating it(e.g. may need to send the component snapshot to code signing before creating it).
In this case, use az ml component build to build a local snapshot of the component based on the component yaml.
The built snapshot is the same as the one created to workspaces which has resolved source directory, aml ignores and additional includes.
Goal:
if “code” is specified in the spec file, the built snapshot should follow it
additional include files should be included
files inside amlignore file should be excluded
Non-Goal:
Build from remote(Github/Devops) yaml is not supported
Use scenarios
Local test
User would like verify the snapshot content before create it as a component in workspace with the following steps.
build the snapshot to a local folder
check if the local folder contains expected files
use the built snapshot to create a component in workspace
CI/CD
In CI/CD pipelines, one user would like build snapshot, do some modification like code signing, then create it as component in workspace with the following step.
In build pipeline
build a component snapshot
update the snapshot(e.g. code signing)
publish the modified component snapshot as artifact
In release pipeline
create the signed component snapshot to a workspace in release pipeline
Examples
Example component project structure.
src/
python/
library1/
hello.py
library2/
en_US/
messages.json
zh_CN/
messages.json
greetings.py
assets/
LICENSE
module_entry/
module_spec.yaml
file
run.py
In module_entry/module_spec.yaml, “code” folder is specified as .., which sets the base folder of the snapshot to parent folder of module_entry.
...
code: ..
...
Example: build snapshot in default folder
A folder .build/ in the component code folder will be built.
The component project folder of example is folder module_entry, run
az ml component build --file module_entry/module_spec.yaml
will get the following project structure
src/
...
assets/
...
module_entry/
...
.build/
src/
...
assets/
...
module_entry/
...
Note:
When building snapshot, directory
.build/will be ignored. Otherwise, when building same component twice, the former.build/folder will be included into snapshot when building the second time.When creating a component,
.build/will not be ignored.When creating a component,
.build/won’t be created. That’s because we can not distinguish if a folder is component project or a component snapshot. When creating a built snapshot, additional.build/folder should not appear in snapshot folder.
Example: build snapshot to specific directory with parameter --target
In some cases, user might need to build snapshot to specific directory(e.g. in build pipeline, build all component’s snapshot into one artifact folder, then publish the artifact folder).
Run
az ml component build --file module_entry/module_spec.yaml --target snapshot
will get generate the snapshot to snapshot folder.
src/
...
assets/
...
module_entry/
...
snapshot/
src/
...
assets/
...
module_entry/
...
Note:
If
--targetis inside of project folder, it will be ignored like we did with.build.However, specify
--targetthe same as project folder is not allowed. Because we always expect a new folder to be created when calling build.
Example: build snapshot with additional aml ignore files
Besides, putting an .amlignore file in the base folder of each component, some components may share same .amlignore file(e.g. all python component need to ignore __pycache__/ folder).
We supported to create a shared .amlignore file, and specify the path to the shared .amlignore file when building the component.
Suppose we have an additional aml ignore file /shared/.amlignore with the following content.
file
Run az ml component build --file module_entry/module_spec.yaml --amlignore-file /shared/.amlignore will get the following project structure.
src/
...
assets/
...
module_entry/
...
.build/
src/
...
assets/
...
module_entry/
...
Discussion:
Above scenario can be replaced with the following commands.
az configure --defaults component_amlignore_file=/shared/.amlignore
az ml component build --file module_entry/module_spec.yaml
Reference Ignore files doc.
Example: CI/CD code signing workflow
This example show how component build works with code signing.
In build pipeline:
Run
az ml component build --file module_entry/module_spec.yamlto get the snapshot.src/ ... assets/ ... module_entry/ ... .build/ src/ ... assets/ ... module_entry/ ...
Modify the snapshot, generate a manifest file
catalog.jsonfor code signing.src/ ... assets/ ... module_entry/ ... .build/ src/ ... assets/ ... module_entry/ ... catalog.json # This is a manifest file with all files in snapshot and SHA hash.
Code signing on
catalog.jsonsrc/ ... assets/ ... module_entry/ ... .build/ src/ ... assets/ ... module_entry/ ... catalog.json # This is a manifest file with all files in snapshot and SHA hash. catalog.json.sig # Signed manifest file
publish the signed component snapshot as artifact
In release pipeline
Run
az ml component create --file .build/module_entry/module_spec.yamlto create the signed component snapshot to workspace in release pipeline
Reference here for how to troubleshoot component build process.