Skip to main content

ComponentSpec Reference Guide

The ComponentSpec format is the standard specification for defining reusable machine learning pipeline components. This guide covers all aspects of the ComponentSpec schema, from basic structure to advanced features.

Overview

A ComponentSpec defines a reusable component that can be executed as part of a machine learning pipeline. A container component follows this structure:

<ComponentSpec>:name: stringdescription: stringmetadata:annotations: Map[string, any]inputs: #InputSpec[]<InputSpec[]>:name: stringtype: TypeSpecTypedescription: stringdefault: stringoptional: booleanannotations: Map[string, any]outputs: #OutputSpec[]<OutputSpec[]>:name: stringtype: TypeSpecTypedescription: stringannotations: Map[string, any]implementation: #ImplementationType<oneOf>:<ContainerImplementation>:container: #ContainerSpecimage: StringOrPlaceholdercommand: #StringOrPlaceholder[]<oneOf>:string:<InputValuePlaceholder>:inputValue: string<InputPathPlaceholder>:inputPath: string<OutputPathPlaceholder>:outputPath: string<ConcatPlaceholder>:concat: StringOrPlaceholder[]<IfPlaceholder>:if:cond: IfConditionArgumentTypethen: StringOrPlaceholder[]else: StringOrPlaceholder[]args: StringOrPlaceholder[]env: Map[str, StringOrPlaceholder]<GraphImplementation>:graph: #GraphSpecoutputValues: Map string -> TaskOutputArgumenttasks: #Map string -> TaskSpec<TaskSpec>:componentRef: #ComponentReferencename: stringdigest: stringtag: stringurl: stringspec: ComponentSpecarguments: #Map string -> ArgumentType<ArgumentType>:<oneOf>:string:<GraphInputArgument>:graphInput:inputName: stringtype: TypeSpecType<TaskOutputArgument>:taskOutput:taskId: stringoutputName: stringtype: TypeSpecTypeisEnabled: PredicateTypeexecutionOptions: #ExecutionOptionsSpecretryStrategy: #RetryStrategySpecmaxRetries: integercachingStrategy: #CachingStrategySpecmaxCacheStaleness: stringannotations: Map[string, any]

Component Specification Schema Reference

This document describes all definitions in the Component Specification JSON Schema.

TypeSpecType

Defines the type system for component inputs and outputs.

PropertyTypeRequiredDescription
-string | object-Either a string type name or an object with nested type specifications

InputSpec

Describes the component input specification.

PropertyTypeRequiredDescription
namestringYesName of the input
typeTypeSpecTypeNoType specification for the input
descriptionstringNoDescription of the input
defaultstringNoDefault value for the input
optionalbooleanNoWhether the input is optional (default: false)
annotationsobjectNoAdditional metadata annotations

OutputSpec

Describes the component output specification.

PropertyTypeRequiredDescription
namestringYesName of the output
typeTypeSpecTypeNoType specification for the output
descriptionstringNoDescription of the output
annotationsobjectNoAdditional metadata annotations

InputValuePlaceholder

Represents the command-line argument placeholder that will be replaced at run-time by the input argument value.

PropertyTypeRequiredDescription
inputValuestringYesName of the input

InputPathPlaceholder

Represents the command-line argument placeholder that will be replaced at run-time by a local file path pointing to a file containing the input argument value.

PropertyTypeRequiredDescription
inputPathstringYesName of the input

OutputPathPlaceholder

Represents the command-line argument placeholder that will be replaced at run-time by a local file path pointing to a file where the program should write its output data.

PropertyTypeRequiredDescription
outputPathstringYesName of the output

StringOrPlaceholder

Union type that can be one of several placeholder types or a string.

Type Options
string
InputValuePlaceholder
InputPathPlaceholder
OutputPathPlaceholder
ConcatPlaceholder
IfPlaceholder

ConcatPlaceholder

Represents the command-line argument placeholder that will be replaced at run-time by the concatenated values of its items.

PropertyTypeRequiredDescription
concatarray of StringOrPlaceholderYesItems to concatenate

IsPresentPlaceholder

Represents the command-line argument placeholder that will be replaced at run-time by a boolean value specifying whether the caller has passed an argument for the specified optional input.

PropertyTypeRequiredDescription
isPresentstringNoName of the input

IfConditionArgumentType

Union type for conditional arguments.

Type Options
IsPresentPlaceholder
boolean
string
InputValuePlaceholder

ListOfStringsOrPlaceholders

Array type definition.

TypeDescription
array of StringOrPlaceholderArray containing strings or placeholder objects

IfPlaceholder

Represents the command-line argument placeholder that will be replaced at run-time by a boolean value specifying whether the caller has passed an argument for the specified optional input.

PropertyTypeRequiredDescription
ifobjectYesConditional structure
if.condIfConditionArgumentTypeYesCondition to evaluate
if.thenListOfStringsOrPlaceholdersYesValues to use when condition is true
if.elseListOfStringsOrPlaceholdersNoValues to use when condition is false

ContainerSpec

Defines container execution specification.

PropertyTypeRequiredDescription
imageStringOrPlaceholderYesDocker image name
commandarray of StringOrPlaceholderNoEntrypoint array. Not executed within a shell. The docker image's ENTRYPOINT is used if this is not provided
argsarray of StringOrPlaceholderNoArguments to the entrypoint. The docker image's CMD is used if this is not provided
envobject with StringOrPlaceholder valuesNoList of environment variables to set in the container

ContainerImplementation

Represents the container component implementation.

PropertyTypeRequiredDescription
containerContainerSpecYesContainer specification

ImplementationType

Union type for component implementations.

Type Options
ContainerImplementation
GraphImplementation

MetadataSpec

Metadata specification for components.

PropertyTypeRequiredDescription
annotationsobjectNoAdditional metadata annotations

ComponentSpec

Component specification. Describes the metadata (name, description, source), the interface (inputs and outputs) and the implementation of the component.

PropertyTypeRequiredDescription
namestringNoComponent name
descriptionstringNoComponent description
inputsarray of InputSpecNoComponent input specifications
outputsarray of OutputSpecNoComponent output specifications
implementationImplementationTypeYesComponent implementation
metadataMetadataSpecNoComponent metadata

ComponentReference

Component reference. Contains information that can be used to locate and load a component by name, digest or URL.

PropertyTypeRequiredDescription
namestringNoComponent name
digeststringNoComponent digest
tagstringNoComponent tag
urlstringNoComponent URL
specComponentSpecNoInline component specification

GraphInputArgument

Represents the component argument value that comes from the graph component input.

PropertyTypeRequiredDescription
graphInputobjectYesReferences the input of the graph/pipeline
graphInput.inputNamestringYesName of the graph input
graphInput.typeTypeSpecTypeNoType specification

TaskOutputArgument

Represents the component argument value that comes from the output of a sibling task.

PropertyTypeRequiredDescription
taskOutputobjectYesReferences the output of a sibling task
taskOutput.taskIdstringYesID of the task
taskOutput.outputNamestringYesName of the output
taskOutput.typeTypeSpecTypeNoType specification

ArgumentType

Union type for task arguments.

Type Options
string
GraphInputArgument
TaskOutputArgument

TwoArgumentOperands

Pair of operands for a binary operation.

PropertyTypeRequiredDescription
op1ArgumentTypeYesFirst operand
op2ArgumentTypeYesSecond operand

TwoLogicalOperands

Pair of operands for a binary logical operation.

PropertyTypeRequiredDescription
op1PredicateTypeYesFirst logical operand
op2PredicateTypeYesSecond logical operand

PredicateType

Union type for predicate expressions.

OperatorOperand TypeDescription
==TwoArgumentOperandsEquality comparison
!=TwoArgumentOperandsInequality comparison
>TwoArgumentOperandsGreater than comparison
>=TwoArgumentOperandsGreater than or equal comparison
<TwoArgumentOperandsLess than comparison
<=TwoArgumentOperandsLess than or equal comparison
andTwoLogicalOperandsLogical AND operation
orTwoLogicalOperandsLogical OR operation
notPredicateTypeLogical NOT operation

RetryStrategySpec

Optional configuration that specifies how the task should be retried if it fails.

PropertyTypeRequiredDescription
maxRetriesintegerNoMaximum number of retry attempts

CachingStrategySpec

Optional configuration that specifies how the task execution may be skipped if the output data exist in cache.

PropertyTypeRequiredDescription
maxCacheStalenessstring (duration format)NoMaximum age of cached data to consider valid

ExecutionOptionsSpec

Optional configuration that specifies how the task should be executed. Can be used to set some platform-specific options.

PropertyTypeRequiredDescription
retryStrategyRetryStrategySpecNoRetry configuration
cachingStrategyCachingStrategySpecNoCaching configuration

TaskSpec

Task specification. Task is a configured component - a component supplied with arguments and other applied configuration changes.

PropertyTypeRequiredDescription
componentRefComponentReferenceYesReference to the component
argumentsobject with ArgumentType valuesNoArguments to pass to the component
isEnabledPredicateTypeNoCondition for enabling the task
executionOptionsExecutionOptionsSpecNoExecution configuration
annotationsobjectNoAdditional metadata annotations

GraphSpec

Describes the graph component implementation. It represents a graph of component tasks connected to the upstream sources of data using the argument specifications. It also describes the sources of graph output values.

PropertyTypeRequiredDescription
tasksobject with TaskSpec valuesYesTasks in the graph
outputValuesobject with TaskOutputArgument valuesNoOutput value mappings

GraphImplementation

Represents the graph component implementation.

PropertyTypeRequiredDescription
graphGraphSpecYesGraph specification

PipelineRunSpec

The object that can be sent to the backend to start a new Run.

PropertyTypeRequiredDescription
rootTaskTaskSpecYesMain task to execute
onExitTaskTaskSpecNoTask to execute on exit