Skip to content

Pipeline

mllabs._pipeline.Pipeline

Node graph that describes an ML workflow.

Holds groups (:class:PipelineGroup) and nodes (:class:PipelineNode). The implicit DataSource node is stored as nodes[None].

Attributes:

Name Type Description
grps dict[str, PipelineGroup]

All registered groups.

nodes dict[str | None, PipelineNode]

All nodes, keyed by name. None is the DataSource.

set_grp(name, role=None, processor=None, edges=None, method=None, parent=None, adapter=None, params=None, desc=None, exist='diff')

Create or update a group.

Parameters:

Name Type Description Default
name str

Group name. Cannot contain __ or path-invalid chars.

required
role str

'stage' or 'head'. Inherited from parent if omitted.

None
processor

Processor class.

None
edges dict

Edge definitions {key: [(node_name, var_spec), ...]}.

None
method str

Processor method name (e.g. 'fit_transform').

None
parent str

Parent group name, or None.

None
adapter

ModelAdapter instance.

None
params dict

Constructor parameters for the processor.

None
exist str

Conflict resolution — 'diff' (default, skip if unchanged), 'skip', 'error', or 'replace'.

'diff'

Returns:

Name Type Description
dict

{result, grp, affected_nodes, [old_grp]} where result is

'new', 'skip', or 'update'.

Raises:

Type Description
ValueError

If name is invalid, role conflicts, or edges form a cycle.

set_node(name, grp, processor=None, edges=None, method=None, adapter=None, params=None, desc=None, exist='diff')

Create or update a node.

Parameters:

Name Type Description Default
name str

Node name.

required
grp str

Group the node belongs to.

required
processor

Processor class override.

None
edges dict

Additional edge definitions merged on top of the group.

None
method str

Method name override.

None
adapter

ModelAdapter instance override.

None
params dict

Constructor parameter overrides.

None
exist str

Conflict resolution — 'diff' (default), 'skip', 'error', or 'replace'.

'diff'

Returns:

Name Type Description
dict

{result, obj, old_obj, affected_nodes}.

Raises:

Type Description
ValueError

If the resolved processor or method is missing, edges are invalid, or a cycle would be created.

get_node_names(query)

Resolve a node query to a list of node names.

Parameters:

Name Type Description Default
query

None (all nodes), list (exact names), or str (regex pattern matched against node names).

required

Returns:

Type Description

list[str]: Matching node names (DataSource None excluded for

str/list queries).

get_node_attrs(name)

Return fully resolved attributes for a node (group hierarchy merged).

Parameters:

Name Type Description Default
name str

Node name.

required

Returns:

Name Type Description
dict

Keys — name, grp, processor, method,

adapter, edges, params.

get_node(name)

get_grp(name)

rename_grp(name_from, name_to)

remove_grp(name)

remove_node(name)

copy()

Return a deep copy of the entire pipeline.

Returns:

Name Type Description
Pipeline

New pipeline with all groups and nodes copied.

copy_stage()

Return a copy containing only Stage groups and nodes.

Returns:

Name Type Description
Pipeline

Pipeline with only role='stage' groups and nodes.

copy_nodes(node_names)

Return a copy containing the specified nodes and all their ancestors.

Parameters:

Name Type Description Default
node_names list[str]

Target node names. Their upstream Stage dependencies are included automatically.

required

Returns:

Name Type Description
Pipeline

Minimal pipeline needed to run node_names.

compare_nodes(nodes)

Compare params and X-edges across nodes that share the same processor.

Nodes are grouped by processor class. Within each group, only columns that differ between nodes are included.

Parameters:

Name Type Description Default
nodes list[str]

Node names to compare.

required

Returns:

Type Description

dict[str, pd.DataFrame]: {processor_name: DataFrame} where the

DataFrame index is node names and columns are a MultiIndex of

('params', param_key) and ('X', stage_label).

desc_pipeline(max_depth=None, direction='TD')

파이프라인 구조를 Mermaid Markdown으로 반환

Parameters:

Name Type Description Default
max_depth

최대 표시 깊이 (None이면 무제한)

None
direction

그래프 방향 ('TD': Top-Down, 'LR': Left-Right)

'TD'

desc_node(node_name, direction='TD', show_params=False)

특정 노드까지의 연결 구조를 Mermaid Markdown으로 반환

Parameters:

Name Type Description Default
node_name

대상 노드 이름

required
direction

그래프 방향 ('TD': Top-Down, 'LR': Left-Right)

'TD'
show_params

True이면 노드의 파라미터 정보를 표시 (default: False)

False

mllabs._pipeline.PipelineGroup

A named group that shares configuration across its member nodes.

Groups form a hierarchy via parent. Child groups and their nodes inherit processor, method, adapter, edges, and params from ancestors, with child values taking precedence.

Attributes:

Name Type Description
name str

Group name.

role str

'stage' or 'head'.

processor

Processor class (optional, may be inherited).

edges dict

Edge definitions (optional, merged with parent).

method str

Processor method name (optional, may be inherited).

parent str

Parent group name, or None.

adapter

ModelAdapter instance (optional, may be inherited).

params dict

Constructor parameters (optional, merged with parent).

children list[str]

Child group names.

nodes list[str]

Node names belonging to this group.

get_attrs(grps)

diff(processor=None, edges=None, method=None, parent=None, adapter=None, params=None)

mllabs._pipeline.PipelineNode

An individual executable unit in the pipeline.

Node-level attributes override group attributes. Final resolved values are obtained via :meth:get_attrs.

Attributes:

Name Type Description
name str

Node name.

grp str

Parent group name.

processor

Processor class override (None → inherit from group).

edges dict

Additional or overriding edge definitions.

method str

Processor method name override.

adapter

ModelAdapter instance override.

params dict

Constructor parameter overrides.

output_edges list[str]

Names of nodes that consume this node's output.

get_attrs(grps)

diff(grp, processor=None, edges=None, method=None, adapter=None, params=None)