feat(workflows): add continue_on_error step field for non-halting failures by doquanghuy · Pull Request #2663 · github/spec-kit

doquanghuy · 2026-05-21T14:34:47Z

Description

Closes #2591.

Adds an optional continue_on_error: bool field on every step.
When set to true and the step fails, the engine records the
result (exit_code, stderr, status) into steps.<id>.output and
continues to the next sibling step instead of halting the run.
Downstream if, switch, or gate steps can then branch on
{{ steps.<id>.output.exit_code }} to route the recovery path.

This is the shape @mnriem proposed in the issue discussion —
it composes with primitives that already exist (the exit code
is already captured, the expression engine already resolves it,
and if/switch/gate are already available). The only gap
was that a non-zero exit hard-stopped the pipeline before any
downstream step could evaluate it.

Canonical usage

- id: heavy-thing
  type: command
  integration: claude
  command: speckit.heavy-thing
  continue_on_error: true

- id: check-result
  type: if
  condition: "{{ steps.heavy-thing.output.exit_code != 0 }}"
  then:
    - id: review
      type: gate
      message: "Step failed (exit {{ steps.heavy-thing.output.exit_code }}). Retry or skip?"
      on_reject: skip
  else:
    - id: next-thing
      command: speckit.next-thing

Engine

WorkflowEngine._execute_steps now consults the step config when
a step returns StepStatus.FAILED:

Gate aborts (output.aborted) always halt the run — operator
decisions take precedence over the flag.
Otherwise, if continue_on_error: true, log a
step_continue_on_error event and proceed to the next sibling.
Otherwise, behave as before: log step_failed, set
RunStatus.FAILED, and return.

Exactly one event per failure-resolution path is logged so the
log timeline is unambiguous: either the run continued past the
failure or it halted.

Validation

_validate_steps rejects non-bool values for continue_on_error.
Coerced strings like "true" are not accepted so authoring
mistakes surface at validation time rather than silently
changing run semantics.

Default behaviour preserved

When continue_on_error is omitted, every code path is
byte-equivalent to before this change. Existing workflows see no
difference.

Verdict coverage (from the issue discussion)

Scenario	How
Skip	`continue_on_error: true` + `if` branches around the failure
Abort	Omit the flag — today's default halts the run
Retry	`continue_on_error: true` + `gate` → operator approves → `resume` re-runs from gate

Fully unattended retry-on-transient (e.g. retry a 429 at 3 AM
without operator attendance) is intentionally out of scope here.
The skip and abort verdicts work without a human; the
retry verdict still pauses for one at the gate. A future
loop/retry-count primitive or an auto-approving gate type could
close that gap on top of this mechanism without further engine
changes — happy to follow up on that in a separate issue if
useful.

Testing

Tested locally with uv run specify --help
Ran existing tests with uv sync && uv run pytest
→ 2967 passed, 35 skipped (was 2960 before; +7 new
tests added in this PR).
Tested with a sample project: ran a 3-step workflow where
the middle step exits non-zero. Without
continue_on_error, run halts at the failing step (as
before). With continue_on_error: true, the failing step
records exit_code and the third step executes. A
downstream if branching on
{{ steps.flaky.output.exit_code != 0 }} routes into a
recovery gate cleanly.

New test coverage

TestContinueOnError in tests/test_workflows.py:

Test	What it locks
`test_undeclared_failure_halts_run`	Default behaviour byte-equivalent — no flag → run halts on first non-zero exit.
`test_declared_and_fired_continues_run`	Flag set + step fails → run continues, exit_code recorded.
`test_declared_but_step_succeeded_is_noop`	Flag set + step succeeds → no behaviour change.
`test_if_branch_routes_around_failure`	End-to-end recovery pattern from the issue discussion.
`test_gate_abort_still_halts_with_continue_on_error`	Operator-driven gate abort always halts, even with the flag set.
`test_validation_rejects_non_bool_continue_on_error`	`"true"` (string) fails validation.
`test_validation_accepts_bool_continue_on_error`	`true` and `false` pass validation cleanly.

AI Disclosure

I did not use AI assistance for this contribution
I did use AI assistance (described below)

Used Claude Opus to draft the engine change, the test suite, the
docs section, and this PR body. The shape (continue_on_error

exit-code-as-API + branch on it via existing primitives) was
proposed by @mnriem on the issue thread; this PR implements that
proposal. Code, tests, and design decisions were human-reviewed
before submission.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 3

+By default, a non-zero exit code from any step halts the entire run.
+Set `continue_on_error: true` on a step to record its result and
+continue to the next sibling step instead. The exit code remains
+available on `steps.<id>.output.exit_code` so downstream `if`,
+`switch`, or `gate` steps can branch on it:


+            assert not any(
+                "continue_on_error" in e for e in errors
+            ), errors


+        monkeypatch.setattr(gate_module.sys.stdin, "isatty", lambda: True)
+        monkeypatch.setattr(


doquanghuy · 2026-05-27T12:47:08Z

@mnriem — addressed all three Copilot findings in e9a4871, and rebased the branch onto current main (clean rebase, no conflicts).

1. README wording. Rewrote the "Error Handling" intro in terms of StepStatus.FAILED halting by default, with non-zero shell / command exit as one common cause. Avoids implying only exit codes can halt a run (gate aborts and validation failures also do, just via different mechanisms).

2. Validation test contract tightened. test_validation_accepts_bool_continue_on_error now asserts errors == [] instead of "no error mentions continue_on_error". Unrelated validation regressions on the same minimal YAML can no longer slip past this test.

3. Gate stdin patching made runner-robust. In test_gate_abort_still_halts_with_continue_on_error, swapped sys.stdin itself for a stub _TTYStdin object instead of patching sys.stdin.isatty. Method-on-instance assignment is unreliable on real io.TextIOWrapper objects (e.g. under pytest with capture disabled), so replacing the whole stdin object is more robust across runners.

Full suite still passes (continue_on_error: 7/7), no regressions. Branch is MERGEABLE and ready for another look whenever you have a moment.

AI disclosure: drafted with Claude Opus, human-reviewed.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 2

+run — most commonly a `shell` or `command` step exiting non-zero, but
+also things like a step type that fails validation. Set


+# is declared, the engine records the result (exit_code, stderr, status)
+# and continues to the next sibling step instead of halting the run.


- README "Error Handling": drop the validation-failure example. Structural validation runs before the workflow run starts, so it never produces a `StepStatus.FAILED` mid-run and was misleading as an example for `continue_on_error`. Reword to "any runtime error raised during step execution" with a parenthetical clarifying the validation-time path. - Tests block comment: fix the misleading "(exit_code, stderr, status)" wording. `status` is stored as `steps.<id>.status`, not nested under `steps.<id>.output`. Clarify that `output` carries the exit-code / stderr from the failure and that `status` is a sibling key so future readers writing expressions against run state aren't led astray. Addresses Copilot review feedback on github#2663.

doquanghuy · 2026-05-27T16:27:22Z

@mnriem — addressed both new Copilot findings in c827f4f.

1. README runtime-vs-validation framing. Dropped the "step type that fails validation" example from the "Error Handling" intro — structural validation runs at specify workflow run startup and exits before any run is created, so it never produces a StepStatus.FAILED mid-run and was misleading as an example for continue_on_error. Reworded to "any other runtime error raised during step execution" with a parenthetical clarifying that invalid workflows are rejected up-front.

2. Test-block comment about status location. Fixed the misleading "(exit_code, stderr, status)" wording in the section header above TestContinueOnError. status is stored as steps.<id>.status (sibling of output), not nested under steps.<id>.output. Clarified that output carries the exit_code / stderr from the failure and that status is a separate sibling key, so future readers writing expressions against the run state aren't led astray.

Continue_on_error suite still 7/7, no regressions.

AI disclosure: drafted with Claude Opus, human-reviewed.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 1

+                # steps can branch on it. Log a single, unambiguous
+                # event per failure resolution — either the run
+                # continued past it, or it halted.
+                if step_config.get("continue_on_error"):


Engine was checking `continue_on_error` via truthiness (`if step_config.get("continue_on_error"):`), which let truthy non-bool values like the string `"true"` activate the flag at runtime if validation was skipped. `execute()` does not auto-validate (documented on the `load_definition` docstring), so a caller bypassing `validate_workflow()` could see the contract silently violated. Tighten the runtime check to identity comparison (`is True`) so only a literal boolean enables the behaviour. Validation continues to catch the authoring mistake at parse time; this is defense in depth for the bypass-validation path. Add `test_engine_ignores_truthy_non_bool_continue_on_error` locking the contract — feeds `continue_on_error: "true"` (string) through `execute()` directly and asserts the run halts with `RunStatus.FAILED`. Addresses Copilot review feedback on github#2663.

doquanghuy · 2026-05-27T17:47:56Z

@mnriem — addressed the new Copilot finding in 838e7e4.

The gap. Copilot caught a real defense-in-depth hole: _validate_steps rejects non-bool continue_on_error values at validation time, but the engine itself was checking the flag via truthiness (if step_config.get("continue_on_error"):) at runtime. Since WorkflowEngine.execute() does NOT auto-validate the definition (the load_definition docstring explicitly says "not yet validated; call validate_workflow() or engine.validate() separately"), a caller that bypasses validation and feeds the engine a definition with continue_on_error: "true" (string) could silently activate the flag — the YAML string "true" is truthy in Python — which contradicts the documented "literal boolean only" contract.

The fix. Tightened the runtime check at engine.py:661 to identity comparison:

- if step_config.get("continue_on_error"):
+ if step_config.get("continue_on_error") is True:

Only a literal True now enables the behaviour. Validation continues to catch the authoring mistake at parse time; this is the belt-and-braces guard for the bypass-validation path. Inline comment expanded to explain why identity beats truthiness here, with a pointer to the load_definition docstring so future readers don't "simplify" it back to a truthy check.

New test. test_engine_ignores_truthy_non_bool_continue_on_error locks the contract. It bypasses validate_workflow() entirely, feeds continue_on_error: "true" (string) straight through engine.execute(), and asserts the run halts with RunStatus.FAILED and the sibling step does NOT run. Would have failed under the old truthy check.

Continue_on_error suite: 8/8 pass (was 7/7). Full suite: 3002 passed, 40 skipped, no regressions.

AI disclosure: drafted with Claude Opus, human-reviewed.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 2

+                # the behaviour, even if validation was skipped.
+                # Validation rejects non-bool values at parse time,
+                # but `execute()` does not auto-validate (see the
+                # `load_definition` docstring), so a caller passing


+        behaviour. `execute()` does not auto-validate (the
+        `load_definition` docstring documents this), so the engine


mnriem

Please address Copilot feedback and resolve conflicts

Adds an optional `continue_on_error: bool` field on every step. When set to `true` and the step fails, the engine records the result (`exit_code`, `stderr` on `steps.<id>.output` plus `status` as a sibling key on `steps.<id>`) and continues to the next sibling step instead of halting the run. Downstream `if`, `switch`, or `gate` steps can then branch on `{{ steps.<id>.output.exit_code }}` to route the recovery path. Engine details -------------- `WorkflowEngine._execute_steps` now consults the step config when a step returns `StepStatus.FAILED`: - Gate aborts (`output.aborted`) always halt the run — operator decisions take precedence over the flag. - Otherwise, if `continue_on_error` is the literal `True`, log a `step_continue_on_error` event and proceed to the next sibling. The runtime check uses identity comparison (`is True`) rather than truthiness, so truthy non-bool values like the string `"true"` cannot silently change run semantics even if a caller bypasses `validate_workflow()`. - Otherwise, behave as before: log `step_failed`, set `RunStatus.FAILED`, and return. Validation ---------- `_validate_steps` rejects non-bool values for `continue_on_error`. Coerced strings like `"true"` are not accepted so authoring mistakes surface at validation time rather than silently changing run semantics. Tests ----- `TestContinueOnError` in `tests/test_workflows.py` (8 tests): - `test_undeclared_failure_halts_run` — default halt behaviour. - `test_declared_and_fired_continues_run` — flag + fail → continue. - `test_declared_but_step_succeeded_is_noop` — flag + success → no-op. - `test_if_branch_routes_around_failure` — end-to-end recovery. - `test_gate_abort_still_halts_with_continue_on_error` — abort always halts. - `test_validation_rejects_non_bool_continue_on_error` — `"true"` rejected at validation. - `test_validation_accepts_bool_continue_on_error` — `true`/`false` pass cleanly. - `test_engine_ignores_truthy_non_bool_continue_on_error` — defense-in-depth: engine ignores string `"true"` even when validation is bypassed. Rebased onto current upstream/main (post github#2664 merge); the new `TestContinueOnError` class sits immediately after upstream's `TestContextRunId` so the two feature suites coexist cleanly. Closes github#2591. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

doquanghuy · 2026-05-27T18:11:35Z

@mnriem — addressed both Copilot findings and resolved the conflicts in d0b9e00.

1. Wrong method name reference. Both the engine inline comment (engine.py:667) and the test docstring referenced load_definition, but the actual engine API is WorkflowEngine.load_workflow() / WorkflowEngine.execute() — there is no load_definition. Fixed both to reference WorkflowEngine.load_workflow and quote the exact docstring line ("not yet validated; call validate_workflow() or engine.validate() separately") so the pointer is self-checking.

2. Conflict resolution after PR #2664 landed. Upstream main advanced past my branch — notably #2664 merged in c6afe4c, which inserted a new TestContextRunId class at the same anchor point (just before # ===== State Persistence Tests =====) where my TestContinueOnError was sitting. Both classes are independent feature suites and need to coexist.

I rebased the branch onto current main (c6afe4c) and squashed my four iteration commits into a single feat(workflows): add continue_on_error step field commit on top — cleaner history, and it sidesteps the messy three-way conflict that came from re-applying each iteration commit through the same conflict region. New class ordering in tests/test_workflows.py:

TestContextRunId        (upstream, from #2664)
TestContinueOnError     (this PR, 8 tests)
TestRunState            (upstream, unchanged)

Full suite: 3013 passed (was 3009 pre-rebase; +4 from upstream's TestContextRunId + same 8 from TestContinueOnError), 40 skipped, no regressions. Branch is MERGEABLE again.

AI disclosure: drafted with Claude Opus, human-reviewed.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 2

 Supported filters: `default`, `join`, `contains`, `map`.

-### Runtime Context
-
-`{{ context.* }}` exposes engine-managed runtime metadata for the
-current run:
-
-| Variable | Description |
-|----------|-------------|
-| `context.run_id` | The current workflow run id (the same value Spec Kit prints as `Run ID:` at the end of `workflow run`). Auto-generated runs are 8-character hex from `uuid4`; operator-supplied ids may be any alphanumeric string with hyphens or underscores. Empty string outside a run context. |
-
-```yaml
-# Stamp telemetry events with the run id for cross-system join.
- id: emit-event
-  type: shell
-  run: 'echo "{\"run_id\":\"{{ context.run_id }}\",\"event\":\"started\"}" >> events.jsonl'
-
-# Per-run scratch directory.
- id: prep-scratch
-  type: shell
-  run: 'mkdir -p /tmp/run-{{ context.run_id }}'
-
-# Pass run id into a command for artifact metadata.
- id: tag-artifact
-  command: speckit.specify
-  input:
-    args: "{{ context.run_id }}"
-```
-
 ## Input Types

 Workflow inputs are type-checked and coerced from CLI string values:


+  then:
+    - id: review
+      type: gate
+      message: "Step failed (exit {{ steps.heavy-thing.output.exit_code }}). Retry or skip?"


mnriem · 2026-05-28T13:54:57Z

Please address Copilot feedback

Two Copilot findings on d0b9e00: 1. The `### Runtime Context` documentation for `{{ context.* }}` was lost during the rebase onto current main (the squash dropped the anchor where github#2664 had added it). Restored under `## Expressions` so users can find `context.run_id` semantics and examples. 2. The continue_on_error example gate had message "Retry or skip?" but used the default `options: [approve, reject]` with `on_reject: skip`, which implied an automatic retry path that gates do not provide. Reworded the message to match the actual approve/reject semantics and added an explicit note that retry requires either custom gate options + downstream branching or a wrapper loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

doquanghuy · 2026-05-28T14:34:19Z

@mnriem — addressed both Copilot findings in 22c5184 (docs-only, no engine/test changes).

1. Runtime Context section restored. The ### Runtime Context block for {{ context.* }} (added upstream in #2664) was lost during my rebase onto current main — the squash dropped the anchor where it sat. Restored under ## Expressions so context.run_id semantics and the three usage examples are discoverable again.

2. Gate prompt clarified. Reworded the example's message: from "Retry or skip?" to "Approve to continue, or reject to skip the rest of this branch." — the default options: [approve, reject] never offered a retry path, and gates do not auto re-run the failed step. Added an explicit note below the example clarifying that retry requires either custom gate options with downstream branching, or a wrapper loop.

CI was green on the prior commit (11/11 passing). This change is README-only — no Python tests touched.

AI disclosure: drafted with Claude Opus, human-reviewed.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 1

+By default, any step that ends in `StepStatus.FAILED` at runtime halts
+the entire run — most commonly a `shell` or `command` step exiting
+non-zero, but also any other runtime error raised during step
+execution. (Invalid workflow definitions are rejected up-front by
+`specify workflow run` before the run even starts, so structural
+validation failures never reach this code path.) Set
+`continue_on_error: true` on a step to record its result and continue
+to the next sibling step instead. When the failure was a non-zero
+exit, the exit code remains available on


Copilot finding on d0b9e00: The README's "Error Handling" intro implied `continue_on_error` covers "any other runtime error raised during step execution", but the engine only consults the flag when a step returns `StepResult(status=FAILED, ...)`. Exceptions raised out of `step_impl.execute()` propagate to `WorkflowEngine.execute()`, where the catch-all logs `workflow_failed` and re-raises — the step result is never recorded, and the flag is never consulted. Audited the whole PR diff for the same overclaim: 1. workflows/README.md — main fix. Reworded the Error Handling intro to "any step that returns StepResult(status=FAILED, ...)" and promoted the parenthetical structural-validation note into the Notes block. Added a new "Scope: returned failures only" note that names the exception path explicitly and tells step authors how to bring the flag into scope for exceptional code (catch internally and return FAILED with the failure encoded in `output`). 2. tests/test_workflows.py — section comment used "when an executable step fails", same ambiguity. Tightened to "when a step returns StepResult(status=FAILED, ...)" and added a sentence calling out that unhandled exceptions are out of scope. 3. src/specify_cli/workflows/engine.py — already correct ("any step that returns FAILED" in the validator comment; "lets the pipeline route around the failure" in the execute path). No change. Engine semantics and test bodies are unchanged. Docs-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

doquanghuy · 2026-05-28T15:51:05Z

@mnriem — addressed the new Copilot finding in b8982a7 (docs-only, no engine/test code changes).

The finding. Copilot caught a real overclaim in the README "Error Handling" intro: the text implied continue_on_error covers "any other runtime error raised during step execution", but the engine only consults the flag when a step returns StepResult(status=FAILED, ...). Exceptions raised out of step_impl.execute() propagate to WorkflowEngine.execute(), where the catch-all logs workflow_failed and re-raises — the step result is never recorded, and the flag is never consulted.

Audit (per "fix all similar issues"). Grepped the entire PR diff for the same pattern:

workflows/README.md — main fix.
- Reworded the Error Handling intro to "any step that returns StepResult(status=FAILED, ...)" (removed the "any other runtime error raised during step execution" clause).
- Added a new "Scope: returned failures only" bullet in the Notes block that names the exception path explicitly: unhandled exceptions are caught by WorkflowEngine.execute(), logged as workflow_failed, and abort the run regardless of continue_on_error. Tells step authors how to bring the flag into scope for exceptional paths — catch the exception internally and return StepResult(status=FAILED, ...) with the failure encoded in output.
- Promoted the parenthetical structural-validation note ("Invalid workflow definitions are rejected up-front...") out of the main paragraph into the Notes block for parallelism with the other "what the flag does NOT cover" notes.
tests/test_workflows.py — the ===== continue_on_error Tests ===== section header read "when an executable step fails and continue_on_error: true is declared", same ambiguity (could be misread to include exception paths). Tightened to "when a step returns StepResult(status=FAILED, ...)" and added a sentence calling out that unhandled exceptions are out of scope.
src/specify_cli/workflows/engine.py — audited the inline comments added in d0b9e00: the validator block already says "any step that returns FAILED" (line 235) and the execute path says "lets the pipeline route around the failure" (line 654). Both correctly use returned, not raised. No code-side change needed.

Engine semantics, validation, and test bodies are unchanged. CI: 161/161 tests/test_workflows.py pass locally.

Branch is clean against upstream/main. Ready when you are.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 1

+The gate's default `options: [approve, reject]` map directly to "continue
+the run" vs. "skip the rest of this branch" — gates do not automatically
+re-run the failed step. To express a retry path, either define custom
+gate options and branch on the choice downstream, or wrap the failing
+step in your own loop.


…MPLETED, not auto-skip Copilot finding on b8982a7: The README example's gate message said "reject to skip the rest of this branch", and the explanatory paragraph claimed [approve, reject] map to "continue" vs "skip the rest of this branch". The engine does not implement automatic branch-skipping. `on_reject: skip` returns `StepStatus.COMPLETED` (gate/__init__.py:65-66); the next sibling step runs unconditionally unless the author wires a downstream `if` reading `{{ steps.<gate-id>.output.choice }}`. Two fixes: 1. Restructured the YAML example so it actually demonstrates the manual-branching pattern: added a `recover` if-step after the gate that conditions on `steps.review.output.choice == 'approve'`. Now the example shows the real workflow author's responsibility instead of implying the engine does it. 2. Replaced the trailing paragraph with three precise notes: - both gate options return COMPLETED; `on_reject: skip` controls abort behaviour only, not sibling-skipping - all three `on_reject` values enumerated with their actual engine semantics (FAILED+aborted / COMPLETED / PAUSED) - the original retry-loop guidance retained as the third bullet Updated the gate message in the example to match — "reject to leave the failure recorded and move on" instead of "reject to skip the rest of this branch". Audited the whole PR diff for the same overclaim: no other instance. Engine semantics, validation, and test bodies are unchanged. Docs-only. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

doquanghuy · 2026-05-29T14:47:26Z

@mnriem — addressed the new Copilot finding in $(git -C /tmp/spec-kit-pr-2663 rev-parse --short HEAD) (docs-only, no engine/test code changes).

The finding. Copilot caught a real overclaim in the example: the gate message said "reject to skip the rest of this branch" and the explanatory paragraph claimed [approve, reject] map to "continue" vs "skip the rest of this branch". The engine does not implement automatic branch-skipping. Per gate/__init__.py:65-66, on_reject: skip returns StepStatus.COMPLETED; the next sibling step in the then: list runs unconditionally unless the workflow author wires a downstream if reading {{ steps.<gate-id>.output.choice }}.

Two fixes:

Restructured the YAML example to actually demonstrate the manual-branching pattern. Added a recover if-step after the gate that conditions on steps.review.output.choice == 'approve'. Now the example shows what the workflow author has to do, instead of implying the engine does it for them.
Replaced the trailing paragraph with three precise notes:
- both gate options return COMPLETED; on_reject: skip controls abort behaviour only, not sibling-skipping
- all three on_reject values enumerated with their actual engine semantics (abort → FAILED+output.aborted, skip → COMPLETED, retry → PAUSED)
- the original retry-loop guidance retained as the third bullet

Updated the example gate message to match — "reject to leave the failure recorded and move on" instead of the old overclaim.

Audit (per "fix all similar issues"). Grepped the entire PR diff for the same pattern. No other instance — tests don't carry similar phrasing, and engine.py comments accurately describe the gate semantics already.

Engine semantics, validation, and test bodies are unchanged. CI: 161/161 tests/test_workflows.py pass locally.

…ally branch Audit follow-up to 393ac6b — three sites repeated the same minor overclaim about gates being one of the "branch on it" step types alongside `if` and `switch`: 1. workflows/README.md (the "downstream `if`, `switch`, or `gate` steps can branch on it" sentence introducing the example) 2. engine.py:236 (validator inline comment) 3. engine.py:657 (execute-path inline comment) A `gate` step does not have a `condition` or `expression` field — it only evaluates expressions for `message` and `show_file` (gate/__init__.py:29,36). Programmatic branching happens in `if`/`switch`; a gate surfaces the value to a human operator via message interpolation, and the operator's choice is recorded in `output.choice` for a *subsequent* `if`/`switch` to route on. Reworded all three sites consistently: "a downstream `if` or `switch` can branch on it (or a `gate` can surface it to the operator via message interpolation)". The README example already demonstrates this distinction — the gate carries `{{ }}` template variables in its message and the `recover` if-step downstream is what actually branches on the choice. Engine semantics, validation, and test bodies are unchanged. Docs-only on the README; comment-only on engine.py. 161/161 tests/test_workflows.py pass locally. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

doquanghuy · 2026-05-29T15:18:08Z

@mnriem — proactive audit follow-up in 03454dd (docs + inline-comment only).

After fixing the Copilot finding in 393ac6b I audited the rest of the PR diff for the same class of issue (docs claiming engine behaviour that the code doesn't actually implement). Found three sites with the same minor overclaim:

workflows/README.md — intro sentence to the example said "downstream if, switch, or gate steps can branch on it".
src/specify_cli/workflows/engine.py:236 — validator inline comment said the same.
src/specify_cli/workflows/engine.py:657 — execute-path inline comment said the same.

A gate step doesn't have a condition or expression field. Looking at gate/__init__.py:29,36, the only expressions a gate evaluates are template interpolations in message and show_file for human display. Programmatic branching is if/switch only; a gate's role is to surface a value to the operator, and the operator's output.choice is what a subsequent if/switch then routes on.

Reworded all three sites consistently to: "a downstream if or switch can branch on it (or a gate can surface it to the operator via message interpolation)". The README example already demonstrates this distinction — the gate carries {{ }} template variables in its message, and the recover if-step downstream is what actually branches on the choice.

Engine semantics, validation, and test bodies are unchanged. CI: 161/161 tests/test_workflows.py pass locally.

Branch is clean against upstream/main.

doquanghuy requested a review from mnriem as a code owner May 21, 2026 14:34

doquanghuy force-pushed the feat/continue-on-error branch 2 times, most recently from f34ab4c to da8ed4d Compare May 21, 2026 14:48

mnriem requested a review from Copilot May 26, 2026 12:12

Copilot started reviewing on behalf of mnriem May 26, 2026 12:20 View session

Copilot AI reviewed May 26, 2026

View reviewed changes

mnriem requested a review from Copilot May 27, 2026 12:05

Copilot started reviewing on behalf of mnriem May 27, 2026 12:06 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

doquanghuy force-pushed the feat/continue-on-error branch from e9a4871 to a0e78ee Compare May 27, 2026 12:56

mnriem requested a review from Copilot May 27, 2026 16:15

Copilot started reviewing on behalf of mnriem May 27, 2026 16:15 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

doquanghuy mentioned this pull request May 27, 2026

feat(integrations): support SPECIFY_<KEY>_EXTRA_ARGS env var for agent subprocess flags #2596

Merged

5 tasks

mnriem requested a review from Copilot May 27, 2026 17:39

Copilot started reviewing on behalf of mnriem May 27, 2026 17:40 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Comment thread src/specify_cli/workflows/engine.py Outdated

# steps can branch on it. Log a single, unambiguous

# event per failure resolution — either the run

# continued past it, or it halted.

if step_config.get("continue_on_error"):

mnriem requested a review from Copilot May 27, 2026 17:57

Copilot started reviewing on behalf of mnriem May 27, 2026 17:57 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

mnriem requested changes May 27, 2026

View reviewed changes

doquanghuy force-pushed the feat/continue-on-error branch from 838e7e4 to d0b9e00 Compare May 27, 2026 18:11

mnriem requested a review from Copilot May 28, 2026 12:38

Copilot started reviewing on behalf of mnriem May 28, 2026 12:39 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

mnriem requested a review from Copilot May 28, 2026 15:24

Copilot started reviewing on behalf of mnriem May 28, 2026 15:24 View session

Copilot AI reviewed May 28, 2026

View reviewed changes

mnriem requested a review from Copilot May 29, 2026 13:53

Copilot started reviewing on behalf of mnriem May 29, 2026 13:53 View session

Copilot AI reviewed May 29, 2026

View reviewed changes

		monkeypatch.setattr(gate_module.sys.stdin, "isatty", lambda: True)
		monkeypatch.setattr(

		run — most commonly a `shell` or `command` step exiting non-zero, but
		also things like a step type that fails validation. Set

		# is declared, the engine records the result (exit_code, stderr, status)
		# and continues to the next sibling step instead of halting the run.

		behaviour. `execute()` does not auto-validate (the
		`load_definition` docstring documents this), so the engine

Conversation

doquanghuy commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Canonical usage

Engine

Validation

Default behaviour preserved

Verdict coverage (from the issue discussion)

Testing

New test coverage

AI Disclosure

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

doquanghuy commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

doquanghuy commented May 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

doquanghuy commented May 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

mnriem left a comment

Choose a reason for hiding this comment

Uh oh!

doquanghuy commented May 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

mnriem commented May 28, 2026

Uh oh!

doquanghuy commented May 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

doquanghuy commented May 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

doquanghuy commented May 29, 2026

Uh oh!

doquanghuy commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

doquanghuy commented May 21, 2026 •

edited

Loading

doquanghuy commented May 27, 2026 •

edited

Loading