Skip to content

fix(controller): honor scaleUpPreviewCheckPoint reset on blueGreen pod template change. Fixes #4728#4729

Open
abizer wants to merge 1 commit into
argoproj:masterfrom
abizer:fix-bluegreen-preview-status-check
Open

fix(controller): honor scaleUpPreviewCheckPoint reset on blueGreen pod template change. Fixes #4728#4729
abizer wants to merge 1 commit into
argoproj:masterfrom
abizer:fix-bluegreen-preview-status-check

Conversation

@abizer
Copy link
Copy Markdown

@abizer abizer commented May 16, 2026

What

Read prevValue from newStatus instead of c.rollout.Status in calculateScaleUpPreviewCheckPoint. When syncRolloutStatusBlueGreen calls resetRolloutStatus on pod template change, the reset mutates newStatus.BlueGreen.ScaleUpPreviewCheckPoint = false; reading from c.rollout.Status immediately afterwards picks up the pre-reset value, clobbering the reset.

- prevValue := c.rollout.Status.BlueGreen.ScaleUpPreviewCheckPoint
+ prevValue := newStatus.BlueGreen.ScaleUpPreviewCheckPoint

Why

Matches the function's documented behavior:

scaleUpPreviewCheckPoint... gets reset to false when the pod template changes, or the rollout fully promotes

Without this fix, a new revision started while the previous one was in the scale-up / autoPromotionSeconds window inherits scaleUpPreviewCheckPoint: true and skips prePromotionAnalysis entirely — the new preview RS scales straight to replicas (bypassing previewReplicaCount), no AnalysisRun is created, and cutover proceeds untested.

Full repro + root cause + live-cluster evidence in #4728.

Tests

  • Added unit regression test TestCalculateScaleUpPreviewCheckPointHonorsReset in rollout/bluegreen_test.go. Verified it fails on the unfixed code (Should be false) and passes with the fix.
  • Existing TestPreviewReplicaCountHandleScaleUpPreviewCheckPoint sub-tests (TrueAfterMeetingMinAvailable, FalseAfterActiveServiceSwitch, TrueWhenScalingUpPreview) continue to pass.
  • Full ./rollout/... package suite passes locally (make test equivalent).

Checklist

  • This is a bug fix.
  • PR title is conventional (fix(controller): ...), states what changed, and suffixes the related issue number (Fixes #4728).
  • Commits are DCO signed.
  • Unit tests written and passing.
  • All tests run locally and pass.
  • I have used LLM/AI/Agent tools for this PR but I am responsible for all code of this PR.
  • I understand what the code does and WHY/HOW it works in several scenarios.
  • This changes behavior for existing users of blueGreen + previewReplicaCount + prePromotionAnalysis: previously a back-to-back revision would skip the gate; now it correctly runs prePromotionAnalysis for the new revision. No API or config change.
  • My organization is added to USERS.md. (Out of scope for this bug fix — separate PR if needed.)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 16, 2026

Published E2E Test Results

  4 files    4 suites   3h 54m 8s ⏱️
120 tests 105 ✅  7 💤  8 ❌
494 runs  453 ✅ 28 💤 13 ❌

For more details on these failures, see this check.

Results for commit 28c1d69.

♻️ This comment has been updated with latest results.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.04%. Comparing base (2ccdae8) to head (b7f948b).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4729   +/-   ##
=======================================
  Coverage   85.03%   85.04%           
=======================================
  Files         164      164           
  Lines       18989    18989           
=======================================
+ Hits        16148    16149    +1     
+ Misses       1993     1992    -1     
  Partials      848      848           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 16, 2026

Published Unit Test Results

2 470 tests   2 470 ✅  3m 21s ⏱️
  129 suites      0 💤
    1 files        0 ❌

Results for commit 28c1d69.

♻️ This comment has been updated with latest results.

…d template change

When a new pod template is rolled out while the previous revision had
already completed prePromotionAnalysis (and thus set
status.blueGreen.scaleUpPreviewCheckPoint=true), the new revision
inherits that value and Argo Rollouts skips creating an AnalysisRun for
it entirely -- including the prePromotionAnalysis gate.

syncRolloutStatusBlueGreen calls resetRolloutStatus(&newStatus) on pod
template change, which clears newStatus.BlueGreen.ScaleUpPreviewCheckPoint.
The subsequent call to calculateScaleUpPreviewCheckPoint read prevValue
from c.rollout.Status (the pre-reset value) rather than the newStatus
parameter that was just reset, immediately clobbering the reset.

Read prevValue from newStatus so the reset applied earlier in the same
reconcile is honored, matching the documented behavior:

  // ... gets reset to false when the pod template changes, or the
  // rollout fully promotes (stableRS == newRS)

Add a regression test that fails on the old behavior and passes with
the fix.

Signed-off-by: Abizer Lokhandwala <abizer@abizer.me>
@abizer abizer force-pushed the fix-bluegreen-preview-status-check branch from b7f948b to 28c1d69 Compare May 23, 2026 14:12
@abizer abizer requested a review from a team as a code owner May 23, 2026 14:12
@sonarqubecloud
Copy link
Copy Markdown

@abizer
Copy link
Copy Markdown
Author

abizer commented May 28, 2026

cc @argoproj/argo-rollouts-approvers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant