Skip to content

Close guest process stdin to avoid TTY hang on macOS#17562

Open
spboyer wants to merge 5 commits into
mainfrom
spboyer/fix-guest-launcher-stdin-tty-hang
Open

Close guest process stdin to avoid TTY hang on macOS#17562
spboyer wants to merge 5 commits into
mainfrom
spboyer/fix-guest-launcher-stdin-tty-hang

Conversation

@spboyer
Copy link
Copy Markdown
Member

@spboyer spboyer commented May 27, 2026

Description

aspire new for the TypeScript starter (and aspire init/add/restore) hangs silently on macOS, producing no further output at ~0% CPU. Issue #16791 documents the symptom and a working < /dev/null workaround, which is strong evidence that some spawned child inherits the parent CLI's TTY and blocks on a stdin read.

Root cause

Several subprocess launch paths in Aspire.Cli set RedirectStandardOutput/Error = true and UseShellExecute = false, but leave stdin inheriting the parent TTY. On macOS/Linux any read from stdin in those children then blocks forever:

Launch path Used for
ProcessGuestLauncher npm/pnpm/yarn/bun install (and the guest AppHost itself)
NpmRunner npm view / pack / audit signatures / install -g
DotNetBasedAppHostServerProject.Run Dev/source-based AppHost server process
PrebuiltAppHostServer.CreateStartInfo Shipped AppHost server process

The TypeScript starter flow exercises all of these via BuildAndGenerateSdkAsync: prepare → start AppHost server → RPC codegen → npm install. By contrast, dotnet new install already goes through ProcessExecutionFactory, which sets RedirectStandardInput = true — that's why C#-only template scaffolding doesn't hit this.

Fix

In each of the four launch paths above:

  1. Set RedirectStandardInput = true on the ProcessStartInfo.
  2. Immediately after process.Start(), call process.StandardInput.Close() (wrapped in try/catch (IOException)) so any child read surfaces as EOF instead of blocking on the inherited TTY.

The CLI controls these processes via Ctrl+C / backchannel cancellation and Unix-socket IPC (REMOTE_APP_HOST_SOCKET_PATH), never via stdin, so closing the pipe is safe.

Tests

Added ProcessGuestLauncher_ClosesChildStdinSoReadsObserveEof in GuestRuntimeTests. It launches a short shell snippet that reads stdin and asserts the launcher returns within 10s with the child reporting EOF. Without the fix the test would block on the inherited TTY (or be killed by the 10s CTS).

All GuestRuntimeTests, AppHostServerProjectTests, AppHostServerSessionTests, and NpmRunnerTests pass locally (49 succeeded, 4 Windows-only skipped).

Fixes #16791

Copilot AI review requested due to automatic review settings May 27, 2026 21:46
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 27, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17562

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17562"

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a macOS/Linux hang in Aspire CLI guest process execution (notably TypeScript scaffolding flows) by ensuring spawned child processes do not inherit the parent TTY for stdin, preventing lifecycle scripts or prompts from blocking indefinitely.

Changes:

  • Redirect and immediately close stdin for guest processes launched via ProcessGuestLauncher so child reads observe EOF instead of blocking on the terminal.
  • Apply the same stdin redirect+close behavior to NpmRunner subprocess invocations.
  • Add a regression test that launches a command which attempts to read stdin and asserts it exits promptly with EOF observed.
Show a summary per file
File Description
tests/Aspire.Cli.Tests/Projects/GuestRuntimeTests.cs Adds regression coverage ensuring child stdin is closed so reads return EOF (prevents macOS TTY hangs).
src/Aspire.Cli/Projects/ProcessGuestLauncher.cs Redirects stdin and closes it immediately after start to avoid inheriting the CLI’s TTY.
src/Aspire.Cli/Npm/NpmRunner.cs Redirects stdin for npm processes and closes it right after start to prevent interactive hangs.

Copilot's findings

  • Files reviewed: 3/3 changed files
  • Comments generated: 0

@spboyer
Copy link
Copy Markdown
Member Author

spboyer commented May 27, 2026

The \Cli.EndToEnd-EmptyAppHostTemplateTests\ failure (C# empty AppHost) looks like a flaky E2E test rather than a regression from this PR:

  • The changes in this PR only affect TypeScript/npm guest process launching (\ProcessGuestLauncher, \NpmRunner) and the code-gen AppHost server (\DotNetBasedAppHostServerProject, \PrebuiltAppHostServer). They do not touch the C# AppHost runtime used by \�spire start/\�spire stop.
  • The TypeScript variant of the same test (\TypeScriptEmptyAppHostTemplateTests) passed ✅ — confirming the TypeScript path is healthy.
  • The failure is \�spire stop\ timing out after 8:20 while waiting for the apphost to stop — a known infrastructure flakiness pattern for this test suite.

Will request a re-run once the current CI run completes.

@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

spboyer and others added 4 commits May 28, 2026 21:36
ProcessGuestLauncher and NpmRunner spawn child processes (npm/pnpm/yarn/bun
install, plus the guest AppHost itself) with stdout/stderr redirected but
left stdin inheriting the parent CLI's TTY. On macOS/Linux, if any child
(e.g. an npm postinstall script, husky, or a package-manager permission
prompt) reads from stdin, it blocks indefinitely waiting on the terminal,
making 'aspire new' for the TypeScript starter (and 'aspire init/add/
restore') appear to stall with no output and ~0% CPU.

Redirect stdin and close it immediately after Process.Start() so any child
read surfaces as EOF instead of blocking. We never write to the guest
process or npm stdin, so closing is safe. dotnet-based invocations already
redirect stdin via ProcessExecutionFactory.

Add a regression test in GuestRuntimeTests that launches a shell script
which reads stdin and asserts it observes EOF and exits within 10s.

Fixes #16791

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extend the TTY-hang fix to the two AppHost server launch paths used by
BuildAndGenerateSdkAsync during 'aspire new'/'init'/'add'/'restore':

- DotNetBasedAppHostServerProject.Run (dev/source-based AppHost server)
- PrebuiltAppHostServer (shipped AppHost server)

Both previously redirected stdout/stderr but left stdin inheriting the
parent CLI's TTY. The CLI communicates with the server over a Unix socket
(REMOTE_APP_HOST_SOCKET_PATH), not stdin, so closing the redirected stdin
pipe immediately after Process.Start() is safe and ensures any stdin read
in the server (or a library it loads) surfaces as EOF instead of blocking.

Combined with the earlier ProcessGuestLauncher / NpmRunner changes, this
covers every child process spawned during the TypeScript starter
scaffolding flow that previously inherited the parent TTY.

Refs #16791

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The E2E helper AspireStopAsync was calling WaitForSuccessPromptAsync, which
waits up to 500s (default) for [N OK] $. When aspire stop returns non-zero
(for example the documented FailedToDotnetRunAppHost flake in #16643), the
prompt arrives as [N ERR:2] $ and the test then sits idle for ~8:20 before
failing with a useless 'didn't see OK' timeout. The recent failure on this
PR's CI was exactly this shape: aspire stop exited within seconds with
ERR:2, but the test wasted 8m20s waiting for an OK that would never come.

Switch the helper to WaitForSuccessPromptFailFastAsync so any ERR prompt
fails the test immediately with the captured error context. All 20 callers
are happy-path tests that expect aspire stop to succeed, so this is a pure
test-diagnostic improvement — no product behavior change.

Refs #16643

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ProcessCaptureRunner bounded post-exit stdout/stderr capture at 250ms. On loaded Windows CI, short-lived cmd.exe wrappers can exit before the async pipe readers get enough CPU to observe EOF, causing callers to receive an empty capture even though the process wrote output. The PeerInstallProbe failure on this PR had that shape: the fake peer.cmd printed the expected --version output, but the probe reported no usable output.

Increase the bounded post-exit capture window to 2s. This remains far below the full process timeout, but gives enough scheduling slack for Windows pipe readers under CI load.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@spboyer spboyer force-pushed the spboyer/fix-guest-launcher-stdin-tty-hang branch from fc4bf93 to 9e54954 Compare May 29, 2026 01:37
WaitForSuccessPromptFailFastAsync was renamed to WaitForSuccessPromptAsync
in b2df78e. Update the call site in AspireStopAsync to match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

  • Tests / Acquisition / Acquisition (macos-latest) - Failed step 'Build test project' will be retried because the job log shows a likely transient infrastructure network failure. Matched pattern: /Unable to load the service index for source https:\/\/(?:pkgs\.dev\.azure\.com\/dnceng|dnceng\.pkgs\.visualstudio\.com)\/public\/_packaging\//i.

@github-actions
Copy link
Copy Markdown
Contributor

CLI E2E Tests unknown — 107 passed, 0 failed, 2 unknown (commit ee36829)

View all recordings
Status Test Recording Job Artifacts
AddPackageInteractiveWhileAppHostRunningDetached Recording #78428599143 Logs
AddPackageWhileAppHostRunningDetached Recording #78428599143 Logs
AgentCommands_AllHelpOutputs_AreCorrect Recording #78428599387 Logs
AgentInitCommand_DefaultSelection_InstallsDefaultSkills Recording #78428599387 Logs
AgentInitCommand_MigratesDeprecatedConfig Recording #78428599387 Logs
AgentMcpListStructuredLogsReturnsLogsFromStarterApp Recording #78428599281 Logs
AgentMcpListStructuredLogsReturnsLogsFromStarterApp_DevLocalhost Recording #78428599281 Logs
AgentMcpListStructuredLogsReturnsLogsFromStarterApp_Isolated Recording #78428599281 Logs
AllPublishMethodsBuildDockerImages Recording #78428599296 Logs
AspireAddAndStartWorkAgainstLegacyAppHostTs Recording #78428599017 Logs
AspireAddPackageVersionToDirectoryPackagesProps Recording #78428599229 Logs
AspireInitSingleFileAppHostRunsViaDotnetRunAppHost Recording #78428598748 Logs
AspireInitWithExistingAppHostDirRecreatesMissingNuGetConfigAndPreservesFiles Recording #78428599279 Logs
AspireInitWithSolutionFileGeneratesAppHostThatBuildsAgainstChannelHive Recording #78428599279 Logs
AspireStartUpdatesStaleTypeScriptAppHostPath Recording #78428598909 Logs
AspireUpdateRemovesAppHostPackageVersionFromDirectoryPackagesProps Recording #78428599229 Logs
AspireUpdateRemovesOrphanAppHostPackageVersionWhenSdkAlreadyCurrent Recording #78428599229 Logs
Banner_DisplayedOnFirstRun Recording #78428598933 Logs
Banner_DisplayedWithExplicitFlag Recording #78428598933 Logs
Banner_NotDisplayedWithNoLogoFlag Recording #78428598933 Logs
CertificatesClean_RemovesCertificates Recording #78428599047 Logs
CertificatesTrust_WithNoCert_CreatesAndTrustsCertificate Recording #78428599047 Logs
CertificatesTrust_WithUntrustedCert_TrustsCertificate Recording #78428599047 Logs
ConfigSetGet_CreatesNestedJsonFormat Recording #78428599439 Logs
CreateAndRunAspireStarterProject Recording #78428599045 Logs
CreateAndRunAspireStarterProjectWithBundle Recording #78428599083 Logs
CreateAndRunEmptyAppHostProject Recording #78428599107 Logs
CreateAndRunJavaEmptyAppHostProject Recording #78428599289 Logs
CreateAndRunJsReactProject Recording #78428599180 Logs
CreateAndRunPythonReactProject Recording #78428599315 Logs
CreateAndRunTypeScriptEmptyAppHostProject Recording #78428599305 Logs
CreateAndRunTypeScriptStarterProject Recording #78428599156 Logs
CreateJavaAppHostWithViteApp Recording #78428598972 Logs
CreateTypeScriptAppHostWithViteApp_AllowsGuestAppPackageManagerToDiffer Recording #78428598946 Logs
CreateTypeScriptAppHostWithViteApp_UsesConfiguredToolchain Recording #78428598946 Logs
DashboardRunWithAgentMcpListTracesReturnsNoTraces Recording #78428599327 Logs
DashboardRunWithAgentMcpListTracesReturnsNoTraces_DevLocalhost Recording #78428599327 Logs
DashboardRunWithOtelTracesReturnsNoTraces Recording #78428599327 Logs
DashboardRunWithOtelTracesReturnsNoTraces_DevLocalhost Recording #78428599327 Logs
DeployK8sBasicApiService Recording #78428599054 Logs
DeployK8sWithExternalHelmChart Recording #78428598913 Logs
DeployK8sWithGarnet Recording #78428599046 Logs
DeployK8sWithMongoDB Recording #78428599201 Logs
DeployK8sWithMySql Recording #78428598928 Logs
DeployK8sWithPostgres Recording #78428599287 Logs
DeployK8sWithRabbitMQ Recording #78428599316 Logs
DeployK8sWithRedis Recording #78428599277 Logs
DeployK8sWithSqlServer Recording #78428599297 Logs
DeployK8sWithValkey Recording #78428599309 Logs
DeployTypeScriptAppToKubernetes Recording #78428599104 Logs
DescribeCommandResolvesReplicaNames Recording #78428599301 Logs
DescribeCommandShowsRunningResources Recording #78428599301 Logs
DetachFormatJsonProducesValidJson Recording #78428599070 Logs
DetachFormatJsonProducesValidJsonWhenRestartingExistingInstance Recording #78428599070 Logs
DoPublishAndDeployListStepsWork Recording #78428598796 Logs
DocsCommand_RendersInteractiveMarkdownFromLocalSource Recording #78428598969 Logs
DoctorCommand_DetectsDeprecatedAgentConfig Recording #78428599387 Logs
DoctorCommand_TypeScriptAppHostReportsMissingConfiguredToolchain Recording #78428599072 Logs
DoctorCommand_WithSslCertDir_ShowsTrusted Recording #78428599072 Logs
DoctorCommand_WithoutSslCertDir_ShowsPartiallyTrusted Recording #78428599072 Logs
GatewayWithoutExternalEndpoint_FailsPublishWithGuidance Recording #78428599019 Logs
GeneratedAspireDevScript_StartsWatchMode_WithConfiguredToolchain Recording #78428598946 Logs
GlobalMigration_HandlesCommentsAndTrailingCommas Recording #78428599439 Logs
GlobalMigration_HandlesMalformedLegacyJson Recording #78428599439 Logs
GlobalMigration_PreservesAllValueTypes Recording #78428599439 Logs
GlobalMigration_SkipsWhenNewConfigExists Recording #78428599439 Logs
GlobalSettings_MigratedFromLegacyFormat Recording #78428599439 Logs
IngressWithoutExternalEndpoint_FailsPublishWithGuidance Recording #78428599019 Logs
InitTypeScriptAppHost_AugmentsExistingViteRepoInWorkspaceSubdirectory Recording #78428598946 Logs
InteractiveCSharpInitCreatesExpectedFiles Recording #78428598993 Logs
InvalidAppHostPathWithComments_IsHealedOnRun Recording #78428598984 Logs
JavaScriptHostingApisRunFromTypeScriptAppHost Recording #78428599296 Logs
LatestCliCanStartStableChannelAppHost Recording #78428599045 Logs
LatestCliCanStartStableChannelTypeScriptAppHost Recording #78428599045 Logs
LegacySettingsMigration_AdjustsRelativeAppHostPath Recording #78428598909 Logs
LogsCommandShowsResourceLogs Recording #78428598866 Logs
OtelLogsReturnsStructuredLogsFromStarterApp Recording #78428598723 Logs
OtelLogsReturnsStructuredLogsFromStarterAppIsolated Recording #78428598723 Logs
PsCommandListsRunningAppHost Recording #78428599306 Logs
PsFormatJsonOutputsOnlyJsonToStdout Recording #78428599306 Logs
PublishJavaScriptPatternsGeneratesExpectedDockerComposeArtifacts Recording #78428599383 Logs
PublishWithConfigureEnvFileUpdatesEnvOutput Recording #78428599383 Logs
PublishWithDockerComposeServiceCallbackSucceeds Recording #78428599383 Logs
PublishWithoutOutputPathUsesAppHostDirectoryDefault Recording #78428599383 Logs
ResourceCommand_FailedExecution_DisplaysAppHostLogPathAndLogContainsEntries Recording #78428599200 Logs
ResourceCommand_SetAndDeleteParameterUpdatesDescribeOutput Recording #78428599200 Logs
RestoreGeneratesSdkFiles Recording #78428599167 Logs
RestoreGeneratesSdkFiles_WithConfiguredToolchain Recording #78428599284 Logs
RestoreRefreshesGeneratedSdkAfterAddingIntegration Recording #78428599284 Logs
RestoreSupportsConfigOnlyHelperPackageAndCrossPackageTypes Recording #78428599291 Logs
RunFromParentDirectory_UsesExistingConfigNearAppHost Recording #78428598982 Logs
RunReportsSyntaxErrorsForDotNetAppHost Recording #78428599178 Logs
RunReportsSyntaxErrorsForTypeScriptAppHost Recording #78428599178 Logs
SecretCrudOnDotNetAppHost Recording #78428599280 Logs
SecretCrudOnTypeScriptAppHost Recording #78428598903 Logs
StagingChannel_ConfigureAndVerifySettings_ThenSwitchChannels Recording #78428599170 Logs
StartAndWaitForTypeScriptSqlServerAppHostWithNativeAssets Recording #78428599351 Logs
StartReportsSyntaxErrorsForDotNetAppHost Recording #78428599178 Logs
StartReportsSyntaxErrorsForTypeScriptAppHost Recording #78428599178 Logs
StopAllAppHostsFromAppHostDirectory Recording #78428598852 Logs
StopJavaPolyglotAppHostUsingApphostDirectory Recording #78428599454 Logs
StopNonInteractiveSingleAppHost Recording #78428598852 Logs
StopTypeScriptPolyglotAppHostUsingApphostDirectory Recording #78428599029 Logs
StopWithNoRunningAppHostExitsSuccessfully Recording #78428599143 Logs
UnAwaitedChainsCompileWithAutoResolvePromises Recording #78428599284 Logs
UpdateProjectChannelToStable_CSharpEmptyAppHost_PreservesAspireConfigChannel Recording #78428599073 Logs
UpdateProjectChannelToStable_CSharpSingleFileInit_PreservesAspireConfigChannel Recording #78428599073 Logs
UpdateProjectChannelToStable_TypeScriptSingleFileInit_PreservesAspireConfigChannel Recording #78428599073 Logs
UpdateProjectChannelToStable_TypeScript_PreviewsStablePackagesAndPreservesChannel Recording #78428599073 Logs

📹 Recordings uploaded automatically from CI run #26613241630

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

aspire init/add/restore can hang silently when stdin is a TTY (macOS)

2 participants