[Bug]: Parallel `file.managed` states cause CPU spin-loop on Linux (fork-inherited ZeroMQ sockets)

### What happened?

## Description

When `parallel: True` is used on `file.managed` states, the forked `ParallelState` child processes enter a CPU spin-loop (~93–98% CPU each) and never complete. The root cause is that on Linux (fork-based platforms), `call_parallel()` passes the parent's `State` instance directly to the child process, including live ZeroMQ sockets connected to the Salt master. Multiple forked children then race on the same inherited ZeroMQ connections, causing the asyncio event loop to spin indefinitely waiting for responses that were already consumed by a sibling process.

## Setup

- **Salt version**: 3007.13 (Chlorine)
- **OS**: Ubuntu 22.04 (Linux, fork-based process creation)
- **Master topology**: 3-master failover (`master_type: failover`)
- **Transport**: ZeroMQ

## Steps to Reproduce

1. Create a state with multiple `file.managed` declarations using `parallel: True`:

```yaml
redis_exporter_binary:
  file.managed:
    - name: /usr/bin/redis_exporter
    - source: https://nexus.example.com/.../redis_exporter
    - skip_verify: True
    - parallel: True

redis_exporter_env:
  file.managed:
    - name: /etc/default/redis_exporter
    - source: salt://redis_exporter/files/redis_exporter.env.jinja
    - template: jinja
    - parallel: True

sentinel_exporter_env:
  file.managed:
    - name: /etc/default/sentinel_exporter
    - source: salt://redis_exporter/files/redis_exporter.env.jinja
    - template: jinja
    - parallel: True
```

2. Run `state.apply` (even with `test=True`):

```bash
salt <minion-id> state.apply redis_exporter test=true
```

3. Observe that the run never completes.

## Expected Behavior

All `file.managed` states should execute in parallel and return results within seconds. With `test=True`, only hash comparison against the master file server should occur — no file writes.

## Actual Behavior

- The main state process logs `"Started in a separate process"` for each parallel state and then hangs indefinitely.
- Two or more `ParallelState(...)` child processes appear, each consuming ~93–98% CPU:

```
root  2094575 98.1  1.9  610016 56528 ?  Sl  22:14  4:47  ...ParallelState(/etc/default/redis_exporter)
root  2094576 98.2  1.9  610016 56724 ?  Sl  22:14  4:47  ...ParallelState(/etc/default/sentinel_exporter)
```

- The kernel stack for these processes shows `futex_wait`, but top reports near-100% CPU — characteristic of a userspace busy-loop (asyncio event loop spinning).
- Network connections to the master (ports 4505/4506) are `ESTABLISHED` with `Send-Q: 0` — sockets are open but idle.
- `salt-call cp.hash_file <same-file>` works perfectly when run standalone (no parallelism).
- `salt-call cp.list_master` works perfectly.
- Killing the hung job with `saltutil.kill_all_jobs` and re-running produces the same result — the issue is 100% reproducible.

## Root Cause Analysis

The issue is in `salt/state.py`, in the `call_parallel()` method (line ~2276):

```python
def call_parallel(self, cdata, low, inject_globals):
    ...
    if salt.utils.platform.spawning_platform():
        instance = None          # Windows/macOS: will recreate State from scratch
    else:
        instance = self          # Linux: reuse parent's State object (with live sockets!)
        inject_globals = None
    
    proc = salt.utils.process.Process(
        target=self._call_parallel_target,
        args=(instance, self._init_kwargs, name, cdata, low, inject_globals),
        ...
    )
    proc.start()
```

On Linux, `Process` uses `fork()`. The child process inherits the parent's memory space, including:

- ZeroMQ `REQ` sockets connected to the master's `ret` port (4506)
- The asyncio event loop state
- File client channel objects

When multiple children simultaneously call `cp.hash_file` (triggered by `file.managed` to compare file hashes), they all attempt to use the **same inherited ZeroMQ socket** to communicate with the master. ZeroMQ `REQ` sockets have strict request-reply ordering — if one child reads a response intended for another, the other child's event loop never receives its expected reply and spins indefinitely.

On Windows/macOS (spawning platforms), `instance` is set to `None`, and `_call_parallel_target` recreates the `State` object from scratch with fresh connections — which is why this bug is Linux-specific.

## Suggested Fix

Force fresh `State` instance creation for parallel children on all platforms, not just spawning ones. The simplest approach:

```python
def call_parallel(self, cdata, low, inject_globals):
    ...
    # Always create a fresh instance in child to avoid
    # sharing ZeroMQ sockets across forked processes
    instance = None

    proc = salt.utils.process.Process(
        target=self._call_parallel_target,
        args=(instance, self._init_kwargs, name, cdata, low, inject_globals),
        ...
    )
    proc.start()
```

This trades a small startup cost (recreating the State object per parallel child) for correctness. The current "optimization" of reusing the parent instance on Linux is unsafe whenever the state function communicates with the master.

An alternative approach would be to use `multiprocessing.set_start_method('forkserver')` or `'spawn'` for parallel state processes specifically, but this would be a larger change.

### Type of salt install

Official deb

### Major version

3007.x

### What supported OS are you seeing the problem on? Can select multiple. (If bug appears on an unsupported OS, please open a GitHub Discussion instead)

debian-11, debian-12

### salt --versions-report output

```shell
salt --versions-report
Salt Version:
          Salt: 3007.13

Python Version:
        Python: 3.10.19 (main, Feb  5 2026, 07:05:38) [GCC 11.2.0]

Dependency Versions:
          cffi: 2.0.0
      cherrypy: unknown
  cryptography: 42.0.5
      dateutil: 2.8.2
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.6
       libgit2: 1.9.1
  looseversion: 1.3.0
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.7
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 24.0
     pycparser: 2.21
      pycrypto: Not Installed
  pycryptodome: 3.19.1
        pygit2: 1.18.2
  python-gnupg: 0.5.2
        PyYAML: 6.0.1
         PyZMQ: 25.1.2
        relenv: 0.22.3
         smmap: Not Installed
       timelib: 0.3.0
       Tornado: 6.5.4
           ZMQ: 4.3.4

Salt Extensions:
 saltext.vault: 1.5.0

Salt Package Information:
  Package Type: onedir

System Versions:
          dist: debian 12.13 bookworm
        locale: utf-8
       machine: x86_64
       release: 6.12.73+deb12-amd64
        system: Linux
       version: Debian GNU/Linux 12.13 bookworm
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Parallel `file.managed` states cause CPU spin-loop on Linux (fork-inherited ZeroMQ sockets) #68940

What happened?

Description

Setup

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Suggested Fix

Type of salt install

Major version

What supported OS are you seeing the problem on? Can select multiple. (If bug appears on an unsupported OS, please open a GitHub Discussion instead)

salt --versions-report output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Parallel file.managed states cause CPU spin-loop on Linux (fork-inherited ZeroMQ sockets) #68940

Description

What happened?

Description

Setup

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Suggested Fix

Type of salt install

Major version

What supported OS are you seeing the problem on? Can select multiple. (If bug appears on an unsupported OS, please open a GitHub Discussion instead)

salt --versions-report output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Bug]: Parallel `file.managed` states cause CPU spin-loop on Linux (fork-inherited ZeroMQ sockets) #68940