Skip to content

mitosis: preempt on tick and check borrow fairness#3469

Open
likewhatevs wants to merge 1 commit intosched-ext:mainfrom
likewhatevs:mitosis-tick-borrow-fairness
Open

mitosis: preempt on tick and check borrow fairness#3469
likewhatevs wants to merge 1 commit intosched-ext:mainfrom
likewhatevs:mitosis-tick-borrow-fairness

Conversation

@likewhatevs
Copy link
Copy Markdown
Contributor

Summary

  • Add ops.tick handler that preempts the running task when per-CPU DSQ
    work is waiting, or when a cross-cell borrower occupies a CPU whose
    native cell has queued work.
  • Before borrowing a CPU, check that its native cell and per-CPU DSQs
    are empty — skip the borrow if the target cell needs the CPU.
  • Add raw u64 DSQ ID helpers for use in contexts where BPF cannot
    return aggregates.

Together with #3468, this gets the vtime contamination test in #3467
to pass.

Add ops.tick to zero the running task's slice when this CPU's
per-CPU DSQ has waiting work, or when the task is cross-cell
and the native cell DSQ has waiting work. This triggers dispatch
within the current tick rather than waiting for the full slice
to expire, reducing latency for per-CPU DSQ tasks and reclaiming
CPUs when native cell work arrives.

Add borrow eligibility check in try_pick_idle_cpu: before
borrowing a CPU, verify the target cell's DSQ and per-CPU DSQ
have no waiting work at borrow time. Work arriving after the
borrow starts is handled by tick.

Add cpu_dsq_raw() and cell_llc_dsq_raw() helpers to dsq.bpf.h
for contexts where BPF does not support aggregate return values
from get_cpu_dsq_id() / get_cell_llc_dsq_id().

Signed-off-by: Pat Somaru <patso@likewhatevs.io>
target_cctx->cell, tllc)) > 0 ||
scx_bpf_dsq_nr_queued(cpu_dsq_raw(cpu)) > 0)
goto no_borrow;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make a ton of sense to me - it seems like papering over some other bug if we're finding idle CPUs in a cell with work in its DSQ

p->scx.slice = 0;
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I rather make this choice at select_cpu or enqueue time rather than on a periodic tick. That being said - this just seems like a fairness change - maybe let's wait on this until we've resolved more obvious correctness issues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants