Skip to content

DHCP: concurrent DISCOVER on multiple nodes can overwrite the same lease and downgrade it to negotiate TTL #2148

@romanlum

Description

@romanlum

Summary

When multiple Gravity instances run the DHCP role, the same client lease can be created concurrently during DHCPDISCOVER.

If one node misses the lease in its local watcher cache, it may create the lease again and overwrite the existing key with the shorter leaseNegotiateTimeout TTL. This can cause the lease to disappear after the negotiate timeout even though the client completed DHCP successfully.

Impact

  • the same lease key can be written by multiple nodes
  • an existing lease can be overwritten by a later DISCOVER
  • the overwritten lease may end up with the short negotiate TTL, for example 30s
  • after that timeout, the lease disappears unexpectedly

Reproduction

  1. Run multiple Gravity instances with the DHCP role enabled.
  2. Use a DHCP client against the shared cluster, for example:
    dhclient -r
    dhclient -v
  3. Observe lease creation while more than one node processes the same client traffic.
  4. In some runs, the lease key is overwritten and ends up with the negotiate timeout instead of the normal scope TTL.

Expected behavior

  • only the first node should create the lease for a client
  • other nodes should reuse the existing lease instead of overwriting it
  • DISCOVER must not downgrade an existing lease to leaseNegotiateTimeout

Actual behavior

A node can miss the existing lease in its local watcher state and do:

  • FindLease() == nil
  • create a new lease object
  • write it with leaseNegotiateTimeout

If another node already created the same lease key, this overwrites the key and can shorten the lease lifetime unexpectedly.

Root cause

Lease existence was checked via local watcher state only. In a clustered setup, watcher lag means two nodes can both believe the lease does not yet exist.

Proposed fix

Use an atomic create-if-absent operation for new lease creation during DISCOVER and REQUEST:

  • try to create the lease key only if it does not already exist
  • if creation fails because the key already exists, fetch and reuse the existing lease
  • never overwrite an existing lease with the short discover or negotiate TTL

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions