Skip to content

Feat/enable autoresume for byoc#2470

Open
matthewlouisbrockman wants to merge 25 commits intomainfrom
feat/byoc-autoresume
Open

Feat/enable autoresume for byoc#2470
matthewlouisbrockman wants to merge 25 commits intomainfrom
feat/byoc-autoresume

Conversation

@matthewlouisbrockman
Copy link
Copy Markdown
Contributor

@matthewlouisbrockman matthewlouisbrockman commented Apr 21, 2026

allows edge clients to check the api for autoresume using grpc over tls using the existing api_secret

when a request comes into the edge proxy, if the sandbox isn't present, it calls home to the main API to see if it can autoresume. if it can autoresume, the main api resumes the sandbox and returns the info back to the edge proxy.

                           E2B CONTROL PLANE
                     +--------------------------+
                     | API                      |
                     |                          |
                     | ResumeSandbox gRPC       |
                     | api-grpc.<domain>        |
                     +------------+-------------+
                                  ^      |
                                  |      |
                                  |      | 3. API asks BYOC orchestrator
                                  |      |    to resume/start sandbox
                                  |      v
BYOC / EDGE DATA PLANE            |  +---------------------------+
+----------------------------------|--| edge orchestrator node   |-----------+
|                                  |  | starts/resumes sandbox   |           |
|                                  |  | updates sandbox catalog  |           |
|                                  |  +------------+--------------+           |
|                                  |               |                          |
|                                  |               | resumed sandbox           |
|                                  |               v                          |
|                                  |  +---------------------------+           |
|                                  |  | sandbox app / envd        |           |
|                                  |  +---------------------------+           |
|                                  |                                          |
|  user traffic                    |                                          |
|      |                           |                                          |
|      v                           |                                          |
|  +-------------------+           |                                          |
|  | BYOC LB / DNS     |           |                                          |
|  +---------+---------+           |                                          |
|            |                     |                                          |
|            v                     |                                          |
|  +-------------------+           |                                          |
|  | edge proxy        |           |                                          |
|  | client-proxy bin  |           |                                          |
|  +---------+---------+           |                                          |
|            |                     |                                          |
|            | 1. lookup sandbox   |                                          |
|            v                     |                                          |
|  +-------------------+           |                                          |
|  | edge catalog      |           |                                          |
|  | sandbox -> node   |           |                                          |
|  +----+---------+----+           |                                          |
|       |         |                |                                          |
|       | hit     | miss           |                                          |
|       |         |                |                                          |
|       |         v                |                                          |
|       |   +-------------------------------+                                 |
|       |   | 2. auto-resume path           |                                 |
|       |   | edge proxy calls API          |                                 |
|       |   | ResumeSandbox(sandbox, port)  |---------------------------------+
|       |   | + edge/proxy auth             |
|       |   | + forwarded request auth      |
|       |   +---------------+---------------+
|       |                   |
|       |                   | 4. API returns node route
|       |                   |    catalog is now populated
|       |                   v
|       |          +-----------------------------+
|       +--------->| edge orchestrator node      |
|                  | orchestrator proxy          |
|                  +-------------+---------------+
|                                |
|                                | 5. original request continues
|                                |    to requested sandbox port
|                                |    e.g. 80 / 3000 / envd
|                                v
|                  +-----------------------------+
|                  | sandbox app / envd          |
|                  +-----------------------------+
|                                                                       |
+-----------------------------------------------------------------------+


@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 21, 2026

PR Summary

High Risk
Adds a new externally reachable API gRPC entrypoint and uses shared secrets to authorize ResumeSandbox calls, plus changes sandbox routing behavior; mistakes could expose resume functionality or break sandbox traffic routing.

Overview
Enables BYOC/edge client-proxy to auto-resume paused sandboxes by calling the control-plane API over gRPC (optionally TLS) with a new x-e2b-client-proxy-auth header backed by API_SECRET (cluster- or global-scoped). This also wires up infrastructure to expose and rate-limit api-grpc.<domain> (Traefik/GCP LB, firewall, new api_grpc_port plumbing), propagates node IPs from remote service discovery for data-plane routing, tightens error handling when routing info is missing, and adds per-destination proxy retry overrides for resumed sandboxes.

Reviewed by Cursor Bugbot for commit 7a93b62. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread iac/provider-gcp/nomad-cluster/network/main.tf
Comment thread iac/provider-aws/nomad/main.tf
Comment thread packages/client-proxy/internal/proxy/proxy.go Outdated
@dobrac dobrac requested a review from sitole April 21, 2026 22:04
Add a shared route IP resolver with the local-cluster fallback needed by CI, and make API/client-proxy callers treat empty resolved routes as unavailable instead of successful resume responses.

This keeps BYOC/remote empty node IPs from being treated as routable while preserving the local 127.0.0.1 path.
@sitole
Copy link
Copy Markdown
Member

sitole commented Apr 22, 2026

@matthewlouisbrockman, can you please share a diagram explaining wiring things together?

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 0cdff28. Configure here.

Comment thread packages/orchestrator/pkg/proxy/proxy.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants