Skip to content

Move connection logic in accept and connect calls#128

Draft
tvegas1 wants to merge 1 commit intoMellanox:masterfrom
tvegas1:dev/async_connect
Draft

Move connection logic in accept and connect calls#128
tvegas1 wants to merge 1 commit intoMellanox:masterfrom
tvegas1:dev/async_connect

Conversation

@tvegas1
Copy link
Copy Markdown
Collaborator

@tvegas1 tvegas1 commented Oct 20, 2023

Why

When changing worker vs ep allocation scheme, the connection step can remain stuck indefinitely, as the loop on ucp_worker_progress() does not allow unrelated request progress anymore.

What

Move from blocking to non-blocking for connection establishment.

How

Move connection steps from isend()/irecv() to connect()/accept() and use similar state machine management steps from other plugins.

@tvegas1
Copy link
Copy Markdown
Collaborator Author

tvegas1 commented Oct 20, 2023

pls look @brminich

@tvegas1
Copy link
Copy Markdown
Collaborator Author

tvegas1 commented Oct 24, 2023

on hold, conflicts with NET_SHARED_COMMS=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant