Skip to content

rabbitmq takes forever to start, fails, and still eats 100% CPU after started, if ulimit -n set to a high value #491

@t-lo

Description

@t-lo

Do you want to request a feature or report a bug?

This is a bug report which includes a workaround (see below). Motivation for filing this issue is to share this workaround (which has cost me quite a bit of debugging time) with other affected users.

What is the current behavior?

When ulimit -n inside the document server container is set to a high value (depending on the on docker config) it takes multiple minutes to start, hanging at Starting RabbitMQ Messaging Server rabbitmq-server - which eventually fails - though the container continues to run. After that, a process (or thread?) erl_child_setup consumes 100% of a single CPU, and keeps running forever. The document server container is not usable at this point (health endpoint returns 502) because rabbitmq never started successfully.

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

  1. check that ulimit -n is set to a high value inside of the onlyoffice container
    host $ docker run -ti --entrypoint /bin/bash onlyoffice/documentserver
    container $ ulimit -n
    1073741816
  2. Start the container
    host $ docker run onlyoffice/documentserver
  3. run htop on the host (which also shows container namespaced processes): shows start-stop-daemon [...] redis-server stuck for multiple minutes, consuming 100% CPU; then erl_child_setup doing the same.

What is the expected behavior?

  1. Container starts normally independent of ulimit -n setting inside of container
  2. If the start-up of any of the required components (rabbitmq, redis, documentserver, nginx, etc) fails, the container exits with an error.

Did this work in previous versions of DocumentServer?

Yes, but I'm unsure when it stopped working.

DocumentServer Docker tag:

  • dockerhub digest 8a1edcc13f9d
  • image ID 5a50e3a2d2ed

Host Operating System:

Fedora 36 w/ docker version 20.10.17, build 100c701

Workaround

Set ulimit for NOFILE to a lower value, either individually for the documentserver container or globally for all containers.

Individually: add (e.g.) --ulimit nofile=65536:65536 to the docker command line, or

   ...
   ulimits:
     nofile:
       soft: "65536"
       hard: "65536"
   ...

to your service configuration YAML for docker-compose.

Globally: Add --default-ulimit nofile=65536:65536 to the dockerd command line.

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature requestIssues that request new features to be added to OnlyOffice

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions