tcpffee

/ˈtɒf.i/, a sticky TCP load balancer.

Setup and Configuration

Configuration is done through a single TOML file:

# IP address to bind to.
bind_addr = "0.0.0.0"
# Socket to bind stats API to.
stats_bind_socket = "127.0.0.1:2340"
# Socket to bind TCP listeners for inter-proxy communications.
# Defaults to localhost only for security, as is ideal for single proxy deployments.
# Every proxy must be able to access both of these ports on every other proxy.
# Make sure these ports are not externally accessible!
#
# Note: These may be merged into fewer ports in future.
raft_healthchecks_logs_bind_socket="127.0.0.1:2341"
raft_stick_table_logs_bind_socket="127.0.0.1:2342"
raft_healthchecks_proposals_bind_socket="127.0.0.1:2343"
raft_stick_table_proposals_bind_socket="127.0.0.1:2344"
# Ports to listen on.
ports = [ 5280, 5281, 5282 ]
# How many seconds a stick table entry is valid for without activity
expiration = 10


# Every group must have a unique ID and it must stay the same throughout the
# life of the group.
[groups.0]
# `portmap` defines which hosts a port is forwarded to.
# This can be expressed as one IP address or as a port-ip mapping for finer
# control:
portmap = "127.0.0.1"

# Optional attributes used to drain or disable a servers
# default: false
drain = false
# default: false
disabled = false

[groups.1]
portmap = "172.31.100.20"

# A group can have any number of healthchecks, including zero.
[[groups.1.healthchecks]]
# Healthchecks are TCP by default.
# How many checks can fail before server group is drained or disabled, default: 3
fall = 3
# What to do to the *whole group* when a server fails its healthchecks.
# default: `Draining`, values: `Draining`, `Disabled`, `Up` (ignore healthcheck)
fall_action = "Draining"
# How many healthchecks must pass before server group is resumed, default: 5
rise = 5
# How often to run the healthcheck (seconds between execution), default: 5.
interval = 5
# Maximum seconds to wait for the check to complete before considering it a fail,
# default: half of interval.
timeout = 1

[[groups.1.healthchecks]]
interval = 5
# Example of a HTTP healthcheck.
# HTTP mode can be set by specifying at least one of the options starting with `http_`.
fall_action = "Disabled"
# The endpoint on the server to check health with, default: "/"
http_target = "/health"
# Any HTTP headers to attach to the request, default: {}
#http_headers = { Content-Type = "text/plain" }
# HTTP method to use, default: GET, values: GET, POST
http_method = "POST"
# HTTP request body, default: None
http_body = "Hello, World!"
# HTTP response code which indicates a healthy server default: 200
http_status = 201


[groups.2]
# `portmap` defines which hosts a port is forwarded to.
# This can be expressed as one IP address or as a port-ip mapping for finer
# control:
[groups.2.portmap]
5280 = "172.31.100.198:5280"
5281 = "172.31.100.20:3487"
5282 = "172.31.100.3:5282"

# List of all peers, including the peer itself.
# If left empty, the proxy will run on its own.
[[ peers ]]
# Hostname that can be used to communicate with the peer from other peers.
# Can also be IP address.
host = "hostname.domain"
# MAC address of a network interface on the peer.
# Used to assign peers their node ID, and for peers to identify which peer they are.
mac_address = "00:00:00:00:00::01"

[[ peers ]]
host = "hostname.domain"
mac_address = "00:00:00:00:00::02"

There are also some settings that are set through environment variables:

Environment Variable	Description	Default
`TCPFFEE_CONFIG_PATH`	Path to config file	`/etc/tcpffee/tcpffee.toml`
`TCPFFEE_PID_PATH`	Path to file with the PID of tcpffee	`/run/tcpffee.pid`
`TCPFFEE_HEALTHCHECK_SAVE_PATH`	Path to file to read from/write to the state of healthchecks	`/var/lib/tcpffee/healthcheck_state`

Proxy Key Types

Environment Variable	Values
`TCPFFEE_KEY_TYPE`	`IP`, `IP_WITH_FIRST_MESSAGE_SLICE`
`TCPFFEE_KEY_FIRST_MESSAGE_SLICE_START`	Positive integer
`TCPFFEE_KEY_FIRST_MESSAGE_SLICE_END`	Positive integer greater than `TCPFFEE_KEY_FIRST_MESSAGE_SLICE_START`

The proxy key is used to identify client machines and evenly distribute their load. By default, the key used is the client's IP address.

Be aware that this may not perfectly distribute load, due to NAT, when there are lots of IPv4 clients. Additionally, if the drain feature needs to be used with zero user interruption, this may result in a server never being fully drained. This could happen if client machines behind the same NAT have several partially overlapping and long running sessions/communications with the servers being proxied. An example of this oculd be a heartbeat to a license server.

To prevent this, the key type can be set to IP_WITH_FIRST_MESSAGE_SLICE. TCPFFEE_KEY_FIRST_MESSAGE_SLICE_START and TCPFFEE_KEY_FIRST_MESSAGE_SLICE_END must also be set. This key type creates the key by appending to the IP address a slice of the first message the client program sends. This can be useful to prevent load balancing issues with NATs, if the client program's first message is always the same format and identifies itself with something unique (such as a username or hostname).

Logging

The log level can be set using the RUST_LOG environment variable. More information can be found in the env_logger documentation.

Draining and Disabling Server Groups

A server group can be set to drain by setting drain to true in the config or by having it fail a healthcheck. The same can be done for disabling a server, although the groups.<id>.healthchecks[].fall_action setting must be set to Disabled for this behaviour, as servers are drained by default.

Reloading Configuration

Configuration can be reloaded by sending the SIGUSR1 signal to the process. The PID of the process is saved to /run/tcpffee.pid by default, and can be overwritten by the environment variable TCPFFEE_PID_PATH. Reloading the configuration will not cause any connections to be dropped, assuming that it is correctly configured. If the syntax or schema is wrong, the program will log this and continue with the old configuration.

Live reloading can be used to dynamically change the configuration, such as adding/removing servers.

Statistics API

tcpffee comes with a HTTP API for accessing statistics and information about the proxy. This can currently be accessed at:

STATS_BIND_SOCKET/api/v1/
STATS_BIND_SOCKET/ui/v1/ (for an interactive way to explore the API)
STATS_BIND_SOCKET/spec/v1/ (for the OpenAPI specification)

where STATS_BIND_SOCKET is the variable set in the proxy configuration. The default socket will only listen on localhost, be careful opening it up, as the entire contents of the stick table are available through the API.

TODO

performance
Thorough testing of healthchecks
Save stick table to disk at regular intervals
Test proxy smooth shutdown
Transparent proxying
Use raft to synchronise state between proxy instances
- ~~Implement generic raft storage for use in proxy~~
- ~~Implement generic raft node for use in proxy~~
- ~~Implement communication for raft node~~
- ~~Synchornise stick tables using raft (replace postgresql)~~
- Leader should reject stick table additions for entries that have not expired
- ~~Leader server holds elections on health of server groups to decide if they are healthy~~
- ~~Servers which falsely report server groups as healthy get shut down~~ (they now have an endpoint to report being unhealthy instead)
Testing
~~Connection pooling~~ Done!
~~Live configuraion editing API/config hot reloading~~~ Config hot reloading done!
~~Logging~~ Done!
~~Cache database queries~~ Done!
~~Drain option as a config line~~ Done!
~~Healthchecks~~ Done!
- ~~Automatic draining for failed healthchecks~~ Done!
- ~~Fix cache not updated when client redirected due previous direction to disabled server~~ Fixed!
- ~~Healthcheck status is restored on server restart~~ Done!
- ~~Ensure healthchecks are up to date on config reload~~ Done!
~~Add timeouts to healthchecks~~

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
src		src
.clippy.toml		.clippy.toml
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tcpffee

Setup and Configuration

Proxy Key Types

Logging

Draining and Disabling Server Groups

Reloading Configuration

Statistics API

TODO

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tcpffee

Setup and Configuration

Proxy Key Types

Logging

Draining and Disabling Server Groups

Reloading Configuration

Statistics API

TODO

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages