Support TCP for protocol messages by softins · Pull Request #3636 · jamulussoftware/jamulus

softins · 2026-03-11T14:36:42Z

Short description of changes

Support fallback to TCP for protocol messages, in order to overcome potential loss of large messages due to UDP fragmentation. ~~Currently an incomplete draft, for comment as development continues.~~

CHANGELOG: Client/Server: Support TCP fallback for protocol messages.

Context: Fixes an issue?

Discussed in issue #3242.

Does this change need documentation? What needs to be documented and how?

It will need documentation once design and development are complete. Particularly need to explain the firewall requirements for a server or directory.

Status of this Pull Request

~~Incomplete, still under development. Main server side complete and working. Client side development in progress.~~ Complete and ready for review and testing. ~~Still marked draft as~~ it needs some of the debug messages to be commented out before merging.

What is missing until this pull request can be merged?

A lot of testing of both server and client. Intended for Jamulus 4.0.0.

Checklist

I've verified that this Pull Request follows the general code principles
I tested my code and it does what I want
My code follows the style guide
I waited some time after this Pull Request was opened and all GitHub checks completed without errors.
I've filled all the content above

softins · 2026-03-11T14:40:45Z

So far, this implements the server side of the design described here and here

softins · 2026-03-28T12:07:21Z

So the next stage of implementation has been achieved: client-side support in the Connect dialog.

If the server list has not been received via UDP when the associated message indicating TCP support has arrived, the client will retry fetching the server list over TCP.
If the client list for a server has not been received via UDP when the associated message indicating TCP support has arrived, the client will retry fetching the client list over TCP, and will continue to use TCP for that server while the Connect dialog is open.
A directory or server that does not have TCP support will not send the TCP supported message, and will continue to be handled as in current versions.
If the server list or client list is successfully received over UDP, there is no need for the client to try TCP.

It has been tested by using nft to drop outbound Jamulus UDP messages with a specific message ID, to simulate loss due to fragmentation.

Examples for a directory-enabled server running on port 22120:

drop UDP server list: nft add rule inet filter output udp sport 22120 @ih,16,16 0xee03 drop
drop UDP client list: nft add rule inet filter output udp sport 22120 @ih,16,16 0xf503 drop
drop UDP "TCP supported" msg: nft add rule inet filter output udp sport 22120 @ih,16,16 0xfb03 drop

Note that nft rules require network byte order (big-endian), but Jamulus IDs are little-endian:

CLM_SERVER_LIST = 1006 = 0x03ee => 0xee03 (LE byte order)
CLM_RED_SERVER_LIST = 1018 = 0x03fa => 0xfa03 (LE byte order)
CLM_CONN_CLIENTS_LIST = 1013 = 0x03f5 => 0xf503 (LE byte order)
CLM_TCP_SUPPORTED = 1019 = 0x03fb => 0xfb03 (LE byte order)

softins · 2026-03-28T17:56:02Z

The next step is to try implementing the connected-mode TCP described here

ann0see · 2026-04-07T14:55:58Z

    bool         bUseTranslation             = true;
    bool         bCustomPortNumberGiven      = false;
    bool         bEnableIPv6                 = false;
+    bool         bEnableTcp                  = false;


Since we'll have a long time for the 4.0 release, I'd enable it by default soon (of course once we've tested that the basics work)

No, I disagree. It's a server-only option, and most servers operators will not need to enable TCP support. Only those running large directories or large servers will need to, and they also need to understand and configure their firewall requirements.

TCP support in the client will indeed be enabled by default, but will only take effect when talking to a directory or server that has enabled it.

If a server operator enables TCP without having configured their firewall correctly, client users could have problems as the server would advertise TCP support to the client, but the client could be unable to connect.

Can we not give an error message or fallback procedure in case the TCP connection timed out?

Yes, I'm sure we can. I haven't yet tested that scenario.

But it doesn't negate my view that server-side TCP support needs to be an explicit option.

softins · 2026-04-09T22:48:19Z

Well I've finished implementing everything I intended to, for directory, server and client, so it's ready for reviewing and trying out, as and when time permits (post 3.12.0).

I have a private directory and server built and running with TCP support, at newjam.softins.co.uk on the standard port 22124.

In order to demonstrate the use of TCP in a new client's connect dialog, it will be necessary to use custom firewall filters on the client end to temporarily drop incoming UDP Jamulus protocol messages containing a server list or connected clients list.

There is full forward and backward compatibility between clients and servers built with TCP support and older versions.

softins · 2026-04-10T06:34:38Z

Keeping as draft, because it will need quite a few debug messages removed before merging.

Constructor for CTcpConnection made polymorphic for client and server.

softins · 2026-06-09T15:38:04Z

Now rebased to latest dual-stack main and tested. I haven't yet studied @pljones review comments from a week or two ago. Will do soon.

The displayed address of 0.0.0.0 was misleading for a dual-stack socket

pljones · 2026-06-11T16:51:55Z

+
+The basic summary is that TCP need only be used as a fallback when it is determined that a UDP message from a directory or server failed to reach the client, probably due to fragmentation, _and_ that the directory or server explicitly supports TCP.
+
+### Current operation when client opens Connect dialog


I'd also note that the JSONRPC Client API provides means to request the server list and client list. Both those scenarios are susceptible to the same issue.

Unless the solution materially differs, then I'd avoid - as much as possible and except for clarification - to mention the UI. It's just an example consumer of the response.

The solution needs to work regardless of that consumer.

(Most of the document avoids mentioning the GUI at all - it's worth trying to keep it that way throughout.)

pljones · 2026-06-11T16:54:54Z

+
+### Enhancement for TCP support
+
+1. A server (which may also be a directory) can be configured with the command-line option `--enabletcp` to enable TCP operation.


Add a note as to why this should not ever be made the default, if there is one. Otherwise, clarify why it's defaulting to off. (My view is, it's either innocuous and can be enabled for everyone and it'll either work or do nothing or it will cause some kind of trouble people should have highlighted to them.)

pljones · 2026-06-11T16:59:56Z

+   b. There is no need for the directory to send `CLM_RED_SERVER_LIST` to the client, since the TCP connection is reliable, so the directory server just sends the `CLM_SERVER_LIST` over the TCP connection.
+
+4. When the client has received the `CLM_SERVER_LIST` over TCP, it closes the TCP connection, populates its list of servers in the connect dialog in the normal way and stops the 2.5 sec re-request timer.
+


Just to be clear here -- there is no mention of persisting the information that a Directory has TCP support for CLM_SERVER_LIST. Presumably this is not needed as, each time a CLM_REQ_SERVER_LIST UDP request is sent, the same steps - 2 to 4 - are repeated.

That's right, the information does not need to be persisted at the client end. In the connect dialog implementation, I do persist the TCP support status separately for each listed server, so that once it knows a particular server needs TCP it doesn't drop back to UDP (except I need to make it drop back to UDP if the TCP connection fails). But that's just an optimisation and only for the life of that server list and gets forgotten when changing directories or on closing the dialog.

Yeah, probably worth explicitly saying this in the doc -- and commenting the code clearly as to the design behind it. The doc is good but the code needs to have the commentary "right there", as in years to come... no one's going to read the fine manual...

pljones · 2026-06-11T17:05:20Z

+
+8. If the server accepts a TCP connection and receives a `CLM_REQ_CONN_CLIENTS_LIST` over it, it will process the request in the same way as for a UDP request, but will send the reply over the TCP connection.
+
+9. When the client has received the `CLM_CONN_CLIENTS_LIST` over TCP, it closes the TCP connection and updates the list of clients for that server in the GUI. However, it will note for that server that TCP is needed, and if/when the number of connected clients next changes while the connect dialog is still open, it will immediately request the updated list via TCP instead of UDP.


However, it will note for that server that TCP is needed, and if/when the number of connected clients next changes while the connect dialog is still open, it will immediately request the updated list via TCP instead of UDP.

Explain why this makes sense here and not for the Directory and why there's no point saving this information if the UDP request works.

The server list is only fetched from the directory once, not repeatedly, after opening the connect dialog or changing directory. So there's nothing to remember there. And persisting the TCP status for each listed server is also just an optimisation and not essential, as I mentioned above.

So a Client never gets an updated server list from the Directory, whilst the dialog is open, even if another Server registers?

Hm... We should call that a bug, really. If I open the Connect dialog and the nearest server is 35ms, I'd want to know if one 15ms appeared.

pljones · 2026-06-11T17:12:45Z

+  - `RECORDER_STATE` - current state of the server-based recording. Sent when the state changes?
+  - `JITT_BUF_SIZE` - the size of the receiving jitter buffer for this connection on the server. Sent in Auto mode when the value changes.
+  - `CLM_PING_MS` - sent in response to a `CLM_PING_MS` received from the client. For client-side ping time calculation.
+  - `CONN_CLIENTS_LIST` - list of connected clients. Sent when the list changes due to a client connecting or leaving. This message could be large on a server with many clients.


Could be fixed by a new protocol message:
CLIENT_LIST_CHANGE <action> [<channel> [<channel details>]
where

<action> is "clear", "add", "update" or "remove"

<channel> is the server channel number, only present on add/update/remove

<channel details> is the player details, only present on add/update

which should never fragment...

Except when a client joins a server that already has a large number of connected clients, it will still get the complete list all at once, so I don't think CLIENT_LIST_CHANGE gains us much.

Except when a client joins a server that already has a large number of connected clients

Maybe a V4 Client could first send REQ_CLIENT_LIST_CHANGES and, if it got the CLIENT_LIST_CHANGE "clear" back, it wouldn't request the client list and the Server wouldn't send the full list or the Server would at least stop sending them. (That would cut traffic generally, too.)

Pre-V4 support carries on the same way - and hasn't got TCP anyway, so TCP can't fix it.

pljones · 2026-06-11T17:13:25Z

+
+By sending the `CLM_TCP_SUPPORTED` message immediately *after* sending a potentially large list of servers or connected clients, it allows a client easily to determine whether or not it needs to fall back to TCP without the necessity of timeouts or other delays. It will only need to use TCP if it has not already succeeded in receiving the message over UDP.
+
+## CONNECTED MODE


I'm tempted to get this into a separate PR.

Why? That just feels like more work for no gain. Temptation can always be resisted!

Mostly because I can see alternative approaches, so I'm more cautious. With the pre-connection, it's not arguable - it needs doing, we've seen it break and it affects a lot of people. The audience for this part is different.

pljones · 2026-06-11T17:14:40Z

+
+## OTHER CONSIDERATIONS
+
+The server should only be configured to offer TCP by specifying `--enabletcp` if the server operator has also configured any firewall to allow the inbound TCP connections.


I think that's upside down: if the Server operator specifies --enabletcp, then they will need to make sure their system allows the traffic. They can specify --enabletcp as much as they like, it just won't get traffic from anywhere if it's blocked... Jamulus would not fail if it can't connect to the Client over TCP: the Client might not have incoming TCP access, so the connection would fail at that point.

pljones · 2026-06-11T17:17:05Z

+
+The server should only be configured to offer TCP by specifying `--enabletcp` if the server operator has also configured any firewall to allow the inbound TCP connections.
+
+If a server were to offer TCP to the client, but the server's firewall didn't allow the incoming TCP connection, the client request for TCP would wait until its request times out.


It should handle it like any other dropped request. It'll go around and try UDP.

This is why I'm nervous about "remember I was told to try TCP" until it's actually worked.

Yes, I'm thinking of splitting CFM_TCP into CFM_TCP_REQUEST and CFM_TCP_RESULT, like it is for UDP.

pljones · 2026-06-11T17:19:18Z

+
+If a server were to offer TCP to the client, but the server's firewall didn't allow the incoming TCP connection, the client request for TCP would wait until its request times out.
+
+This has to be the responsibility of the server/directory operator, and is why TCP operation must be controlled by a command-line option, rather than always enabled. The operator should only enable TCP in the Jamulus server if they know their environment has been configured to support it.


I tend to disagree with it being a reason for not always enabling it, as stated. Things should carry on working exactly as they would without it -- except an extra exchange happens that fails. Enabling by default IPv6 does the same.

pljones · 2026-06-11T17:21:15Z

+
+This has to be the responsibility of the server/directory operator, and is why TCP operation must be controlled by a command-line option, rather than always enabled. The operator should only enable TCP in the Jamulus server if they know their environment has been configured to support it.
+
+Most operators of small servers of directories will not need to be concerned with TCP at all. _The only server operators who will need to enable TCP support are those running large directories (e.g. Volker, Peter) or those running a large server designed to support many simultaneous client connections._


This bit is the real reason. I think it should be explain like this primarily, rather than on the technicalities.

It's pointless doing it for everyone. It doesn't matter, it's just pointless.

pljones · 2026-06-11T17:27:34Z

+
+The reason for using a TCP connection in an active session is just to provide a reliable path for delivering a list of connected clients that could be large and subject to fragmentation (if it is sent over UDP). So the established TCP connection is only used to deliver client lists, and not other protocol messages.
+
+Therefore, if the server has an active TCP connection from the client, it will use the connectionless `CLM_CONN_CLIENTS_LIST` message to deliver updates for the connected client list. If there is no active TCP connection, updates will be delivered using the connected-mode `CONN_CLIENTS_LIST` over UDP as at present.


Might be worth noting at this point (as it's in the flow and notable) why the two messages are different - i.e. why it doesn't bother to use the CONN_CLIENTS_LIST.

softins added this to the Release 4.0.0 milestone Mar 11, 2026

softins self-assigned this Mar 11, 2026

softins mentioned this pull request Mar 11, 2026

Support TCP for protocol messages #3242

Open

softins force-pushed the tcp-protocol branch 4 times, most recently from 5e1a658 to 0ae51e2 Compare March 16, 2026 13:05

softins linked an issue Mar 16, 2026 that may be closed by this pull request

Support TCP for protocol messages #3242

Open

softins added the feature request Feature request label Mar 16, 2026

softins force-pushed the tcp-protocol branch 3 times, most recently from 7ad1d1f to d939e5b Compare March 26, 2026 17:38

ann0see self-requested a review April 7, 2026 14:51

ann0see reviewed Apr 7, 2026

View reviewed changes

Comment thread src/tcpserver.h

ann0see reviewed Apr 7, 2026

View reviewed changes

Comment thread src/connectdlg.cpp Outdated

ann0see reviewed Apr 7, 2026

View reviewed changes

Comment thread src/connectdlg.cpp Outdated

ann0see assigned ann0see and pljones Apr 9, 2026

ann0see added the bug Something isn't working label Apr 9, 2026

github-project-automation Bot added this to Tracking Apr 9, 2026

github-project-automation Bot moved this to Triage in Tracking Apr 9, 2026

ann0see moved this from Triage to In Progress in Tracking Apr 9, 2026

softins marked this pull request as ready for review April 9, 2026 22:48

softins marked this pull request as draft April 10, 2026 06:30

softins added 12 commits June 8, 2026 16:34

Be specific about bDisconAfterRecv for TCP

b41bc21

Rework TCP session-mode connection

e38ceb3

Minor comment updates

30a4f4b

Add support for sending Empty Message over TCP

ba3c4b3

Implement keepalive over session long TCP connection

e29c55c

Clarify comment

20ee030

Make CTcpConnection work in serveronly mode.

11d314b

Constructor for CTcpConnection made polymorphic for client and server.

Add timeout for TCP connection

bcb4b78

Add an idle timeout on the server side

86149e3

Add document describing TCP operation

f1dcc0a

Update copyright headers for new source files

bbc0905

Use new way to discover IPv6 availability

eda100e

softins force-pushed the tcp-protocol branch from 2d5ec89 to eda100e Compare June 9, 2026 15:34

Only quote port number in TCP server start message.

8b14d66

The displayed address of 0.0.0.0 was misleading for a dual-stack socket

ann0see reviewed Jun 9, 2026

View reviewed changes

Comment thread docs/TCP.md

Small changes to address review comments

dded0ca

softins marked this pull request as draft June 10, 2026 19:25

pljones reviewed Jun 11, 2026

View reviewed changes

Comment thread docs/TCP.md

pljones reviewed Jun 11, 2026

View reviewed changes


		The basic summary is that TCP need only be used as a fallback when it is determined that a UDP message from a directory or server failed to reach the client, probably due to fragmentation, _and_ that the directory or server explicitly supports TCP.

		### Current operation when client opens Connect dialog


		### Enhancement for TCP support

		1. A server (which may also be a directory) can be configured with the command-line option `--enabletcp` to enable TCP operation.

		b. There is no need for the directory to send `CLM_RED_SERVER_LIST` to the client, since the TCP connection is reliable, so the directory server just sends the `CLM_SERVER_LIST` over the TCP connection.

		4. When the client has received the `CLM_SERVER_LIST` over TCP, it closes the TCP connection, populates its list of servers in the connect dialog in the normal way and stops the 2.5 sec re-request timer.


		8. If the server accepts a TCP connection and receives a `CLM_REQ_CONN_CLIENTS_LIST` over it, it will process the request in the same way as for a UDP request, but will send the reply over the TCP connection.

		9. When the client has received the `CLM_CONN_CLIENTS_LIST` over TCP, it closes the TCP connection and updates the list of clients for that server in the GUI. However, it will note for that server that TCP is needed, and if/when the number of connected clients next changes while the connect dialog is still open, it will immediately request the updated list via TCP instead of UDP.


		By sending the `CLM_TCP_SUPPORTED` message immediately after sending a potentially large list of servers or connected clients, it allows a client easily to determine whether or not it needs to fall back to TCP without the necessity of timeouts or other delays. It will only need to use TCP if it has not already succeeded in receiving the message over UDP.

		## CONNECTED MODE


		## OTHER CONSIDERATIONS

		The server should only be configured to offer TCP by specifying `--enabletcp` if the server operator has also configured any firewall to allow the inbound TCP connections.


		The server should only be configured to offer TCP by specifying `--enabletcp` if the server operator has also configured any firewall to allow the inbound TCP connections.

		If a server were to offer TCP to the client, but the server's firewall didn't allow the incoming TCP connection, the client request for TCP would wait until its request times out.


		If a server were to offer TCP to the client, but the server's firewall didn't allow the incoming TCP connection, the client request for TCP would wait until its request times out.

		This has to be the responsibility of the server/directory operator, and is why TCP operation must be controlled by a command-line option, rather than always enabled. The operator should only enable TCP in the Jamulus server if they know their environment has been configured to support it.


		This has to be the responsibility of the server/directory operator, and is why TCP operation must be controlled by a command-line option, rather than always enabled. The operator should only enable TCP in the Jamulus server if they know their environment has been configured to support it.

		Most operators of small servers of directories will not need to be concerned with TCP at all. _The only server operators who will need to enable TCP support are those running large directories (e.g. Volker, Peter) or those running a large server designed to support many simultaneous client connections._


		The reason for using a TCP connection in an active session is just to provide a reliable path for delivering a list of connected clients that could be large and subject to fragmentation (if it is sent over UDP). So the established TCP connection is only used to deliver client lists, and not other protocol messages.

		Therefore, if the server has an active TCP connection from the client, it will use the connectionless `CLM_CONN_CLIENTS_LIST` message to deliver updates for the connected client list. If there is no active TCP connection, updates will be delivered using the connected-mode `CONN_CLIENTS_LIST` over UDP as at present.

Conversation

softins commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

softins commented Mar 11, 2026

Uh oh!

softins commented Mar 28, 2026

Uh oh!

softins commented Mar 28, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

softins commented Apr 9, 2026

Uh oh!

softins commented Apr 10, 2026

Uh oh!

softins commented Jun 9, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pljones Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pljones Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pljones Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

softins commented Mar 11, 2026 •

edited

Loading

pljones Jun 11, 2026 •

edited

Loading

pljones Jun 12, 2026 •

edited

Loading

pljones Jun 11, 2026 •

edited

Loading