Hot questions for Using ZeroMQ in chumak

Question:

I am using Chumak in erlang, opening a ROUTER socket.

I have a handful (4 or so) clients that use the Python zmq library to send REQ requests to this server.

Things work fine most of the time, but sometimes a client will have disconnect issues (reconnecting automatically is in the client code, and it works). I've found that when an error occurs in one client connection, it seems to move on to others as well, and I get a lot of ** {{noproc,{gen_server,call,[<0.31596.16>,incomming_queue_out]}}, on the server.

On the server side, I'm just opening one chumak socket and looping:

{ok, Sock} = chumak:socket( router ),
{ok, _}    = chumak:bind( Sock, tcp, "0.0.0.0", ?PORT ),
spawn_link( fun() -> loop( Sock ) end ),
...

loop( CmdSock ) ->
    {ok, [Identity, <<>>, Data]} = chumak:recv_multipart( Sock ),
    ...   

The ZeroMQ docs seem to imply that one listening socket is enough unless I have many clients. Do I misunderstand them?


Answer:

No, there is no need to increase number of Socket instances

Abstractions are great to reduce a need to understand all the details under the hood for a typical user. That ease of life stops whenever such user has to go into performance tuning or debugging incidents.

Let's step in this way: - unless some mastodon beast sized data payloads are to get moved through, there is quite enough to have a single ROUTER-AccessPoint into a Socket-instance, for say tens, hundreds, thousands of REQ-AccessPoints on the client side(s). - yet, such numbers will increase the performance envelope requirements for the ROUTER-side Context-instance, so as to remain capable of handling all the Scalable Formal Communication Archetype ( pre-scribed ) handling, so as to all happen in due time and fair fashion.

This means, one can soon realise benefits from spawning Context-instances with more than its initial default solo-thread + in all my high-performance setups I advocate for using zmq.AFFINITY mappings, so as to squeeze indeed a max performance on highest-priority Socket-instances, whereas leaving non-critical resources sharing a common sub-set of the Context-instance's IO-thread-pool.


Next comes RAM Yes, the toys occupy memory. Check all the .{RCV|SND}BUF, .MAXMSGSIZE, .{SND|RCV}HWM, .BACKLOG, .CONFLATE


Next comes LINK-MANAGEMENT

Do not hesitate to optimise .IMMEDIATE, .{RCV|SND}BUF, .RECONNECT_IVL, .RECONNECT_IVL_MAX, .TCP_KEEPALIVE, .TCP_KEEPALIVE_CNT, .TCP_KEEPALIVE_INTVL, .TCP_KEEPALIVE_IDLE

Always set .LINGER right upon instantiations, as drop-outs cease to be lethal.


Next may come a few defensive and performance helper tools:

.PROBE_ROUTER, .TCP_ACCEPT_FILTER, .TOS, .HANDSHAKE_IVL


Next step?

If no memory-related troubles remain in the game and once mentioning reconnections, my suspect would be to rather go and setup .IMMEDIATE + possibly let ROUTER benefit from explicit PROBE_ROUTER signalling.