Hot questions for Using ZeroMQ in design patterns

Top 10 C/C++ Open Source / ZeroMQ / design patterns

Question:

I'm working on a program that will have multiple threads requiring information from a web-service that can handle requests such as: "Give me [Var1, Var2, Var3] for [Object1, Object2, ... Object20]"

and the resulting reply will give me a, in this case, 20-node XML (one for each object), each node with 3 sub-nodes (one for each var).

My challenge is that each request made of this web-service costs the organization money and, whether it be for 1 var for 1 object or 20 vars for 20 objects, the cost is the same.

So, that being the case, I'm looking for an architecture that will:

  1. Create a request on each thread as data is required
  2. Have a middle-tier "aggregator" that gets all the requests
  3. Once X number of requests have been aggregated (or a time-limit has reached), the middle-tier performs a single request of the web-service
  4. Middle-tier receives reply from web-service
  5. Middle-tier routes information back to waiting objects

Currently, my thoughts are to use a library such as NetMQ with my middle-tier as a server and each thread as a poller, but I'm getting stuck on the actual implementation and, before going too far down the rabbit-hole, am hoping there's already a design pattern / library out there that does this substantially more efficiently than I'm conceiving of.

Please understand that I'm a noob, and, so, ANY help / guidance would be really greatly appreciated!!

Thanks!!!


Answer:

Overview

From the architectural point of view, you just sketched out a good approach for the problem:

  1. Insert a proxy between the requesting applications and the remote web service
  2. In the proxy, put the requests in the request queue, until at least one of the following events occurs
    1. The request queue reaches a given length
    2. The oldest request in the request queue reaches a certain age
  3. Group all requests in the request queue in one single request, removing duplicate objects or attributes
  4. Send this request to the remote web service
  5. Move the requests into the (waiting for) response queue
  6. Wait for the response until one of the following occurs
    1. the oldest request in the response queue reaches a certain age (time out)
    2. a response arrives
  7. Get the response (if applicable) and map it to the according requests in the response queue
  8. Answer all requests in the response queue that have an answer
  9. Send a timeout error for all requests older than the timeout limit
  10. Remove all answered requests from the response queue
Technology

You probably won't find an off-the-shelf product or a framework that exactly matches you requirements. But there are several frameworks / architectural patterns that you can use to build a solution.

C#: RX and LINQ

When you want to use C#, you could use reactive extensions for getting the timing and the grouping right.

You could then use LINQ to select the attributes from the requests to build the response and to select the requests in the response queue that either match to a certain part of a response or that timed out.

Scala/Java: Akka

You could model the solution as an actor system, using several actors:

  1. An actor as the gateway for the requests
  2. An actor holding the request queue
  3. An actor sending the request to the remote web service and getting the response back
  4. An actor holding the response queue
  5. An actor sending out the responses or the timeouts

An actor system makes it easy to deal with concurrency and to separate the concerns in a testable way.

When using Scala, you could use its "monadic" collection API (filter, map, flatMap) to do basically the same as with LINQ in the C# approach.

The actor approach really shines when you want to test the individual elements. It is very easy to test each actor individually, without having to mock the whole workflow.

Erlang/Elixir: Actor System

This is similar to the Akka approach, just with a different (functional!) language. Erlang / Elixir has a lot of support for distributed actor systems, so when you need an ultra stable or scalable solution, you should look into this one.

NetMQ / ZeroMQ

This is probably too low level and brings in to few infrastructure. When you use an actor system, you could try to bring in NetMQ / ZeroMQ as the transport system.

Question:

I have a project that needs to be written in Perl so I've chosen ZeroMQ.

There is a single client program, generating work for a variable number of workers. The workers are real human operators who will complete a task then request a new task. The job of the client program is keep all available workers busy all day. It's a call center.

So each worker can only process one task at time, and there may be some time before requesting a new task. And the number of workers may vary during the day.

The client needs to keep a queue of tasks ready to give to workers as and when they request them. Whenever the client queue gets low the client can generate more tasks to top-up the queue.

What design pattern (i.e. what ZeroMQ Socket combination) should I use for this? I've skimmed through all the patterns in the 0MQ Guide and can't find anything that matches this.

Thanks


Answer:

Sure. ... there is not a single, solo Archetype to match the Requirement List use several ZeroMQ Scalable Formal Communication Patterns

Typical software Project uses many ZeroMQ sockets ( with various Archetypes ) as a certain form of node-node signalisation and message-passing platform.

It is fair to note, that automated Load-Balancers may work fine for automated processes, but not always so for processes, executed by Humans or interacting with Humans.

Humans ( both the Call centre Agents and their Line-Supervisors ) introduce another layer of requirements - sometimes with a need to introduce non-just-Round-Robin workload distribution logic, sometimes need to switch a call from Agent A to another Agent B ( which a trivial archetype will simply not be capable of and might get into troubles, if it's hardwired-logic runs into a collision ( mutually blocked REQ-REP stale-mate being one such example ).

So simply forget to wait for one super-powered archetype, but rather create a smart network of behaviours, that will cover your distributed-computing problem desired event-handling.

There are many other aspects, one ought learn before taking the first ZeroMQ socket into service.

  • failure resillience

  • performance scaling

  • latency-profiling ( high-priority voice-traffic, vs. low-priority logging )

  • watchdog acknowledgements and timeout situations handling

  • cross-compatibility issues ( version 2.1x vs 3.x vs 4.+ API )

  • processing robustness against a malfunctioning agent / malicious attack / deadly spurious traffic storms ... to name just a few of problems

all of which has some built-ins in the ZeroMQ toolbox, some of which may need some advanced thinking, so as to handle known constraints.


The Best Next Step?

A would advocate for a fabulous Pieter HINTJENS' book "Code Connected, Volume 1" -- for everyone, who is serious into distributed processing, this is a must-read -- do not hesitate to check other my posts to find a direct URL to a PDF-version of this ZeroMQ Bible.

Worth time and one's tears and sweat.

Question:

I have something like remote machine that performs heavy computations and a client machine that sends tasks to it. Output results are very big from megabytes to gigabytes and come in chunks during long time period. So it looks like this: client sends task and then needs to receive this chunks since they are already useful (one request - multiple responses). How to realize this pattern in ZeroMQ.


Answer:

Maybe I read the problem above definition wrong, but as it stands, it seems to me that the main concern is achieving a way to accomodate a message flow between a pair of hosts ( not a Broker fan-out to 1+ Workers using classical DEALER/ROUTER Scaleable Formal Communication Pattern ), where the key concern is, how to handlea client-machine ( sending one big-computing-task-request and "waits" for a flow of partial results )to an HPC-machine ( receiving a TaskJOB, processing it and delivering a flow of non-synchronised, unconstrained in time and size messages back to the client-machine ).

For such a 1:1 case, with 1-Job:many-partialJobResponses, the setup may benefit from a joint messaging and signalling infrastructure with several actual sockets under hood, as sketched below:

Signalling:
clientPUSH   |-> hpcPULL    // new TaskJOB|-> |
clientPULL <-|   hpcPUSH    //              <-|ACK_BEGIN
clientPULL <-|   hpcPUSH    //              <-|KEEPALIVE_WATCHDOG + PROGRESS_%
clientPULL <-|   hpcPUSH    //              <-|KEEPALIVE_WATCHDOG + PROGRESS_%
...                         //                |...
clientPULL <-|   hpcPUSH    //              <-|KEEPALIVE_WATCHDOG + PROGRESS_%
clientPULL <-|   hpcPUSH    //              <-|KEEPALIVE_WATCHDOG + PROGRESS_%
clientPULL <-|   hpcPUSH    //              <-|ACK_FINISH         + LAST_PAYLOAD#
clientPUSH   |-> hpcPULL    // new TaskJOB|-> |
clientPULL <-|   hpcPUSH    //              <-|ACK_BEGIN
...                         //                |...
clientPULL <-|   hpcPUSH    //              <-|ACK_FINISH         + LAST_PAYLOAD#
Messaging:
clientRECV <-|   hpcXMIT    //              <-|FRACTION_OF_A_FAT_RESULT_PAYLOAD#i
clientACK    |-> hpcACK     // #i POSACK'd|-> |
clientRECV <-|   hpcXMIT    //              <-|FRACTION_OF_A_FAT_RESULT_PAYLOAD#j
clientRECV <-|   hpcXMIT    //              <-|FRACTION_OF_A_FAT_RESULT_PAYLOAD#k
clientACK    |-> hpcACK     // #k POSACK'd|-> |
clientACK    |-> hpcACK     // #j   NACK'd|-> |
clientRECV <-|   hpcXMIT    //              <-|FRACTION_OF_A_FAT_RESULT_PAYLOAD#j
clientACK    |-> hpcACK     // #j POSACK'd|-> |
clientACK    |-> hpcACK     // #u   NACK'd|-> |               // after ACK_FINISH
clientACK    |-> hpcACK     // #v   NACK'd|-> |               // after ACK_FINISH
clientACK    |-> hpcACK     // #w   NACK'd|-> |               // after ACK_FINISH
clientACK    |-> hpcACK     // #x   NACK'd|-> |               // after ACK_FINISH
clientRECV <-|   hpcXMIT    //              <-|FRACTION_OF_A_FAT_RESULT_PAYLOAD#x
clientACK    |-> hpcACK     // #x POSACK'd|-> |
clientRECV <-|   hpcXMIT    //              <-|FRACTION_OF_A_FAT_RESULT_PAYLOAD#u
clientACK    |-> hpcACK     // #u POSACK'd|-> |
...                         //                | ...      
clientRECV <-|   hpcXMIT    //              <-|FRACTION_OF_A_FAT_RESULT_PAYLOAD#w
clientACK    |-> hpcACK     // #w POSACK'd|-> |

again, using a pair of PUSH/PULL sockets for (internally)-state-less messaging automata, but allowing one to create one's own, higher level Finite-State-Automata, for self-healing messaging flow, handling the FAT_RESULT controlled fragmentation into easier to swallow payloads ( remember one of the ZeroMQ maxims, to use rather a Zero-Guarrantee than to build an un-scaleable mastodont ( which the evolutionary nature of the wild ecosystem will kill anyways ) and also providing some level of reactive re-transmits on demand.

Some even smarter multi-agent setups are not far from gotten sketched to increase the processing throughput ( a FAT_RESULT DataFlow Curator agent, separate from the HPC_MAIN, unloading the HPC platform's resources for immediate start of next TaskJOB, etc )