Design Distributed Message Queue
Last updated
Last updated
Ref: original video
Pros:
easier and faster to implement
Cons:
harder to deal with consumer service failures
What happens if LB goes down?
LB uses primary and secondry nodes.
The primary node accepts connections and serves requests while the secondary node monitors the primary. If primary node doesn't respond, the secondary node takes over.
What if traffic increases and reaches LB's limit?
As for scalablity concerns, a concept fo multiple VIPs can be utilized.
By spreading load balancers across several data centers, we improve both avalibility and performance.
API is a good option to control over queue configuration parameters. Delete queue is a little bit controversial as it may cause a lot of harm and must be executed with caution.
One option is not to delete a message right after it was consumed. Messages can be deleted several days later by a job. This is used by Kafka.
We need to maintain some kind of an order for messages in the queue and keep track of offset, which is the position of a message within a queue.
Another option is used by Amazon SQS. Messages are also not deleted immediately, but marked as invisible. So other consumers may not get already retrieved message.
Consumer that retrieved the message, needs to then call delete message API to delete the messages from a backend host.
And if the message was not explicitly deleted by a consumer, message becomes visible and may be delivered and processed twice.
Synchronously replication: when backend host receives new message, it waits until data is replicated to other hosts. And only if replication is fully completed, successful response is returned to a producer.
higher durability but with a cost of higher latency for send message operation.
Asynchronous replication: response is returned back to a producer as soon as a message is stored on a single backend host. Message is later replicated to other hosts.
more performant, but not guarantee that message will survive backend host failure.
At most once: when messages may be lost but are never redelivered.
At least once: when messages are never lost but maybe redelivered.
Exactly once: when each message is delivered once and only once.
Pull model: consumer constantly sends retrieve message requests and when new message is available in the queue, it is sent back to a consumer.
Push model: consumer is not constantly bombarding FrontEnd service with receive calls. Instead, consumer is notified as soon as new message arrives to the queue.
From producer side, pull is easy to implement compared to push. But from a consumer perspective, we need to do more work if we pull.
It's hard to maintain the order. Some system either not guarantee strict order or have limitations around throughput.
Encryption using SSL over HTTPS helps to protect messages in transit.
We may also encrypt messages while storing them on backend hosts.
We need to monitor health of our distributed queue system and give customers ability to track state of their queues.
Each service we built has to emit metrics and write log data.
Yes. Every component is scalable. When load increases, we just add more load balancers, more FrontEnd hosts, more Metadata service cache shards, more backend clusters and hosts.
Yes. No single point of failure. Each component is deployed across several data centers.
Yes. Each individual microservice needs to be fast.
Yes. We replicate data while storing and ensure messages are not lost during the transfer from a producer to a consumer.