Message Queues
Messaging problems
- What if topic gets too big for one computer?
- What if one computer is not reliable enough?
- How strongly can we guarantee delivery?
Benefits
- Enabling Asynchronous Processing
- Interacting with remote servers - when you don’t want to depend on the remote server availability - push message to queue and perform job when remote server is available
- Improve the performance and availability of critical requests - we could just publish (instead of performing long-running job) to queue and immideately respond to user.
- Resource intensive work
- Easier Scalability - we could scale based on the amount of messages in the queue.
- Evening Out Traffic Spikes - just postpone handling of extra messages when there is a spike of traffic and handle it later.
- Isolating Failures and Self-Healing
- Decoupling - the less two parts of the system know about each other the better
Challenges
- No Message Ordering - in genereal there is no order, but it could be achived using some tradeofss.
- Limit the number of consumer to a single thread per queue. This way we could create FIFO.
- Build the system to assume that messages can arrive in random order.
- Use a messaging broker that supports partitial message ordering guarantee (by using message group and labels)
- Dublicates
- Implement idempotent handlers
- Race Conditions Become More Likely - we completely lost the call stack and everything could happened in different order. Developers should pay a lot of attention on this.
- Risk of Increased Complexity - because we add one more component.
Anti-Patterns
- Threating the Message Queue as TCP Socket - it’s bad to use message queue in request => response cycle. It means do not publish message by consumer to publisher.
- Threating Message Queue as Database - by deleting or updating messages in the queue
- Coupling Message Producers with Consumers - e.g. by sharing common classes for data serialization\desirialization
- Lack of Poison Message Handling
Existed Solutions
- Amazon SQS - the easies and cheapest way to get started. If you don’t manage very highload app with a lot of customer demands it’s the best option
- RabbitMQ
+ supports routing
+ configuring lifetime via REST API
- doesn’t have scheduled message delivery
- doesn’t have good horizontal scaling
- ActiveMQ
+ some queue code could be embeded into your app (if it’s written in Java), by doing it you decrease coupling even more
+ have message groups to perform In-Order Delivery
- performs badly at highload
Scalability
- We could easily distribute messages between different brokers just by choosing random from the brokers pool. Random distribution works well for most cases, but if you use ActiveMQ with message groups you should use message group id as a partition key.