- All messages are Java Strings
- A client can do 3 things to the server:
- "REC" (followed by a regular expression) - express interest in receiving all messages containing given regular expression
- "STOP" stop receiving any messages
- "M" (followed by a string) - send message for delivery
- Only one "subscription" per connected client is possible - any "REC" command cancels the effect of previous one.
- There is no acknowledgment and no flow control - i.e. senders are not getting any feedback from the server
Here is the architecture. Clients can be either senders, or receiver or both at the same time. Every client has an open socket to the server (two arrows between client and server). Within the server, every client connection is serviced by two threads (shown as stars). When client is sending messages, corresponding thread reads them off the input stream of the socket and places into the distribution queue. Distribution queue is accessible to a number of distributor threads (4 stars in the middle of the server). Using their routing rules, which are updated by "REC" and "STOP" commands, these distributor threads are placing every message into zero, one or many output queues (there is one output queue per connected client that is currently interested in messages). Output queues are dispatched by the threads attached to the output streams of the sockets.
My next steps:
- Upload the current code somewhere to sourceforge and get Subversion access to it, so I can specify revision number for every blog entry
- Simple monitoring of the distribution queue and output queues. Monitoring will include size of the queues and also contribution of every sender
- Flow control post + simple flow control implementation. Judgment for flow control will be based on monitoring values and some parameters known as "watermark values"
- Post about unfairness of flow control
- Re-writing the code to use non-blocking I/O. This is first of all to compare performance and scalability (with flow control on)
- Post about TTL (Time-To-Live) as an alternative to Flow Control, which is more fair, but could lead to data loss. Concerns about memory usage because of keeping TTL.
- Flush messages to disk to avoid excessive memory consumption when using long TTL. Rotate the files in order to avoid indexing and use of embedded databases etc.
- Some other cool stuff to come...
No comments:
Post a Comment