Reliable messaging through msmq seems not possible in current implementation

Aug 7, 2010 at 1:45 PM
Edited Aug 7, 2010 at 2:11 PM
One of the nice things of this framework is that one can change the transport without altering the code.
However, this does pose one problem with reliable/persistent transports like MSMQ.


with the current implementation there are 2 problems:
at the Client side: there is no feedback/ack/event if the actual message has been in fact correctly placed in the queue.
at the Server side: there is no method to signal the queue that the event/workitem has been processed and it can be removed from the queue indefinitely.


Both these issues make reliable messaging impossible. If I have a package and I need to send it from component1 to component2, and it MUST reach the destination, then the moment I post the message in a port on the client side, it might very well never be persisted in the queue (due to some reason or the other), and I never know. I could do a ping every once in a while, but anything in between is still unreliable.

On the server side, suppose I get the workeritem which may not be lost. I process it, but halfway through the server crashes, or something else happens. Then this packet is lost forever.


On a lot of systems this is perfectly fine, but for some packets (for which the MSMQ transport might be a good option) you also want to use it's transactional behaviour.


Is this a problem which is being tackled in future releases?


R


p.s. there was an article from someone also combining ccr and msmq, and they wrapped the packet which came out of a port on the server side in a class, which has one method 'itemprocessed' which the server needs to call when the packet can be disposed. But the client side from this lib has the same problem.: hhttp://www.codeguru.com/columns/experts/article.php/c16357
Coordinator
Aug 9, 2010 at 9:54 AM

Hi Renier!

As for the client side problem, the Send() extension method may help you here. If you use Send() instead of Post() to transfer your message, the successful execution of this method ensures that a message has been transferred correctly. With MSMQ this means that the message has successfully been placed in the queue. For an automatic retry mechanism it would e.g. be possible to implement another extension method variant of Send(), which automatically retries sending the message if an error is thrown.

On the server side the problem is more difficult. As you correctly say, a work item could get lost if the server crashes anywhere between taking the item out of the queue and finishing the processing of the item in a certain worker. Leaving the work item in the queue until it is processed is also not a really good solution here, because then you would need to wait until one item is processed to take out the next queue item, which makes it no more possible to process items in parallel.
A way to deal with this problem would be to let the worker respond to the client when the task is finished, and let the client resend the message if the worker doesn't answer in a specified time span (or to implement a ping mechanism to recognize when the server goes offline before finishing the task and then let the client resend the message).

You can expect to see more functionality concerning these problems in future releases - there will be addons to deal with typical problems like reliable messaging or load balancing (although i cannot yet tell you when).

Thomas

Coordinator
Aug 9, 2010 at 12:31 PM

@Renier: I second what Thomas wrote.

But on the other hand: The AppSpace is not an ESB. So I guess you´ll never see some features that ESB provide regardless what we might add in the future. We don´t want to duplicate ESB functionality. So it might well be the case you´re better off with an ESB. I encourage you to try them out.

Also some features might be beyond what should be done in infrastructure. Instead you need to model your own solution. Think about protocols for example. If you´ve a need to a particular style of communication, check if it´s right to see it as a reponsibility of an infrastructure like the AppSpace or an ESB.

We´ll be very careful to not burden the AppSpace with feature creep.

Aug 10, 2010 at 7:23 AM
Edited Aug 10, 2010 at 7:44 AM
Hi,

I understand the concern that the framework should not become bloated. However I do see a problem with the reliability.
In a normal scenario (non-async, oldskool messaging), one would listen on a port perhaps (TCP/HTTP/etc) on the server side, and on the client side it would connect to this port, and send the data. The client would know if the packet got sent or not, and could decide what to do next).

With the CCR, we now have ports. If I put something in a port, I know that (unless the process crashes completely), someone eventually will pick up my packet and do something with it. So it's also semi reliable.

Now with this framework,... If I use Post, I don't know if my packet will arrive at it's destination. And worse, if it doesn't, I'm not informed.
I could use the Ping solution (ping every 10 seconds). But with a conservative message throughput of 100 packets/sec, I lose 500 packets! And I don't even know which ones so I can not recover.

Thomass suggestion to use 'Send' instead, has the disadvantage that it is synchronous. So my thread stalls, and I loose all performance on my CCR component (since the thread count is limited and synchronous calls should be avoided).


I see 2 solutions:
- either a Port keeps on retrying to send messages, and notifies the sender that the Port is stalled. (So if the problem is fixed, everything continues on it's merry way).
- Or every packet one sends, needs to have a callback attached to notify that the send actually succeeded (or didn't). So the client can on a per packet basis decide what to do.

Without 'some' reliability, I don't see any scenario where this framework could prove to be useful in a business application setting.


About your comment not wanting to let it become an ESB. I don't think it is much like an ESB since this is something which sits in between various programs(of various languages) and routes, transforms and stores messages. This framework is clearly something different since it is compiled into all programs which use it's functionality. R
Coordinator
Aug 10, 2010 at 8:21 AM
Edited Aug 10, 2010 at 8:22 AM

The synchronicity of Send() is only a problem if you don´t have a proper architecture in place.

You seem to envision some code C in process P1 to be directly connected to some worker W in process P2. For out samples that´s all well and good. They are supposed to show off features of the AppSpace.

But that´s a recipe for disaster in production code. It´s not evolvable. It´s hard to test.

So my premise always is: you properly wrap resources in your designs according to the SRP and SoC principle. That means C does not post to a remote port, but to a local port of some proxy P. P in turn uses Send() to communicate with W.

Of course, then P needs to decide what to do in the face of an exception. But that´s only a local issue and thus easier to deal with, e.g. give P a retry interval and retry repeat count, or equip it with an exception port etc.

Aug 10, 2010 at 8:59 AM
I think we are saying the same thing. I don't want to be connected to the other side with a direct connection. I like it that the framework takes care of the communications.
I just store my packet in a local Port and 'magically' it arrives at it's destination, either local or on a different computer.

I'm just pointing out that when I put something in a Port than I need to be sure that the packet arrives at it's destination, or if it can't that I'll be informed. Thomass said: you can use 'send' for this, but this makes the call synchronous, while I prefer to keep everything asynchronous. Even if I somehow abstract this Send away behind a Port, the issue remains that this send will be synchronous in a asynchronous system.
Coordinator
Aug 10, 2010 at 9:56 AM

You´re missing the proxy. Send() does not impede any client code if you wrap it in a proxy. The client passes messages to the proxy with Post(). The proxy uses Send() to pass them on to the server/worker. This is (!) async with regard to the client code. (It is not for the proxy. But that´s on purpose to be able to easily detect problems.)

Aug 10, 2010 at 6:01 PM
Edited Aug 10, 2010 at 6:05 PM
Thats just moving the problem ;)

If my local proxy is sending synchronously then it is still using resources (threads) very un-optimal. I'd rather have it send asynchronously (beginSend/endSend async pattern) directly in that case. Which at that point makes the whole appspace framework proposition weak.

The appspace framework (as I initially understood it(but perhaps I'm just seeing it wrong)) is the async programming model of the ccr, but additionally crossing process or even machine boundaries (much like what dss aimed to do)

Given the above definition, i'd like it to handle the details of sockets, queues, pipes and what not. I agree it's up to the developer to create proxy classes if he decides it makes sense in his architecture. However, if it's the component itself, or a local proxy which does the communication, it will still use the appspace framework, so it should still be able to send asynchronously. And if it does this sending asynchronously it should also give asynchronous feedback on failed messages

If the proxy should really be using synchronous sends in order to achieve some basic reliability, we might as well roll our own tcp communications in the proxy. :/


I apologize for going on about this, and if i'm just not seeing the whole picture yet. It's an impressive framework with a lot of features which (like the ccr) will probably take some time to fully understand and appreciate
Coordinator
Aug 10, 2010 at 6:58 PM

I sure emphasize with you and understand your wish. But my feeling is were entering an area where adding functionality could either bloat the AppSpace - or lead to cases where LOLA kicks in.

Let me see what I can do. I´ll talk with the dev team.

Coordinator
Aug 14, 2010 at 4:07 PM

Hi Renier,

I just added a simple feature that allows asynchronously waiting for an ACK or communication error, which I think fits in nicely into the framework. You can now get an ACK by adding a causality with a special coordination port (one that is of type Port<Ack>). It goes like this:

var exceptionPort = new Port<Exception>();
var ackPort = new Port<Ack>();
var c = new Causality("c", exceptionPort, ackPort);
Dispatcher.AddCausality(c);
				
remotePort.Post(...);

Or you could also use the PostWithCausality extension method:

var exceptionPort = new Port<Exception>();
var ackPort = new Port<Ack>();

remotePort.PostWithCausality(..., exceptionPort, ackPort);

For waiting either for ack or an exception you can use the new ReceiveChoice extension method (which is a bit easier to read than if you use the CCR Arbiter class directly):

space.ReceiveChoice(ackPort, exceptionPort, ack => ... , ex => ...);

To try it, just download an compile the latest source code.

Aug 17, 2010 at 2:14 PM
that sounds like a great solution. thanks!