Unhandled exception while opening 2nd connection

May 26, 2015 at 3:46 PM
We open two connections one is for Event hub and other one to subscribe topics. Sometime while opening the 2nd connection it never comes back and wince device terminate the application with unhandled exception.
Any idea?
Coordinator
May 26, 2015 at 5:14 PM
You mentioned wince device so I assume you are using the Amqp.NetCF39 assembly. Do you have the exception message and callstack? Can you follow the instructions here and post the frame traces (remove the SASL frames that contain your client credentials)?
https://amqpnetlite.codeplex.com/wikipage?title=Tracing&referringTitle=Documentation

Thanks,
Xin
May 26, 2015 at 6:54 PM
Thanks for your prompt response Xin!

Yes we use Amqp.netCF39. Actually application code or even un-handler event is not able to catch that exception. Wince device pops up a message to kill the application. I will dig more into and try to print frame traces.

Jag
May 27, 2015 at 4:37 PM
Hi Xin

Here are frame traces:

[06:55.000] SEND (ch=0) begin(next-outgoing-id:4294967293,incoming-window:2048,outgoing-window:2048,handle-max:7)
[06:55.000] SEND (ch=0) attach(name:ServiceBus.Cbs:receiver-link,handle:0,role:True,source:source(address:device/commands0/......),target:target())
[06:55.000] SEND (ch=0) flow(next-in-id:0,in-window:2048,next-out-id:4294967293,out-window:2048,handle:0,delivery-count:0,link-credit:5)
[06:55.000] RECV (ch=0) begin(remote-channel:0,next-outgoing-id:1,incoming-window:2048,outgoing-window:2048,handle-max:7)
[06:56.000] RECV (ch=0) attach(name:ServiceBus.Cbs:receiver-link,handle:0,role:False,rcv-settle-mode:1,source:source(address:device/commands0/...),target:target(),initial-delivery-count:0,max-message-size:266240)
[06:51.000] SEND (ch=0) transfer(handle:0,delivery-id:0,delivery-tag:00000000,message-format:0,settled:False,batchable:True) payload 2014
[06:00.000] RECV (ch=0) disposition(role:True,first:0,settled:True,state:accepted())
[06:10.000] SEND (ch=0) transfer(handle:0,delivery-id:1,delivery-tag:00000001,message-format:0,settled:False,batchable:True) payload 185
[06:13.000] RECV (ch=0) disposition(role:True,first:1,settled:True,state:accepted())
[06:21.000] SEND (ch=0) empty
[06:49.000] SEND (ch=0) empty
[06:42.000] RECV (ch=0)
May 28, 2015 at 9:09 PM
Hi Xin,

I added some more debug messages looks like when it trying to connect to socket in tcptransport.cs then it throws that fatal exception.
socket.Connect(new IPEndPoint(ipAddress, address.Port));

Thanks
Jag
Coordinator
May 28, 2015 at 9:26 PM
Does this happen all the time, or is just random?

Do you know the values of ipAddress and address.Port before the exception? If you put "new Connection(address);" in a try block, can you catch that exception? It should because socket.Connect is run on the same thread where you create the connection object.

btw, from the traces it looks like there were activities from two connections and I am not able to tell what was wrong.

Thanks,
Xin
Jun 1, 2015 at 3:57 PM
Just random, Sometimes once in 2-3 days.

Yes ipAddress has a valid value and port is 5671 all the time.

I am not able to catch the exception in try catch block. Sounds like some conflicts in internal threads and wince exposes that exception to kill the host.

A pop up comes on wince device with this exception message:
Application ch.exe encountered a serious error and must shut down.


Thanks
Jag
Jun 1, 2015 at 4:17 PM
Xin,

We use wince 6.0 with .net compact framework 3.5. So we converted AQMP.NetCF39 to .net 3.5 compact framework. It works pretty well but somehow in between we start seeing this exception.

Thanks
Jag
Jun 1, 2015 at 4:19 PM
Xin,

We use wince 6.0 with .net compact framework 3.5. So we converted AQMP.NetCF39 to .net 3.5 compact framework. It works pretty well but somehow in between we start seeing this exception.

Thanks
Jag
Jun 12, 2015 at 3:36 PM
Hi Xin,

We have narrowed down little bit if we keep only one connection then it works fine. Any no. of times it can reconnect. If two connections then sometimes while trying to open tcp socket it throws unhandled exception and device crashed.

Please let me know if you have any fix or suggestions.

Thanks
Jag
Coordinator
Jun 13, 2015 at 12:44 AM
I guess this might have something to do with the ssl option set on the socket. Without more information I cannot tell what is the problem. If possible, we could try the following options to narrow down the issue further.
  1. Run your application without SSL. You cannot do this with the Azure SB service but it should be easy to setup a local broker. You can build the TestAmqpBroker project from the solution and start it on your machine by running "TestAmqpBroker.exe amqp://localhost:5672 /creds:guest:guest /queues:q1 /trace:frame".
    Set the address in your wince application to "amqp://guest:guest@[name-of-the-machine]:5672" and create sender/receiver links against "q1". This only helps us narrow down the issue but it does not give any more information about the root cause.
  2. Debug the application. If you could connect the debugger to the device and run the application under debugger, it should break when the thread faults.
  3. Capture a dump file of the device using the Error Reporting functionality. From the dump file we may be able to find something.
Regards,
Xin
Jun 22, 2015 at 4:22 PM
I was able to setup local test broker. Test code was opening\closing send\receive connections and send 200 messages in each try. It worked fine till around 25000 tries. After that Fx.StartThread threw out of memory exception. Debugger was also attached. I was able to reproduce this couple times.

Here is last trace logs:

[02:32.000] SEND (ch=0) begin(next-outgoing-id:4294967293,incoming-window:2048,outgoing-window:2048,handle-max:7)
[02:32.000] SEND (ch=0) attach(name:send-link,handle:0,role:False,source:source(),target:target(address:q1),initial-delivery-count:0)
[02:32.000] RECV (ch=0) open(container-id:TestAmqpBroker,host-name:localhost,max-frame-size:16384,channel-max:3)
[02:33.000] RECV (ch=0) begin(remote-channel:0,next-outgoing-id:4294967293,incoming-window:2048,outgoing-window:2048,handle-max:7)
[02:33.000] RECV (ch=0) begin(remote-channel:0,next-outgoing-id:4294967293,incoming-window:2048,outgoing-window:2048,handle-max:7)
[02:33.000] SEND (ch=0) detach(handle:0,closed:True)
[02:33.000] RECV (ch=0) attach(name:send-link,handle:0,role:True,source:source(),target:target(address:q1))
[02:33.000] RECV (ch=0) flow(next-in-id:0,in-window:2048,next-out-id:4294967293,out-window:2048,handle:0,delivery-count:0,link-credit:200)
[02:33.000] RECV (ch=0) detach(handle:0,closed:True)
[02:33.000] SEND (ch=0) end()
[02:33.000] SEND (ch=0) end()
[02:33.000] RECV (ch=0) end()
[02:33.000] SEND (ch=0) close()
[02:33.000] SEND (ch=0) close()
[02:33.000] RECV (ch=0) close()
[02:34.000] SEND AMQP 3 1 0 0
[02:34.000] SEND sasl-init(mechanism:PLAIN,initial-response:006775657374006775657374,hostname:jags)
[02:34.000] RECV AMQP 3 1 0 0
[02:34.000] RECV sasl-mechanisms(sasl-server-mechanisms:PLAIN)
[02:34.000] RECV sasl-outcome(code:0)
[02:34.000] SEND AMQP 0 1.0.0
[02:34.000] SEND (ch=0) open(container-id:a9e41d07-415e-44f0-89ba-10c311f12e14,host-name:jags,max-frame-size:16384,channel-max:3)

Thanks
Jag
Coordinator
Jun 25, 2015 at 9:21 PM
I am running a similar test which creates/closes ssl connections in a loop in Device Emulator against the test broker. So far it has run 150,000 iterations without issues. I also tried the .NET framework and it works too. I think the managed code is fine and I am not sure if the issue is in the device OS/driver.
Aug 20, 2015 at 2:41 PM
Edited Aug 20, 2015 at 8:17 PM
Hi Xin,

We are still trying to find out the cause of this fatal error. We have tested almost all possible different scenarios. We have updated OS 6.0 with latest updates and latest .net cf framwrok. In our application we also tried to disable one by one each service but we still see the exception. We also created a test app that has very limited functionality but that's also fails.

We also noticed it fails even on opening the connection first time. It always fails on same place while opening the socket. These failure are rare around 3% to 5%.

We have updated Amqp client to 1.1.0 that had some fixed for connection dead lock issues but no luck.

All development has been completed but we are not able to release the project. We will really appreciate any further help on this. We use win ce 6.0 with .netcf 3.5.

Thanks
Jag
Aug 24, 2015 at 9:54 PM
Sounds like problem with the SSL. We have tested with local broker without SSL it worked fine.
Sep 1, 2015 at 7:23 PM
Finally we are able to fix this bug. We have replaced AMQP SSL encryption with bouncy castles.
Marked as answer by jag3435 on 9/1/2015 at 12:23 PM