[Wiced]System crash after calling the wiced_tcp_delete_socket

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Anonymous
Not applicable

Hi,

We have one critical issue, that is system crash after calling the wiced_tcp_delete_socket.

In our application, we have a secured tcp socket which connects to out side Server, and this connection is always connected unless there is network problem or close from Server  or stop from user.

Whenever there is connection close (other than user) we need to re-try connecting to the Server .

We observer that after device is running for few hours and if network is in bad condition sometime we get a 7014 error from wiced_tcp_receive, here we noticed that 7014 is received by the netx_duo , Server or user did not send any TCP disconnect( This 7014 error already looking by Broadcom network engineer).

In our application we retry connection after 10 second, before making new connection with the server we tried to clean the previously created socket. When we tested it works well even after few  creating and deleting few hundreds times, but some times we can see crash after application is running for few hours and getting some un-usual close from netx_duo .

below is the code used for clean Socket .

And we can see the crash after wiced_tcp_delete_socket

        APPLN_LOG_INFO("Socket state is [%d] result [%d]", socket_state, result);

        if(socket->socket_magic_number == WICED_SOCKET_MAGIC_NUMBER && socket->socket.nx_tcp_socket_id !=0) /* delete only if it is not Already deleted*/

          result = wiced_tcp_delete_socket(socket);

With our observation the crash is after calling nx_tcp_socket_disconnect, we verified that socket pointer is not NULL and it is already created.

Please help to debug this crash, our application is rejectd by our testing team to this crticial issue, below is the code related socket delete and Crash Log

wiced_result_t DisConnectSokcet(wiced_tcp_socket_t *sockets)

{

  wiced_socket_state_t socket_state;

  wiced_result_t result = WICED_SUCCESS;

  wiced_tcp_socket_t *socket = NULL;

  int tryCount = 0;

  TRY_AGAIN:

  socket = msgpstmCISocket->tcp_client_socket;

  if(socket != NULL)

  {

    tryCount ++;

    result =
wiced_tcp_get_socket_state( socket, &socket_state );

    if(result == WICED_SUCCESS)

    {

      if (socket_state == WICED_SOCKET_DATA_PENDING &&
tryCount <10)

      {

        SN_SleepTask(1000); /** 10 sec
Wait for socket Transaction*/

        goto TRY_AGAIN;

      }

      else if(socket_state ==WICED_SOCKET_CONNECTED)

      {

        SHPNANO_LOG_INFO("Socket
state [%d] "
, socket_state);

        if(socket->socket_magic_number ==
WICED_SOCKET_MAGIC_NUMBER && socket->
socket.nx_tcp_socket_id !=0)

          result =  wiced_tcp_disconnect(socket);

        if(result != WICED_SUCCESS)

        {

          SHPNANO_LOG_WARNING("Disconnect
result [%d] "
, result);

        }

      }

    }

    else

    {

      SHPNANO_LOG_WARNING("socket
Might Have already Closed [%d] "
, socket_state);

    }

  }

  return result;

}

/**

*

* @param socket

*/

void CleanSokcet(wiced_tcp_socket_t *sockets)

{

  SHPNANO_LOG_FUNC_ENTRY();

  wiced_tcp_socket_t * socket =
sockets;

  if(socket != NULL)

  {

    wiced_socket_state_t socket_state;

    wiced_result_t result;

    int tryCount = 0;

    TRY_AGAIN:

    result =
wiced_tcp_get_socket_state( socket, &socket_state );

    tryCount ++;

    if(result == WICED_SUCCESS)

    {

      if(socket_state != WICED_SOCKET_CLOSED)

      {

        SHPNANO_LOG_INFO("Sokcet
state is [%d], Need to wait"
, socket_state);

        if(tryCount <10)

        {

          SN_SleepTask(1000); /** Standard
Wait for Sokcet Transcation*/

          if(socket_state !=WICED_SOCKET_CLOSING)

            goto TRY_AGAIN;

        }

        DisConnectSokcet(socket);

        //SN_SleepTask(WICED_TCP_DISCONNECT_TIMEOUT)
;/** Standard Wait for Sokcet Disconnection */

        goto TRY_AGAIN;

      }

      if(socket != NULL) /* Added
Double NULL check*/

      {

        result =  wiced_tcp_get_socket_state( socket,
&socket_state );

        SHPNANO_LOG_INFO("Socket
state is [%d] result [%d]"
, socket_state, result);

        if(socket->socket_magic_number ==
WICED_SOCKET_MAGIC_NUMBER && socket->
socket.nx_tcp_socket_id !=0) /* delete only
if it is not Already deleted*/

        {

          SHPNANO_LOG_INFO("Calling
Delete Socket"
);

          result =
wiced_tcp_delete_socket(socket);

        }

      }

      if(result != WICED_SUCCESS)

      {

        SHPNANO_LOG_WARNING("Delete
Socket Warning [%d]"
, result);

      }

      //memset(socket, 0 ,
sizeof(wiced_tcp_socket_t));

    }

    else

    {

      SHPNANO_LOG_WARNING("Failed
to get the Socket state [%d] "
, result);

    }

  }

  SHPNANO_LOG_FUNC_EXIT();

}

|06:49:27.012|[tid
: 5550672] E palCIServer.c:553|readDataFromNetwork(): Socket return [7014]

|06:49:27.021|[tid
: 5550672] W palCIServer.c:569|readDataFromNetwork(): Packet reception failed
on CI Channel, CI Socket  read return
[7014]

|06:49:27.035|[tid
: 5550672] W palCIServer.c:684|readDataFromNetwork(): Server [52.69.94.74]
Returned empty Response or Connection got Broken

|06:49:27.049|[tid
: 5550672] I CoapServerAdaptor.c:480|CoapResponseHandler(): Respone Code:[731]

|06:49:27.059|[tid
: 5550672] E CoapServerAdaptor.c:543|CoapResponseHandler(): Connection
terminated by ci server :(, Will Try ReConnecting

|06:49:27.073|[tid
: 5550672] D DawitControl.c:216|DeviceConnStatusCallBack():
DeviceConnStatusCallBack [0]

|06:49:27.084|[tid
: 5550672] I CoapUtil.c:310|TryNextReconnectWithRIV(): Timer for [2000] ms

|06:49:27.392|[tid
: 5548832] E CoapClientAdaptor.c:161|CoapClientPost(): Remote Connection for
Coap is not yet Ready

|06:49:27.403|[tid
: 5548832] E SubscriptionManagerUtil.c:149|SendNotification(): Failed to send
Notification..

|06:49:29.084|[tid
: 5550856] D CoapUtil.c:184|CoapEventHandler(): Event : 2

|06:49:29.191|[tid
: 5550856] D CoapServerAdaptor.c:782|RegisterUtil(): CI Registration start

|06:49:30.200|[tid
: 5550856] T CoapServerAdaptor.c:944|__RegisterCI(): Entry

|06:49:30.207|[tid
: 5550856] I palCIServer.c:433|_palSendCIRequest(): Server Connection is not
Alive, Make Sure this is Registration Request

|06:49:30.221|[tid
: 5550856] D palCIServer.c:263|_palDelSocket():
Deleting the CI Server Socket Called

|06:49:30.232|[tid
: 5550856] T palCIServer.c:1095|CleanSocket (): Entry

|06:49:30.239|[tid
: 5550856] I palCIServer.c:1125|CleanSocket (): Socket  state is [0] result [0]

|06:49:30.239|[tid
: 5550856] I palCIServer.c:1127|CleanSocket (): Calling Delete Socket

===
EXCEPTION ===

data_abort_handler

LR   : 0x004F08E9

DFSR :
0x0000080D

DFAR :
0x00000005

IFSR :
0x00000000

IFAR :
0x00000000

CPSR :
0x600001D7

=================

0 Likes
6 Replies
enel_2129601
Level 3
Level 3
10 likes given 5 likes given First like received

Did you find any solution for this problem. since i have encountered the same problem.

0 Likes
Anonymous
Not applicable

Hi,

Still I am not able to find the route cause, its very difficult to reproduce .

Do you have any easy method to reproduce this issue?

Regards

Chethan

0 Likes

the way how i am finding that situation is restarting the card for each connection and waiting for new connection. and the problem occurs very randomly.

RoDe_1773541
Level 4
Level 4
25 replies posted 10 replies posted 5 replies posted

I'm not sure if this is related, but there's a bug in the create socket function where it's not completely initialized so the tls_context pointer can point to a random location in memory and cause problems when you try and close the socket.  Just make sure you initalize the socket object to zero before using it.

memset(&socket, 0, sizeof(wiced_tcp_socket_t));

or

wiced_tcp_socket_t socket = {0};

-Rob

the software i used, it is same as what you stated. so the problem is continuing

0 Likes
Anonymous
Not applicable

I was able to fairly reliably reproduce 7014 errors by adding a delay of 0.5 seconds between calls to  wiced_tcp_receive() and download a file that was about 600KB in size. Every few thousand packets I would get a 7014 error. This was caused by our Google Cloud web server sending a Reset packet to close the connection. I couldn't see any good reason for the Reset, so I tried downloading the file from an Apache server that I had installed on an old laptop a few years ago. I don't seem to be getting any 7014 errors any more. My suggestion would be to try a different server and see if that fixes the problem.

0 Likes