What do the errno from LWIP mean in wiced context.

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Félix_T
Level 5
Level 5
10 sign-ins First comment on blog 50 replies posted

We are getting spontaneously disconnected from the AP.  I saw another thread that suggested enabling the keep alive option to resolve this.  Regardless of any fix, I would like to know what some of the errnos we are seeing mean.  These are all seen when trying to send using lwip_send.

1. #define  EINPROGRESS    115  /* Operation now in progress */

     What does operation in progress mean?  What operation?

2. #define  EAGAIN       11  /* Try again */

     This one is pretty self evident, and I understand that the message should be resent, but does anyone know why I would see this?

3. #define  EIO           5  /* I/O error */

     IO error I get, but at which layer?  Is it caused by the SDIO connection between the host processor and the BCM433262 or is it an IO error at the wifi level?

0 Likes
1 Solution
Félix_T
Level 5
Level 5
10 sign-ins First comment on blog 50 replies posted

Asynchronously sending and receiving using LWIP calls on a single socket caused these issues.  We have changed to a better windowed scheme and haven't seen these failures under the same conditions.

View solution in original post

16 Replies
Anonymous
Not applicable

  Your numbers seem to be off   from the errno definitions in LWIP. don't see a '115'. '11' is "not connected"; '5' is "operation in progress".

Typically  ERR_INPROGRESS means the operation hasn't been finished when the call that invokes the operation returns, in your case, perhaps is the lwip_send() call.  It is usually for non-blocking calls.

#define ERR_OK 0 /* No error, everything OK. */

#define ERR_MEM -1 /* Out of memory error. */

#define ERR_BUF -2  /* Buffer error. */

#define ERR_TIMEOUT -3 /* Timeout. */

#define ERR_RTE -4 /* Routing problem. */

#define ERR_INPROGRESS -5

/* Operation in progress */

#define ERR_VAL -6

/* Illegal value. */

#define ERR_WOULDBLOCK -7

/* Operation would block. */

#define ERR_IS_FATAL(e) ((e) < ERR_VAL) 

#define ERR_ABRT -8  /* Connection aborted. */

#define ERR_RST -9 /* Connection reset. */

#define ERR_CLSD -10 /* Connection closed. */

#define ERR_CONN -11 /* Not connected. */

#define ERR_ARG -12 /* Illegal argument. */

#define ERR_USE -13 /* Address in use. */

#define ERR_IF -14  /* Low-level netif error */

#define ERR_ISCONN -15 * Already connected. */

0 Likes

the list is in lwip\arch.h, atleast that seems to be the one used since I do see a 115 being reported.  Below is the full list as seen in the file.

EDIT:  The list you posted seems to be the LWIP error codes and not the errno codes.  the return values are quite different from the errno values which have to be checked seperately based on what lwip returns as an error.  In this specific case, LWIP returns a timeout code, i.e. -3, but the errno is checked when this occurs and the errno can be any of the first 3 mentioned.

#define  EPERM         1  /* Operation not permitted */

#define  ENOENT        2  /* No such file or directory */

#define  ESRCH         3  /* No such process */

#define  EINTR         4  /* Interrupted system call */

#define  EIO           5  /* I/O error */

#define  ENXIO         6  /* No such device or address */

#define  E2BIG         7  /* Arg list too long */

#define  ENOEXEC       8  /* Exec format error */

#define  EBADF         9  /* Bad file number */

#define  ECHILD       10  /* No child processes */

#define  EAGAIN       11  /* Try again */

#define  ENOMEM       12  /* Out of memory */

#define  EACCES       13  /* Permission denied */

#define  EFAULT       14  /* Bad address */

#define  ENOTBLK      15  /* Block device required */

#define  EBUSY        16  /* Device or resource busy */

#define  EEXIST       17  /* File exists */

#define  EXDEV        18  /* Cross-device link */

#define  ENODEV       19  /* No such device */

#define  ENOTDIR      20  /* Not a directory */

#define  EISDIR       21  /* Is a directory */

#define  EINVAL       22  /* Invalid argument */

#define  ENFILE       23  /* File table overflow */

#define  EMFILE       24  /* Too many open files */

#define  ENOTTY       25  /* Not a typewriter */

#define  ETXTBSY      26  /* Text file busy */

#define  EFBIG        27  /* File too large */

#define  ENOSPC       28  /* No space left on device */

#define  ESPIPE       29  /* Illegal seek */

#define  EROFS        30  /* Read-only file system */

#define  EMLINK       31  /* Too many links */

#define  EPIPE        32  /* Broken pipe */

#define  EDOM         33  /* Math argument out of domain of func */

#define  ERANGE       34  /* Math result not representable */

#define  EDEADLK      35  /* Resource deadlock would occur */

#define  ENAMETOOLONG 36  /* File name too long */

#define  ENOLCK       37  /* No record locks available */

#define  ENOSYS       38  /* Function not implemented */

#define  ENOTEMPTY    39  /* Directory not empty */

#define  ELOOP        40  /* Too many symbolic links encountered */

#define  EWOULDBLOCK  EAGAIN  /* Operation would block */

#define  ENOMSG       42  /* No message of desired type */

#define  EIDRM        43  /* Identifier removed */

#define  ECHRNG       44  /* Channel number out of range */

#define  EL2NSYNC     45  /* Level 2 not synchronized */

#define  EL3HLT       46  /* Level 3 halted */

#define  EL3RST       47  /* Level 3 reset */

#define  ELNRNG       48  /* Link number out of range */

#define  EUNATCH      49  /* Protocol driver not attached */

#define  ENOCSI       50  /* No CSI structure available */

#define  EL2HLT       51  /* Level 2 halted */

#define  EBADE        52  /* Invalid exchange */

#define  EBADR        53  /* Invalid request descriptor */

#define  EXFULL       54  /* Exchange full */

#define  ENOANO       55  /* No anode */

#define  EBADRQC      56  /* Invalid request code */

#define  EBADSLT      57  /* Invalid slot */

#define  EDEADLOCK    EDEADLK

#define  EBFONT       59  /* Bad font file format */

#define  ENOSTR       60  /* Device not a stream */

#define  ENODATA      61  /* No data available */

#define  ETIME        62  /* Timer expired */

#define  ENOSR        63  /* Out of streams resources */

#define  ENONET       64  /* Machine is not on the network */

#define  ENOPKG       65  /* Package not installed */

#define  EREMOTE      66  /* Object is remote */

#define  ENOLINK      67  /* Link has been severed */

#define  EADV         68  /* Advertise error */

#define  ESRMNT       69  /* Srmount error */

#define  ECOMM        70  /* Communication error on send */

#define  EPROTO       71  /* Protocol error */

#define  EMULTIHOP    72  /* Multihop attempted */

#define  EDOTDOT      73  /* RFS specific error */

#define  EBADMSG      74  /* Not a data message */

#define  EOVERFLOW    75  /* Value too large for defined data type */

#define  ENOTUNIQ     76  /* Name not unique on network */

#define  EBADFD       77  /* File descriptor in bad state */

#define  EREMCHG      78  /* Remote address changed */

#define  ELIBACC      79  /* Can not access a needed shared library */

#define  ELIBBAD      80  /* Accessing a corrupted shared library */

#define  ELIBSCN      81  /* .lib section in a.out corrupted */

#define  ELIBMAX      82  /* Attempting to link in too many shared libraries */

#define  ELIBEXEC     83  /* Cannot exec a shared library directly */

#define  EILSEQ       84  /* Illegal byte sequence */

#define  ERESTART     85  /* Interrupted system call should be restarted */

#define  ESTRPIPE     86  /* Streams pipe error */

#define  EUSERS       87  /* Too many users */

#define  ENOTSOCK     88  /* Socket operation on non-socket */

#define  EDESTADDRREQ 89  /* Destination address required */

#define  EMSGSIZE     90  /* Message too long */

#define  EPROTOTYPE   91  /* Protocol wrong type for socket */

#define  ENOPROTOOPT  92  /* Protocol not available */

#define  EPROTONOSUPPORT 93  /* Protocol not supported */

#define  ESOCKTNOSUPPORT 94  /* Socket type not supported */

#define  EOPNOTSUPP      95  /* Operation not supported on transport endpoint */

#define  EPFNOSUPPORT    96  /* Protocol family not supported */

#define  EAFNOSUPPORT    97  /* Address family not supported by protocol */

#define  EADDRINUSE      98  /* Address already in use */

#define  EADDRNOTAVAIL   99  /* Cannot assign requested address */

#define  ENETDOWN       100  /* Network is down */

#define  ENETUNREACH    101  /* Network is unreachable */

#define  ENETRESET      102  /* Network dropped connection because of reset */

#define  ECONNABORTED   103  /* Software caused connection abort */

#define  ECONNRESET     104  /* Connection reset by peer */

#define  ENOBUFS        105  /* No buffer space available */

#define  EISCONN        106  /* Transport endpoint is already connected */

#define  ENOTCONN       107  /* Transport endpoint is not connected */

#define  ESHUTDOWN      108  /* Cannot send after transport endpoint shutdown */

#define  ETOOMANYREFS   109  /* Too many references: cannot splice */

#define  ETIMEDOUT      110  /* Connection timed out */

#define  ECONNREFUSED   111  /* Connection refused */

#define  EHOSTDOWN      112  /* Host is down */

#define  EHOSTUNREACH   113  /* No route to host */

#define  EALREADY       114  /* Operation already in progress */

#define  EINPROGRESS    115  /* Operation now in progress */

#define  ESTALE         116  /* Stale NFS file handle */

#define  EUCLEAN        117  /* Structure needs cleaning */

#define  ENOTNAM        118  /* Not a XENIX named type file */

#define  ENAVAIL        119  /* No XENIX semaphores available */

#define  EISNAM         120  /* Is a named type file */

#define  EREMOTEIO      121  /* Remote I/O error */

#define  EDQUOT         122  /* Quota exceeded */

#define  ENOMEDIUM      123  /* No medium found */

#define  EMEDIUMTYPE    124  /* Wrong medium type */

0 Likes
Anonymous
Not applicable

You got this from WICED-SDK-xxxx tree?

0 Likes

Yes, wiced 2.4.0.

Path: WICED-SDK\Wiced\Network\LwIP\ver1.4.0.rc1\src\include\lwip\arch.h

0 Likes
Anonymous
Not applicable

WICED uses this one-

  #include "lwip/err.h"

0 Likes

Yes, wiced does use that one also.  The file you are refering to is used for the return values in LWIP calles, sometimes, since most error return values are hard set to -1 completely disregarding the actual error value, this is why the errno has to be checked.  errno != return value.  errno is set by anything in the system to indicate the status of the last error.  The lwip_send call returns a -1 value, which means nothing at all, because any error that occurs in lwip_send causes a return of -1, therefore I check the errno of the system which is also set by lwip calls deeper in the stack, and I find 115 or 5 or 11.

0 Likes
Anonymous
Not applicable

if you got a postive return code from the lwip_send(), it is actually the bytes of data sent.

0 Likes

LWIP is not returning a positive value, it is returning -1.  But this does not provide any information about the error.  the ERRNO is a global value, set by lwip, but not returned by lwip.

Does this make sense, I'll repeat, errno is NOT the same thing as the return value from lwip_send.

0 Likes
Anonymous
Not applicable

You are right that WICED uses both, what a mess!

I search for  EINPROGRESS, but it doesn't turn up being referenced.  It is really bizzard that you saw 115.

for EIO, it is mapped to ERR_ARG and also used in err_to_error() when passing an error code exceeding err_to_errno_table[] size.

I can't copy/paste the source  code, I have been having problem to paste stuff into this forum. DNK if it is my browser or the forum edit.

0 Likes
Anonymous
Not applicable

WICED does not use the LwIP errno and you should disregard it when attempting to debug connectivity problems.

Why are you using lwip_send() and not the WICED API?

It sounds like you may be mixing native LwIP objects with WICED objects which will cause all sorts of bad behaviour and possible memory corruption.

0 Likes

Our platform does not allow us to use the full wiced implementation.  We are using the wwd API with lwip to implement our solution.

0 Likes
Anonymous
Not applicable

All outgoing packets will end up in low_level_output() found in wwd_network.c under the Wiced/Network/LwIP/wwd directory.

I would put a break point on the failed return statement and see if you end up there (it returns ERR_INPROGRESS). If so then the reason of the failed send is because you have lost Wi-Fi connectivity for some reason.

Can you take a sniffer trace of the events that lead up to be disconnected from the AP?

0 Likes

We mitigated the issue by retrying to send the same packet again up to 5 times.  This seems to have "fixed" the problem, since we manage to successfully transmit after the second or third try.  Before the retry logic, we did capture the network traffic between the device and the system it was communicating with.  All we saw was that we would be happily sending packets when all of a sudden, nothing would come out of the device anymore. Now, with the retries, we manage to stay connected. 

More about the history of the problem - The LWIP return value that comes with the errno of 115 or 11 is usually a -3, which is a simple timeout.  This timeout was added into LWIP by the wiced group which is evident by the comment "/* WICED_CHANGES - added timeout check */".  Now, we did modify this timout check because we noticed that the variable used as a value for the timeout, apimsg->msg.msg.bc.timeout was complete garbage since "bc" is part of a union where the apiflags field in the "w" struct that is part of the union with bc and other things, was being set somewhere in the call stack for lwip_send.  Now most of the time, because of the memory on the stack and the allignment with the different structures in the union, the timeout value was kept as 0, which corresponds to wait forever. Sometimes however, due to memory allignment in the structures in the union, timeout could become a garbage value that would lead to a non-wait-forever timeout. Long story short: After realizing that the timeout forever caused the system to hang forever in the case of a EINPROGRESS or ETRYAGAIN or anyother errno, we changed the timeout value from apimsg->msg.msg.bc.timeout to a hardcoded value of 10,000ms.  This is what allowed us to retry the transmission that times out due to EINPROGRESS of ETRYAGAIN, and mostly resolves the issue.  The only remaining problem is, why is the system hangning to the point that it takes 10 seconds to send out a single TCP packet on a not too congested network when 99% of packets make it out without a hitch?

0 Likes
Anonymous
Not applicable

Thank you for the detailed analysis.

The TCP packet may not be flagged as sent until an ACK has been received from the other side. The error may not be at an 802.11 layer but rather on the IP layer as it routes the packet to the server.

Can you try run the same code but connect to the HTTP server on the AP? I expect you will need to change the processing logic in your code but it might to worthwhile to see if you have similar problems when connecting to a TCP service that won't have any routing issues. If you have continued lockup while talking to the AP then it would seem to indicate an issue at the 802.11 layer.

0 Likes

Unfortunately, our network in composed of a bunch of APs with one controller on the backbone.  The AP image does not have an HTTP server, so the only http server we could connect to would still be 1 hop away on the network making the test invalid.

We simply added retries, and when the connection stalls, the retries seem to be able to recover, but only in the case of an EINPROGRESS or ETRYAGAIN.

We are seeing some corrupted packets (not corrupted wifi, but at a higher layer) on the server side, and the cause of these packets may be the cause of our connection issues.

0 Likes
Félix_T
Level 5
Level 5
10 sign-ins First comment on blog 50 replies posted

Asynchronously sending and receiving using LWIP calls on a single socket caused these issues.  We have changed to a better windowed scheme and haven't seen these failures under the same conditions.