on a CYBLE-222014-01, we've implemented an application based on the 'Mesh Flood' example in PSoC Creator 4.4. We've recently received reports from a few customers about devices becoming 'invisible' after some time. We've examined the issue and concluded that the BLE stack (v3.65, as the affected application is ~2 years old as of today) sometimes won't restart advertising after disconnecting from our mobile app. We've managed to attach a debugger (CY8CKIT-42-BLE) to one such running target and found that GenericEventHandler() kept being called with its event parameter constantly set to 0x30303001u (henceforth, the 'magic number'), effectively deadlocking the BLE stack.
Side note: It seems that this behavior is observable only after accessing a series of certain (custom) characteristics, though; otherwise the application works fine.
TL;DR: The magic number seems to originate from the closed-source part of the stack, AFAICT.
The long story: I've tried to backtrace the event parameter and found that GenericEventHandler() is passed as a callback function pointer to CyBle_Start(), where it is then assigned to the function pointer CyBle_ApplCallback (of type CYBLE_CALLBACK_T). The latter is always and exclusively being called in conjunction with enumerators of either CYBLE_EVENT_T or CYBLE_EVT_T for its corresponding parameter eventCode, with the largest theoretically possible value therefore being CYBLE_DEBUG_EVT_BLESS_INT, i.e. 0xE000u -- except for one location. This one location lies at the end of CyBle_EventHandler() (BLE_eventHandler.c @ l.2240), where it can be called with the function's parameter of the same name, depending on cyBle_eventHandlerFlag. Provided that eventCode is set to some value not handled by the switch statement (as would be the case with our magic number), the function effectively does nothing except definitely calling CyBle_ApplCallback() with its parameters simply passed through (the default label only contains a break statement, thus leaving cyBle_eventHandlerFlag's CYBLE_CALLBACK bit, set at the beginning of the function, unaffected). This is as far as I've been able to investigate, as I cannot backtrace where CyBle_EventHandler() is being called. According to its description, it "handles the events from the BLE stack," and because our application calls neither CyBle_EventHandler() nor GenericEventHandler() directly, I can only conclude it must be called from some obscure layer of the BLE stack and that the magic number has to be generated and assigned from there as well.
I herewidth kindly ask you to investigate whether there might be a bug in the underlying implementation. Or, is this a known issue, and if so, will it be addressed in an upcoming version of the stack? In any case, please let me know if you need additional information.
In the meantime, we've set the watchdog timer to trigger a reset whenever advertising has not been restarted after a certain period of time, which appears to be an effective but 'dirty' work-around. Also, we've upgraded the BLE stack to the latest v3.66 and compiled using the latest ARM toolchain comprising GCC v10.3 instead of v5.4, which is the one provided by Cypress. ATM, we're testing whether applying either or both of these measures resolves our issue -- I'm going to post updates here.
Any help would be very much appreciated. Thanks in advance and best regards,