CYW43907 boot lockup / WDT reset

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
NiMc_1688136
Level 5
Level 5
10 sign-ins 50 questions asked 10 solutions authored

Occasionally on boot, in wiced_int, the code hangs while initializing the WLAN core through the WWD interface. I do not know exactly where it hangs but i assume it is in the WWD thread context which blocks the monitor thread from kicking the watchdog, both are set to the highest priority. I have submitted this issue to an internal ticket and the  only solution was to add/increase a delay before configuring the country code. The occurrence rate has dropped but it still occurs and triggers a WDT, potentially forcing the failsafe app to load the factory reset image. I occasionally see this happening on boot after an OTA and it is pretty frustrating because the firmware gets rolled back to a really old version that was placed in the factory reset at time of manufacturing... This is a real big issue in my application.

If this issue occurs i would rather perform a SW reset versus the watchdog. The only way i see to this this is to lower the WWD priority below the system monitor and use the monitor thread hooks around the wiced_init call. Currently the monitor thread and WWD use priority 9 (FreeRTOS). From what I see, nothing else in my system is running at priority 8... Thoughts?

SDK: 6.2

Processor: CYW43907

FreeRTOS build

With the default 6.2 SDK i was able to reproduce this on the CWY43907 eval board with the reset button. After 20 or 30 presses the system would watchdog on boot.

By the way, i diffed 6.2.1 and 6.2 WWD folders and nothing has changed in the newer SDK.

For Admins, this is in reference to Startup hangs occasionally - CYW43907​, which has been locked.

Here is a console printout between a bad and good boot

Bad

Starting WICED vWiced_006.002.000.0072

Platform InVue_RTD initialised

Started FreeRTOS v9.0.0

Initialising LwIP v2.0.3

DHCP CLIENT hostname WICED IP

WWD SoC.43909 interface initializing with US/0

Reset wlan core..

load_wlan_fw: write reset_inst : 0xb83ef1b0

Release WLA

½É•¹.

Good

Starting WICED vWiced_006.002.000.0072

Platform InVue_RTD initialised

Started FreeRTOS v9.0.0

Initialising LwIP v2.0.3

DHCP CLIENT hostname WICED IP

WWD SoC.43909 interface initializing with US/0

Reset wlan core..

load_wlan_fw: write reset_inst : 0xb83ef1b0

Release WLA

½É•¹.

DMA: TX reclaim

read pkt , p0: 0x505120

read pkt , p0: 0x504a00

intstatus: 0x0, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x5042e0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x503bc0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x5034a0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x502d80

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x502660

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x501f40

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x501820

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x501100

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x5009e0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x5002c0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x4ffba0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x4ff480

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x4fed60

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x4fe640

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

intstatus: 0x1000000, NO PACKET

read pkt , p0: 0x4fdf20

read pkt , p0: 0x4fd800

293: Event (interface, type, status, reason): WWD_STA_INTERFACE Unknown WLC_E_STATUS_SUCCESS WLC_E_REASON_INITIAL_ASSOC

read pkt , p0: 0x4fd0e0

302: Event (interface, type, status, reason): WWD_STA_INTERFACE Unknown WLC_E_STATUS_SUCCESS WLC_E_REASON_INITIAL_ASSOC

intstatus: 0x0, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x4fc9c0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x4fc2a0

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x505120

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x504a00

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x5042e0

intstatus: 0x10000, NO PACKET

WLAN MAC Address : CC:C0:79:DF:57:04

DMA: TX reclaim

read pkt , p0: 0x503bc0

intstatus: 0x10000, NO PACKET

WLAN Firmware : wl0: May 15 2018 19:39:17 version 7.15.168.114 (r689934) FWID 01-d6f88905

DMA: TX reclaim

read pkt , p0: 0x5034a0

intstatus: 0x10000, NO PACKET

WLAN CLM : API: 12.2 Data: 9.10.74 Compiler: 1.31.3 ClmImport: 1.36.3 Creation: 2018-05-15 19:33:15

DMA: TX reclaim

read pkt , p0: 0x502d80

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x502660

intstatus: 0x10000, NO PACKET

DMA: TX reclaim

read pkt , p0: 0x501f40

intstatus: 0x10000, NO PACKET

Init took ~403 ms

0 Likes
1 Solution

m.lagrassa_3768146

I finally got around to prototyping your idea and it works well.

Thanks for the idea!

View solution in original post

0 Likes
9 Replies
MaLa_3768146
Level 2
Level 2
First like received

Hello,

did you fix the issue?

I'm experiencing something similar. Occasionally, after a reset (either the reset button pressed or a power down->up cycle of the board) the system hangs and the watchdog is triggered... In my case though the console log is the following

Starting WICED vWiced_006.002.001.0002

Platform CYW943907AEVAL1F initialised

Started ThreadX v5.8

Initialising NetX_Duo v5.10_sp3

Creating Packet pools

After the "Creating Packet pools" line the system occasionally hangs.

This is really boring because I cannot even reproduce it at will... it just happens. I think that If the issue doesn't get fixed in time I'll have to modify the ota2 bootloader to prevent the failsafe app from performing a factory reset after a watchdog event... I cannot risk that the hw factory resets itself in production. Still, very boring.

0 Likes

I can reproduce it using SDK 6.2.0 with FreeRTOS+lwIP and a reset button. As I moved to SDK 6.2.1, I am unable to reproduce it. I feel like it is still probably there as the diff between the two SDK releases is pretty minor with no apparent changes to the WWD interface.

From what I can tell, there is a problem resetting or communicating with the WLAN core over the WWD bus, internal to the chipset.

I was told in a support ticket to add a delay prior to the country code being configured. I was advised to use a large delay, order of 3 seconds, but as this is incredibly long, i scaled it down to 50mS.

From my ticket

----

Kindly keep the delay after wwd_management_wifi_platform_init().

wiced_init() --> wiced_wlan_connectivity_init() --> wwd_management_wifi_on() --> wwd_management_wifi_platform_init()

In 6.2.0, this helped the issue but it would still reset. My biggest issue with this is that units are encountering this reset after an OTA update and the firmware rolls back to the factory reset image. I too am considering changing the boot loader behavior before our next production run. I understand why the design choice in the bootloader to launch the failsafe app (bad OTA update on a remote device) but I too cannot afford old code being executed years down the road due to a random SW bug or a boot issue.

Something I have been thinking about but I haven't tried is too lower the WWD thread priority below the system monitor thread and then set a monitor instance around the WLAN init/startup code with the hope it would trigger a SW reset prior to a watchdog reset. This is based on the FreeRTOS implementation, I am not sure if it is the same in ThreadX. In the FreeRTOS build, WWD and system monitor are at the highest priority so if something blocks in WWD thread, the watchdog is not kicked.

0 Likes

I'm using ThreadX and SDK 6.2.1 in my implementation and last time it happened after 121 resets (with reset button), so the issue must still be there somewhere.

I think disabling the bootloader's "auto factory reset" feature after the watchdog kicked in is the easier way... One can always let the user manually perform a factory reset using a button.

Obviously something so annoying should be fixed, though...

0 Likes

That is one of my dilemmas. I do not have a "Factory reset" button and my actual factory reset image is the main app at the time of manufacturing / JTAG programming. My device is designed to run without any user interaction and does not have any interfaces for a user (buttons). 

I worry that if i disable the watchdog rollback in the bootloader, could it bite me down the road as I have compromised the failsafe capability of the device.

0 Likes

I wonder if you open a support ticket on the same issue if it would help to increase the bug visibility to the support team. Basically the response to my support ticket was that it is too random to reliability reproduce so we cannot do a root cause investigation to fix it right now.

0 Likes

Uhm wasn't the "tech" support section deprecated in favour to this community-based support center? Which section did you open the ticket at exactly? Failure analysis, Customer Service or what?

Anyway, not having a factory reset button is indeed a problem... Have you considered adding a counter field into the dct and increment it inside the ota2 bootloader code each time the watchdog kicks in. Then, always in the ota2 bootloader code, you only launch the failsafe app (and reset the counter) if the counter reaches a given level. This way:

1 - If the watchdog kicked in occasionally, the failsafe app would not be launched because the counter hasn't reached the level. You can just reboot the board.

2 - if the problem is really occasional, the reboot should lead to a normal boot sequence and therefore the app could start normally. You can then reset the counter to 0.

3 - If the watchdog kicked in due to a real issue with the app, after few reboots the counter will reach the programmed level and the failsafe app will factory reset the chip as intended.

What do you think about this approach? I've never tried it, I'm just thinking.

Yes you are correct, they want to push people towards the community forum but I did find a way to submit technical tickets through the design support link. I may have had to feature unlocked at some point, i can't remember.

Anyway, the DCT idea is actually pretty clever. It seems like it would work, as long as the DCT field is not corrupted, but if it is, i think the failsafe launch would become active anyway.

0 Likes

Nice, I hope the dct idea can actually help you.

Honestly I'm personally considering switching to other solutions for my projects. I used the PSoC Creator + PSoC BLE modules and I was really happy with them. However, since I started using Wiced Studio and the CYW943907 eval board I found issues with every single thing I tested. Yesterday I've encountered another issue with the deep sleep functionality that doesn't work at all. I've seen that you reported the problem almost a year ago and the problem is still there. Very disappointed.

m.lagrassa_3768146

I finally got around to prototyping your idea and it works well.

Thanks for the idea!

0 Likes