demo.bt_smartbridge easily get GKI_exception 65524 getbuf: out of buffers exception on scan more then 3 connectable BLE peripherals around

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Anonymous
Not applicable

Hi All,

I tested demo.bt_smartbridge on 43438 based platform (host: STM32F11/2).

It works when 2 BLE peripheral devices around.

When there are more than 3 peripheral devices around, even just scanning will cause exception:

GKI_exception 65524 getbuf: out of buffers

Any idea about this ?

Log as attachment.

0 Likes
38 Replies
JeGu_2199941
Level 5
Level 5
25 likes received 10 likes received 10 likes given

I guess your problem may be related to settings of this structure in

const wiced_bt_cfg_buf_pool_t wiced_bt_cfg_buf_pools[WICED_BT_CFG_NUM_BUF_POOLS] =

{

/*  { buf_size, buf_count } */

    { 64,  4  }, /* Small Buffer Pool */

    { 360, 4  }, /* Medium Buffer Pool (used for HCI & RFCOMM control messages, min recommended size is 360) */

    { 360, 12 }, /* Large Buffer Pool  (used for HCI ACL messages) */

    { 600, 1  }, /* Extra Large Buffer Pool - Used for avdt media packets and miscellaneous (if not needed, set buf_count to 0) */

};

As far as I understand, the smallest available buffer will be used first when BT stack needs to allocate memory.

Unfortunately there isn't much formal document for this.

You can try to reference other usages in SDK and guess for your case.

0 Likes
Anonymous
Not applicable

changing buffer pool doesn't help.

I increased buffer pool even to 100 but doesn't help at all.

I did found some change may help but not solve all problem by

changing scanning interval/window from default 96/48 to 160/48.

But I need to know real solution and how to detect the exception and how to recover.

Anyone else suffered from this ? (on SDK 3.7.0)

0 Likes

mwf_mmfae

In https://community.cypress.com/message/27247#27247

I had asked the SDK to provide an API to show the run-time statistics number for buffer usage.

Can you confirm if next SDK will expose such API?

It would be helpful for developers to diagnostic issues.

Otherwise, people just blindly copy-n-paste and guess the settings.

0 Likes
Anonymous
Not applicable

Yes but there is no promising solution yet.

And this is critical because only several BLE peripherals around can kill the device.

0 Likes
Anonymous
Not applicable

I can also easily trigger the getbuf out of buffer exception by using:

result = wiced_bt_ble_observe(WICED_TRUE, 30, simple_ble_reader_scan_results_cb);

I suspect there is something wrong in 3.7.0 bluetooth buffer memory management.

It doesn't scale at all and run out of buffer in no sense.

0 Likes
Anonymous
Not applicable

I tried and it doesn't help at all.

I even add buffer pool up to 100 nothing helps.

Please take a review on implementation how large buffer can have impact on wiced_bt_ble_observe()  ?

(this API shouldn't require large buffer, should it ?)

I don't think this is buffer allocation problem. I assume it's memory leak problem.

0 Likes
Anonymous
Not applicable

The API

result = wiced_bt_ble_observe(WICED_TRUE, 30, simple_ble_reader_scan_results_cb);

was working on 3.5.2 sdk (though not exposed) but it is not on 3.7.0.

our partner also report demo.bt_smartbridge doesn't trigger getbuf: out of buffers exception without

changing wiced_bt_cfg_buf_pools.

Please think about regression seriously.

0 Likes
Anonymous
Not applicable

Some hint:

It seems the getbuf: out of buffer exception is easily triggered when

#define WPRINT_ENABLE_LIB_DEBUG

Any idea ?

0 Likes

Just guessing

It's warned in bootloader-related code that "printf" needs 4kB memory.

Maybe there's some "printf" in (deep) callbacks that are executed in BT stack and caused overflow?

AFAIK the BT stack is ~6kB.

0 Likes

mwf_mmfae

I think I figure out the GKI_exception 65524 getbuf: out of buffers issue.

In a heavy loading system, it's possible the code to handle the bt reader report

may take longer time to process the data.

So when bt_reader_report_cb is slightly blocked,

it's possible to run into "GKI_exception 65524 getbuf: out of buffers".

This can be simulated by adding a small delay (e.g. 30ms) in bt_reader_report_cb().

However, the problem is when run into "GKI_exception 65524 getbuf: out of buffers"

case, the device no longer work any more. THIS IS A BUG!

Even hit out of buffers, the bt stack should be still working fine when the

buffer is available.

Please consider to fix it in upcoming release.

I think it's not helpful.

The developers do not have API to show the run-time statistics number for buffer usage.

So when developers hit errors like this issue, there is no way to know what's going on.

I'm sure you have internal API to show the run-time statistics number,

Please consider to export it. At least, it helps people to report issue with better description about the symptom.

This is the third time I ask this, either yes or no, please don't ignore my question.

0 Likes
Anonymous
Not applicable

axel.linmwf_mmfae

I've created an internal JIRA to track axel.lin's request for providing an API for GKI buffer usage statistics.

dkumar wrote:

axel.lin mwf_mmfae

I've created an internal JIRA to track axel.lin's request for providing an API for GKI buffer usage statistics.

dkumar​,

Thank you.

Please also make sure the device can still work after hitting "out of space" case as I mentioned in this thread.

If it's possible to hit "out of space" at run time (depends on loading/environment), it should still work once a buffer is available.

Axel

0 Likes

Hi dkumar​,

Just FYI:

I occasionally run into below issue:

This is the first time I see the "Send - Buffer corrupted".

Test on sdk-3.7.0.

00:00:35.009392 GKI_create_task func=0x803d139  id=1  name=BTU  stack=0x0  stackSize=6144

00:00:35.018392 GKI_create_task func=0x803e659  id=0  name=HCISU  stack=0x0  stackSize=4096

00:00:35.060856 GKI_exception(): Task State Table

00:00:35.064856 GKI_exception 65535 Send - Buffer corrupted

00:00:39.060856 GKI_exception(): Task State Table

00:00:39.064856 GKI_exception 65527 getbuf: Size is too big

00:00:39.005320 GKI_exception(): Task State Table

00:00:39.009320 GKI_exception 65535 Send - Buffer corrupted

0 Likes

dkumar wrote:

axel.lin mwf_mmfae

I've created an internal JIRA to track axel.lin's request for providing an API for GKI buffer usage statistics.

Is this API (to check GKI buffer usage statistics) available on sdk-3.7.0-7?

Can you point out the function name?

0 Likes

Hi Axel,

Just to clarify, currently there is no API to check GKI buffer usage statistics. Dharam is working on creating an API for this and it will be included in the next patch of the SDK. Thank you for your suggestions.

Thanks,

Jaeyoung

jaeyoung wrote:

Hi Axel,

Just to clarify, currently there is no API to check GKI buffer usage statistics. Dharam is working on creating an API for this and it will be included in the next patch of the SDK. Thank you for your suggestions.

Thanks,

Jaeyoung

In previous discussion, see https://community.cypress.com/thread/6755

dkumar replied on Jul 13, 2016 7:57 PM

We do have such API to show the run-time statistics for buffer usage and I agree that it would make developer's life bit easier while debugging Bluetooth issues - but unfortunately as of now it is not exposed.

So I suppose it's easy to export the API to users.

I have no idea why it takes so long.

Ah I see this has been brought up before. We apologize for the delay, perhaps it was not as prioritized before. I just took a look at our issue tracking system and it is currently in active progress. I promise it will be included in the next patch. Thank you for your patience.

Thanks,

Jaeyoung

Please make sure it's in next 3.7.x serial.

I use FreeRTOS so I cannot use 4.x sdk.

0 Likes

Of course, we will make sure the patch is included in the 3.7.x series so you can continue using it. We will update you when we have the patch.

jayi wrote:

Ah I see this has been brought up before. We apologize for the delay, perhaps it was not as prioritized before. I just took a look at our issue tracking system and it is currently in active progress. I promise it will be included in the next patch. Thank you for your patience.

Thanks,

Jaeyoung

jayi

So what is the function  to show the run-time statistics for buffer usage?

I seem don't find such function even in Studio v4.1.0?

0 Likes

Hi Axel,

It is included in Studio 4.1, the function name is wiced_bt_print_cfg_buf_pool_stats(void);

You can find the definition in <SDK>/libraries/drivers/bluetooth/include/wiced_bt_cfg.h

Thanks,

Jaeyoung

jayi wrote:

Hi Axel,

It is included in Studio 4.1, the function name is wiced_bt_print_cfg_buf_pool_stats(void);

You can find the definition in <SDK>/libraries/drivers/bluetooth/include/wiced_bt_cfg.h

Thanks,

Jaeyoung

Thanks,

It's still not available for sdk-3.x. ( I have checked the BTE library updated by rash​ recently, and it does not have this function.)

So I have to wait another sdk-3.x sdk update.

Will next sdk-3.x sdk update available soon?

mifo

axel.lin_1746341 wrote:

jayi wrote:

Hi Axel,

It is included in Studio 4.1, the function name is wiced_bt_print_cfg_buf_pool_stats(void);

You can find the definition in <SDK>/libraries/drivers/bluetooth/include/wiced_bt_cfg.h

Thanks,

Jaeyoung

Thanks,

It's still not available for sdk-3.x. ( I have checked the BTE library updated by rash recently, and it does not have this function.)

So I have to wait another sdk-3.x sdk update.

Will next sdk-3.x sdk update available soon?

mifo

jayi

The fact is I cannot use the wiced_bt_print_cfg_buf_pool_stats() because it is

not available in sdk-3.x. I don't get any update for sdk-3.x.

Now I even don't know how to debug the bt issue.

ps. I'm the first one asking this API however I never have a chance to use it

   because I'm using FreeRTOS?

0 Likes

At this point, I don't think we are still considering another release of 3.7.X

We are planning on supporting FreeRTOS again in WICED Studio 5 (end of April/early May?).

0 Likes

mifo wrote:

At this point, I don't think we are still considering another release of 3.7.X

We are planning on supporting FreeRTOS again in WICED Studio 5 (end of April/early May?).

2 things:

1. I still have problem in your BT library as you can see in other threads.

However, I don't know how to debug it and I don't get a valid response from your team.

I don't think it's fine to keep it as is until May.

2. I need to make sure my existing product (already released by using sdk-3.7.0-7) can move to WICED Studio 5.

Can you guarantee that at least OTA must work.

0 Likes

mifo wrote:

At this point, I don't think we are still considering another release of 3.7.X

We are planning on supporting FreeRTOS again in WICED Studio 5 (end of April/early May?).

On the second thought, I'm vary worried about no more release for 3.x.

It means you don't maintain bug fixes for older sdks.

This is a problem especially if the bug is in binary libraries.

0 Likes

jayi wrote:

Hi Axel,

It is included in Studio 4.1, the function name is wiced_bt_print_cfg_buf_pool_stats(void);

You can find the definition in <SDK>/libraries/drivers/bluetooth/include/wiced_bt_cfg.h

Thanks,

Jaeyoung

Below shows the stats if I set buf_count to be 0 for "large and extra large buf":

( I only use ble scan, so it should not use large and extra large buffers)

00:01:44.060248 --- Bluetooth(Pool type: A-App, I-Internal) Buffer summary ---␍␍␊

00:01:44.001712 Pool(size,type)   Available   In-use    Total   Max-used␍␍␊

00:01:44.011712 --------------------------------------------------------␍␍␊

00:01:44.019712 00(  64, A):          16,       0,      16,       5␍␍␊

00:01:44.025712 01( 360, A):          16,       0,      16,       3␍␍␊

00:01:44.031712 00(41903, A):           0,       0,       0,       0␍␍␊

00:01:44.038712 --(54304, I):           0,       0,       0,       0␍␍␊

See the Pool(size,type) field, it is totally wrong.

In additional, it would be helpful if you provide the API to return the number of Max-used

rather than "print" it. My program can detect problem if the Max-used number looks wrong

(i.e. I can monitor the Max-used counters by program),

but if you use "print" there is nothing I can do at run-time to detect if something wrong.

sight, why cypress developers don't response? at least let me know you understand my point.

0 Likes

axel.lin_1746341 wrote:

jayi wrote:

Hi Axel,

It is included in Studio 4.1, the function name is wiced_bt_print_cfg_buf_pool_stats(void);

You can find the definition in <SDK>/libraries/drivers/bluetooth/include/wiced_bt_cfg.h

Thanks,

Jaeyoung

Below shows the stats if I set buf_count to be 0 for "large and extra large buf":

( I only use ble scan, so it should not use large and extra large buffers)

00:01:44.060248 --- Bluetooth(Pool type: A-App, I-Internal) Buffer summary ---␍␍␊

00:01:44.001712 Pool(size,type)   Available   In-use    Total   Max-used␍␍␊

00:01:44.011712 --------------------------------------------------------␍␍␊

00:01:44.019712 00(  64, A):          16,       0,      16,       5␍␍␊

00:01:44.025712 01( 360, A):          16,       0,      16,       3␍␍␊

00:01:44.031712 00(41903, A):           0,       0,       0,       0␍␍␊

00:01:44.038712 --(54304, I):           0,       0,       0,       0␍␍␊

See the Pool(size,type) field, it is totally wrong.

This problem is still in sdk-5.0:

00:01:03.024464 --- Bluetooth(Pool type: A-App, I-Internal) Buffer summary ---␍␍␊

00:01:03.031464 Pool(size,type) Available In-use Total Max-used␍␍␊

00:01:03.038464 --------------------------------------------------------␍␍␊

00:01:03.045464 00( 64, A): 10, 6, 16, 6␍␍␊

00:01:03.051464 01( 360, A): 16, 0, 16, 0␍␍␊

00:01:03.057464 02( 360, A): 6, 0, 6, 0␍␍␊

00:01:03.063464 --(42405, I): 0, 0, 0, 0␍␍␊

^^^^^^^^^^^^^^^^^^^^^^^^

0 Likes
Anonymous
Not applicable

Hi axel.lin_1746341

Logs which you've put here not necessarily indicate that there is a problem with the buffers.

00:01:03.063464 --(42405, I): 0, 0, 0, 0␍␍␊

It just shows that you tried to print size/type of a buffer pool which has not been allocated.
If during application initialisation, you initialise a buffer pool(in buffer pool array) with size=0 and nr_of_buffers = 0, it will end up showing like this.

To fix this situation you've to change WICED_BT_CFG_NUM_BUF_POOLS as per your application.

And then initialise the buffer-pool array ( wiced_bt_cfg_buf_pool_t ) in your application as per WICED_BT_CFG_NUM_BUF_POOLS you set earlier. By default, WICED_BT_CFG_NUM_BUF_POOLS is set to 4.

Hope it helps.

Hi dhak

Your reply helps, but it's not trivial change. (You should improve the documentation)

As the comment only says it's fine to set extra_lage buf_count as 0.

Nothing mentioned that WICED_BT_CFG_NUM_BUF_POOLS needs change.

Can you provide API to return max-used count but not printing it?

Returning a value so my program can monitor the buffer usage.

(Printing does not help too much).

I have one more question about ble scan/ ble observe:

My observation shows it usually does not use "large size buffer", but sometimes

"large size buffer" max-used becomes 1 and then the scan silently stop.

I just want to make sure if it is *abnormal* if ble scan uses "large size buffer".

The behavior looks like a bug.

If I can get max-used count, at least I can detect above issue.

0 Likes

dhak

I remove "extra_large" and "large" buf_size buffers.

I only use "small and medium buf_size" for ble scan.

After running for a while, I got below error.

01:03:50.057496 --- Bluetooth(Pool type: A-App, I-Internal) Buffer summary ---␍␍␊

01:03:50.000960 Pool(size,type)   Available   In-use    Total   Max-used␍␍␊

01:03:50.014960 --------------------------------------------------------␍␍␊

01:03:51.005000 00(  64, A):          16,       0,      16,       5␍␍␊

01:03:51.014000 01( 360, A):          16,       0,      16,       3␍␍␊

01:04:41.004248 GKI_exception(): Task State Table␍␍␊

01:04:41.013248 GKI_exception 65523 getpoolbuf bad pool␍␍

Why I got "bad pool" exception?

Does that mean ble scan/observe needs "large buffer"?

Can you confirm? It's your internal implementation, there is no way I can confirm this behavior by myself.

0 Likes

dhak

Now I'm pretty sure ble sca/observe will also use "large buf_size".

The ble scan/observe stops everytime when Max-used count for "large buf_size" becomes 1.

This looks like a bug to me. Can you explain what is happening?

0 Likes

axel.lin_1746341 wrote:

dhak

I remove "extra_large" and "large" buf_size buffers.

I only use "small and medium buf_size" for ble scan.

After running for a while, I got below error.

01:03:50.057496 --- Bluetooth(Pool type: A-App, I-Internal) Buffer summary ---␍␍␊

01:03:50.000960 Pool(size,type)   Available   In-use    Total   Max-used␍␍␊

01:03:50.014960 --------------------------------------------------------␍␍␊

01:03:51.005000 00(  64, A):          16,       0,      16,       5␍␍␊

01:03:51.014000 01( 360, A):          16,       0,      16,       3␍␍␊

01:04:41.004248 GKI_exception(): Task State Table␍␍␊

01:04:41.013248 GKI_exception 65523 getpoolbuf bad pool␍␍

Why I got "bad pool" exception?

Does that mean ble scan/observe needs "large buffer"?

Can you confirm? It's your internal implementation, there is no way I can confirm this behavior by myself.

dhak

If I provide "large buffer" it does not run to GKI_exception, but scan

no longer receive any result without any error once Max-used fo large buffer

becomes 1. i.e. the device no longer work silently.

I'm fine you keep all your secret about your close source library.

But I need your suggestion to address this issue.

It looks like a bug in the close source library and I had reported this for more than 2 months.

0 Likes

axel.lin_1746341 wrote:

axel.lin_1746341 wrote:

dhak

I remove "extra_large" and "large" buf_size buffers.

I only use "small and medium buf_size" for ble scan.

After running for a while, I got below error.

01:03:50.057496 --- Bluetooth(Pool type: A-App, I-Internal) Buffer summary ---␍␍␊

01:03:50.000960 Pool(size,type)   Available   In-use    Total   Max-used␍␍␊

01:03:50.014960 --------------------------------------------------------␍␍␊

01:03:51.005000 00(  64, A):          16,       0,      16,       5␍␍␊

01:03:51.014000 01( 360, A):          16,       0,      16,       3␍␍␊

01:04:41.004248 GKI_exception(): Task State Table␍␍␊

01:04:41.013248 GKI_exception 65523 getpoolbuf bad pool␍␍

Why I got "bad pool" exception?

Does that mean ble scan/observe needs "large buffer"?

Can you confirm? It's your internal implementation, there is no way I can confirm this behavior by myself.

dhak

If I provide "large buffer" it does not run to GKI_exception, but scan

no longer receive any result without any error once Max-used fo large buffer

becomes 1. i.e. the device no longer work silently.

dhakmifo

I have been waitting for your reply for 3 weeks.

(And I had reported this issue 3 months ago).

The question is pretty simple.

In what kind of situation a wiced_bt_ble_observe() call will use large buffer?

I want to point out it looks like a bug because scan stop working if a large buffer is being used by

wiced_bt_ble_observe() call.

Sight, it seems hard to get response from cypress. 😞

0 Likes

axel.lin_1746341 wrote:

jayi wrote:

Hi Axel,

It is included in Studio 4.1, the function name is wiced_bt_print_cfg_buf_pool_stats(void);

You can find the definition in <SDK>/libraries/drivers/bluetooth/include/wiced_bt_cfg.h

Thanks,

Jaeyoung

Below shows the stats if I set buf_count to be 0 for "large and extra large buf":

( I only use ble scan, so it should not use large and extra large buffers)

00:01:44.060248 --- Bluetooth(Pool type: A-App, I-Internal) Buffer summary ---␍␍␊

00:01:44.001712 Pool(size,type)   Available   In-use    Total   Max-used␍␍␊

00:01:44.011712 --------------------------------------------------------␍␍␊

00:01:44.019712 00(  64, A):          16,       0,      16,       5␍␍␊

00:01:44.025712 01( 360, A):          16,       0,      16,       3␍␍␊

00:01:44.031712 00(41903, A):           0,       0,       0,       0␍␍␊

00:01:44.038712 --(54304, I):           0,       0,       0,       0␍␍␊

See the Pool(size,type) field, it is totally wrong.

In additional, it would be helpful if you provide the API to return the number of Max-used

rather than "print" it. My program can detect problem if the Max-used number looks wrong

(i.e. I can monitor the Max-used counters by program),

but if you use "print" there is nothing I can do at run-time to detect if something wrong.

sight, why cypress developers don't response? at least let me know you understand my point.

mady

As I told you that using "print" to get runtime bluetooth buffer usage statistics

does not hep if STDIO is disabled.

In additional, you also found that calling wiced_bt_print_cfg_buf_pool_stats() has

some side effect.

Any chance to provide the API to just return the buffer usage statistics number

rather than print it?

ps. I had reported this quite a long time but somehow on response.

0 Likes