sflash driver paging is broken particularly on Macronix sflash(WICED 6.0.1)

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
lock attach
Attachments are accessible only for community members.
Anonymous
Not applicable

When writing to a random place in sflash on most sflash chips the process is very slow on some chips the data written will be corrupted. This can be easily fixed by modifying the WICED 6.0.1 sflash driver where the paging used by sflash_write() is broken. This function uses the 0x02 (Page Program) to write a series of bytes to sflash. On some legacy chips the 0x02 command is actually only a (Byte Program) command that can only write a single byte. By default the sflash driver assumes that only one byte may be written each time it sends the 0x02 command. This is very slow. When saving a 600,000 byte software image by using the download_apps build option it takes much too long. This is because it sends a command to write one byte and then waits for the write to complete before sending the 0x02 command for each successive byte. The WICED sflash driver normally disables paging by using a page size of 1 byte. This makes sflash writes very slow, but the data will eventually get written to the sflash correctly.

For Macronix chips the driver uses a page size of 128 and there is a note in the code saying that this should be 256 but that didn't work so 128 is used instead:

#ifdef SFLASH_SUPPORT_MACRONIX_PARTS

    if ( SFLASH_MANUFACTURER( handle->device_id ) == SFLASH_MANUFACTURER_MACRONIX )

    {

        max_write_size = (unsigned int) 128;  /* TODO: this should be 256, but that causes write errors */

        enable_before_every_write = 1;

    }

#endif /* ifdef SFLASH_SUPPORT_MACRONIX_PARTS */

...

    while ( size > 0 )

    {... 

        write_size = ( size > max_write_size )? max_write_size : size; ...

        status = generic_sflash_command( handle, SFLASH_WRITE, (unsigned long) 3, curr_device_address, (unsigned long) write_size, data_addr_ptr, NULL );

        ...

        data_addr_ptr += write_size;

        device_address += write_size;

        size -= write_size;

    }

The problem with paging in this driver is that it doesn't align the data to be written on 256 byte boundaries. The code above will always write the first 128 bytes with a single SFLASH_WRITE(0x02) command even if that 128 bytes spans more than one 256 byte page. This means that if you pick a random location in sflash to write to, the command will only work half the time. The other half of the time you will end up with corrupted data. This simple minded fix for this is to disable paging for Macronix as well as all other chips by setting max_write_size = 1 for everything. The better approach is to align the data and fix the paging for all chips.

I went through the list of chips in the sflash_device_id_size[] array. These are the only chips for which sflash_get_size() will return a non-zero size. For the SFLASH_WRITE(0x02) command they all had page sizes of 256 bytes with 2 exceptions. SST25VF080B is a legacy chip that only supports writing a single byte with this command. CY15B104Q is a new chip that does not have a page buffer at all so a single SFLASH_WRITE(0x02) command can be used to write the entire contents of the chip. I decided that the best approach would be to use a 256 byte page size as the default and include special logic that disables paging on the one chip that doesn't support it. This means if someone tries to write to a legacy chip that has not been added to the sflash_device_id_size[] array and the chip only supports writing a single byte, the sflash_write() function will write corrupted data. For all other chips including the CY15B104Q, sflash_write() should work.

I attached a patch that replaces WICED-Studio-6.0.1/43xxx_Wi-Fi/libraries/drivers/spi_flash/spi_flash.c.

This should fix the problem as long as you add any legacy chips that can only write a single byte to the list of chips where paging is disabled.

1 Solution
VinayakS_26
Moderator
Moderator
Moderator
100 replies posted 50 replies posted 25 replies posted

Hi,

The fix works. An internal ticket is already created on this issue.

-Thanks

Vinayak

View solution in original post

3 Replies
VinayakS_26
Moderator
Moderator
Moderator
100 replies posted 50 replies posted 25 replies posted

Hi,

The fix works. An internal ticket is already created on this issue.

-Thanks

Vinayak


@VinayakS_26 wrote:

The fix works. An internal ticket is already created on this issue.

Vinayak


But WICED-6.6.1 does not include the fix.
People who downloaded latest sdk still needs to search the forum and fixes things manually.

0 Likes
CrDe_3090586
Level 1
Level 1
5 sign-ins First like received First reply posted

I wanted to share my experience with a more recent Macronix Chip, MX25U1633FM2I. At least I assume it's more recent as the Datasheet copyright was 2019. I'm no expert in sflash, any suggestions are appreciated.

Note, the MX25U1633FM2I is a 2Mb flash chip. When queried (by Wiced studio to determine the sflash chip), the chip resolves to a MX25L1606E - 0xc22015  (same 2Mb size). 

When trying to use this provided code  in Wiced Studio 6.4 (as shown below), this write was failing at about 0x400 offset into Sflash memory (physical address 0x5c400). That is using this code:

max_write_size = (unsigned int) 128; /* TODO: this should be 256, but that causes write errors */

Further testing attempting to use the 256 byte page also failed at ~offset 0x400:

max_write_size = (unsigned int) 256; /* TODO: this should be 256, but that causes write errors */

Using the 1 byte page, the corruption was gone completely (worked successfully):

max_write_size = (unsigned int) 1; /* TODO: this should be 256, but that causes write errors */

I've now tested this about 20 times,  and it works every time with max_write_size = 1.

- It failed every time using either 128 or 256.

I didn't have time to debug this issue completely, but debugged enough to know with 100% certainty that the ota data being written to flash (from the buffer read from the server) is 100% correct. 

Note, the screenshot shows the corruption, at 0x400. The left side of the diff is the *correct* one. The right side is the corrupted dump file.

0 Likes