Unidentified Alarms in group 7 reported

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
User19868
Level 1
Level 1
Hello, I am using TC37xx, and I am having trouble finding the root cause for the reporting of two alarms in group 7.
First is Alm7[26] it is related to the Multibit error in pflash, I accessed the MBAB register and got the reported address, however, the 1st thing I can't understand is the TC37 have 2 pflash banks PFI0 and PFI1, and the address is reported in MBAB for PFI1 however the address reported is 01FC080, shouldn't the address space for PFI1 start from 2FFFFF? how does the MBAB for PFI1 hold an address from PFI0's space? and the second issue is I hoped to put a breakpoint at that specific address with the hope to find a root cause but the breakpoint was never reached, what should I do to find the root cause and fix this?

Second is Alm7[17] it is related to SRI bus DOM0 error, I traced the error back to the default slave by reading DOM0_PESTAT I found out that PESCI[15] reports an error which traces to the default slave so I read the DOM0_ERR15 and DOM0_ERRADDR, the ERRADDR is showing address 0x0000000, so what could that be? null pointer access? and what module exactly is the default slave I cant seem to get that from the user manual or the appendix,

this is how far I got tracing both alarms. I would really appreciate any push to reach the root cause for both, thank you!
0 Likes
10 Replies
NeMa_4793301
Level 6
Level 6
10 likes received 10 solutions authored 5 solutions authored
The MBABRECORD0.ADDR value is a local offset:

6.7.3.2.4 PFI Uncorrected Multi Bits Address Buffer (MBAB)
When data is read from Flash NVM and the ECC decoder detects an uncorrected multi bit error (including all-0 and all-1 errors) then the local address is stored in the MBAB. Each local address is only entered once, and covers 256 bits of data. The bottom five reserved bits of the MBAB read as 0, and can be concatenated with the local address
to give the local address as seen by the system (without the base offset).

So, you need to add the base address of PF1, which is 0xA0300000: A0300000 + 1FC080 = A04FC080. Is there something valid at that address?

An uncorrectable PFLASH error will also cause an SRI bus error. Clear up your PFLASH error and I'll bet ALM7[17] will go away. If not, let us know.
0 Likes
Darren_Galpin
Employee
Employee
First solution authored First like received
Another question regarding the PFlash access - are you using Software Over The Air (SOTA)? This can change which PFlash reports an error based on the address seen as well.

The default slave is located in the XBAR. It responds to all addresses not mapped to any other slave, and also handles the register accesses to the XBAR, i.e. when you read DOM0_PESTAT. Your error capture in the default slave means that it observed an access to address 0x0000_0000 by something - the contents of ERR15 should be able to tell you what was accessing it (from the tag) and whether it was a read or a write.
0 Likes
User19868
Level 1
Level 1
UC_wrangler wrote:
The MBABRECORD0.ADDR value is a local offset:

So, you need to add the base address of PF1, which is 0xA0300000: A0300000 + 1FC080 = A04FC080. Is there something valid at that address?

An uncorrectable PFLASH error will also cause an SRI bus error. Clear up your PFLASH error and I'll bet ALM7[17] will go away. If not, let us know.


it seems I am using an older version of the user manual the part regarding the base offset and local address is not there in the UM version am using, however that explains it, concatenating the first reserved bitfields with the address gives the local offset without the base, MBABRECORD0.ADDR holds FE04 after concatenating the 5 reserved bits it would be 1FC080 and then adding the base as you did upthere would give A04FC080, I think that is definitely worth a try.
I will check that and get back to you thank you!
0 Likes
User19868
Level 1
Level 1
Darren Galpin wrote:
Another question regarding the PFlash access - are you using Software Over The Air (SOTA)? This can change which PFlash reports an error based on the address seen as well.

The default slave is located in the XBAR. It responds to all addresses not mapped to any other slave, and also handles the register accesses to the XBAR, i.e. when you read DOM0_PESTAT. Your error capture in the default slave means that it observed an access to address 0x0000_0000 by something - the contents of ERR15 should be able to tell you what was accessing it (from the tag) and whether it was a read or a write.


No I am not using SOTA.
these results are from different runs
the DOM0_SCICTRL_15_ERR = 0x00072092 , 0x0001E092, 0x0093A092 0x00956092, however in each scenario the DOM0_ERRADDR always holds 0x00000000

So it seems the different runs all map to CPU0 from the tag ID 0b10 0000 and it was a read transaction from 0b10 at the beginning of the PESTAT, so that should men a null pointer dereferencing took place, right? is there a way to further know where to find such dereference through the whole code?
0 Likes
Darren_Galpin
Employee
Employee
First solution authored First like received
Maybe there is other information in the ERR register which may give a hint - the opcode field for example. That would tell you the size of the transaction (i.e. word, half-word, byte, burst.....), which may then point to the code doing the access. Other than that you will need a debugger, and that is going outside my range of help!
0 Likes
User19909
Level 3
Level 3
First like received
UC_wrangler wrote:
The MBABRECORD0.ADDR value is a local offset:

So, you need to add the base address of PF1, which is 0xA0300000: A0300000 + 1FC080 = A04FC080. Is there something valid at that address?

An uncorrectable PFLASH error will also cause an SRI bus error. Clear up your PFLASH error and I'll bet ALM7[17] will go away. If not, let us know.


clearng the pflash bab would clear the alarm?
0 Likes
User19868
Level 1
Level 1
UC_wrangler wrote:
The MBABRECORD0.ADDR value is a local offset:

So, you need to add the base address of PF1, which is 0xA0300000: A0300000 + 1FC080 = A04FC080. Is there something valid at that address?

An uncorrectable PFLASH error will also cause an SRI bus error. Clear up your PFLASH error and I'll bet ALM7[17] will go away. If not, let us know.


there is no code actually at that address it contains all zeroes, but what's worth mentioning is that the same address in the mbab is reported in zbab as well and the address just after it is reported in zbab as well, and this consistently happens at the same address everytime I run, if I reflash it also consistently happens but just the address may change but the behaviour remains the same
0 Likes
Darren_Galpin
Employee
Employee
First solution authored First like received
Is the PFlash initialised or not? If you are reading a location with no data being written, then the ECC value will not match the address and it will error. Your code I would guess is in PFlash0, hence is ECC correct, whereas nothing has been written to PFlash1?
0 Likes
User19868
Level 1
Level 1
Darren Galpin wrote:
Is the PFlash initialised or not? If you are reading a location with no data being written, then the ECC value will not match the address and it will error. Your code I would guess is in PFlash0, hence is ECC correct, whereas nothing has been written to PFlash1?


- what do you mean by initialized? and how to initialize it then?
-yes I just checked it seems the code is in PF.lash0 so I would agree that since nothing is on Pflash1 the ECC would not match the unknown contents for sure, so should initializing solve this? if so, how can I initialize pflash1 then?
0 Likes
NeMa_4793301
Level 6
Level 6
10 likes received 10 solutions authored 5 solutions authored
If your application is reading from memory beyond what you've programmed, that sounds more like a software bug - I'd track that down rather than initialize memory that should otherwise remain unused.

When does this error occur? Could it be that you've got a DMA transaction that is accessing PF1? If it's a CPU access into PF1, I would expect that to cause a synchronous read trap, which should be visible in the debugger's call stack, just before the SMU alarm.
0 Likes