PSoC5LP - Tracking down the cause of entry into the IntDefaultHandler, errno = ENOMEM

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
KyTr_1955226
Level 6
Level 6
250 sign-ins 10 likes given 50 solutions authored

Hey all,

I'm trying to track down the cause of some PSoC5LP firmware running into the default interrupt with errno==ENOMEM.

The only way I've been able to reproduce it is to power cycle the PSoC quickly enough after powerup to catch it, and even then I've only been able to get it to occur maybe once every 50 or so cycles.  This makes using the debugger to track it down basically impossible from what I can tell.  It almost seems like it might be occurring when the PSoC is undervolted due to very quick loss and re-application of main power, so some "undefined behavior" might be occurring.  I know PIC MCUs I've worked with have a "Brownout Reset" function to automatically put the MCU into reset when VDD falls beneath a certain configurable voltage.  Does PSoC5LP have anything comparable?

I am running some BIT test on bootup in this firmware and *suspect* it occurs when I manage to power cycle/brownout the system while it is in progress of running these tests.  Specifically, I am running an SRAM March Test, a Stack March Test, and a Flash ECC Test on bootup.  The code for all these tests was taken from the examples in AN78175 (I made a thread about it months ago, actually: https://community.infineon.com/t5/PSoC-5-3-1/PSoC5LP-Using-SRAM-test-functions-from-AN78175/m-p/2836...).

Right now when I manage to trigger the problem and enter the trap ISR for ENOMEM, I simply have it infinitely loop and blink an LED.  I wonder if it would be worth it to try and hunt the exact error cause down.  My other thought is that this occurs so infrequently and might be unavoidable enough, to just perform a software reset in the handler and call it a day?

I'd love a second or third opinion on maybe how I should track this down/handle it.

Thanks!

 

20 Replies
Len_CONSULTRON
Level 9
Level 9
Beta tester 500 solutions authored 1000 replies posted

Kyle,

Can you get the condition while in the Debugger?

If Yes: You can trap on IntDefaultHandler and then look at the stack.   The stack should give you clues as to what was the last call in your application before this happens.

If No:  Get it to happen without the debugger and perform an "Debug/Attach  to Running Target..."  with the "Halt Target on Attach" selected.

If successful, it should be looping in the IntDefaultHandler  and check the stack for the last App code branch.

Len
"Engineering is an Art. The Art of Compromise."

Hi Len,

I didn't think I'd be able to get at it with the debugger attached, since it seems to be so dependent on VIN power being cycled, but connecting the debugger actually proved to make it much easier to replicate.

The main power to the system is 12V, but the actual VDD to the PSoC is controlled by an external  signal.  I found that when I connect the debugger (which is providing VDD (5V) on its own), then remove/reapply main power (either via the external signal, or just by cutting off the 12V VIN) the debugger continues running the PSoC from it's own VDD, but it ends up in the Enomem trap every time when main power is reapplied.

As far as what the debugger tells me when I break, the Call stack doesn't look like it gives me much to go on.  It simply jumps from wherever it is in the program into the IntDefaultHandler() when VDD returns from the external supply, where it falls into CyBoot_IntDefaultHandler_Enomem_Exception_Callback(), which contains a simple while loop with a couple CyDelay() calls to blink an LED.  The Call Stack window is below:

KyTr_1955226_0-1642106847468.png

In this case I caught it while processing any potential logging that needs to occur, but I've caught it in other places as well.  In this case the line in question is just checking a global flag to see if logging is due to occur.  It doesn't seem to have any actual bearing on the reason for dropping into the IntDefaultHandler.

There are a number of things on the PCB that will lose power when the external 12V is shut off (that the debugger will not power when attached).  So I'm thinking one of these parts losing then regaining power while the PSoC has never stopped running is the root of the issue.  As to why in the world it would trigger an Out Of Memory error I'm still racking my brain trying to riddle out.  Maybe one of the RS232/RS422 transceivers...

It's progress I suppose, but I'll need to do some additional probing around and messing with it.  Thanks for the tip.

 

0 Likes

Kyle.

I believe I received this error when I was trying to print floating values with a sprintf() function.

For me to solve it, I needed to change the Linker to use the newlib-nano Float Formatting AND change the DWR/System/Heap Size from 0x80 to at least 0x200 bytes.

Is it possible that your logging function is exceeding the heap size and since there is no Int handling for heap overrun it is using the default?

When you see the "<signal handler called>" occurs before the CyBoot_IntDefaultHandler_Enomem_ Exception_Callback() it is usually due to a library function (such as math lib) where some basic checks are performed.

For example, checks for the heap or stack memory over/underruns, and math "divide by zero" checks are performed.  Since it is a ENOMEM, I would suspect a heap or stack overrun.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

Len,

I've also come across this when using the float formatting without increasing the heap size.  I'm relatively certain it isn't the case here.  There is no logging being performed when the problem is reproduced (plus my heap size is set to 0x400 anyway).  Process_Logging() is entered, but there if the logging global is not set, it never performs any logging.  This is the case here (no logging is being performed).  I've also seen call stacks when I reproduce this where it's in other functions (processing PWM, analogs, etc), so I don't think it's related to sprintf().

I did find something this morning though.  I suspected something with the RS232/422 transceivers so I decided to just completely disable the PSoC UARTs when a fault in power is detected (I have the fault pin of the VIN regulator tied to a PSoC interrupt).  This looks to completely solve the issue.  So I suspect that one (or more) of the UART Rx lines is changing state when the transceivers lose power.  This fires off the byte-received interrupts I have as well as some timer interrupts that fire when too much time has passed between received characters.  As to why this would trigger ENOMEM I still don't really know, but I've at least tracked down the module that's causing the problem.

Here's one of the UART Rx interrupts, I think I may have the problem spotted?

 

/** CH1 Receive ISR
 *  - Fires when character received on CH1 UART
 *  - Reads any recevied characters into next buffer location(s)
 *  - Message terminates at 7 received characters (1 packet)
 *  - Message copied to buffer for processing once 7 characters are received
 */
void CH1_RX_ISR_Interrupt_InterruptCallback (void){
    uint8_t c_in;
    uint8_t status = CH1_422_UART_ReadRxStatus();
    static char rs232_msgbuffer[BUFFER_DATA_SIZE];
    
    CH1_RX_ISR_ClearPending();
    
    if (((status & CH1_422_UART_RX_STS_STOP_ERROR) > 0) ||
        ((status & CH1_422_UART_RX_STS_OVERRUN) > 0) || 
        ((status & CH1_422_UART_RX_STS_PAR_ERROR) > 0)){
        CH1_UART_Err = true;
    }
        
    TMR_CH1RX_Start();  //Enable the timeout timer
        
    while (status == CH1_422_UART_RX_STS_FIFO_NOTEMPTY){
        
        c_in = CH1_422_UART_ReadRxData();

        if (CH1_UART_ByteCount < CTRL_MSG_SIZE){

            rs232_msgbuffer[CH1_UART_ByteCount++] = c_in;
            
            if (CH1_UART_ByteCount == CTRL_MSG_SIZE){
                /*7 bytes is complete control message*/
                memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,CTRL_MSG_SIZE);   
                memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
                CH1_UART_ByteCount = 0;
                TMR_CH1RX_Stop();   //Message done, disable timeout until next byte reception
                CH1_UART_MsgDone = true;
            }
            
            if (CH1_UART_ByteCount > CTRL_MSG_SIZE){
                memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,CTRL_MSG_SIZE);   
                memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
                CH1_UART_ByteCount = 0;
                CH1_UART_MsgDone = true;
                CH1_UART_Err = true;
            }

        } else {
            /*Extra Unexpected Characters?*/
            CH1_UART_MsgDone = true;
            CH1_UART_Err = true;
        }

        status = CH1_422_UART_ReadRxStatus();
    }
    
}

 

 

Note the while loop checking if the (status ==  CH1_422_UART_RX_STS_FIFO_NOTEMPTY).

When the problem occurs, status also has the STS_STOP_ERROR bit set, which will obviously skip this while loop.  In this case the internal receive buffer is not being cleared out.  Although, I am setting the error flag, which gets caught in main and UART_ClearRxBuffer(); is called for that respective UART.  I would think that would clear it up, but maybe I should try clearing the RxBuffer in the ISR rather than in the main loop.  I could also adjust the ISR code to specifically make sure UART_ReadRxData() is called until FIFO_NOTEMPTY is cleared.  Will keep messing with it, but I'm getting close. At the end of the day I could even just shut the UARTs completely down when VIN loss is detected and bring them back up once it has come back stable.

 

0 Likes

Kyle,

I'm still reviewing your code.  

I've made some observations about your code that I don't have an answer due to the limited code you've provided.

Observation #1

You declare the var 

static char rs232_msgbuffer[BUFFER_DATA_SIZE];

You increment the byte count index.

rs232_msgbuffer[CH1_UART_ByteCount++];

 You test for the CH1_UART_ByteCount in two places.

if (CH1_UART_ByteCount == CTRL_MSG_SIZE){

and

if (CH1_UART_ByteCount > CTRL_MSG_SIZE){

In both cases you move 

memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,CTRL_MSG_SIZE);   

to CH1_UART_MsgIn.

Issue:  I have no info as to the #defines CTRL_MSG_SIZE (I assume it is 7), BUFFER_DATA_SIZE and the allocated size of CH1_UART_MsgIn[]

 

Observation #2

You have a potential infinite loop under many conditions.

    uint8_t status = CH1_422_UART_ReadRxStatus();
...
  
    while (status == CH1_422_UART_RX_STS_FIFO_NOTEMPTY)
    {
        status = CH1_422_UART_ReadRxStatus();  // This only updates status if NOT_EMPTY
    }

 You only update the var 'status' at the beginning of the function and IF the (status == CH1_422_UART_RX_STS_FIFO_NOTEMPTY).  If status is not this condition, the while() loop will not  be updated with the Rx Status.  (I confirmed this observation by counting your brackets '{' and '}'.)

 

Observation #3

You start a Rx timer before the while().

TMR_CH1RX_Start();  //Enable the timeout timer

 I see no indication how this timer can exit the while() loop or what it does in the code.

 

Observation #4

I see how you stop the Rx timer if the message size matches.

However, if the Rx message size is more than CTRL_MSG_SIZE, the timer is still running.(?)

 

Observation #5

Since the only control of incrementing CH1_UART_ByteCount is in the while() loop (at least for the code snip you provided), I see no need for 

if (CH1_UART_ByteCount > CTRL_MSG_SIZE){ ... }

since CH1_UART_ByteCount can not increment past

if (CH1_UART_ByteCount == CTRL_MSG_SIZE){...}

 

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

#1) Yes, CTRL_MSG_SIZE is 7 (should always be the packet size from the connected device).  BUFFER_DATA_SIZE is 64 (The total size of my serial receive buffer).  CH1_UART_MsgIn and rs232_msgbuffer are both 64 bytes (BUFFER_DATA_SIZE).  I have a habit of double buffering serial inputs like this.  Wait for complete message, then copy over for processing by the main loop.

#2) Yes this is the big thing I noticed earlier when I double checked the code in this ISR.  I should probably adjust this to something like:

 

while ((status & CH1_422_UART_RX_STS_FIFO_NOTEMPTY) != 0)

 

In order to make sure I keep reading the characters in, even if other bits in status are set.  I want to always leave the ISR with the FIFO empty.

#3) I could see how this would be a little weird to look at.  TMR_CH1RX is a timer that I start in software upon receiving a character over UART.

KyTr_1955226_0-1642177050791.png

The idea being that for every character received the timer will restart and if it elapses before I have 7 characters in the buffer, I know the message is incomplete and should be thrown out.  The device is sending data at a specified rate, so I need to have it cleared out for a potential next transmission.

#4) Yes, I should probably be stopping the timer in both of those cases.  Good catch, but as you say...

#5) Also correct.  This is a habit I have of always having something accounting for overrun, even if it doesn't really make sense in the flow of the code here.  That code should never be hit and is almost definitely unnecessary.

0 Likes

Kyle,

I little more research indicates that the Address (0xFFFFFFF9) listed on the <signal handler called>() is in the "Vendor Specific" section of the Cortex-M3 memory map of the PSoC5 TRM.

If true, then Infineon tech support needs to weigh in on this.  The  "Vendor Specific" is not usually documented.

Without this tech help, the closest you can come is to find the App code line that is being executed in the stack monitor before the <signal handler called>() line which is in main.c line 561.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

Interesting development, thanks for the tip (I don't think I ever would have looked into that).  Maybe I'll see if I can ping Infineon for some info.

The app code for that line is nothing out of the ordinary, it is just checking a flag to see if logging is enabled (it isn't, g_Proc_Logging == false unless enabled by external command).

/** Process_Logging
    - Processes log string to passed in communication interface
*/
static void Process_Logging (comminterface_t interface){
    
    if (g_Proc_Logging){

        if (interface == INTERFACE_UART){
            Process_UART_Log();
        } else {
            Process_HID_Log();
        }
        
    g_Proc_Logging = false;
    }
    
}

 

  The <signal handler called>() can occur seemingly anywhere in the main application code flow, it does not always occur in Process_Logging().

FWIW, Disabling all the UARTs via the UART Component reset line and disabling all associated ISRs when the power fault condition is detected avoids the trap, so I at least have a method of avoidance now.

0 Likes

Kyle,

On Observation #2, I'm not seeing how you resolve that 'status' is not being updated in a while() loop.

Since the "<signal handler called>()" is occurring basically anywhere in the main(), then it's probably occurring in an ISR.  

I'm not sure, but I think that ISRs are not traced well in the stack monitor.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

@Len_CONSULTRON wrote:

Kyle,

On Observation #2, I'm not seeing how you resolve that 'status' is not being updated in a while() loop.

Since the "<signal handler called>()" is occurring basically anywhere in the main(), then it's probably occurring in an ISR.  

I'm not sure, but I think that ISRs are not traced well in the stack monitor.


Maybe miscounted the curly braces (or did it mispaste)?  status = CH1_422_UART_ReadRxStatus(); Is the last line before it returns to the beginning of the while loop.  The bit I noticed was that if there's another status flag raised in the status byte, I'm not clearing out the FIFO anymore since it will break out of the while().  I should be masking the FIFO_NOT_EMPTY bit rather than a direct == comparison.

Yes, I agree about it probably occurring in an ISR, or as a result of something an ISR is doing.  Just need to track down which one in particular

0 Likes

Kyle,

There are three ways I know of to "count" the braces.

  1. Actually count the braces.  This is time consuming and prone to errors.
  2. The PSoC Creator IDE allows you to select one of the brackets and the other matching bracket will be also highlighted.
  3. The PSoC Creator allows you to collapse the brackets.  This allows you to see what parts of the code are inside the bracket pair.

The last two are very accurate and quick.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

Oh I think I see what you mean now.  I thought you were suggesting that status was not being updated at all inside the while(). 

Yes I need to adjust that while loop to account for other status,  although in most cases I mostly only care if there's data that needs to come out of the FIFO, which is why I want to mask it rather than do an ==. 

As it is right now if there is another status bit raised it will not pull anything out of the FIFO until the error is cleared in the main loop.  I think I wrote it under the assumption that since I'm reading the status register as soon as I enter the ISR, I wouldn't need to worry about other status since they should have been cleared on that first read. 

But what that does *not* account for is when I re-read the status register after processing a character, some of those bits can been re-set(?) and I'm not looking for that condition. This could, as you say, lead to problems if for example the FIFO is empty but there is still some kind of error condition.

Now I have this:

 

 

void CH1_RX_ISR_Interrupt_InterruptCallback (void){
    uint8_t c_in;
    uint8_t status = CH1_422_UART_ReadRxStatus();
    static char rs232_msgbuffer[BUFFER_DATA_SIZE];
    
    CH1_RX_ISR_ClearPending();
    
    if (((status & CH1_422_UART_RX_STS_STOP_ERROR) != 0) ||
        ((status & CH1_422_UART_RX_STS_OVERRUN) != 0) || 
        ((status & CH1_422_UART_RX_STS_PAR_ERROR) != 0)){
        CH1_UART_Err = true;
    }
        
    TMR_CH1RX_Start();  //Enable the timeout timer
        
    while (((status & CH1_422_UART_RX_STS_FIFO_NOTEMPTY) != 0) && (!CH1_UART_Err)){
        
        c_in = CH1_422_UART_ReadRxData();

        if ((CH1_UART_ByteCount < CTRL_MSG_SIZE) && (!CH1_UART_Err)){

            rs232_msgbuffer[CH1_UART_ByteCount++] = c_in;
            
            if (CH1_UART_ByteCount == CTRL_MSG_SIZE){
                /*7 bytes is complete control message*/
                memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,CTRL_MSG_SIZE);   
                memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
                CH1_UART_ByteCount = 0;
                TMR_CH1RX_Stop();   //Message done, disable timeout until next byte reception
                CH1_UART_MsgDone = true;
            }
            
            if (CH1_UART_ByteCount > CTRL_MSG_SIZE){
                memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,CTRL_MSG_SIZE);   
                memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
                CH1_UART_ByteCount = 0;
                CH1_UART_MsgDone = true;
                CH1_UART_Err = true;
            }

        } else {
            /*Extra Unexpected Characters or other error*/
            CH1_422_UART_ClearRxBuffer();
            CH1_UART_MsgDone = true;
            CH1_UART_Err = true;
        }

        status = CH1_422_UART_ReadRxStatus();
        
        if (((status & CH1_422_UART_RX_STS_STOP_ERROR) != 0) ||
        ((status & CH1_422_UART_RX_STS_OVERRUN) != 0) || 
        ((status & CH1_422_UART_RX_STS_PAR_ERROR) != 0)){
            CH1_UART_Err = true;
        }
    }
    
    /*If we encountered an error condition during reception, clear our local buffer out and clear any remaining bytes from FIFO*/
    if (CH1_UART_Err){
        CH1_UART_ByteCount = 0;
        CH1_422_UART_ClearRxBuffer();
        memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
    }
    
}

 

 

This way, if an error condition is detected anywhere during reception it breaks out of the while loop, the local buffer rs232_msgbuffer gets cleared out, the byte index gets reset, and the hardware Rx buffers get cleared.  No help preventing falling into the ENOMEM trap though unfortunately.

0 Likes

Kyle,

You new code is better but can theoretically get into a infinite loop.

May I suggest the following:

 

void CH1_RX_ISR_Interrupt_InterruptCallback (void)
{
    uint8_t c_in;
    uint8_t status;
    static char rs232_msgbuffer[BUFFER_DATA_SIZE];
	_Bool CH1_UART_Err = false;
	_Bool CH1_UART_MsgDone = false;
    
    CH1_RX_ISR_ClearPending();
         
    TMR_CH1RX_Start();  //Enable the timeout timer
        
	do 
	{
		status = CH1_422_UART_ReadRxStatus();	// Read the current Rx Status
		
		if (((status & CH1_422_UART_RX_STS_STOP_ERROR) != 0) ||
	        ((status & CH1_422_UART_RX_STS_OVERRUN) != 0) || 
	        ((status & CH1_422_UART_RX_STS_PAR_ERROR) != 0))
		{	// Test for serial errors
	        CH1_UART_Err = true;
    	}
			
		else ((status & CH1_422_UART_RX_STS_FIFO_NOTEMPTY) != 0)
		{	// Test for data in the FIFO
        	c_in = CH1_422_UART_ReadRxData();	// Get 1 byte of FIFO data

	        if (CH1_UART_ByteCount < CTRL_MSG_SIZE)
			{

	            rs232_msgbuffer[CH1_UART_ByteCount++] = c_in;
	            
	            if (CH1_UART_ByteCount == CTRL_MSG_SIZE)
				{	// Equals message size
	                /*7 bytes is complete control message*/
	                memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,CTRL_MSG_SIZE);   
	                memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
	                CH1_UART_ByteCount = 0;
	                TMR_CH1RX_Stop();   //Message done, disable timeout until next byte reception
	                CH1_UART_MsgDone = true;
	            }
	            
	            if (CH1_UART_ByteCount > CTRL_MSG_SIZE)
				{	// Exceeds message size
	                memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,CTRL_MSG_SIZE);   
	                memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
	                CH1_UART_ByteCount = 0;
	                CH1_UART_MsgDone = true;
	                CH1_UART_Err = true;
	            }
	        } 
			else 
			{
	            /*Extra Unexpected Characters or other error*/
	            CH1_422_UART_ClearRxBuffer();
	            CH1_UART_MsgDone = true;
	            CH1_UART_Err = true;
	        }
    	}
    }  while (!CH1_UART_Err && !CH1_UART_MsgDone);	// continue if no error and the message is not done
    
    /*If we encountered an error condition during reception, clear our local buffer out and clear any remaining bytes from FIFO*/
    if (CH1_UART_Err)
	{
        CH1_UART_ByteCount = 0;
        CH1_422_UART_ClearRxBuffer();
        memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
    }
}

 

This code reads the RxStatus every time the do() loop is executed to get the latest status.

If also first determines if an error occurred and if no error if processes the byte if the FIFO is not Empty.

If you don't mind I'd like to offer up another suggestion.

I've created this function as a callback in an ISR.

It is generally not a good practice to place potentially blocking calls in an ISR.   You are in effect waiting for all the message bytes to be processed once in the ISR callback.  This is considered a blocking function.  At best, you are at interrupt for the duration to process seven bytes of the message.  It worst, if the Rx data has additional delays upstream, it will be longer.

I believe you indicated there are quite a few of these UARTs with similar callbacks.

This means that potentially while you wait for this channel's message to be processed at interrupt level, another channel's message could be:

  • Other channels' messages will be blocked from processing if interrupts are not allowed to be nested.  All other interrupts will be blocked as well.
  • Other channels' messages will be processed if interrupts are allowed to  be nested.  This leads to two problem type
    • The callback at the end of the nest will be processed first and then the next in the nest.  Ie. LIFO (Last-in-First-Out).
    • The stack can easily get overrun.  This is because nested interrupts will consume stack in the order of execution.   If nested interrupts are allowed you need to allocate more stack bytes.

The general principles concerning interrupt coding I learned from my mentors are these points:

  • Do use calls to other functions that can be blocking (as stated earlier).
  • When you create ISR code, "get in and get out quickly".   This is to prevent ISR blocking if nesting is not allowed AND to prevent the probability of ISR stacking if nesting is allowed. 
    The best strategy of "get in and get out quickly" in your situation is to virtually remove all data processing from the ISR.  Only get the data from the buffer and store it elsewhere.  Then signal the main() task level application that data is available in the buffer.  In your case you can keep a count of how many data bytes are acquired in the ISR.  
    In your main() task, you can poll the data count for each channel.   Once you get 1 message size worth, you can further process the data at main() level.
    This strategy makes your ISR very quick and extremely minimizes ISR blocking.

If you need some help on this strategy of ISR coding, let me know.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

Len,

Appreciate the detail.

There are 2 interrupts with callbacks styled as I showed.  Both of them receive the same 7-byte packets at a rate of 20Hz per packet in.  I definitely am aware of "spend no more time in an ISR than you need", but figured there would be no serious issues with hanging too long inside this ISR (and actually scoped out the ISR timing as well by toggling some I/Os) mainly because these UARTs are only running at 9600 baud.  When the interrupt fires on byte received it should only really ever have 1 character in the FIFO (*maybe* 2 if really unlucky with timing I guess?), and there should be plenty of time before the next character is finished (about 1ms to transmit a byte character at 9600 iirc), so it really shouldn't be looping more than once, twice in worst case.  Once the message is done (either by number of bytes in this case, or detecting a CR/LF in other systems) it copies the message over to CH1_UART_MsgIn and raises the CH1_UART_MsgDone flag.  When the finished message is copied inside the ISR, the byte index can be immediately reset and the local buffer wiped and we can be ready to handle the next incoming message.

CH1_UART_MsgDone then gets caught in main and the message in CH1_UART_MsgIn actually gets processed.  I've actually used this pattern quite frequently (tbf, most of the time for only 1 UART).  Of course, this all works out if the transmitter on the other end is all working fine, which as we're seeing is not the case when the transceivers shut down and the PSoC keeps running.

do()/while() might make more sense?  One potential issue I see though is that the code you posted appears that it won't leave the ISR until the message is complete, whereas my original setup will only read out what is currently in the FIFO, exit the ISR, then interrupt again when the next character is done.  Like you said, I don't really want to spend a ton of time in there, especially to wait for an entire message to be complete before leaving.  I think I can tweak it though, just putting a break if the FIFO is empty to leave the ISR:

 

 

    do{
        status = CH1_422_UART_ReadRxStatus();

        if (((status & CH1_422_UART_RX_STS_STOP_ERROR) != 0) ||
        ((status & CH1_422_UART_RX_STS_OVERRUN) != 0) || 
        ((status & CH1_422_UART_RX_STS_PAR_ERROR) != 0))
        {
            CH1_UART_Err = true;
        } else if ((status & CH1_422_UART_RX_STS_FIFO_NOTEMPTY) != 0){
            c_in = CH1_422_UART_ReadRxData();

            if (CH1_UART_ByteCount < PFD_CTRL_MSG_SIZE)
			{
                rs232_msgbuffer[CH1_UART_ByteCount++] = c_in;
            
                if (CH1_UART_ByteCount == PFD_CTRL_MSG_SIZE){
                    /*7 bytes is complete control message*/
                    memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,PFD_CTRL_MSG_SIZE);   
                    memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
                    CH1_UART_ByteCount = 0;
                    TMR_CH1RX_Stop();   //Message done, disable timeout until next byte reception
                    CH1_UART_MsgDone = true;
                }
                
                if (CH1_UART_ByteCount > PFD_CTRL_MSG_SIZE){
                    /*Message size exceeded, should be impossible to get here*/
                    memcpy((char *)CH1_UART_MsgIn,rs232_msgbuffer,PFD_CTRL_MSG_SIZE);   
                    memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
                    CH1_UART_ByteCount = 0;
                    CH1_UART_MsgDone = true;
                    CH1_UART_Err = true;
                }
            
            } else {
                CH1_422_UART_ClearRxBuffer();
	            CH1_UART_MsgDone = true;
	            CH1_UART_Err = true;
            }
            
        } else {
            /*No characters in FIFO, break and leave ISR until next byte*/
            break;   
        }
        
    } while (!CH1_UART_Err && !CH1_UART_MsgDone);

 

 

 

0 Likes

So after the rewriting of those serial ISRs, I may have useful information?  Running through the debugger I found that when I induce the fault, only one of the serial ISRs actually fires (found by breakpoint), and it's not either of the 2 "main" serial channels, but the receive interrupt for the UART I use for debugging commands and the bootloader (basically a factory port).  There is no regular traffic coming or going from this UART during normal operation.  I actually also rewrote this ISR to match the others, the only real difference being there's no specified packet size on this UART, but uses a CR/LF delimiter to determine end of message (since I'm usually sending commands to it through a terminal).  It also runs at 115200 baud.  Here's the code for reference:

 

/*Debug (DBG) UART buffers and flags*/
static volatile bool    DBG_UART_MsgDone = false;
static volatile bool    DBG_UART_Err     = false;
static volatile char    DBG_UART_MsgIn[BUFFER_DATA_SIZE];
static volatile char    DBG_UART_MsgOut[BUFFER_DATA_SIZE];

void DBG_RX_ISR_Interrupt_InterruptCallback (void){
    uint8_t c_in;
    static uint8_t num_bytes = 0;
    uint8_t status;
    static char rs232_msgbuffer[BUFFER_DATA_SIZE];
    
    DBG_RX_ISR_ClearPending();

    do{
        status = DBG_UART_ReadRxStatus();
        
        if (((status & DBG_UART_RX_STS_STOP_ERROR) != 0) ||
        ((status & DBG_UART_RX_STS_OVERRUN) != 0) || 
        ((status & DBG_UART_RX_STS_PAR_ERROR) != 0) ||
        ((status & DBG_UART_RX_STS_SOFT_BUFF_OVER) != 0)){
            DBG_UART_Err = true;
        } else if ((status & DBG_UART_RX_STS_FIFO_NOTEMPTY) != 0){
            c_in = DBG_UART_ReadRxData();

            if (c_in != CR){
                
                if (c_in != LF){
                    rs232_msgbuffer[num_bytes++] = c_in;
                    
                    if (num_bytes >= BUFFER_DATA_SIZE){
                        num_bytes = 0;
                        DBG_UART_Err = true;
                    }
                    
                }
                
            } else {
                memcpy((char *)DBG_UART_MsgIn,rs232_msgbuffer,strlen(rs232_msgbuffer));   
                memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
                num_bytes = 0;
                DBG_UART_MsgDone = true;
            }
        } else {
            /*No errors detected and no data in FIFO, we can break out*/
            break;   
        }

    } while (!DBG_UART_Err && !DBG_UART_MsgDone);
    
    if (DBG_UART_Err){
        memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
        DBG_UART_ClearRxBuffer();
        num_bytes = 0;   
    }
    
}

 

 

Note that this ISR is a lower priority (Priority value is numerically higher in the cydwr) than either the CH1/CH2 Rx interrupts from my above posts, so if I hit this breakpoints and I didn't hit breakpoints in the CH1/CH2 RX ISRs, I feel I can infer that they are not firing (at least, haven't fired yet).

I set a breakpoint at the beginning of this ISR callback (Breakpoint is on the DBG_RX_ISR_ClearPending() line), and induce the fault by re-enabling power to the PCB.  This ISR fires and I hit the breakpoint.  Here's where it gets weird:

  • If I single step through the ISR with the debugger (F11 - Step Into), it appears to make it through the entire ISR and returns back to the main application.  Upon returning to main application code, any function call will result in a jump out to IntDefaultHandler() and into the ENOMEM callback.
  • If I use F10 to Step Over, or F5 to Resume it will jump immediately from DBG_RX_ISR_ClearPending() to IntDefaultHandler() and into the ENOMEM callback.
  • Another thing I noted is the RX Status Register for this UART.  Using the component debug window, I was able to determine RX_DBG_UART_RX_STATUS = 0x08 when the breakpoint is hit.  Looking at the UART header, this would be a STOP error?  What's weird is that this register clears to 0x00 when DBG_RX_ISR_ClearPending() is called.  Calling ReadRxStatus() before ClearPending() just returns 0x00 for the status.  Maybe getting bad or misleading info from the debugger?

Just more mysteries to add to the pile...

 

0 Likes
KyTr_1955226
Level 6
Level 6
250 sign-ins 10 likes given 50 solutions authored

A thought:

If I put a breakpoint in a UART Receive ISR callback, and have the debug window for the component up to tell me the values of the UART registers, will this read operation from the debugger clear the RX Status Register on a hardware level?  Could that be why my RX Status Register is clearing before I can read it out? 

The more I think about it maybe that's why the code is not catching the error status bits when I breakpoint in the UART Callback.

If I place the breakpoint later, it looks like it catches the error condition, and drops into this section of the ISR Callback:

 

 

    if (DBG_UART_Err){
        memset(rs232_msgbuffer,'\0',BUFFER_DATA_SIZE);
        DBG_UART_ClearRxBuffer();
        num_bytes = 0;   
    }

 

 

When it gets here, the first function called is where it kicks out to the IntDefaultHandler().  Status reads 0x28, which is comes out to STOP_ERROR and FIFO_NOT_EMPTY.  Neither of these conditions is unexpected and it's catching it in code now that I'm not breaking at the beginning of the ISR and presumably unintentionally clearing out the status register.  I tried also putting DBG_UART_ClearRxBuffer() *before* memset() but it still just jumps to IntDefaultHandler() on whichever call is first.

If I don't breakpoint at all in the ISR and just halt after the trap is hit, according to the call stack it manages to escape the ISR callback, but just kicks to the IntDefaultHandler() back in the main application loop.

0 Likes

Kyle,

When you're monitoring the UART registers in the debugger, it will read the status registers when the CPU is in HALT mode.  When this happens, any status bits that are cleared on read will be cleared if set.

Additionally, if the UART FIFO_FULL status bit is set, reading the data register will clear the FIFO_FULL status bit.

 

You've been great at trying to explain your predicament with dropping into the IntDefaultHandler().   However, my experiments have not been able to do this.   I probably will not be able to do so without your code.

I'm about one day from uploading example code of how to process multiple Rx UART port ISRs efficiently.

I'm measuring at BUS_CLK of 79.5MHz and 11 Rx UART ports, a maximum Baud rate of 231 KBaud for each port WITHOUT errors!

You can use this example as a starting point.

Len
"Engineering is an Art. The Art of Compromise."
0 Likes

Hi Len,

I had a feeling that's what was going on with clear-on-read, and have adjusted my breakpoints accordingly so I will no longer run into that issue in the UART ISR.

Unfortunately this firmware is basically a large completed project and proprietary so I'm limited in what I can share.  This is an issue we discovered (unfortunately) at the end of the firmware development process.

What I've narrowed down is this:

I have 4 UARTs being processed.  Two of them are RS-422 and send/receive data packets at specified rates (Every 5ms a data packet is sent from the PSoC, every 10ms a packet is received).  The other two UARTS are RS-232 and are used much less regularity.  One for debug output and bootloading and the other to send occasional commands to an external controller.  When I disable the RS-422 UARTs (via the RESET pins on the UART Components), the problem goes away.  When I enable *only* the RS-422 UARTs, the jump to IntDefaultHandler() occurs under the conditions provided (Transceivers/external hardware shut down, PSoC  VDD remains). 

Thanks to the tweaks to the ISR callback code you provided, entering an infinite loop and overrunning a buffer inside the ISR Callback should be almost impossible.  Something has to be either loading up the heap, or at least making the PSoC *think* the heap has been overrun. 

One way I've found that triggers ENOMEM came earlier in this project, where data was being sent to the PSoC on the RS-422 interfaces while the system was still performing bootup tasks.  When the UART_Start() call had been made, and interrupts had been enabled, but the main loop had not yet been entered, so data processing functions for the UART buffers had not yet begun to be called.  Once I tweaked the startup code to not enable the UARTs until bootup is complete and data processing functions begin being called regularly, the ENOMEM on bootup stopped occurring.

There's got to be a smoking gun somewhere, I just need to find it.  The help you have provided is greatly appreciated.  If I spend too much more time on this not being able to find anything, it might be time for me to try and put together a new project with the purpose of trying to reproduce the problem.  In which case I could share something more useful.

 

0 Likes

More weirdness:

I decided to throw some scope probes on a few test points to see if I could get a better sense of the timing of what's going on.  Idea being to measure the time between VIN returning to the system, VIN_OK going back high to denote input voltage is OK, and the ENOMEM callback being entered.  I have a test point that toggles every 100ms inside the callback, so it's nice and easy to see the timing:

KyTr_1955226_0-1643057359098.png

 

I then wanted to increase the amount of heap to see if the amount of time from VIN returning to the callback being entered changes (Is the heap being all eaten immediately or does a take a certain amount of "trips" through the main loop to overrun).  What I found was something very weird.  When I increase the heap space (in this case from 0x400 to 0x600) the callback appears to never be entered.  The program seems to totally lock up inside the IntDefaultHander ISR, on the line of the callback function.  My test point never starts toggling.

KyTr_1955226_3-1643043403046.png

 

 

 

0 Likes

Hi @KyTr_1955226 

Thread was locked due to inactivity for long time, you can continue the discussion on the topic by opening a new thread with reference to the locked one. The continuous discussion in an inactive thread may mostly be unattended by community users.

Thanks and Regards,
Alen Austin

0 Likes