Stack depth calculation for 1.6.2P - TC375TP

User21707 · ‎Apr 30, 2021

How to calculate stackdepth for an API on 1.6.2P architecture? Board used is TC375TP and at present using Trace32 debugger.

NeMa_4793301 · ‎Apr 30, 2021

I asked the same question to Tasking a while ago and got this response:

- Details about the user stack usage: The call tree which can be included in the map file does include details about the user stack usage of the function and also its callees. For more details you can have a look at chapter:

15.2. Linker Map File Format

of the TriCore tools v6.3r1 user guide.

Since v6.3r1 a new feature has been introduced which allows to specify root functions for call stack calculations. For more details you can have a look at chapter

17.4.3. Defining Address Spaces

section -> Stacks and heaps

The application note:

STACKS AND STACK SIZE ESTIMATION IN THE TASKING VX-TOOLSET FOR TRICORE
https://resources.tasking.com/tasking-whitepapers/stacks-and-stack-size-estimation-in-the-tasking-vx...

also includes details about the stack usage and calculation.

User21707 · ‎May 01, 2021

Wrangler, I do not have the option of making changes to LSL file and moreover I use the Lauterbatch Trace32 debugger and Tricore Toolset 6.2r2, so the process mentioned in the above link is not possible I feel. Can we not use the A10 general purpose register ? Or any other methodology ?

Aim is to find the stack usage of a particular API in a full code stack.

NeMa_4793301 · ‎May 01, 2021

If you can't modify the LSL, and you're stuck with Tasking 6.2r2, then you're going to have to do it the old-fashioned way: set a breakpoint at the top level of the call tree, record A10, set a breakpoint in your deepest API, and record A10 again. The difference is the stack depth.

User21707 · ‎May 02, 2021

So let for example, in the following call tree -

+-- E2E_P01Check [ustack_tc0:8,16]
| | | |
| | | +-- E2E_P01.src:E2E_P01CalculateCRC *
| | | |
| | | +-- E2E_P01.src:E2E_P01CheckStatus [ustack_tc0:0,0]

1. If I have to calculate for the P01Check, as per your explanation, initial break point should be placed at P01Check and Final Breakpoint at P01CheckStatus (Considering that was the last API call inside P01Check) or should the Final breakpoint be placed after coming out of P01Check ?

2. Also, does change of board from TC297 to TC375 make difference in stack usage ? [ If code used and compiler flags are same. Register settings might be different ]

3. I wanted to automate the process of calculating stack usage. Is there any way to do it ? I have done it for calculating the timing but couldn't get any solution for stack usage.

NeMa_4793301 · ‎May 02, 2021

#1: Step 3 instructions into the function so that it reserves its local stack space.

#2: The instructions between TC2xx and TC3xx will not be significantly different.

#3: It depends on how adept you are with debugger scripts. Lauterbach, PLS, and iSYSTEM are quite flexible. My general recommendation is to fill the task stack with a known pattern, and then it's easy to spot the high water mark after letting your application run for a few seconds.

User21707 · ‎May 03, 2021

UC_wrangler wrote:
#1: Step 3 instructions into the function so that it reserves its local stack space.

#2: The instructions between TC2xx and TC3xx will not be significantly different.

#3: It depends on how adept you are with debugger scripts. Lauterbach, PLS, and iSYSTEM are quite flexible. My general recommendation is to fill the task stack with a known pattern, and then it's easy to spot the high water mark after letting your application run for a few seconds.

Wrangler, I am still not clear with points #1 and #3.

NeMa_4793301 · ‎May 03, 2021

#1: If I have a function like this:

void something( void )
{
	int x[4096];
	int i;

	for( i=0; i<0400; i++ )
		x = 0;
	something2(x);
}

Then the first line of code in the function allocates 16384 bytes on the stack with this instruction, decrementing A10 by 16384:

	lea	a10,[a10]-16384

If you simply set a breakpoint at the start of the function, you won't see that change in A10. So, step a couple instructions to be sure.

Note 16384 bytes (4096 * 4 bytes) only includes the variable x; i is not included, because the compiler has optimized it into a register instead of memory.

#3: If you fill the stack area (e.g., ustack_tc0) with a known value, let the application run for a few seconds, and then view the stack area and see how much of the original pattern is intact, that may give a good indication of maximum stack depth. It may not be accurate if your application hasn't executed all of its paths.

User18259 · ‎May 13, 2021

Hi SRS_Sabat,

There is a alternate solution for the max stack usage measurement, the method should be:
1. at startup phase, fill all your stack with a special pattern such as: 0xA5A5A5A5, this may take a longer time than your nominal startup time.
2. after a long time(at least executed a full and most complex function of your sw), check the first data where not matching the special pattern
3. calculate your max stack usage: address.first_non_special_pattern - address.stack_start

I`m not sure if this can help you, but this method can be used as an rough method for your intention.

Stack depth calculation for 1.6.2P - TC375TP

Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP

Re: Stack depth calculation for 1.6.2P - TC375TP