Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob

TRAVEO™ T2G MCU: LBO and IBO performance on internal and external memory – KBA235809

TRAVEO™ T2G MCU: LBO and IBO performance on internal and external memory – KBA235809

1000 replies posted First like given 750 replies posted
IFX_Publisher2
Community Manager
Community Manager
Community Manager

Version: **

This KBA presents the results of performance measurements performed on different combinations of locations of source and store buffers using image-based rendering mode (IBO) and line-based rendering mode (LBO).

Software environment used:

  • Green Hills MULTI 2017.14
  • TRAVEO™ T2G graphics driver v1e.1.0

Reference documents:

  • 002-27763: CYT3DL datasheet
  • 002-25454: Graphics Driver for TRAVEO(TM) T2G Cluster Series User Guide

Software setup:

Figure 1 is a sample image used as the source surface for blitting. The image resolution is 400x480 PPI. Depending on the store and source surfaces configuration, this image is stored in either Video RAM (VRAM), HYPERRAM™, or HYPERFLASH™.

The color format of both source and store surface is chosen as R8G8B8A8.
Every result is recorded over a blit of 300 iterations.

BinduPriya_G_1-1659078957193.png


Figure 1   Source image taken for blit (400x480)

BinduPriya_G_0-1659078895903.png

 

 

Figure 2   Software flow

Table 1 shows the surface configurations used.

Table 1  Timing results captured for different source and store combination

Source surface location

Store surface location

Internal flash

VRAM

VRAM

VRAM

VRAM

HYPERRAM™ / External RAM

HYPERRAM™

VRAM

HYPERRAM™

HYPERRAM™

HYPER FLASH™

VRAM


Table 2  External memory configurations

S26H (HYPERFLASH™)

S27K (HYPERRAM™)

PLL400#2 (HF8): 200 MHz

PLL400#3 (HF9): 200 MHz

SMIF clock: 100 MHz

SMIF Clock: 100 MHz

Setup delay: 3 clock cycles

Setup delay: 3 clock cycles

Hold delay: 3 clock cycles

Hold delay: 1 clock cycles

Mode: XIP

Mode: XIP

Read latency code: 20

Read latency code: 4

Merge enable with timeout after 4096 cycles

Merge enable with timeout after 4096 cycles

Timing results:

Table 3  Internal flash to VRAM

 

Megapixels per second

Megapixels per second

LBO/IBO

IBO

240.9

963.7

 

LBO

205.73

822.94

0.85

 
Table 4  VRAM to VRAM

 

Megapixels per second

Megapixels per second

LBO/IBO

IBO

218.85

875.40

 

LBO

543.87

2127.5

2.5

 
Table 5   VRAM to HYPERRAM™

 

Megapixels per second

Megapixels per second

LBO/IBO

IBO

32.77

131.10

 

LBO

515.79

2063.16

15.73

 
Table 6   HYPERRAM™ to VRAM

 

Megapixels per second

Megapixels per second

LBO/IBO

IBO

48.9

195.66

 

LBO

36.76

147.07

0.75

 
Table 7  HYPERRAM™ to HYPERRAM™

 

Megapixels per second

Megapixels per second

LBO/IBO

IBO

15.26

61.069

 

LBO

36.6

146.38

2.4

 
Table 8   HYPERFLASH™ to VRAM

 

Megapixels per second

Megapixels per second

LBO/IBO

IBO

48.87

195.48

 

LBO

32.4

129.7

0.66

Notes:

  • Memory Bandwidth reducing LBO performance: Memories other than VRAM can transfer only less than or equal to 32 bits in one clock cycle. In additional, in LBO mode three fetches may happen at different locations in memory which can create additional overhead of non-sequential memory access for external memories. Hence LBO is even slower compared with IBO when all the source images are in external memories (e.g.: internal flash). The performance LBO/IBO becomes better if you use smaller bits per pixel (BPP) sources.
  • For example, if you compare for blending to target HRAM, LBO/IBO ratio is larger than factor 3. This is because blending of multiple single images requires to read and write the target buffer for each single blit. LBO blends all images in internal buffer together and writes the result only one time.
  • When using LBO, disabling Object Partitioning may sometimes make the blit a little faster.

 
Note:     This KBA applies to the following series of TRAVEO™ MCUs:

TRAVEO™ T2G Cluster CYT3DL series

0 Likes
227 Views