Help us improve the Power & Sensing Selection Guide. Share feedback

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
Level 4
Level 4
50 replies posted 25 replies posted 10 replies posted

I have created a working ILI9341 driver for my PSoC 4 BLE Pioneer board which is based on the Adafruit GFX/ILI9341 Arduino libraries and other code I found on the community forum.

It works well enough but, in my opinion, the refresh rate is a little slow especially with fill screen and drawing bitmap graphics compared to other boards I've seen.

One trick I noticed in some of the Arduino based libraries, especially for the Teensie boards, is the use of DMA to speed things up. I've never used DMA so as it is an option here, I was wondering if this would help and if so how to do it.

I've created a rather basic video to try and illustrate update speed:

I have also attached the code.

Any tips on optimising greatly appreciated.

One optimisation question that has me perplexed is why does increasing the rx/tx (8-bit words) buffer size slow things down dramatically, for example - see toplevel design & spi component block. I have these set at the min, which is 4.

8 Replies
Level 9
Level 9
First comment on KBA 1000 replies posted 750 replies posted

Some examples here (may or may not be helpful)

That is fastest output I've seen on PSoC

Psoc 5 ILI9340 - YouTube

And this one on STM32F7 (link to code in Comments)

ILI9341 with DMA and Optimization -O1 - YouTube


P.S. After some play with ILI9340 I switched to the Nextion display (main problem was to find a font of right size, and I didn't want to make my own). Nextion is more expensive ($25 vs $2.5), but saves a lot of time to get something useful done (e.g. menu, text, chart, bitmap) without wasting PSoC resources. With ILI9340 PSoC is almost 100% busy updating the screen. I see only justified use for it as a cheap o-scope screen or simple game (Tetris, etc.) 

Some very nice demos. Thanks.

Now we have our benchmark for comparison.


You are absolutely right about resource requirements when updating the screen. I found that if you add in timer interrupts for other processes you see it slow down dramatically. But then again, is this not what buffers are for, although I never got buffers to work as don't understand how to use properly.



There are couple of examples using DMA to transfer data from RAM to screen. This way DMA works in the background, leaving CPU free.

See link to RAM-DMA-VGA here:

How to Drive a VGA Monitor on a PSoC 5LP w/Verilog

and some other examples here



Level 4
Level 4
50 replies posted 25 replies posted 10 replies posted

So besides the limitations of the ILI9341, I would like to use this as a learning exercise too.

Having seen other MCU handle the display, I see that it is key to also get the hardware to do the SPI work.

I am now looking at the SCB block component instead as I see this has a DMA option. Is this SCB block faster?

How do you access and then use DMA for example?


Not applicable

DMA is "Direct Memory Access", and essentially is a the ability for certain peripherals to access the memory of the RAM without needing the CPU to transfer it using an interrupt/buffer directly. This means that the SPI can read the data out of a buffer of data, and the CPU merely updates that buffer when it wants the SPI to use the data. The DMA then handles transferring the data from the buffer into the SPI output registers.

Sorry got pulled away on other things. However, this has helped me think a little more about DMA.

While I understand the concept of DMA I really struggling with terminology.

If I start with the SCB block and the options found under the SPI Advanced tab. It says "RX output" and "TX output". This is rather confusing to me as my brain is thinking maybe this is a typo and it should really read "RX input/TX output" or "RX output/TX input". So, why are both saying "output" for example?

I then see that you've repeated what I have seen in the documentation "This means that the SPI can read the data out of a buffer of data". How does SPI do this (as in read the data) as surely your MCU or DMA is shifting data into a peripheral device at the define clock speed via MOSI connection and then vice versa via MISO connection.

I've had a quick look at an available example "PSoC 4 DMA SPI EXAMPLE" which is available on PSoC Creator as a code sample. This is even more confusing as to which DMA channel relates to MISO and MOSI, for example. The DMA documentation is rather confusing and then there are a ton of options available within the DMA component (under Descriptor tab).

So, where does one start?

How do I optimise this DMA specifically for a TFT screen which really just uses the MOSI connection? What would be the recommended buffer size for example? I was thinking you make it a least one row long (i.e. 320 bytes).

Not applicable

Example projects like this: Using DMA to transfer data to SPIM of PSoC 3/5 on TX_FIFO_NOT_FULL flag

Demonstrate the concept well. The tab saying "RX output", "TX output" is probably just a misnomer, but since the DMA component is itself not holding the data internally, it might just be the idea that both buffers are external to the DMA component, and hence the "Output" terminology.

The DMA documentation is indeed a good place to read for more information

Working with a display, you probably want the DMA to be refreshing the data as quickly as possible. The DMA can be wired up like a hardware component in the CySch file to operate, and the documentation describes what each pin does.

The DMA addresses to read/write to are setup when you initialize the DMA, and it returns a DMA channel reference when it initializes.

I would start with an example for the DMA and use a debugger to see how it behaves/works on an application-standpoint. Then, once you see that it works, and transfers data, then you can start changing operational pieces, like the src/dst addresses, the size of the buffers, etc.

I have not worked with the DMA personally, but it functions like a CPU that has been programmed to transfer data from one linear buffer array to another (All of the peripheral devices, like the SPI, UART, etc. use internal RAM/memory on the SoC to operate, and thus setting the DMA to write from one RAM location in general-purpose memory to the RAM location allocated for the SPI peripheral is a simple matter).

I would set the buffer size to either one row of TFT data, or two depending on how much you need to optimize the refresh time on the screen. If you have two-rows of buffer data allocated, then you can write one row while the other is writing out on the SPI without stopping the timing.