I don't find answer in applications notes and examples includes in PSoC Creator, can you tell me/explain me how i can configure DMA to increment address by more than one step address. To precise my issue, i need to start stored data in address 0x00 (for example) next data in address 0x03 and next ... but for this moment i use the "TD_INC_DST_ADR" and it is not good in my application.
Thank's in advance.
INC_DST_ADR increments according to the transfer size for each block. So if you transfer 4 bytes per transaction, it will increment by 4. Or do you transfer only single bytes, but need to scatter them into memory?
(I remember to see an AppNote explaining that, but cannot find it right now 😞
In fact my DMA transfert from Status register one byte by one byte to SRAM.
I need to describe my system, i have four status registers, each status register input is connected to 8 bits bus, and each time i detect a bit change (on any of four bytes), i need to transfert simultaneously this four bytes values in SRAM but in an organized manner.
byte 3 => SRAM Address 0x00, byte 2 => Address 0x01, byte 1 => Address 0x02 and byte0 => Address 0x03 and next event byte 3 => SRAM Address 0x04, byte 2 => Address 0x05 and next... so each 4-byte packet corresponds to an event.
My idea is to have finaly only one buffer which contain my events informations in an organized manner to then simply transfert via USB by 64-byte packets. i don't know if it is the best way to do this. Four Status register + four DMA + SRAM and USB transfert.
To return to my first post, i was describing only the system with one status register and one DMA 🙂 , you understand now why i need to jump to 0x00 address to 0x04 address for the same status register and DMA, because i have in same time, three others DMA which should tranfert data in address 0x01, 0x02 and 0x03 🙂 , so you have right hli, i need scatter them into memory :s . and i don't know if i can do this, i think NO unfortunately.
Thank's in advance!!!
Note being a DMA expert, why would not TD chaining take care of this ?
Here is a white paper, and one example where bytes were swapped in an intermediate
Thank you Dana!
I have finished my day 🙂 , tomorow i will post here a schema of system i want to do, a picture is worth a long discuss 😉
According to the link Dana sent you you might
Use 4 DMAs to transfer 4 single bytes to a 32-bit var
Use 1 DMA to transfer that 4-byte value into your final destination-array.
I am curious how you manage to get informed of a single bit-change within your 4 byte-wide status-registers.
For detect a bit change on one of my buses, i use a edge detector (rising & falling), i parallelize my bus to 8 edges detectors, when a edge is detected, i use the pulse generated to manage a DMA for example or an others components. This seems to be ok because i transfert the 8 bits counter values like this to the SRAM and after on my PC (by USBFS) without losing data. (by 64-byte packet). (1xCounter + 8xEdges detectors + 1xStatus registor + DMA + SRAM_Buffer + 1xUSBFS).
I look again carefully Dana's White Paper and your explanation Bob Marlowe. Pending, attached file is two simples schemas will show you what i want to do. But Bob Marlowe, to be honnest with you, i don't understand how i can configured my four DMA to transfert in a single 32 bits variable because the four DMA transfert simultaneously the four bytes. Maybe i can set in DMA parameter not only an destination address but also a position in this address. In this manner, each DMA have the same destination address but write in a differentes locations in this address.
Different DMA-transfers cannot occur simultaneously, there lies a short time between them. My suggestion was to trigger 4 one-byte Transfers with the signal indicating a change to dest, dest, dest and dest. When all transfers terminated you transfer a 33-byte value from dest to your array keeping the final results.
While writing this I get the idea of a (more complicated) but much easier to handle solution: Within PSoC5 you can build your own components. When you create a 32-bit wide datapath component it will be able to detect a bit-change and capture the 32-bit result in a FIFO from which you can directly transfer it with one DMA. All done in hardware.
What you need? You'll have to learn VeriLog as Hardware Description Language (HDL) and learn how to use the datapath programming. This solution will use 4 (max.8) of the 24 UDBs within the PsoC and will run with a clock of max. 48MHz and -apart from setup- will need no CPU intervention. You may capture up to 1024 32-bit samples with a single DMA transfer.
Hope this doesn't get you headaches
Thank you Bob Marlowe,
I am agree with you for the second solution, in fact, i have previously already explode this aspect, and i created a custom component (symbol + verilog file + DMA capability xml ) which capture the four bytes and concatenated them in one 32bit word. My component is detected in DMA Wizard (Yeaaah!!) but my issue is to implement the output register of 32 bits which must be transfered by DMA in SRAM. I think it is the datapath configuration but i don't understand this aspect... 😕
For precise my issue, when i choose in DMA wizard my custom component like source and SRAM like destination, i don't know how specify what are the data pointed in the source.
You should not use the output register of the datapath, but the output FIFO which is 4 entries deep. The fitter will assign a name to the used UDB's FIFOs where you can read from. When got stuck (which seems to be right now) file a MyCase and get help by a Cypress engineer (Top if this page: Support&Community -> Technical Support -> Create a MyCase). Please keep us informed of this process, I'd like to see that solution.
Thank you Bob Marlowe for your advices, i will take time to explore more the custom component aspect and i will contact Cypress engineer!
when i have the solution i will share with you!
I have finally choose to create my own component! And so i think this is the best way... as promised i will explain you how i do that.
Four 8 bits inputs, Byte0 => timer[7:0] ; Byte1 => timer[15:8]; Byte2 => timer[23:16] and Byte3 => CNT[7:0].
One output, a DRQ signal for DMA request transfer.
This component contain 4 UDB setup to datapath 32. All UDB are chained (Basic Chaining like in this example http://www.cypress.com/?app=forum&id=2492&rID=76859) in "Datapath Config Tools" and so chained in verilog file (see example http://www.cypress.com/?rID=46730&cache=0 to see a verilog chaining). in each UDB, "A0 WR SRC" and "F0 INSEL" setup to "ALU". We use Parallel Input "PI" on each UDB. (and so PI is connected to A0 and A0 is connected to F0.)
With this first component version, when i read F0 content,i read my 32 bits words with no issue. (CY_GET_REG32(MyComponent_DPConf__F0_REG in cyfitter.h).
In a second time, i need to generate a DRQ, this DRQ is managed by two events. First event, a timer overflow, and second event, a bit change on CNT byte.
Timer overflow handling: It is very easy to do this, i use the UDB datapath output named ".ff0", this output indicate when A0 register is equal to FF. And so when my three timer bytes are equal to FF my DRQ is equal to '1'.
CNT bit change: To do this, i check each bits status with an either edge detector (see APPENDIX B in http://www.cypress.com/?docID=42936, when posedge or negedge is detected on one of my CNT bit, my drq is equal to '1'.
For resume, F0 content is good and my DRQ generation is good...but i have now another issue. I can't transfer my 32bits data properly with a DMA. In fact, when i read my destination array, datas are wrong. To see this issue, i have connected a digital constant to CNT input, and despite this constant, the CNT bytes in my destination array is not equal to my constant and more it is variable (change at each clk edge).
See my DMA init:
So i have a doubt on my source address...but i don't see where is my issue, because my source address is the same when i read direclty the F0 content in my main and print it on my LCD screen.
To read my destination array:
I don't know if my explain is clear...if you have some questions or if you need more explanation don't hesitate!!
Sorry for this UP...
On the same project, i don't understand why i can't read my UDB FIFO_0 correctly with a DMA, in fact, when i read the content of my FIFO_0 as follow:
dataout = CY_GET_REG32(SpyHandling_Byte0__F0_REG);
The data read are goods. I obtain my 32 bits data.
But with the DMA, configured to 4 bytes per burst, and request per burst set to 1.
source address => SpyHandling_Byte0__F0_REG.
destination address => a tab named destinationArray.
And i read the SRAM content as follow
databuffer = CY_GET_REG32(destinationArray);
here, i read four times the same value with my DMA.
To read correctly with my DMA, i must increment the source address, but i think it is not good, because, when i do that, i think that i read the first byte of each F0 FIFO of the 4 UDB chained and not in the same FIFO.
do you have an idea on the cause of this issue?
See attach my UDB component test project. (PSoC5LP CY8CKIT-050 Dev kit)
Reading 32bit registers withn the DMA is tricky - it can only do 16 bit directly. For an explanation how to do it, see AN61102. You really need to increase the source address, so you transfer 4 single bytes (if you don't increase the address, you always get the same byte 4 times).
If you want to transfer more than one word, you need an intermediate memory. See the AN above for the '20-bit data buffering' chapter. The AN also comes with sample projects.
Thank you Hli,
I know this document, but i had completely forgotten it!! Thank's a lot!
Now, if i understand well, my first mistake is to have considered my output spoke width to 32bits, but, actually it is 8bits, it is right? That why i need to increment my source address to read my complete word with DMA (4 x 8bits).
Actually, in my real project, i have the max number of TD, i.e 128, and each TD transfert 4 bytes from UDB FIFO to SRAM, increment source and destination addresses and when transfert is done, go to next TD. This system works well, but the number of stored words is boundary to 128 before to loopback on the fisrt TD. But my issue in my final project is that i have a continuous flow of datas which are stored in SRAM by the DMA, and "in same time", USB reads data in SRAM by 64-bytes packet size and send it on computer. And the USB process is too slow and DMA process catch up the USB read packet process. That why i want optimize my DMA system to increase the number of word that i can store in SRAM without loop on the first SRAM writting address to fast.
So, thank you for the document, i will try to inspire me to it to fit my project.
The output spoke of a peripheral is 16 bits. You set up a first DMA to transfer 4 Bytes to a memory location (an uint32; increment source increment dest). ´When that is finished you transfer from that location to your array (4 bytes , no increment source, increment dest)
Second rule of PSoC: there is probably an AppNote about it 🙂
Your scenario is exactly what the AN describes. You need two intertwined DMAs - the first reading 4 bytes from the register into a fixed memory location, and the second one then transfers that into the memory buffer you want to use (in sRAM a 32-bit-transfer is possible).