CyU3PDmaChannelSetXfer problem

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hi,

I am trying to modify FX3 cyfxmscdemo app to set an interface from usb to fpga over gpif. 
I use the following APIs,
--
In Scsi Read10 case,
CyU3PDmaChannelSetupRecvBuffer (&glChHandleBulkPtoU, &dmaBuf) //  receive from fgpa to temp buffer
CyU3PDmaChannelSetupSendBuffer (&glChHandleBulkPtoU, &dmaBuf) // send temp buffer to usb host

In Scsi Write10 case,
CyU3PDmaChannelSetupRecvBuffer (&glChHandleBulkUtoP, &dmaBuf) //  receive from usb host to temp buffer
CyU3PDmaChannelSetupSendBuffer (&glChHandleBulkUtoP, &dmaBuf) // send temp buffer to fpga
--

However, this temp buffer (glMscStorageDeviceMemory) has limited capacity due to FX3 ram size. 
Current usb3 speeds may send 2048 numBlks (2048x512 byte=1MByte), which cannot be stored in this temp buffer. I can correctly format fpga ddr memory using above functions by limiting numBlks count in Linux side. 

Now, I want to get rid of this buffer and directly send data from usb to fpga and vice versa. Thus, I noticed CyU3PDmaChannelSetXfer() API in cyfx3s_msc app and I changed above functions with the following ones

In Scsi Read10 case,
CyU3PDmaChannelSetXfer(&glChHandleBulkPtoU, numBlks*glLunBlkSize) // send from fpga to usb host(??)

In Scsi Write10 case,
CyU3PDmaChannelSetXfer(&glChHandleBulkUtoP, numBlks*glLunBlkSize) // send from usb host to fpga (??)

I suppose that should directly transfer data in both directions in the dma channels without involving any buffer, right?

The problem is that, host msc driver first sends read10 cmd (lba:0 and numBlks:1). Then fpga correctly send data to usb host. I can both observe that data correctly in fpga chipscope and in Wireshark usb protocol in Linux .  However, Read10 acknowledge does not appear in Wireshark usb protocol.(I think it should come from FX3 msc driver?) And usb connection freezes after that and does not send any further read10 cmds.

I wonder that, is there anything that I am missing? Can't I use CyU3PDmaChannelSetXfer() in this case? How can I further debug?

Thanks in advance.

 

 

0 Likes
1 Solution
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hello again,

I am able to resolve the "usb event 4" error in the Read10 case.  I was using AN82716 for GPIF interface. After each 512 byte read in DO_IN_DATA state, I sent INTR_CPU (RD_SHORT_PKT). This looped until required amount of data was read. Now, I do not send any INTR_CPU and return to RD_WR_IDLE state using only DMA_RDY_TH0 flag. This change somehow resolved the error I face.

Now I can read&write numBlks up to dma_buffer size. I set the dma buffer size to max allowed 65520 bytes. This makes max 127 numBlks. Although the read/write speeds increase with 127 numBlks, is it possible to make it larger than dma buffer size or are we bounded by that capacity?

Regards,

View solution in original post

0 Likes
20 Replies
JayakrishnaT_76
Moderator
Moderator
Moderator
First question asked 1000 replies posted 750 replies posted

Hello,

Please refer to the documentation of the API CyU3PDmaChannelSetXfer () in the FX3 API guide. As mentioned in the API guide, this function just enables a DMA channel to transfer a specified amount of data before suspending again. You need to allocate DMA buffers for the channel (while creating the channel) to receive the incoming data from the producer socket and thereby to send it to the consumer channel.

Can you please share your project with us so that we can have a look at it?

Best Regards,
Jayakrishna
0 Likes
lock attach
Attachments are accessible only for community members.
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hi again,

I resolved the freeze problem which is explained above and I managed to send and receive data using CyU3PDmaChannelSetXfer() API.

However, I cannot send and receive data, if the usb3.0 host sends numBlks greater than my allocated dmaConfig.size (32kB in my case). That is, if the usb host sends 64 numBlks (64*512kB = 32kB), it is fine. But if the host sends 65 numBlks, the read/write transactions stop. I think, the problem arises when it exceeds dmaConfig.size (32kB).  Is this an expected behavior? As I noticed this dmaConfig.size is max 64kB-16. Usb3.0 hosts may send up to 2048 numBlks(1 MB). Is the transaction bounded by dma buffer size? How should we receive larger  numBlks?

Please find at the attached cyfxmscdemo appnote.

Regards,

 

 

 

0 Likes

Hello,

Can you please let us know how was the freeze issue resolved?

Also, please try sending 64KB and let me know if you face any issues. That is 128 numBlks.

Best Regards,
Jayakrishna
0 Likes
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Can you please let us know how was the freeze issue resolved?
-> I removed CyU3PDmaChannelWaitForRecvBuffer() API in CyFxMscAppHandleMscTask() function.

Also, please try sending 64KB and let me know if you face any issues. That is 128 numBlks.
-> I tried 128 numBlks and CY_USB_EVENT_RESET is received. I think DMA configuration cannot handle larger than 32kB of data in this case.  How can this be resolved?

0 Likes

Hello,

When you say you tried 128numBlks, do you mean to say that 2 separate CY_FX_MSC_SCSI_WRITE_10 commands were sent (64x512 bytes) to the device or is it like a single CY_FX_MSC_SCSI_WRITE_10 was sent (128x512 bytes)? Can you please share the UART debug logs with us?

Best Regards,
Jayakrishna
0 Likes
lock attach
Attachments are accessible only for community members.
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hi again,

I do not send CY_FX_MSC_SCSI_WRITE_10 cmds. I only observe CY_FX_MSC_SCSI_WRITE_10 / CY_FX_MSC_SCSI_READ_10 cmds sent by the host pc (Ubuntu). Ubuntu detects FX3 as mass storage device. Whenever Ubuntu detects the FX3 as mass storage, it may send up to 2048 numBlks( in a single CY_FX_MSC_SCSI_WRITE_10 ) as I noticed.  I can limit the max numBlks in Ubuntu by setting max_sector attribute (to figure out the problem). If I limit this max_sectors to 64 (32kB), everything seems fine. But this time, the speed is too low for an USB3 mass storage device(~1.5 Mbps).

Thus, Ubuntu should be able to send as much as numBlks as it wants in order to make use faster speeds.  Please find the atttached log. Until 64 numBlks, everything is fine. At the end of log file, Ubuntu sends 120 numBlks and reset event occurs.

 

0 Likes
JayakrishnaT_76
Moderator
Moderator
Moderator
First question asked 1000 replies posted 750 replies posted

Hello,

Please let me know if the first 32KB of data was received properly at the host side when numBlks was set to 120. Also, please check the return value of the API CyU3PDmaChannelSetXfer () was success or not when numBlks was set to 120.

In addition to this, please try issuing numBlks of 128 (64KB) and call the API  CyU3PDmaChannelSetXfer () twice as shown below:

CyU3PDmaChannelSetXfer (&glChHandleBulkLpPtoU, 32KB);

...                                                                                                      //Error handling and checking channel status

CyU3PDmaChannelSetXfer (&glChHandleBulkLpPtoU, 32KB);

...                                                                                                      //Error handling and checking channel status

That is, try calling CyU3PDmaChannelSetXfer () with a size that is not more than the DMA buffer. For transferring the remaining data, we can try calling the API CyU3PDmaChannelSetXfer () multiple times until the entire data is transferred. Please try this approach and let me know if you are able to find any differences.

Best Regards,
Jayakrishna
0 Likes
lock attach
Attachments are accessible only for community members.
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hello,

Please let me know if the first 32KB of data was received properly at the host side when numBlks was set to 120. Also, please check the return value of the API CyU3PDmaChannelSetXfer () was success or not when numBlks was set to 120.

-> When numBlks is set to larger than 64, data is received properly for cmds which have less than 64 numBlks. It fails with the first Read10/Write10 cmd which has larger than 64 numBlks. Please have a look at the last lines of uart_log_numBlks120.txt file attached.  
CyU3PDmaChannelSetXfer() api returned successful. However, prodXferCount is different from consXferCount and DMAChannelGetStatus() Api returned DmaState:5  (CY_U3P_DMA_ERROR).
All previous successful cmds have the same prodXferCount and consXferCount. Also, DMAChannelGetStatus() Api returned DmaState:9 (CY_U3P_DMA_XFER_COMPLETED) in previous successful cmds.

 

In addition to this, please try issuing numBlks of 128 (64KB) and call the API CyU3PDmaChannelSetXfer () twice as shown below:

-> I had tried this option. Please have a look at the commented code section of CY_FX_MSC_SCSI_READ_10/WRITE_10 cases in cyfxmscdemo.c. I send at most 32 numBlks  per CyU3PDmaChannelSetXfer() and call multiple CyU3PDmaChannelSetXfer() in READ_10, WRITE_10 cases.

This solution seemed working initially for a couple of minutes in continuous read/write transfers, although read/write speed does not improve. However,  after a couple of minutes, CyU3PDmaChannelWaitForCompletion() returns 73 (CY_U3P_ERROR_DMA_FAILURE). Please have a look at the last lines of uart_log_multiple_xfer.txt attached.

 

 

0 Likes

Hello,

1. Can you please share the source code when you tried continuous read/write transfers. This is because from the source shared before, I found the following line:

while((numBlks-tempnumBlks)>= 32){
SendLBAandNumBlks(tempAddr, 32, 0x0);

Based on my understanding, this 32 should be replaced with 64 (as 64 numBlks will be equal to 32KB). By the way, what is the use of the function SendLBAandNumBlks ()? Does it trigger the FPGA to send or receive a particular numBlks * glLunBlkSize? Also, does the following line of code add delay for the FPGA to be ready?

for(uint32_t looper = 0; looper <400; looper++){
sleepOperation(0x80);
}

2. Can you please remove the API CyU3PDmaChannelWaitForCompletion () for the continuous transfer case and implement delays as done in the single read case? Please let me know if there are any improvements seen after trying this.

3. When the CY_U3P_ERROR_DMA_FAILURE is seen, please share the channel status by using the API CyU3PDmaChannelGetStatus (). 

Best Regards,
Jayakrishna
0 Likes
lock attach
Attachments are accessible only for community members.
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hello,

1) Please find the attached file.  Yes, SendLBAandNumBlks() triggers the FPGA to send or receive a particular numBlks * glLunBlkSize. I had added sleep to wait long enough to ensure the function call completed correctly. I removed sleep loops in the attached file.

2) I removed CyU3PDmaChannelWaitForCompletion() and sleeps.

This time, I observe the following error :

When Write10 cmd which has larger than 64 numBlks comes, calling CyU3PDmaChannelSetupSendBuffer(&glChHandleBulkUtoP, &buffer) API in SendLBAandNumBlks() function returns 67.(CY_U3P_ERROR_ALREADY_STARTED). What does it mean? Do I have to wait more? Please find the attached log.  At the end of uart log file, you can see the error and the current status of dma channels. According to my observation, if I add more delays between function calls, this error is popping up less frequently.

Since more than 32kB cannot be transferred during a single CyU3PDmaChannelSetXfer () API, (due to max dmabuffer size), read/write speeds are still very low. How are we going to overcome this issue, even if all previous errors are cleared?

0 Likes

Hello,

The error CY_U3P_ERROR_ALREADY_STARTED will pop up for CyU3PDmaChannelSetupSendBuffer () if the previous transfer is ongoing. Can you please add enough delays between successive CyU3PDmaChannelSetupSendBuffer ()  calls and let me know if the transfer happens without any issues or not?

Let us deal with the transfer failure for now. We will deal with the data transfer issue once this is sorted out. 

Best Regards,
Jayakrishna
0 Likes
lock attach
Attachments are accessible only for community members.
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hi,

Is not there any API to check that the previous transfer is completed? I have added so many redundant prints between API calls to get rid of that CY_U3P_ERROR_ALREADY_STARTED error.

Now read/write transactions seem to be working for tens of minutes. However, I noticed that CY_U3P_USB_EVENT_RESET is generated after some time. What could be the possible reason for that reset? Please find the attached log, in which reset event occured.

0 Likes

Hello,

You can try making use of the API CyU3PDmaChannelWaitForCompletion () with the timeout parameter set as CYU3P_WAIT_FOREVER for getting rid of CY_U3P_ERROR_ALREADY_STARTED errors:

status = CyU3PDmaChannelSetupSendBuffer (&glChHandleBulkLpUtoP, &tempy);

status = CyU3PDmaChannelWaitForCompletion (&glChHandleBulkLpPtoU, CYU3P_WAIT_FOREVER);

Also, I hope that you are not seeing any errors related to the transfers after following the suggestions that were shared from my side. The only issue now is the occurrence of reset events. As you might be knowing, the reset events are triggered because the USB host issues a reset to the USB device. We are not sure of the reason for these resets. Please confirm that you are using FX3 SDK 1.3.4 for testing. If not, then please use SDK 1.3.4 as it is the latest official release.

Can you please share USB traces using a hardware analyzer such as Lecroy? If it is not available at your end, then can you please share the traces using usbmon (as you are testing on a Linux PC).

In addition to this, please share the latest sources and the UART debug logs with us along with the USB traces so that we can analyze the problem better.

Best Regards,
Jayakrishna
0 Likes
lock attach
Attachments are accessible only for community members.
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hello,

1) I called CyU3PDmaChannelWaitForCompletion () after CyU3PDmaChannelSetupSendBuffer(). Then I saw CY_U3P_ERROR_ALREADY_STARTED after CyU3PDmaChannelSetXfer(). I also called CyU3PDmaChannelWaitForCompletion() after CyU3PDmaChannelSetXfer(). Now I receive CY_U3P_ERROR_ABORTED (72). Please see the attached error aborted.zip for the log and appnote.

2) As another scenario, I removed CyU3PDmaChannelWaitForCompletion() and called so many prints to slow down the firmware. At this time it is working continuously for minutes. After a while, CyU3PDmaChannelGetStatus() returned dma_state CY_U3P_DMA_ERROR(5), although CyU3PDmaChannelSetXfer() returned successful. Please see the attached dma_error_log and related appnote.

3) I use FX3 SDK 1.3.4.

0 Likes

Hello,

Based on my understanding from the UART debug logs, when the numBlks is greater than 64, you are able to send multiples of 64 numBlks correctly, but the last numBlk (which is less than 64) is giving error in the API CyU3PDmaChannelWaitForCompletion (). Please correct me if my understanding is wrong. Also, please let me know if you are able to get the data even after the API fails or are you not able to get data when the API fails?

Please try changing the following line in your code:

status = CyU3PDmaChannelSetXfer (&glChHandleBulkLpPtoU, (tempnumBlks * glLunBlkSize));

to

status = CyU3PDmaChannelSetXfer (&glChHandleBulkLpPtoU, (64 * glLunBlkSize));

And let me know if you are still facing errors.

Also, please add debugprints just before usage of the API CyU3PDmaChannelReset () each time in your source code. This is to track if the DMA channel is reset while the transfers are being done.

In addition to this, please use a global variable for tracking the number of CY_U3P_DMA_CB_XFER_CPLT events that are received in the DMA callback function CyFxMscApplnDmaCb (). When a CY_FX_MSC_SCSI_READ_10 command is received, set the global variable to 0. You need to increment the global variable whenever CY_U3P_DMA_CB_XFER_CPLT event is received inside the CyFxMscApplnDmaCb (). Towards the end of the CY_FX_MSC_SCSI_READ_10 command handling, print the value of the global variable and share the UART debug logs with us along with the modified code for review.

Best Regards,
Jayakrishna
0 Likes
lock attach
Attachments are accessible only for community members.
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hello,

In the current situation, I am facing problems in the Read10 side when numBlks are around 30 (?). Please find the attached log and appnote where I've added  a counter for CY_U3P_DMA_CB_XFER_CPLT events. It's working for numBlks 1,8,16,24.  I experimentally set it to max 30 and Read10 stopped working. I was able to overcome this problem, if I set to while(numblks-tempnumblks) >=16 in Read10 case(That's to send max 16 numBlks at a time). However this is totally weird. There should be enough space in dma buffer. Am I wrong? Write10 loop can send 64 numBlks at a time.

Regarding your questions,
since I can get less than 64 numBlks correctly calling the below function does not make any difference. (status = CyU3PDmaChannelSetXfer (&glChHandleBulkLpPtoU, (64 * glLunBlkSize));)
When error condition happens, data does not come to Linux side in Read10 case.

0 Likes

Hello,

From the UART debug prints, I find the following when error is seen:

USB Read10 startAddr: 106
USB Read10 numBlocks: 30
USB event 4 received

ErrorSending Read10 remBlks!
USB Read10 startAddr: 106
USB Read10 numBlocks: 30
XFER_CPLT_COUNT: 0

As you can see, when the code is inside the Read 10 handling, a USB Event 4 is received (USB RESET). As part of the reset event handling, the DMA channel is Reset inside the function CyFxMscApplnResetDatapath (). This is the reason for CyU3PDmaChannelWaitForCompletion () to return error code 72. Also, this is the reason for the data loss.

We are not sure of the reason for the USB reset issued from the host side. Can you please share the USB traces captured using lecroy or usbmon for us to have a check?

Best Regards,
Jayakrishna
0 Likes
JayakrishnaT_76
Moderator
Moderator
Moderator
First question asked 1000 replies posted 750 replies posted

Hello,

Can you please update this thread by sharing the USB traces so that we can further debug this issue? 

Best Regards,
Jayakrishna
0 Likes
ErOz_4712216
Level 2
Level 2
First solution authored 10 replies posted 10 sign-ins

Hello again,

I am able to resolve the "usb event 4" error in the Read10 case.  I was using AN82716 for GPIF interface. After each 512 byte read in DO_IN_DATA state, I sent INTR_CPU (RD_SHORT_PKT). This looped until required amount of data was read. Now, I do not send any INTR_CPU and return to RD_WR_IDLE state using only DMA_RDY_TH0 flag. This change somehow resolved the error I face.

Now I can read&write numBlks up to dma_buffer size. I set the dma buffer size to max allowed 65520 bytes. This makes max 127 numBlks. Although the read/write speeds increase with 127 numBlks, is it possible to make it larger than dma buffer size or are we bounded by that capacity?

Regards,

0 Likes

Hello,

As mentioned in Section 5.5.4 - DMA Buffer of FX3 TRM, Max DMA buffer size allowed = 0xFFF0 bytes (65520 bytes). It is not possible to increase the DMA buffer size beyond this limit. So, it is not possible to increase the DMA buffer size beyond 65520 bytes.

Best Regards,
Jayakrishna
0 Likes