- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Working with two FX3's in a back to back configuration, and I am wondering if I am missing something.
When I push a buffer from Master->Slave, the correct amount of bytes are received with no additional padding added. However, if I send from Slave->Master, GPIF seems to append zeros to fill out the rest of my buffer, which results in me having to use something like the following to remove them, significantly impacting performance.
i = input->buffer_p.count - 1;
while(input->buffer_p.buffer[i] == 0x00){
i--;
}
//input->buffer_p.count always returns 4096 due to GPIF padding. Find actual size, with benefit to full buffers.
CyU3PDmaChannelCommitBuffer (chHandle, i + 1, 0);
Again, this method works, but I would rather use Auto channels for the speed difference, avoiding CPU modification entirely.
I am aware of the 4 byte alignment padding, where sending 6 bytes will add 2 bytes, giving a total of 8 received bytes. This behavior is fine. The issue is when I want to send a short packet, say 84 bytes. When the Master side receives this, I receive a buffer with its count at 1024, and reading out the data shows my 84 bytes, followed by 940 zeros in the packet.
I have tried a variety of settings, but then found that this behavior appears in the provided back to back example, AN87216. Can this be avoided?
As an example:
A transfer from Master->Slave works fine, and the correct number of bytes arrive on the other side.
But from Slave->Master, you can see all of the padding being added.
Solved! Go to Solution.
- Labels:
-
USB Superspeed Peripherals
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Glad to hear that the zero padding problem is resolved when the data counters at both master and slave side are same.
For the 1024 bytes issue, please refer to this KBA https://community.infineon.com/t5/Knowledge-Base-Articles/Data-sent-from-Host-over-USB-is-not-Commit...
Please let me know if any query on this
Rashi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I understand that you are using the default firmware shared with AN87216. Please confirm.
If yes, the default firmware allows to send short packets (for eg. 84 bytes) to master. Please refer to section 6.3 of the app note which mentions that a short packet is identified when there is no more data (FLAG C asserted)in the DMA buffer of slave FX3 and the ADDR_CNT_HIT event is not generated.
Can you please check if CyFxApplnGPIFEventCB is called on the master side? You can check incrementing a variable and later printing the value in for(;;) loop.
If that doesn't work, please check the interface signals between FX3 master and FX3 Slave. Also , let me know if you are using a custom board or Cypress FX3 kit i.e. CYUSB3KIT-003
Rashi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am using a modified version of AN87216, using four threads. I have attached the state diagrams below. Additionally, I am using two CYUSB3KIT-003, connected as shown in figure 27 of the AN87216 example project.
As you can see in my state diagram, Flag C is used in handling thread 3. I have tried to create an additional flag, Flag E, but this provides the same effect as before and doesn't remove the zero padding.
I can confirm that CyFxApplnGPIFEventCB is being called, but only when I leave the transition to RD_SHORT_PKT as !ADDR_CNT_HIT. However, this then seems to corrupt threads 2 and 3, as changing this transition stops any data from being transfer over those two threads.
Using a transition of !ADDR_CNT_HIT&!FLAG_A almost achieves what I want, if there wasn't buffers that appear to overflow into one another, however, that makes sense given what FLAG A is.
Is there a way that I can add/modify a Flag E to work in a similar fashion to how it behaves in AN87216 as that appears to be a solution if I can get it to work in the same way. I am not quite sure where Flag C is actually coming form in that example, since flags are setup to indicate socket availability, but there are only two configured sockets in AN87216.
Thank you for the help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Please explain the application is detail so that I can suggest a proper solution.
From the Master state machine, I see that Flag C is not used to either Drive the data out from master or read the data from slave. Please let me know why are four threads used.
Please let me know if you have modified the FX3 slave firmware to add the DMA channels associated with the 4 threads.
Using a transition of !ADDR_CNT_HIT&!FLAG_A almost achieves what I want, if there wasn't buffers that appear to overflow into one another, however, that makes sense given what FLAG A is.
>> I didn't understand it. Can you please explain this better. Please use !ADDR_CNT_HIT&!FLAG_A as the transition equation.
I am not quite sure where Flag C is actually coming form in that example, since flags are setup to indicate socket availability, but there are only two configured sockets in AN87216.
>> FLAG C in the default state machine is the watermark flag and not DMA ready flag for thread 1. For more details on DMA flags, please refer to AN65974
Rashi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For reference, the four threads are needed as I have two threads setup to forward the bulk in/out lines from one device to the other as setup in the example project, as well as two additional threads being used to forward data to an interrupt endpoint on each side in response to a vendor request. Essentially allowing me to send and Interrupt to the opposite device host when I receive a Vendor/Class request and vice versa.
DMA Channels configured as follows:
#define CY_FX_DMA_BUF_COUNT (3) /* Master channel buffer count */
#define CY_FX_DMA_TX_SIZE (0) /* DMA transfer size is set to infinite */
#define CY_FX_THREAD_STACK (0x0400) /* Master application thread stack size */
#define CY_FX_THREAD_PRIORITY (8) /* Master application thread priority */
#define CY_FX_INTR_BUF_COUNT (16) /* Master channel buffer count */
#define CY_FX_INTR_TX_SIZE (64) /* DMA transfer size is set to 64 */
#define CY_FX_EP_PRODUCER 0x02 /* EP 1 OUT */
#define CY_FX_EP_CONSUMER 0x81 /* EP 1 IN */
#define CY_FX_EP_CONSUMER_INTR 0x83 /* EP 2 OUT */
#define CY_FX_PRODUCER_USB_SOCKET CY_U3P_UIB_SOCKET_PROD_2 /* Socket 1 is producer */
#define CY_FX_CONSUMER_USB_SOCKET CY_U3P_UIB_SOCKET_CONS_1 /* Socket 1 is consumer */
#define CY_FX_INTERRUPT_USB_SOCKET CY_U3P_UIB_SOCKET_CONS_3 /* Socket 2 is producer */
/* Used with FX3 Silicon. */
#define CY_FX_PRODUCER_PPORT_SOCKET CY_U3P_PIB_SOCKET_0 /* P-port Socket 0 is producer */
#define CY_FX_CONSUMER_PPORT_SOCKET CY_U3P_PIB_SOCKET_1 /* P-port Socket 1 is consumer */
#define CY_FX_INTERRUPT_PRODUCER_PPORT_SOCKET CY_U3P_PIB_SOCKET_2 /* P-port Socket 2 is producer */
#define CY_FX_INTERRUPT_CONSUMER_PPORT_SOCKET CY_U3P_PIB_SOCKET_3 /* P-port Socket 3 is consumer */
/* Burst length in 1 KB packets. Only applicable to USB 3.0. */
#define CY_FX_EP_BURST_LENGTH (16)
/* Multiplication factor used when allocating DMA buffers to reduce DMA callback frequency. */
#define CY_FX_DMA_SIZE_MULTIPLIER (2)
/* Create a DMA Manual(Auto) Channel between four sockets of the U port.
* DMA size is set based on the USB speed. */
dmaCfg.prodSckId = CY_FX_PRODUCER_USB_SOCKET;
dmaCfg.consSckId = CY_FX_CONSUMER_PPORT_SOCKET;
dmaCfg.dmaMode = CY_U3P_DMA_MODE_BYTE;
dmaCfg.notification = 0;
dmaCfg.cb = NULL;
dmaCfg.prodHeader = 0;
dmaCfg.prodFooter = 0;
dmaCfg.consHeader = 0;
dmaCfg.prodAvailCount = 0;
apiRetStatus = CyU3PDmaChannelCreate (&glChHandleBulkLpUtoP,
CY_U3P_DMA_TYPE_AUTO, &dmaCfg);
if (apiRetStatus != CY_U3P_SUCCESS)
{
DBGPRINT ("glChHandleBulkLpUtoP create failed, Error code = %d\n", apiRetStatus);
CyFxAppErrorHandler(apiRetStatus);
}
/* Create a DMA Manual(Auto) Channel between four sockets of the P port.
* DMA size is set based on the USB speed. */
dmaCfg.prodSckId = CY_FX_PRODUCER_PPORT_SOCKET;
dmaCfg.consSckId = CY_FX_CONSUMER_USB_SOCKET;
dmaCfg.notification = 0;
dmaCfg.cb = NULL;
apiRetStatus = CyU3PDmaChannelCreate (&glChHandleBulkLpPtoU,
CY_U3P_DMA_TYPE_AUTO, &dmaCfg);
if (apiRetStatus != CY_U3P_SUCCESS)
{
DBGPRINT ("glChHandleBulkLpPtoU create failed, Error code = %d\n", apiRetStatus);
CyFxAppErrorHandler(apiRetStatus);
}
/* Create a DMA Manual Out Channel between CPU and Interrupt Socket */
dmaCfg.size = CY_FX_INTR_TX_SIZE;
dmaCfg.count = CY_FX_INTR_BUF_COUNT;
dmaCfg.prodSckId = CY_U3P_CPU_SOCKET_PROD;
dmaCfg.consSckId = CY_FX_INTERRUPT_USB_SOCKET;
dmaCfg.dmaMode = CY_U3P_DMA_MODE_BYTE;
/* No callback is required. */
dmaCfg.notification = 0;
dmaCfg.cb = NULL;
apiRetStatus = CyU3PDmaChannelCreate (&glChHandlePushToInt,
CY_U3P_DMA_TYPE_MANUAL_OUT, &dmaCfg);
if (apiRetStatus != CY_U3P_SUCCESS)
{
DBGPRINT ("glChHandlePushToInt create failed, Error code = %d\n", apiRetStatus);
CyFxAppErrorHandler(apiRetStatus);
}
/* Create a DMA MANUAL_IN channel from the Interrupt PPORT to the CPU */
dmaCfg.prodSckId = CY_FX_INTERRUPT_PRODUCER_PPORT_SOCKET;
dmaCfg.consSckId = CY_U3P_CPU_SOCKET_CONS;
dmaCfg.dmaMode = CY_U3P_DMA_MODE_BYTE;
/* Different Callback setup */
dmaCfg.notification = CY_U3P_DMA_CB_PROD_EVENT;
dmaCfg.cb = InterruptDmaPtoUCallback;
apiRetStatus = CyU3PDmaChannelCreate (&glChHandleInterruptLpPtoU,
CY_U3P_DMA_TYPE_MANUAL_IN, &dmaCfg);
if (apiRetStatus != CY_U3P_SUCCESS)
{
DBGPRINT ("glChHandleInterruptLpPtoU create failed, Error code = %d\n", apiRetStatus);
CyFxAppErrorHandler(apiRetStatus);
}
/* Create a DMA MANUAL_OUT channel from the CPU to the Interrupt PPORT */
dmaCfg.prodSckId = CY_U3P_CPU_SOCKET_PROD;
dmaCfg.consSckId = CY_FX_INTERRUPT_CONSUMER_PPORT_SOCKET;
/* No callback is required. */
dmaCfg.notification = 0;
dmaCfg.cb = NULL;
apiRetStatus = CyU3PDmaChannelCreate (&glChHandleInterruptLpUtoP,
CY_U3P_DMA_TYPE_MANUAL_OUT, &dmaCfg);
if (apiRetStatus != CY_U3P_SUCCESS)
{
DBGPRINT ("glChHandleInterruptLpUtoP create failed, Error code = %d\n", apiRetStatus);
CyFxAppErrorHandler(apiRetStatus);
}
Rereading AN65974, I was able to create a Flag E with the following settings:
Which did reduce the amount of padding, but now I am getting some weird results that still seems to be an issue involving the padding. I'm able to confirm that it is just excess padding using the same code from before to strip the excess trailing zeros.
i = input->buffer_p.count - 1;
while(input->buffer_p.buffer[i] == 0x00){
i--;
}
//input->buffer_p.count always returns 4096 due to GPIF padding. Find actual size, with benefit to full buffers.
CyU3PDmaChannelCommitBuffer (chHandle, i + 1, 0);
Using Wireshark to look at the packets, I can see that the zero padding still occurs in the trailing packet of my initial message (These packets are all the same, basically just sending 128000 bytes with the value 0x80, followed by 0x0A to signal the end of the data that was sent):
Where the data sent should have stopped at the 0x0a value. This then causes the 0's to overflow in to the next packets buffer from the looks of things, as sending the same packet again starts with an excess of zeros before seeing my data:
Which causes the end of that packets contents, to somehow be shifted into the next packets buffer, causing the end of my second packet to take up the beginning of my third packet, continuing this shifting pattern throughout every subsequent packet sent.
I should also mention that this happens on what appears to be 1024 byte alignments, as the first packet is padded out to be 1024 bytes long, and the zero padding that is shifting my actual data around all seem to be 1024 bytes long as well.
Additionally, when I said that !ADDR_CNT_HIT&!FLAG_A gave a result that almost worked, this is the same result that I got.
I have tried adjusting my DATA_COUNT and ADDR_COUNT values to match the formulas as described in AN65974, but it doesn't actually seem to have any effect, the result is always the above.
I have reattached the updated state machines with the correct settings for Flag E.
Thank you for working through this with me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thank you for the details.
I understand that currently only two channels are used for the test i.e. related to BULK endpoints and the channels related to interrupt endpoint are not used during this test. Is my understanding correct?
If yes, I understand that following is the test process you are following
- Connecting two FX3 back to back
- Sending 84 bytes from Master > Slave works as expected
- Sending 84 bytes from Slave > Master doesn't work as expected (i.e. zeroes are padded)
From this, it seems that Read from Master doesn't work as expected. I understand that you have configured the watermark value as 0 using CyU3PGpifSocketConfigure api.
Which did reduce the amount of padding, but now I am getting some weird results that still seems to be an issue involving the padding.
>> To narrow down the problem, please do the following and let me know the results (control center snippets)
- Send small amount of data (for example 16 bytes - incrementing numbers) from Master to Slave
- Read the data from IN endpoint of Slave
- Send small amount of data (for example 16 bytes - incrementing numbers) from Slave to Master. Check the input->buffer_p.count in DMA callback of P to U channel on Slave side. Please do not call cyu3pdebugprint inside the DMA callback. I
- Check if CyFxApplnGPIFEventCB is called on master. When CyU3PDmaChannelSetWrapUp is called a producer event for glChHandleBulkLpPtoU channel will be triggered. Copy the input->buffer_p.count into a variable and print that variable outside the DMA callback.
From this test we can understand from which point are the zeroes added. Also, please remove the code snippet to remove the zeros for this test.
Also, I had a query regarding the transition equation from state SELECT_1_OR_2, why one transition uses logic 1 and other one checks for Flag
Rashi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It looks like my suspicion of it being 1024 byte aligned was correct. I increased the data size, and all of the way up to 1024 bytes, I receive the exact amount of data that I sent. As an example, I sent 996 bytes:
And read 996 bytes on the other side:
However, something interesting happens exactly at 1024 bytes. When transmitting 1024 bytes, it appears to transfer successfully, but then the read fails on the other end as if no data was made available.
Then, anything greater in size then 1024 bytes becomes fully padded with zeros. As an example, the picture below shows a transfer of 1040 bytes:
But when read on the other end, its fully padded out with zeros.
When reading the size of input->buffer_p.count, I get the exact value of data sent when less than 1024 bytes, and 32768 when greater than 1024 bytes. However, at 1024 exactly, I am in fact not receiving the packet at all.
For additional information, I have my sockets configured as follows.
Master side:
CyU3PGpifSocketConfigure(0, CY_U3P_PIB_SOCKET_0, 4, CyFalse, 7);
CyU3PGpifSocketConfigure(1, CY_U3P_PIB_SOCKET_1, 4, CyFalse, 7);
CyU3PGpifSocketConfigure(2, CY_U3P_PIB_SOCKET_2, 4, CyFalse, 7);
CyU3PGpifSocketConfigure(3, CY_U3P_PIB_SOCKET_3, 4, CyFalse, 7);
Slave side:
CyU3PGpifSocketConfigure(0, CY_U3P_PIB_SOCKET_0, 4, CyFalse, 7);
CyU3PGpifSocketConfigure(1, CY_U3P_PIB_SOCKET_1, 0, CyFalse, 1);
CyU3PGpifSocketConfigure(2, CY_U3P_PIB_SOCKET_2, 4, CyFalse, 7);
CyU3PGpifSocketConfigure(3, CY_U3P_PIB_SOCKET_3, 0, CyFalse, 1);
I have tried a variety of combinations in addition to this, including adjusting the watermark from 0-4 all all sockets, as well as adjusting the burst values between 0-7. Additionally, Master side is running with a pibClock.clkDiv = 4, and Slave side is running with it set to 2 as suggested in the example application.
EDIT:
By adjusting my LD_DATA_COUNT and LD_ADDR_COUNT to both be 16383, I was able to remove the zero padding for all values both less than 1024 and greater than 1024, but I still don't see packets that are exactly 1024 in length coming through. The only thing that I can think of/find that is 1024 is the bulk endpoints themselves, but I am not sure if this would have any affect on GPIF?
Best,
Devon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Glad to hear that the zero padding problem is resolved when the data counters at both master and slave side are same.
For the 1024 bytes issue, please refer to this KBA https://community.infineon.com/t5/Knowledge-Base-Articles/Data-sent-from-Host-over-USB-is-not-Commit...
Please let me know if any query on this
Rashi
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for linking that article, it seems to be exactly the issue, with all values where (X % 1024) == 0. Is there not a way to force a ZLP from the GPIF state machine so that I do not have to modify the drivers I am using on either side
I would think there is a solution where the DMA callback or the GPIF state machine is able to recognize that data was sent, at which point it can handle sending a ZLP.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Please see my comments below:
Is there not a way to force a ZLP from the GPIF state machine so that I do not have to modify the drivers I am using on either side
>> As the DMA channel UIB to PIB the ZLP should come from USB side and not from PIB/GPIF. The data is to be committed from USB side and not from GPIF side.
The host application need to be modified to send the ZLP if (X % 1024) == 0
Rashi