Minimum achievable transfer latency from Windows to FX3

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
MocciJ
Level 1
Level 1
First reply posted First question asked Welcome!

Hello,

We are upgrading the communication interface of our peripheal from USB2 to USB3 with an FX3 development kit CYUSB3KIT-003.

This peripheal will be connected to a Windows computer.

In our application, the peripheal must receive packets (size below 512Bytes) at the lowest latency possible (<200us).

Is this latency figure achievable with this configuration?

0 Likes
1 Solution
lock attach
Attachments are accessible only for community members.

Hello,

I have performed two tests to determine the latency. The same tests can also be done at your end to check latency as both the tests were done on CYUSB3KIT-003. Please find the details of the tests mentioned below:

1. Test 1:

Note: For this test, use the firmware and test application attached in test1.zip.

In this test, the device enumerates with a Bulk OUT and Bulk IN endpoint. After the device is enumerated, run the host application. The host will send two BULK OUT packets of length 1024 bytes to the device one after the other. The USB trace can be captured by using Wireshark. The following snapshot shows the timestamps of the two transfers:

JayakrishnaT_76_0-1637728709962.png

From this, we can understand that the latency (difference between timestamps) is less than 200us. 

This test made use of XferData which performs synchronous (i.e blocking) IO operations. You can find more information about XferData in the following document in CyAPI.doc in the following location of FX3 SDK:

C:\Program Files (x86)\Cypress\EZ-USB FX3 SDK\1.3\doc\SuiteUSB

Test 2:

Note: For test 2, make use of the firmware that is attached along with test2.zip. The host application used for this test is C++ streamer application that comes along with FX3 SDK. Streamer application will perform asynchronous IO operations.

After programming the device with the firmware in test2.zip, run the streamer application. The snapshot of streamer application used for the test is given below:

JayakrishnaT_76_4-1637730000262.png

The above settings means that we have queued 32 1024 byte packets. Hit start button to start the transfers. The snapshot of Wireshark traces for this test is given below:

JayakrishnaT_76_3-1637729981926.png

 

The highlighted entry is the start of a new queue. You can find that within a queue, the transfers are so fast. But as soon as one queue of transfers are completed, there is a latency to send the next packet.

So, from these tests you can find that the latency is less than 200us as mentioned in your requirement.

Best Regards,
Jayakrishna

View solution in original post

0 Likes
6 Replies
JayakrishnaT_76
Moderator
Moderator
Moderator
First question asked 1000 replies posted 750 replies posted

Hello,

Please let us know to which class will the device bind to? Is it vendor specific device? 

As you might be knowing, CYUSB3KIT-003 makes use of CYUSB3014 (FX3) controller which is a USB Superspeed peripheral controller. Even though the theoretical bandwidth of USB 3.0 is around 5Gbps, practically it will be around 3Gbps. FX3 also can work at this speed. 

Best Regards,
Jayakrishna
0 Likes
MocciJ
Level 1
Level 1
First reply posted First question asked Welcome!

Dear Jayakrishna,

I understand that the total throughput will be around 3Gbps after overheads. However, we are not directly interested in throughput but in latency.

In particular, we wish to know if the Windows Cypress driver is able to send packets at latencies lower than 1ms. From the USB3 documentation, each frame can be fractioned into 8 microframes of 125us (at least within the isochronous transfer). It is unclear if bulk and interrupt methods are also capable of providing data with such low latency. Is there some whitepaper with experimental findings about it?

In this context, latency means the time period from the driver command (C++) to the succesful command reception by the FX3 controller.

We are unsure about the device class: the product is a multi channel voltage amplifier.

0 Likes

Hello,

We could not properly understand the following statement in your previous response:

"In this context, latency means the time period from the driver command (C++) to the successful command reception by the FX3 controller."

Please elaborate the measurement of latency so that I can check if we have the data that you requested for?

Also, please let me know how is the data sent from the host to the FX3 controller. Are you making use of any host application that is supplied along with FX3 SDK or are you making use of your own host application written by using CyAPI.lib? If you are making use of your own host application using CyAPI.lib, then please share the snapshot of the data transfer portion in the source code so that we can understand it better.

Best Regards,
Jayakrishna
0 Likes

An USB3 host controller is capable of transferring packets with a 125us time granularity (microframes). This is the lowest latency defined by the standard despite of the transfer method employed, being it synchronous or asynchronous.

However, Windows builtin USB stack (WinUSB) begins the transfers at the first full frame available (scheduled with a period of 1ms) instead of transferring at the first microframe available (scheduled each 125us). If multiple transfers are issued, they are scheduled into the next frame (where each transfer occupies a microframe).

Is it possible to reach the 125us USB granularity with CyAPI? We haven't used this API yet.

0 Likes
lock attach
Attachments are accessible only for community members.

Hello,

I have performed two tests to determine the latency. The same tests can also be done at your end to check latency as both the tests were done on CYUSB3KIT-003. Please find the details of the tests mentioned below:

1. Test 1:

Note: For this test, use the firmware and test application attached in test1.zip.

In this test, the device enumerates with a Bulk OUT and Bulk IN endpoint. After the device is enumerated, run the host application. The host will send two BULK OUT packets of length 1024 bytes to the device one after the other. The USB trace can be captured by using Wireshark. The following snapshot shows the timestamps of the two transfers:

JayakrishnaT_76_0-1637728709962.png

From this, we can understand that the latency (difference between timestamps) is less than 200us. 

This test made use of XferData which performs synchronous (i.e blocking) IO operations. You can find more information about XferData in the following document in CyAPI.doc in the following location of FX3 SDK:

C:\Program Files (x86)\Cypress\EZ-USB FX3 SDK\1.3\doc\SuiteUSB

Test 2:

Note: For test 2, make use of the firmware that is attached along with test2.zip. The host application used for this test is C++ streamer application that comes along with FX3 SDK. Streamer application will perform asynchronous IO operations.

After programming the device with the firmware in test2.zip, run the streamer application. The snapshot of streamer application used for the test is given below:

JayakrishnaT_76_4-1637730000262.png

The above settings means that we have queued 32 1024 byte packets. Hit start button to start the transfers. The snapshot of Wireshark traces for this test is given below:

JayakrishnaT_76_3-1637729981926.png

 

The highlighted entry is the start of a new queue. You can find that within a queue, the transfers are so fast. But as soon as one queue of transfers are completed, there is a latency to send the next packet.

So, from these tests you can find that the latency is less than 200us as mentioned in your requirement.

Best Regards,
Jayakrishna
0 Likes

Thank you for those examples.

We obtained more or less the same results that you got.

0 Likes