Low latency loopback application

Tip / Sign in to post questions, reply, level up, and achieve exciting badges. Know more

cross mob
RuPa_4353591
Level 1
Level 1
First like given

We're developing a low-bandwidth low-latency audio application and are investigating the FX3 as a bridge between an FPGA and a host CPU.

Our latency requirements are very strict which makes the USB 2 uSOF specs of 125usec unacceptable.

I have investigated using the CCyIsocEndPoint with USBIsochLoopAuto to measure loopback latencies to/from host (which had a few problems: Multiple uses of XferData for CCyIsocEndPoint ).

     #1: What is the expected latency of a loopback over the isoc endpoints? E.g. are there sync intervals similar to the bulk SOF?

          (In our first setup we measure write/read latencies of 100-200usec on a real-time thread. This is lower than the uSOF, but variability suggests some sync-latency).

     #2: Is it correct to use isoc endpoints with superspeed to achieve lowest possible latency with USB3?

     #3: The documentation mentions that bufferlength should be a multiple of 8 times the MaxPktSize. Is this still applicable?

          The documentation example for isoc transfer (page 40) uses a buffer of 4096 (= 4*MaxPktSize), and the USBIsochLoopAuto firmware uses a DMA buffer of 1024 (= 1 * MaxPktSize). Also, no errors are thrown using short buffers.

     #4: Are there firmware modes I should be using? E.g. does the DMA impose additional latency, and are there requirements for the DMA buffer length?

Any answers and suggestions are much appreciated.

//Rune

0 Likes
1 Solution
lock attach
Attachments are accessible only for community members.

Hello Rune,

I have attached a USB trace during a ISOC OUT transfer using the Cypress C++ Streamer example and the IsoSourceSink firmware. Also, attached a image from the same capture indicating the timestamps of the ISOC transfers.

- It can be seen that each subsequent transfers is scheduled within one service interval (which is 125us). It is essential that the buffers on the host are filled and ready to be transmitted before the service interval. This is the reason for mentioning that buffer of size 8 times the MaxPcktSize should be used. This will prepare the data for 8 service intervals (1ms).

- As I had mentioned in my previous comment, having a buffer size less than 8 times the MaxPcktSize will result in latency because of the fact that the buffers are not ready during the service interval.

Best regards,

Srinath S

View solution in original post

0 Likes
4 Replies
SrinathS_16
Moderator
Moderator
Moderator
1000 replies posted 750 replies posted 500 replies posted

Hello Rune,

1. Using XferData() API for Isoc endpoints is not the right way of testing the latency. This is because, each time XferData() API is called, a new request is scheduled on the bus. Also, it is essential that the Isoc requests are scheduled with reference to a frame interval and the frame number at which it is scheduled should be known. Hence, when XferData() API is called, there might be delays between scheduling of successive transfers. Alternate method is to use the asynchronous method of data transfer using the BeginDataXfer(), WaitForXfer() and the FinishDataXfer() APIs. In this case, the transfer requests are continuous, that is, a new request is not scheduled for every transfer. Hence, the frame synchronization is taken care of. Please refer to the Cypress provided Streamer example which implements this asynchronous data transfer.

2. Yes, Isoc endpoints provide the guaranteed bandwidth and latency. But, it also has to be noted that the data producer (device/ host) must be capable of producing data at the rate mentioned in the isoc endpoint descriptor.

3. The buffer length that is mentioned in the document is the length of the data buffer on the host side and not the DMA buffer size. Also, this statement is based on the fact that the data for 8 USB intervals (1ms) must be ready. In case of data buffer that is lesser in size than 8 times MaxPcktSize, it will be treated as short packet but there will not be any error in the transfer.

4. AUTO channel DMA does not cause any additional latency since it involves only the transfer of handle to the consumer socket and no intervention of CPU.

Best regards,

Srinath S

Hi Srinath.

Thanks for the detailed answers.

Regarding #2: Can you specify or point to resources defining the guaranteed latency? Specifically, is it bounded by the 125usec uSOF intervals or is that not applicable to Isoc endpoints?

Regarding #3: Does the host buffer size affect the latency as long as it is lower than the MaxPcktSize? E.g. Will a short host buffer improve latency or is the transfer bound to the USB interval (1/8 ms)? Can I ignore the statement for this application?

Best regards,

Rune

0 Likes
lock attach
Attachments are accessible only for community members.

Hello Rune,

I have attached a USB trace during a ISOC OUT transfer using the Cypress C++ Streamer example and the IsoSourceSink firmware. Also, attached a image from the same capture indicating the timestamps of the ISOC transfers.

- It can be seen that each subsequent transfers is scheduled within one service interval (which is 125us). It is essential that the buffers on the host are filled and ready to be transmitted before the service interval. This is the reason for mentioning that buffer of size 8 times the MaxPcktSize should be used. This will prepare the data for 8 service intervals (1ms).

- As I had mentioned in my previous comment, having a buffer size less than 8 times the MaxPcktSize will result in latency because of the fact that the buffers are not ready during the service interval.

Best regards,

Srinath S

0 Likes

Thank you.

This was exactly the information I was looking for.

//Rune

0 Likes