reworked USB CDCACM (USB serial), both Rx and Tx

stevestrong
Wed Oct 19, 2016 10:14 pm
Hi everyone,
I am posting here because I was working lately on a project which involves fast USB serial reception of huge amount of data (~150kB) on maple mini (blue pill).
@primateio found out, and I acknowledged, that because of some issues in the current version of the USB CDCACM module, some bytes were lost.
See related forum topic here, and on github.

Now I developed a version which overcomes the found issues and thus enables Rx speeds up to ~600kBps, strongly influenced by the involved USB host (PC) and its daily mood (at least on my win10 machine)… Nevertheless, it seems that I don’t have lost bytes anymore.
As a positive side-effect, the size of the Rx buffer was reduced (256 bytes now, previously 512).

Additionally, I also reworked the Tx side as well, because the current version does not allow to send more than 64 bytes at once. It waits till these are completely sent, and only after that new bytes are accepted to be sent, again. This is rather inefficient and thus slow.
Instead, I implemented a ring-buffered transmission, similar to Rx part.
The size of the Tx buffer is currently set to 256 bytes (the saved bytes form the Rx part are used here, so overall no RAM is saved). This means, the Tx is only blocking if you send more than 256 bytes very quickly.

I tested the new version (attached), no problems found so far.
But still, I would like to ask some of you, who is interested in it, to test the Rx and Tx part, as good as you can.

If you think that it is worth, we could maybe even commit it to master branch.

Any suggestion/comment is welcome.


stevestrong
Fri Oct 21, 2016 8:42 pm
UPDATE
I have further reworked the TX part, too. Actually I can reach a speed of ~300kBps for Tx, independent on the USB Tx buffer size.
Which means that the limitation is in user code.
The changes can be found here.

RogerClark
Fri Oct 21, 2016 8:47 pm
Sounds like this is ready for the big time

Can you send a PR ?


stevestrong
Fri Oct 21, 2016 9:09 pm
Hi Roger, PR is sent.

RogerClark
Fri Oct 21, 2016 11:48 pm
Thanks

I will test later


michael_l
Tue Nov 08, 2016 10:20 am
@stevestrong: would you have any test sketch available to test the performance? I’d like to try this out and compare numbers with current master and your patch. Thanks.

What baud rate do you suggest for maximum TX output kB/sec ?


stevestrong
Tue Nov 08, 2016 10:32 am
To set the baud rate is actually superfluous since the USB serial will push the bytes as fast as it can, independent on the user setting.

I will post the test code in the evening.

Please note that the Tx speed is strongly limited by the user code. In my simple test I just push a constant string to USB serial over and over again.
If you try to load data, let’s say, from SD card then the USB TX speed quickly falls below ~60kBps, limited by the reading speed from card.


michael_l
Tue Nov 08, 2016 10:50 am
Ok thanks!

I understand that SD card will limit the performance. In my case the problem seems to be that with the TX operation will block (for too long) which seem to affect other part of the program.

By the way: I had to do one small modification to get patch compiled:

usb_serial.h

line 68: size_t write(const uint8* uint32);


stevestrong
Tue Nov 08, 2016 12:04 pm
michael_l wrote:
By the way: I had to do one small modification to get patch compiled:
usb_serial.h
line 68: size_t write(const uint8* uint32);

michael_l
Tue Nov 08, 2016 12:12 pm
Thanks, I just stared at the commit but didn’t realize you’ve fixed that already. :)

This is my printing code. I wonder if the performance will be better if I create a temp buffer first filling with the data and send all that in one call to Serial.println ? I think yes, because each byte is sent in a single USB packet. I’m using this same code in ESP8266 and it does not seem to block the same way. Ok, it is a different MCU but still…

void printMsg() {
Serial.print(micros());
Serial.print(",ID: ");
Serial.print(id, HEX);
Serial.print(",Data: ");

for(int i = 0; i<len; i++) {
if(rxBuf[i] < 0x10) {
Serial.print("0");
}
Serial.print(rxBuf[i], HEX);
Serial.print(" ");
}
Serial.println();
}


stevestrong
Tue Nov 08, 2016 12:28 pm
In general, since now the TX is also buffered, it is better to send more bytes in one shot.
While the USB is sending the bytes from its own buffer, the main routine can handle other tasks.

I had best results using a temporary buffer of 1024 bytes to fill with data read from SD card. Then I just called “Serial.write(buffer, length);”. During the time USB was sending, I was reading the next 2 blocks, each 512 bytes, of data from the card. The USB TX buffer size is kept at the default value of 256.
But anyway, please keep in mind that the USB host has a strong influence on the TX speed. The faster the host application, the higher the USB TX speed.

EDIT
I am missing the millis() at the end of data transmission in your code.

EDIT2
Regarding the uC processing speed, I newly learned that the F103 family core runs with 2 additional wait states. This means, the uC actually reads code from flash and executes it with 24 MHz, not with 72, as one would expect. This is much less then the 80 MHz of the ESP core.


michael_l
Tue Nov 08, 2016 4:10 pm
First quick tests with your patch and the speed difference is “amazing” ! Thanks it really makes a difference.

stevestrong
Tue Nov 08, 2016 8:42 pm
I’m happy that someone can make any use of it.
Unfortunatelly i wont manage to post today the examples due to lack of time.

michael_l
Wed Nov 09, 2016 10:22 am
Here’s short snippet of my log. First printed is millis() and last value is millis(). As we can see it takes usually under 1ms or sometimes over (the last line) to print these to serial.

35511,00,ID: 000,Data: F5 00 CF C0 FC, 35511
35511,15,ID: 000,Data: B0 15 00 1F D0 07 00 60, 35512
35512,20,ID: 000,Data: AF 02 80 00 08 30 0A 35, 35512
35513,F0,ID: 000,Data: 01 0F FC 00 00 FF F1, 35513
35513,A0,ID: 000,Data: 2F 0A 00 00 7D 00 F0 02, 35514


stevestrong
Wed Nov 09, 2016 10:29 am
For so few data it is just the copying to TX buffer which takes time, the rest is processed in background.

RogerClark
Wed Nov 09, 2016 7:42 pm
Guys

FYi. I have not merged the PR for this, because it sounds like its still under development.

I want to wait for it to be stable and cleanly merge-able before I action the PR.


stevestrong
Wed Nov 09, 2016 7:47 pm
Roger, the part with USB serial is done, no more work on it.
I then started to work on SPI and committed some changes to my repo which landed also in the USB serial PR, I don’t know how.
Unfortunately I don’t know how to remove those late commits from the USB serial PR, so please remove it if you can and only take those relevant PRs.

martinayotte
Wed Nov 09, 2016 7:54 pm
Any Git commits from the same branch is always included in a pending PR.
The only way to distinguish different work into different PR is to have multiple branches.
So, Roger won’t have other choices to do some “cherry picking” …

RogerClark
Wed Nov 09, 2016 7:55 pm
@stevestrong

Ok.

I am not sure if there is a Git way to only take part of a PR, but i can do it manually.

I will do it later today ( its Thursday here ;-)


RogerClark
Wed Nov 09, 2016 7:56 pm
Thanks martin, thats what I thought

stevestrong
Wed Nov 09, 2016 8:00 pm
Thanks Martin for the info, it seems that I have to make another branch for my own development and put the relevant changes to the master branch only when a previous PR has been committed.
Do you guys handle it the same way?

RogerClark
Wed Nov 09, 2016 8:06 pm
I put new dev into separate branches and then merge back into master when they are complete.

I don’t commit changes to different bugs at the same time, but I know that PR contains multiple commits.


stevestrong
Wed Nov 09, 2016 8:11 pm
OK, i got the point, from now on only clean PRs from me :)

RogerClark
Wed Nov 09, 2016 8:57 pm
@stevestrong

In the Arduino STM32 repo I do commits directly to the master branch some times, it depends on the complexity and scope of the commit

e.g. if its just a new variant, its fairly self contained, and just a new folder + updated boards.txt

For PR’s I merge simple ones straight into the Master branch

But for more complex ones, I create a branch for the PR (usually based on the username) and pull to that branch, then test and then merge that branch back into Master

BTW. We have a gitter “lobby” https://gitter.im/stm32duino/Lobby, which is sometimes handy for discussing this stuff, but whether many people are online at the same time, varies


michael_l
Sat May 27, 2017 12:04 pm
So has this been merged to master already ? EDIT: looks like it is

Leave a Reply

Your email address will not be published. Required fields are marked *