I am posting here because I was working lately on a project which involves fast USB serial reception of huge amount of data (~150kB) on maple mini (blue pill).
@primateio found out, and I acknowledged, that because of some issues in the current version of the USB CDCACM module, some bytes were lost.
See related forum topic here, and on github.
Now I developed a version which overcomes the found issues and thus enables Rx speeds up to ~600kBps, strongly influenced by the involved USB host (PC) and its daily mood (at least on my win10 machine)… Nevertheless, it seems that I don’t have lost bytes anymore.
As a positive side-effect, the size of the Rx buffer was reduced (256 bytes now, previously 512).
Additionally, I also reworked the Tx side as well, because the current version does not allow to send more than 64 bytes at once. It waits till these are completely sent, and only after that new bytes are accepted to be sent, again. This is rather inefficient and thus slow.
Instead, I implemented a ring-buffered transmission, similar to Rx part.
The size of the Tx buffer is currently set to 256 bytes (the saved bytes form the Rx part are used here, so overall no RAM is saved). This means, the Tx is only blocking if you send more than 256 bytes very quickly.
I tested the new version (attached), no problems found so far.
But still, I would like to ask some of you, who is interested in it, to test the Rx and Tx part, as good as you can.
If you think that it is worth, we could maybe even commit it to master branch.
Any suggestion/comment is welcome.
I have further reworked the TX part, too. Actually I can reach a speed of ~300kBps for Tx, independent on the USB Tx buffer size.
Which means that the limitation is in user code.
The changes can be found here.
Can you send a PR ?
I will test later
What baud rate do you suggest for maximum TX output kB/sec ?
I will post the test code in the evening.
Please note that the Tx speed is strongly limited by the user code. In my simple test I just push a constant string to USB serial over and over again.
If you try to load data, let’s say, from SD card then the USB TX speed quickly falls below ~60kBps, limited by the reading speed from card.
I understand that SD card will limit the performance. In my case the problem seems to be that with the TX operation will block (for too long) which seem to affect other part of the program.
By the way: I had to do one small modification to get patch compiled:
usb_serial.h
line 68: size_t write(const uint8* uint32);
By the way: I had to do one small modification to get patch compiled:
usb_serial.h
line 68: size_t write(const uint8* uint32);
This is my printing code. I wonder if the performance will be better if I create a temp buffer first filling with the data and send all that in one call to Serial.println ? I think yes, because each byte is sent in a single USB packet. I’m using this same code in ESP8266 and it does not seem to block the same way. Ok, it is a different MCU but still…
void printMsg() {
Serial.print(micros());
Serial.print(",ID: ");
Serial.print(id, HEX);
Serial.print(",Data: ");
for(int i = 0; i<len; i++) {
if(rxBuf[i] < 0x10) {
Serial.print("0");
}
Serial.print(rxBuf[i], HEX);
Serial.print(" ");
}
Serial.println();
}
While the USB is sending the bytes from its own buffer, the main routine can handle other tasks.
I had best results using a temporary buffer of 1024 bytes to fill with data read from SD card. Then I just called “Serial.write(buffer, length);”. During the time USB was sending, I was reading the next 2 blocks, each 512 bytes, of data from the card. The USB TX buffer size is kept at the default value of 256.
But anyway, please keep in mind that the USB host has a strong influence on the TX speed. The faster the host application, the higher the USB TX speed.
EDIT
I am missing the millis() at the end of data transmission in your code.
EDIT2
Regarding the uC processing speed, I newly learned that the F103 family core runs with 2 additional wait states. This means, the uC actually reads code from flash and executes it with 24 MHz, not with 72, as one would expect. This is much less then the 80 MHz of the ESP core.
Unfortunatelly i wont manage to post today the examples due to lack of time.
35511,00,ID: 000,Data: F5 00 CF C0 FC, 35511
35511,15,ID: 000,Data: B0 15 00 1F D0 07 00 60, 35512
35512,20,ID: 000,Data: AF 02 80 00 08 30 0A 35, 35512
35513,F0,ID: 000,Data: 01 0F FC 00 00 FF F1, 35513
35513,A0,ID: 000,Data: 2F 0A 00 00 7D 00 F0 02, 35514
FYi. I have not merged the PR for this, because it sounds like its still under development.
I want to wait for it to be stable and cleanly merge-able before I action the PR.
I then started to work on SPI and committed some changes to my repo which landed also in the USB serial PR, I don’t know how.
Unfortunately I don’t know how to remove those late commits from the USB serial PR, so please remove it if you can and only take those relevant PRs.
The only way to distinguish different work into different PR is to have multiple branches.
So, Roger won’t have other choices to do some “cherry picking” …
Ok.
I am not sure if there is a Git way to only take part of a PR, but i can do it manually.
I will do it later today ( its Thursday here ![]()
Do you guys handle it the same way?
I don’t commit changes to different bugs at the same time, but I know that PR contains multiple commits.
In the Arduino STM32 repo I do commits directly to the master branch some times, it depends on the complexity and scope of the commit
e.g. if its just a new variant, its fairly self contained, and just a new folder + updated boards.txt
For PR’s I merge simple ones straight into the Master branch
But for more complex ones, I create a branch for the PR (usually based on the username) and pull to that branch, then test and then merge that branch back into Master
BTW. We have a gitter “lobby” https://gitter.im/stm32duino/Lobby, which is sometimes handy for discussing this stuff, but whether many people are online at the same time, varies



