SD / Memory bottleneck, any ideas?

ipsilondev
Wed Aug 16, 2017 9:06 pm
DISCLAIMER: i’m trying to use the platform to build a commercial product.

Hi everyone :)

first of all, sorry for so many posts, newbie here, asking for advice :)

So i played with my blue pill a lot, trying to squish it as much as i could. My end goal is to build a 16 bit game console (or kind of), but i’m having serious limitation problems, that even after 10 years coding (i have done web, games, image processing etc.. in more than 10 languages, including c and c++) i don’t see a way out (not trying to presume here, just giving out a bit of context in the type of knowledge i have :D ).

My main problem is that i can’t allocate the resources fast enough. i need to achieve at least 24fps on 240×320, to have something playable. In my test i used RGB images, binary encoded, little endian, pushing the color pixels directly in the LCD. My problem in that context is the SD transfer rate. i got at best, 400kb/s on SPI2, with a buffer of 10kb and using SDFatEx (i had only 15kb of free ram, so i was using almost all my ram). Anywhere, still, i only see posts of people here only getting at maximum, 800kb/s, so it doesn’t really make a difference in the kind of graphic i need to load (if i want to load directly from SD, i need at least 5mb/s).

Think that only the background of a game itself needs to be used on every frame re-draw (whiteout counting the characters, or enemies, that *maybe* i can store on memory). And in theory, i could only read 700 bytes per frame maximum, to achieve the 30fps.

using a compressed image type isn’t either an option for the small cpu.

Of course, i could use some tricks, like building the background main colors and parts with primitives, or even parsing, do a conversion like image -> SVG -> primitives -> binary instruction of primitives -> write to lcd that instructions, plus adding cloned sprites from ram, but the amount of work this would take, is not really worth it.

I have been looking for the SPI SRAM option (as maybe i could use the RC variation with 3 SPI) but the chips are only 128kb the biggest one, and i need at least 512kb to have something to work on.

I don’t really have any other idea to improve this process, for that reason i post here, if someone to more experience on this, could give me a hint or suggestion, is very welcome :D

Thanks :)


victor_pv
Wed Aug 16, 2017 9:41 pm
We can get several MB/s sd card transfer, depends on the sdcard class among other factors, what SPI port you use, speed setting on the port, etc
There are several threads talking about sdcard transfer speeds and many tests.
This is the first post and first test I did when I added DMA capabilities to the sdfat library, just consulted it to check the speeds, and was getting close to 2MB/s back then with a 4KB buffer:
http://www.stm32duino.com/viewtopic.php … +speed#p27

With a 10KB buffer, as long as the card is fast, you should get at least the same speed.
If your images are formatted in a way that can be directly written to the display, yI recommend use double buffering and DMA as much as possible.
While a DMA channel loads data from the sdcard to one buffer, another channel can be writing from the other buffer to the display. You optimize the transfer speed of the peripherals at the same time the CPU can still be doing other things like planing what’s the next thing to load from sdcard, doing calculations, reading inputs, etc.

What core are you using btw?

Also may be time move to a more powerful MCU such as an F4 though. They run faster and have a hardware SP FPU that may help you with calculations for graphics if you use SP.


RogerClark
Wed Aug 16, 2017 10:32 pm
I thought SDIO was the best way to improve performance, but its not available on the Blue pill , and libmaple does not have a built in library for it.
Though I thought there was at least one thread on SDIO

victor_pv
Wed Aug 16, 2017 10:51 pm
[RogerClark – Wed Aug 16, 2017 10:32 pm] –
I thought SDIO was the best way to improve performance, but its not available on the Blue pill , and libmaple does not have a built in library for it.
Though I thought there was at least one thread on SDIO

There is one SDIO for F4. I have looked at it and should be faily straight forward to convert to libmaple F1, only the DMA stuff is different the SDIO peripheral is exactly the same with a different base address, so all registers and bits definitions are valid. The SDIO peripheral is only available on xC MCUs or higher, so no bluepill even if I ported the driver Steve wrote.


Pito
Thu Aug 17, 2017 6:32 am
@ipsilondev: try to run the “bench.ino” test from SdFat repo. Use only 512bytes buffer and SdFatEx.
Download the latest https://github.com/greiman/SdFat
On SPI2 (max 18MHz) you could achieve ~1.7MB/s write and ~1.6MB/s read max with the measurements we did. On SPI1 (36MHz) the double.
That test may show you the limits of your card.
viewtopic.php?f=13&t=20&start=120#p17911

racemaniac
Thu Aug 17, 2017 8:16 am
[victor_pv – Wed Aug 16, 2017 10:51 pm] –

[RogerClark – Wed Aug 16, 2017 10:32 pm] –
I thought SDIO was the best way to improve performance, but its not available on the Blue pill , and libmaple does not have a built in library for it.
Though I thought there was at least one thread on SDIO

There is one SDIO for F4. I have looked at it and should be faily straight forward to convert to libmaple F1, only the DMA stuff is different the SDIO peripheral is exactly the same with a different base address, so all registers and bits definitions are valid. The SDIO peripheral is only available on xC MCUs or higher, so no bluepill even if I ported the driver Steve wrote.

we should really beg for a supplier to make maple mini/blue pill style boards with an F4 on it XD
I’ve modded maple mini’s with stm32f411CEU6 which has a 4 bit SDIO port, imagine if you could buy those for 5-8€ from china XD (the stm32f4 costs 3€ in small volumes on aliexpress).


victor_pv
Thu Aug 17, 2017 1:53 pm
[racemaniac – Thu Aug 17, 2017 8:16 am] –
we should really beg for a supplier to make maple mini/blue pill style boards with an F4 on it XD
I’ve modded maple mini’s with stm32f411CEU6 which has a 4 bit SDIO port, imagine if you could buy those for 5-8€ from china XD (the stm32f4 costs 3€ in small volumes on aliexpress).

I tried once sending an email to vcc-gnd.com about producing bluepills with the 303 (did that transplant and works fine), but didn’t even get a reply.
Perhaps if enough people asks them…
There is also the option of sending the design files and component list to a manufacturing house and see how much it would cost for a small run, likely not that cheap…


ipsilondev
Fri Aug 18, 2017 12:08 pm
[victor_pv – Wed Aug 16, 2017 9:41 pm] –
We can get several MB/s sd card transfer, depends on the sdcard class among other factors, what SPI port you use, speed setting on the port, etc
There are several threads talking about sdcard transfer speeds and many tests.
This is the first post and first test I did when I added DMA capabilities to the sdfat library, just consulted it to check the speeds, and was getting close to 2MB/s back then with a 4KB buffer:
http://www.stm32duino.com/viewtopic.php … +speed#p27

With a 10KB buffer, as long as the card is fast, you should get at least the same speed.
If your images are formatted in a way that can be directly written to the display, yI recommend use double buffering and DMA as much as possible.
While a DMA channel loads data from the sdcard to one buffer, another channel can be writing from the other buffer to the display. You optimize the transfer speed of the peripherals at the same time the CPU can still be doing other things like planing what’s the next thing to load from sdcard, doing calculations, reading inputs, etc.

What core are you using btw?

Also may be time move to a more powerful MCU such as an F4 though. They run faster and have a hardware SP FPU that may help you with calculations for graphics if you use SP.

Thanks for the detailed answer !
I’m using the blue pill STM32F103C8T6. i will re-test everything and see what else i can get :) I would love to use an F4, but costs are another limitation. i can’t pay more than $5 per board (and F4 the cheapest one are $10) i’m planning to use the STM32F103RCT6 for the final version.
Do you have any example for that double buffering to send the data directly to the lcd? it would be very useful for background operations !.

[RogerClark – Wed Aug 16, 2017 10:32 pm] –
I thought SDIO was the best way to improve performance, but its not available on the Blue pill , and libmaple does not have a built in library for it.
Though I thought there was at least one thread on SDIO

I’m planning to use STM32F103RCT6 as the final version of the product, is supposed that has 1 SDIO, and with this numbers all are telling me, maybe is even possible to do it :)

[Pito – Thu Aug 17, 2017 6:32 am] –
@ipsilondev: try to run the “bench.ino” test from SdFat repo. Use only 512bytes buffer and SdFatEx.
Download the latest https://github.com/greiman/SdFat
On SPI2 (max 18MHz) you could achieve ~1.7MB/s write and ~1.6MB/s read max with the measurements we did. On SPI1 (36MHz) the double.
That test may show you the limits of your card.
viewtopic.php?f=13&t=20&start=120#p17911

Thanks for the link! my test’s was done using the SdFat-beta version, so i suppose there was my screw up ! i’ll give it a try :)

Thanks everyone for all the answers ! i’ll re-test and see what else i could achieve with all this new info :D


RogerClark
Fri Aug 18, 2017 9:00 pm
I find it interesting that you say that the cost of the F4 is too high, and you imply that you are developing a commercial product

If so, I think you should have made this clear.


ipsilondev
Fri Aug 18, 2017 9:23 pm
[RogerClark – Fri Aug 18, 2017 9:00 pm] –
I find it interesting that you say that the cost of the F4 is too high, and you imply that you are developing a commercial product

If so, I think you should have made this clear.

Yes, is the idea, if i can achieve what i’m trying to do with the platform. Sorry if i haven’t said it earlier, i didn’t know it was necessary to clarify this to be able to ask for ideas / tips on this matter on the forum if my objective was to do a commercial product.

If i offended someone or someone felt cheated, i apologize for it, it wasn’t my intention at all.


ipsilondev
Fri Aug 18, 2017 9:28 pm
[RogerClark – Fri Aug 18, 2017 9:00 pm] –
I find it interesting that you say that the cost of the F4 is too high, and you imply that you are developing a commercial product

If so, I think you should have made this clear.

i edited the original post and put a disclaimer on top of it. It is ok?

Again, i’am sorry if i broke any rule or offended anyone with my posts, it wasn’t my intention at all.


Pito
Fri Aug 18, 2017 9:48 pm
Sorry if i haven’t said it earlier, i didn’t know it was necessary to clarify this to be able to ask for ideas / tips on this matter on the forum if my objective was to do a commercial product.
Nobody here has to ask you to clarify that..

ipsilondev
Fri Aug 18, 2017 9:54 pm
[Pito – Fri Aug 18, 2017 9:48 pm] –
Sorry if i haven’t said it earlier, i didn’t know it was necessary to clarify this to be able to ask for ideas / tips on this matter on the forum if my objective was to do a commercial product.
Nobody here has to ask you to clarify that..

The admin just told me that:

[RogerClark – Fri Aug 18, 2017 9:00 pm] –
I find it interesting that you say that the cost of the F4 is too high, and you imply that you are developing a commercial product

If so, I think you should have made this clear.


RogerClark
Fri Aug 18, 2017 10:07 pm
I will create some forum rules, as I know in the past people can get offended if they feel they are effectively working for free on a commercial product without knowing this.

I can see that Pito is however happy to answer questions for commercial products.

Also, there is a notice in the readme file for Libmaple, which says the code is experimental and you use it at your own risk.

I don’t know whether STMs own core or STM32 generic have those disclaimers.


Leave a Reply

Your email address will not be published. Required fields are marked *