How can we reduce the excessive RAM use by this core

RogerClark
Fri Nov 18, 2016 10:14 pm
Guys,

Rick has noticed that the RAM usage of this core is very high. Blank sketch takes 9k of RAM vs libmaple uses 2k.

The main reason for this appears to be arrays of data in RAM, most of which is static and does not need to be there

I’ve made a start at fixing some of this, looking at the pwm code which had an array of all PWM capable pins, which has to store the HAL timHandle data, but also included pin / port definitions which don’t change, as well as setup data, which doesnt change.

The WIP branch https://github.com/stm32duino/Arduino_C … 1/tree/WIP has a this fix, which halves the amount of RAM taken for the PWM config (and saves just under 1k)

But though investigating this problem, I noticed that things like this

PinDescription g_intPinConfigured[MAX_DIGITAL_IOS];


stevestrong
Sat Nov 19, 2016 8:32 am
RogerClark wrote:
In which case this doesn’t need to be an array of structs, it needs to be an array of pointers to structs

danieleff
Sat Nov 19, 2016 8:42 am
I made a .map analizer: http://danieleff.com/stm32/map_analizer … uepill.map
You can upload .map files from the build directory, and it will show the sizes of the objects. I put a few examples there.
The advantage to the standard arm-none-eabi-nm is that this also shows the source file that the object was included from.
EDIT: I mean I made it so RAM usage can be analyzed more easily.

RogerClark
Sat Nov 19, 2016 8:54 am
Thanks @danieleff

BTW.

I’ve realised that g_intPinConfigured[pin] = g_APinDescription; does actually do memcpy on the struct


stevestrong
Sat Nov 19, 2016 8:58 am
Hm, I have the feeling that the originator of this code comes from the PC world where the RAM size is never an issue.
Looking at the map from danieleff (nice feature, thank you!) it seems that the struct analog_config_str g_analog_config[NB_ANALOG_CHANNELS]

RogerClark
Sat Nov 19, 2016 9:14 am
Yes. I agree the developers don’t seem to have considered RAM usage

Actually that struct does have 1 member timHandle that needs to be in RAM (See the same file in the WIP branch)

I think the dac member also needs to be in RAM


stevestrong
Sat Nov 19, 2016 9:33 am
I see what you mean, you’re right.
Do you refer to “dacInstance” or “dacChannel”? Or both?
One could make separated structs, one part in FLASH, the other in RAM? I think it has been discussed already.

RogerClark
Sat Nov 19, 2016 9:51 am
stevestrong wrote:I see what you mean, you’re right.

One could make separated structs, one part in FLASH, the other in RAM? I think it has been discussed already.


stevestrong
Sat Nov 19, 2016 9:57 am
Looks very nice :)
In the same context, I see huge RAM saving potential in USB stuff, like reduction of data size uin32 to uint8 by features of USB_CfgTypeDef.
Reducing the number of endpoints from 15 down to 7 would be also not bad, I don’t think that all endpoints will be used simultaneously in any application.

RogerClark
Sat Nov 19, 2016 10:44 am
stevestrong wrote:Looks very nice :)
In the same context, I see huge RAM saving potential in USB stuff, like reduction of data size uin32 to uint8 by features of USB_CfgTypeDef.
Reducing the number of endpoints from 15 down to 7 would be also not bad, I don’t think that all endpoints will be used simultaneously in any application.

RogerClark
Sat Nov 19, 2016 10:51 am
BTW. I have 2 issues open about various problems with RAM usage

https://github.com/stm32duino/Arduino_C … /issues/27

https://github.com/stm32duino/Arduino_C … /issues/23

@danieleff has pointed out some other problems (in another issue) which would be fixed if we made changes to improve RAM usage

Re:dacInstance

It doesnt actually appeared to be used anywhere, probably as the Nucleo F103RB does not have a DAC

We should move it to its own array like I did for timHandle but that array is only needed for F103RC or better

And….

Reuse of pinDescription struct (in RAM) in many places where 90% of the struct should be static or not there at all, also needs to be fixed ;-(


GrumpyOldPizza
Sat Nov 19, 2016 6:33 pm
Just quickly tried out the BareMinimum.ino on the NUCLEO-L432KC with the STM32L4 core (not the STM Core). This is what I am getting:

text data bss dec hex filename
5872 8 576 6456 1938 BareMinimum.ino.elf


RogerClark
Sat Nov 19, 2016 6:44 pm
The problem seems to be the statically allocated arrays to handle all possible PWM pins running etc.

Normally I know malloc() is not generally used, due to memory fragmentation, but I think because of the way the HAL needs large data structs, that perhaps the way to handle the PWM, at least, is to only malloc the timHandle structs for each pin as its required.
But don’t call free() pin pwm_stop() , because in some circumstances it could result in something resembling a memory leak.

It would be no worse than the current code, and would mean if you are not using PWM it would save over 1k.

I have not looked at the other memory hogs, but hopefully something similar could be done with them.


Rick Kimball
Sat Nov 19, 2016 7:26 pm
The way the arduino IDE calculates RAM usage and the way the linker script attempts to warn you about the low memory usage are in conflict. There are really 1536 bytes of RAM that the linker “allocates” to the heap and the stack. However, the linker’s approach doesn’t really reflect how much heap or RAM is used. The arduino IDE knows how much RAM a board/chip has and will report low memory issues with a warning about stability. However, neither approach is accurate.

You could probably change the linker script part here from:
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM


RogerClark
Sat Nov 19, 2016 7:46 pm
Thanks Rick,

The linker stuff is outside my knowledge base ;-)


RogerClark
Sat Nov 19, 2016 10:02 pm
I’ve updated the WIP branch to only malloc the timHandle when its actually used.

i.e I have an array of pointers, which should get initialised to 0 (NULL)

TIM_HandleTypeDef *g_analog_timer_config[NB_ANALOG_CHANNELS]={};


RogerClark
Sun Nov 20, 2016 3:01 am
Rick Kimball wrote:The way the arduino IDE calculates RAM usage and the way the linker script attempts to warn you about the low memory usage are in conflict. There are really 1536 bytes of RAM that the linker “allocates” to the heap and the stack. However, the linker’s approach doesn’t really reflect how much heap or RAM is used. The arduino IDE knows how much RAM a board/chip has and will report low memory issues with a warning about stability. However, neither approach is accurate.

You could probably change the linker script part here from:
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM


danieleff
Sun Nov 20, 2016 8:21 am
stevestrong wrote:RogerClark wrote:
In which case this doesn’t need to be an array of structs, it needs to be an array of pointers to structs

RogerClark
Sun Nov 20, 2016 8:56 pm
stevestrong wrote:Looks very nice :)
In the same context, I see huge RAM saving potential in USB stuff, like reduction of data size uin32 to uint8 by features of USB_CfgTypeDef.
Reducing the number of endpoints from 15 down to 7 would be also not bad, I don’t think that all endpoints will be used simultaneously in any application.

Wi6Labs
Wed Nov 23, 2016 11:18 am
RogerClark wrote:Rick Kimball wrote:The way the arduino IDE calculates RAM usage and the way the linker script attempts to warn you about the low memory usage are in conflict. There are really 1536 bytes of RAM that the linker “allocates” to the heap and the stack. However, the linker’s approach doesn’t really reflect how much heap or RAM is used. The arduino IDE knows how much RAM a board/chip has and will report low memory issues with a warning about stability. However, neither approach is accurate.

You could probably change the linker script part here from:
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM


danieleff
Wed Nov 23, 2016 12:38 pm
It depents if one wants to have a warning or an error when there is too low memory left for stack+heap.
Arduino does the warning (at 75% ram), and _user_heap_stack reserves ram making it an error.

So currently (without the modification) both are in effect.

*EDIT: My proposal is to consider stack as essential, all programs use it, so reserve it (Leave “. = . + _Min_Stack_Size;” inside “._user_heap_stack :”), while heap as non essential, a most programs will not use it, and leave it to arduino warning (Remove “. = . + _Min_Heap_Size;” from “._user_heap_stack :”)

You can kind of sort of guess stack size for every user, but not heap size.


Rick Kimball
Wed Nov 23, 2016 2:09 pm
Like I mentioned in my earlier post, neither method is actually accurate. By using arbitrary values in the ld script you are just making assumptions about how much heap and stack the arduino might actually use. However, letting the arduino ide warn you with its arbitrary low water mark isn’t any better. The arduino seems to use a build engine parameter of 75% of ram (build.warn_data_percentage = 75). So for the 20k stm32f103c8 it will give you a stability warning when you have less than 5120 bytes left. That is even more conservative than the 1536 bytes reserved by the ldscript.

For myself, I’d rather know how much memory is actually used by the .bss and .data sections without having to guess what the .ldscript is holding in reserve.

-rick


GrumpyOldPizza
Wed Nov 23, 2016 2:51 pm
Rick Kimball wrote:Like I mentioned in my earlier post, neither method is actually accurate. By using arbitrary values in the ld script you are just making assumptions about how much heap and stack the arduino might actually use. However, letting the arduino ide warn you with its arbitrary low water mark isn’t any better. The arduino seems to use a build engine parameter of 75% of ram (build.warn_data_percentage = 75). So for the 20k stm32f103c8 it will give you a stability warning when you have less than 5120 bytes left. That is even more conservative than the 1536 bytes reserved by the ldscript.

For myself, I’d rather now how much memory is actually used by the .bss and .data sections without having to guess what the .ldscript is holding in reserve.

-rick


Rick Kimball
Wed Nov 23, 2016 3:40 pm
GrumpyOldPizza wrote:[
It might be interesting to use a similar scheme that ARM uses for CMSIS (startup_CM4.S and gcc_arm.ld below). Essentially you can define __HEAP_SIZE as compile time argument (and __STACK_SIZE). If you overflow the minimum requirement (after all .data/.bss data has been allocated), the linker will throw an error.

GrumpyOldPizza
Wed Nov 23, 2016 6:34 pm
Rick Kimball wrote:GrumpyOldPizza wrote:[
It might be interesting to use a similar scheme that ARM uses for CMSIS (startup_CM4.S and gcc_arm.ld below). Essentially you can define __HEAP_SIZE as compile time argument (and __STACK_SIZE). If you overflow the minimum requirement (after all .data/.bss data has been allocated), the linker will throw an error.

Rick Kimball
Wed Nov 23, 2016 6:38 pm
You really want to control this at the sketch level, not the board level.

Say I create a big array on the stack

void somefunction(void) {
char mybuffer[8192];

while (1) {
.. do somethign with the buffer...
}
}


RogerClark
Thu Nov 24, 2016 7:11 am
umm.

Looks like I need to revert this change until the dust settles


Wi6Labs
Thu Nov 24, 2016 9:14 am
We prefer to keep a warning base on the real RAM usage for the user. We agree that Arduino should give a solution to modify this level (75% is too low).
Reduce stack and heap under the current value is risky (we have made some tests during the development).
Maybe there isn’t any solution for the moment.

RogerClark
Thu Nov 24, 2016 10:28 am
I’ve reverted the change to all the linker files, which I changed in the F1 repo

Note.
All recent changes are only to the WIP branch (Work in progress) and should not effect anyone who downloads the master repo zip


Leave a Reply

Your email address will not be published. Required fields are marked *