Rick has noticed that the RAM usage of this core is very high. Blank sketch takes 9k of RAM vs libmaple uses 2k.
The main reason for this appears to be arrays of data in RAM, most of which is static and does not need to be there
I’ve made a start at fixing some of this, looking at the pwm code which had an array of all PWM capable pins, which has to store the HAL timHandle data, but also included pin / port definitions which don’t change, as well as setup data, which doesnt change.
The WIP branch https://github.com/stm32duino/Arduino_C … 1/tree/WIP has a this fix, which halves the amount of RAM taken for the PWM config (and saves just under 1k)
But though investigating this problem, I noticed that things like this
PinDescription g_intPinConfigured[MAX_DIGITAL_IOS];
In which case this doesn’t need to be an array of structs, it needs to be an array of pointers to structs
You can upload .map files from the build directory, and it will show the sizes of the objects. I put a few examples there.
The advantage to the standard arm-none-eabi-nm is that this also shows the source file that the object was included from.
EDIT: I mean I made it so RAM usage can be analyzed more easily.
BTW.
I’ve realised that g_intPinConfigured[pin] = g_APinDescription; does actually do memcpy on the struct
Looking at the map from danieleff (nice feature, thank you!) it seems that the struct
analog_config_str g_analog_config[NB_ANALOG_CHANNELS]Actually that struct does have 1 member timHandle that needs to be in RAM (See the same file in the WIP branch)
I think the dac member also needs to be in RAM
Do you refer to “dacInstance” or “dacChannel”? Or both?
One could make separated structs, one part in FLASH, the other in RAM? I think it has been discussed already.
One could make separated structs, one part in FLASH, the other in RAM? I think it has been discussed already.
In the same context, I see huge RAM saving potential in USB stuff, like reduction of data size uin32 to uint8 by features of USB_CfgTypeDef.
Reducing the number of endpoints from 15 down to 7 would be also not bad, I don’t think that all endpoints will be used simultaneously in any application.
In the same context, I see huge RAM saving potential in USB stuff, like reduction of data size uin32 to uint8 by features of USB_CfgTypeDef.
Reducing the number of endpoints from 15 down to 7 would be also not bad, I don’t think that all endpoints will be used simultaneously in any application.
https://github.com/stm32duino/Arduino_C … /issues/27
https://github.com/stm32duino/Arduino_C … /issues/23
@danieleff has pointed out some other problems (in another issue) which would be fixed if we made changes to improve RAM usage
Re:dacInstance
It doesnt actually appeared to be used anywhere, probably as the Nucleo F103RB does not have a DAC
We should move it to its own array like I did for timHandle but that array is only needed for F103RC or better
And….
Reuse of pinDescription struct (in RAM) in many places where 90% of the struct should be static or not there at all, also needs to be fixed ;-(
text data bss dec hex filename
5872 8 576 6456 1938 BareMinimum.ino.elf
Normally I know malloc() is not generally used, due to memory fragmentation, but I think because of the way the HAL needs large data structs, that perhaps the way to handle the PWM, at least, is to only malloc the timHandle structs for each pin as its required.
But don’t call free() pin pwm_stop() , because in some circumstances it could result in something resembling a memory leak.
It would be no worse than the current code, and would mean if you are not using PWM it would save over 1k.
I have not looked at the other memory hogs, but hopefully something similar could be done with them.
You could probably change the linker script part here from:
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM
The linker stuff is outside my knowledge base ![]()
i.e I have an array of pointers, which should get initialised to 0 (NULL)
TIM_HandleTypeDef *g_analog_timer_config[NB_ANALOG_CHANNELS]={};
You could probably change the linker script part here from:
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM
In which case this doesn’t need to be an array of structs, it needs to be an array of pointers to structs
In the same context, I see huge RAM saving potential in USB stuff, like reduction of data size uin32 to uint8 by features of USB_CfgTypeDef.
Reducing the number of endpoints from 15 down to 7 would be also not bad, I don’t think that all endpoints will be used simultaneously in any application.
You could probably change the linker script part here from:
/* User_heap_stack section, used to check that there is enough RAM left */
._user_heap_stack :
{
. = ALIGN(4);
PROVIDE ( end = . );
PROVIDE ( _end = . );
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(4);
} >RAM
Arduino does the warning (at 75% ram), and _user_heap_stack reserves ram making it an error.
So currently (without the modification) both are in effect.
*EDIT: My proposal is to consider stack as essential, all programs use it, so reserve it (Leave “. = . + _Min_Stack_Size;” inside “._user_heap_stack :”), while heap as non essential, a most programs will not use it, and leave it to arduino warning (Remove “. = . + _Min_Heap_Size;” from “._user_heap_stack :”)
You can kind of sort of guess stack size for every user, but not heap size.
For myself, I’d rather know how much memory is actually used by the .bss and .data sections without having to guess what the .ldscript is holding in reserve.
-rick
For myself, I’d rather now how much memory is actually used by the .bss and .data sections without having to guess what the .ldscript is holding in reserve.
-rick
It might be interesting to use a similar scheme that ARM uses for CMSIS (startup_CM4.S and gcc_arm.ld below). Essentially you can define __HEAP_SIZE as compile time argument (and __STACK_SIZE). If you overflow the minimum requirement (after all .data/.bss data has been allocated), the linker will throw an error.
It might be interesting to use a similar scheme that ARM uses for CMSIS (startup_CM4.S and gcc_arm.ld below). Essentially you can define __HEAP_SIZE as compile time argument (and __STACK_SIZE). If you overflow the minimum requirement (after all .data/.bss data has been allocated), the linker will throw an error.
Say I create a big array on the stack
void somefunction(void) {
char mybuffer[8192];
while (1) {
.. do somethign with the buffer...
}
}
Looks like I need to revert this change until the dust settles
Reduce stack and heap under the current value is risky (we have made some tests during the development).
Maybe there isn’t any solution for the moment.
Note.
All recent changes are only to the WIP branch (Work in progress) and should not effect anyone who downloads the master repo zip
