I became obsessed last week with the thought that the very cheap ESP8622 could be useful if it did not have RF consuming so much power.
Crazy? Probably: but playing around with the idea and finding a very easy mechanism for disabling the RF section to drop the power consumption, I am now finding that a non-RF ESP8622-12E module in the NodeMCU form factor is only taking 20mA at 80MHz and under 40mA (30’ish) at 160MHz. That is pretty remarkable for the cost.
The specific unit I am using in my test is this one – excepting that I paid $3.11 with free shipping.
So what kind of performance does $3 and pennies provide? I am working on more examples, but I took an STM32duino example I ported from the 8-bit AVR world which was written by the renown Nick Gammon and I ported the STM32 port to ESP8266. Adding a single unsigned long var to time a single pass of the program, this is the results on the STM32duino Maple Mini with bootloader 2.0 and the power consumption was 40+mA
Prime Number Generator
Number of primes in prime table = 53
Last prime in table = 251
Calculating primes through 63001
Total microseconds = 199887 uS
With the Dhrystone benchmarks done last year, the ESP at 80MHz was about 10% slower than STM32F103 :
viewtopic.php?f=3&t=76&start=30#p5435
system_update_cpu_freq(FREQUENCY);The core has 80KB DRAM (Data RAM) and 35KB IRAM (Instruction RAM).The DRAM and IRAM segments are separated (Harvard architecture).
Programs are stored in the flash memory and due to the fast interface (SPI-quad) instructions can be fetched and executed “in place”, in other words, without copying them to ram first. In some occasions you cannot execute your code from SPI flash memory directly, mostly because it needs to be maximum fast or because it handles the flash memory itself (e.g. writing to it). In that case, you can have some code, 32k in total, have copied from the flash at startup, to the IRAM. This code will run from IRAM, not from FLASH.
Unfortunately the IRAM space is mostly occupied by SDK code and not a lot left, from the 32k for your own application, about 2k, maybe 3k.
Even with the fast SPI the execution in flash is slow compared with normal flash (on cpu bus) of our stm32.
There are some compiler directives to force generated code to execute in IRAM but you have to respect the limits of the memory.
Something like this is used when we want something in IRAM
#define IRAM0 __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
system_update_cpu_freq(FREQUENCY);The core has 80KB DRAM (Data RAM) and 35KB IRAM (Instruction RAM).The DRAM and IRAM segments are separated (Harvard architecture).
Programs are stored in the flash memory and due to the fast interface (SPI-quad) instructions can be fetched and executed “in place”, in other words, without copying them to ram first. In some occasions you cannot execute your code from SPI flash memory directly, mostly because it needs to be maximum fast or because it handles the flash memory itself (e.g. writing to it). In that case, you can have some code, 32k in total, have copied from the flash at startup, to the IRAM. This code will run from IRAM, not from FLASH.<…>
Something like this is used when we want something in IRAM
#define IRAM0 __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
Something like this is used when we want something in IRAM
#define IRAM0 __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
So benchmarks must include this rather nasty task switching time.
Something like this is used when we want something in IRAM
#define IRAM0 __attribute__((section(".iram0.text")))
void IRAM0 myfunc(void)
{
}
Maybe the linker script does not accept your function, as there are some limitations…. anyway I am a newbie with Esp8266, I am just blinking a led….
<…>
Hm, well, you’re not comparing CPU-speeds then,<…>
I think the quality of the EPS8266 posting here, are far better than on esp8266.com ![]()
I think virtually all of us use the ESP8266 as well as the STM32.
I designed a PCB that has a Maple mini, ESP-12 as well as a ILI9341 and an nRF905, which I use for various projects ( though it was originally designed just as a display)
If the ESP8266 had better ADC, and USB i could have probably removed the STM32 from my design entirely.
But there are still some things like real time data collection that the ESP8266 is not ideally suited for.
I figured it out ! The macro should not be :
#define IRAM0 __attribute__((section(".iram0.text")))
<…>
#define IRAM0 __attribute__((section(".iram.text")))More here: viewtopic.php?f=45&t=996&start=40#p12370
Summary…
The NodeMCU board is a good buy at $3.11 USD (a check today shows it at $3.22. At 160MHz with the RF section disabled, the board takes less than 40mA (that is, my meter shows it greater than 30mA but never 40mA.) OTOH, a Maple Mini is in the 40 – 49mA range (never lower than 40 but never showing 50mA on the display).
But… and this is a BIG BUT… the Maple Mini is significantly faster than the ESP8266 based NodeMCU. The small price difference of $4.00 – $3.22 (both free ship) is not enough in my opinion to select a NodeMCU as a replacement for the STM32F103 unless:
- More than 20K RAM is required
- More than 128K of flash is required
- A flash-based internal filesystem is required, SPIFFS
- CPU sheer speed is not a critical factor
Maybe a few of our members can add to the list. As a caveat, remember that during 1st power-up, the RF will energize and the power consumption will be a minimum of 0.1A and likely a bit higher as I’m using a USB meter and the display is not instantaneous.
Ray
I hadn’t realized that the ESP8266 @ 80Mhz (or even 160Mhz) would be significantly slower.
I presume this is because of the instruction fetch time from the SPI Flash.
I hadn’t realized that the ESP8266 @ 80Mhz (or even 160Mhz) would be significantly slower.
I presume this is because of the instruction fetch time from the SPI Flash.
<…>
Your test is measuring Arduino-implementation speed, not just CPU-speed, because <…>
I’d be surprised if the SPI flash has the same performance as internal flash, for a start it has to come in via single bit, so even if the SPI clock was 80Mhz, it would only come in at 10 Meg bytes per second.
We know from the GD32 that instruction load time makes a big difference, as the 72Mhz GD32 is definitely faster than a 72Mhz STM32 even on the simple benchmarks.
Additionally, the ESP8266 only has one ADC and it is 0 to 1V not 0 to 3.3V. I don’t know if the ADC has the same precision as the STM32 (seems unlikely), but I don’t know.
<…>
Anyway the performance and the value of this tiny 2$ device is remarkable, for my point of view, the only real limitation is the number of IO/peripheral.
in a few months we will know…
No matter what make USB to serial interface I use, I cant get reliable uploads any faster than 115200 ![]()
Albeit I think the FT232 ones I have are probably clones….
And I still have to do the 2 button shuffle to put the device into upload mode.
It would be great if I could upload at the amazing speeds some people manage (900k or more), and if I didnt need to mess around with pressing buttons prior to uploading.
Albeit I think the FT232 ones I have are probably clones….
Local OTA is not a lot of use to me, as a lot of the time the RF is running but the device is not signed into my network, as its just operating as an AP, so it would take a lot of messing around to switch the PC’s wifi to connect to the ESP8266’s AP etc.
Far simpler to use the direct serial connection.
I think I may have a CH340g based USB to Serial kicking around somewhere, but I mainly have Prolific CP21xx modules.
PS. I did try to use a Maple Mini as a USB to Serial, but it didnt seem to work with the ESP8266 at all ![]()
PS. I did try to use a Maple Mini as a USB to Serial, but it didnt seem to work with the ESP8266 at all ![]()
If you look at the end of the page that WereCatf linked, there is a link to :
There you can find a circuit that requires only DTR signal (like basic FTDI cable)
I haven’t check the circuit as I have only NodeMCUs.
If you look at the end of the page that WereCatf linked, there is a link to :
There you can find a circuit that requires only DTR signal (like basic FTDI cable)
I haven’t check the circuit as I have only NodeMCUs.
<…>
#define IRAM0 __attribute__((section(".iram.text")))The really small gain in the IRAM is probably a gain from the first iteration where the non-IRAM is bring into cache, all other iterations becomes identical, since both versions are already inside the IRAM.
The really small gain in the IRAM is probably a gain from the first iteration where the non-IRAM is bring into cache, all other iterations becomes identical, since both versions are already inside the IRAM.
I think the quality of the EPS8266 posting here, are far better than on esp8266.com ![]()
I think virtually all of us use the ESP8266 as well as the STM32.
I designed a PCB that has a Maple mini, ESP-12 as well as a ILI9341 and an nRF905, which I use for various projects ( though it was originally designed just as a display)
I have no issue with the basic idea of trying to compare STM32F with non-RF ESP8266 using ports of the same program.
However, I am concerned there may be some young, impressionable, folks reading the site, who may misunderstand …
mrburnette wrote:
<…>
…. With RISC hardware, the system clock determines the MIPS (excepting cores with look-ahead and branch determination and cpu pre-cache.) But for most ARM inexpensive uC’s, the max clock is the determining factor; THEREFORE, 80MHz is faster than 72MHz… faster by 80/72 = 11.11%
Thanks for your excellent post.
I totally agree, and have previously posted bits of similar information.
If the ESP8266 didn’t have wifi, people would dismiss it as an “also ran” in terms of its processing power and lack of normal peripherals found on normal MCUs.
But it has gained traction, and the production volumes allow its cost to be low.
Also it would not any where near as popular if Iggr had not done the Arduino port.
(I find the Open Source, non Arduino toolchain is a nightmare to install, and only seems to work on 32 bit linuxes)
I have no issue with the basic idea of trying to compare STM32F with non-RF ESP8266 using ports of the same program. However, I am concerned there may be some young, impressionable, folks reading the site, who may misunderstand …
<…>
Renamed master record.
However, I had not visited for several days, and read the new thread after several days of posts, and assumed it was deliberate.
So I had wondered if there was some Machiavellian plot, like flying under the radar of Espressif? After all, it was an unflattering benchmark and some companies get very squiffy about that sort of thing. However, the first rule of Machiavellian-plot-club, is …
Then again, I do love Ian Fleming’s books and movies, and I read some of Machiavelli’s “The Prince” when I was a youth
Anyway, it was an interesting experiment, which I’d still like to understand.
I am puzzled how the code-RAM-cache works, what algorithm does it use to load code?
Also, I’m impressed that it could be overclocked 2x and still read the QSPI memory. So I’d like to read the documentation.
Where are the English-translations of the peripheral interface programming manuals? I’ve had a noodle around, but failed to find them.
However, I had not visited for several days, and read the title after several days of posts, and assumed it was deliberate.
and would there be two f’s ?
and would there be two f’s ?
I know, this thread is really, _really_ old. I found it while searching for some infos to switch of wifi on a ESP8266. Again, this board had the best result
I am working on a kinda arty project, involving displays and cameras and whatnot – if I can get it all running.
The benchmark results baffled me. In my experience, the ESP8266 is not really that slow. The culprit is, as was stated more than once, that the watchdog timer wants to be fed time and again.
In my view, it is absolutely valid to fiddle with that beast. If I know that something needs some time without any risk to go fubar, I set the watchdog accordingly. With 10 seconds timeout (more than enough to calculate the prime numbers, I get thse result:
80 MHz, w/o IRAM: 797219 µs
80 MHz, with IRAM: 760123 µs
160 MHz, w/o IRAM: 398799 µs
160 MHz, with IRAM: 380203 µs
All I did was adding the line “ESP.wdtEnable(10000);” in setup() and disabling the delay(0). The numbers are definitely worse than with the litte STM32F103, but it’s just a factor of 2 or 4. I have not much code that is that speed sensitive, so it’s all about IO pins for me. If I need them, it is the STM32, but if I want to “install” a web server with a drop of hot glue, I take the ESP ![]()



