c - ML model from Edge Impulse runs 5 times slower after porting it from Arduino IDE to ESP-IDF on ESP32 - Stack Overflow-软件玩家

admin管理员组
文章数量:1431024

I've been struggling with this one for a while so here we go : I've been trying to match the speed of inference for an ML model I generated with Edge Impulse originnaly to Arduino, then to ESP-IDF for my ESP32-CAM device.

The algo takes ~1300ms to run on Arduino and it takes ~6600ms on ESP-IDF with -Os optim in both case. The closer I got is by setting compile optimization to -O2 which got me around 2000ms on ESP-IDF.

In both cases the CPU frequency is set at 240MHz, and I tried to figure out how exactly does Arduino compiles and mimic it to see what I could miss but I'm I'm not figuring it out.

I verified with a test sample that does matricial calcul with both volatile floats and integers to ensure that the CPU calculus capacities are the same in both envs and I got :

I have similar results on both projects and logged everything thread related and it matches (runs on core1, same priority, same cpu speed).

I ensured that memory allocation is static in both tensor flow lite lib with the flag -DTF_LITE_STATIC_MEMORY.

I ensured that there is no parallel shinanigans and OPEN_MP is disabled in both cases.

I switched compilers to check if it doesn't come from a compiler libc or something.

I tried to get as close as possible as Arduino's compiler arguments.

Here is a dump of arduino compile flags arduino compile arguments :

COLLECT_GCC_OPTIONS='-c' '-mlongcalls' '-Wno-frame-address' '-ffunction-sections' '-fdata-sections' '-Wno-error=unused-function' '-Wno-error=unused-variable' '-Wno-error=unused-but-set-variable' '-Wno-error=deprecated-declarations' '-Wno-unused-parameter' '-Wno-sign-compare' '-Wno-enum-conversion' '-gdwarf-4' '-ggdb' '-freorder-blocks' '-Wwrite-strings' '-fstack-protector' '-fstrict-volatile-bitfields' '-fno-jump-tables' '-fno-tree-switch-conversion' '-std=gnu++23' '-fexceptions' '-fno-rtti' '-w' '-Os' '-v' '-w' '-E' '-CC' '-D' 'F_CPU=240000000L' '-D' 'ARDUINO=10607' '-D' 'ARDUINO_ESP32_DEV' '-D' 'ARDUINO_ARCH_ESP32' '-D' 'ARDUINO_BOARD="ESP32_DEV"' '-D' 'ARDUINO_VARIANT="esp32"' '-D' 'ARDUINO_PARTITION_huge_app' '-D' 'ARDUINO_HOST_OS="windows"' '-D' 'ARDUINO_FQBN="esp32:esp32:esp32cam:CPUFreq=240,FlashFreq=80,FlashMode=qio,PartitionScheme=huge_app,DebugLevel=none,EraseFlash=none"' '-D' 'ESP32' '-D' 'CORE_DEBUG_LEVEL=0' '-D' 'BOARD_HAS_PSRAM' '-mfix-esp32-psram-cache-issue' '-mfix-esp32-psram-cache-strategy=memw' '-D' 'ARDUINO_USB_CDC_ON_BOOT=0' '-D' 'ESP_PLATFORM' '-D' 'IDF_VER="v5.1.4-497-gdc859c1e67-dirty"' '-D' 'MBEDTLS_CONFIG_FILE="mbedtls/esp_config.h"' '-D' 'SOC_MMU_PAGE_SIZE=CONFIG_MMU_PAGE_SIZE' '-D' 'UNITY_INCLUDE_CONFIG_H' '-D' '_GNU_SOURCE' '-D' '_POSIX_READER_WRITER_LOCKS' '-D' 'configENABLE_FREERTOS_DEBUG_OCDAWARE=1' '-D' 'TF_LITE_STATIC_MEMORY' '-I'

Here are my compile line on esp-idf :

 C:\Espressif\tools\xtensa-esp32-elf\esp-12.2.0_20230208\xtensa-esp32-elf\bin\xtensa-esp32-elf-g++.exe -mlongcalls -Wno-frame-address -DNDEBUG -fdiagnostics-color=always -Wno-unused-variable -Wno-deprecated-declarations -Wno-missing-field-initializers -Wno-maybe-uninitialized -Wno-error=uninitialized -DTF_LITE_STATIC_MEMORY -mlongcalls -ffunction-sections -fdata-sections -fstrict-volatile-bitfields -fno-jump-tables -fno-tree-switch-conversion -fno-rtti -w -Wall -Werror=all -Wno-error=unused-function -Wno-error=unused-variable -Wno-error=unused-but-set-variable -Wno-error=deprecated-declarations -Wextra -Wno-unused-parameter -Wno-sign-compare -Wno-enum-conversion -gdwarf-4 -ggdb -mfix-esp32-psram-cache-issue -mfix-esp32-psram-cache-strategy=memw -Os -freorder-blocks -fmacro-prefix-map=path -fmacro-prefix-map=other_path -DconfigENABLE_FREERTOS_DEBUG_OCDAWARE=1 -std=gnu++2b -fno-exceptions  -DESP32=ESP32 -MD -MT file.cpp.obj -MF file.cpp.obj.d -o file.cpp.obj
-c file.cpp

What is the more strange for me is the difference between -o2 optimization in my ESP-IDF case, but Arduino is better with -Os...

Anyway any help would be greatly appreciated, Have a good day everyone and thanks for reading me,

Aloïs