For those having issues with OTA failing to upload, like at 4% - 10% or so, the FIRST thing to check if your network connectivity, configuration, routing, firewall, etc.
I lost 2 days trying to sort out why OTA would fail, the watchdog getting triggered, then the device BRICKS without even resetting itself.
In some cases cycling USB Power Supply is enough to make it recover. But in some/most cases, you need to USB flash AGAIN.
The network connectivity seems fine, although if you ping long enough you can see the latency increasing dramatically.
But there is NOT really a network "failure" per se ...
Code:
root@MYHOSTNAME:~# ping -c 20 esp32-s3-base.local
PING esp32-s3-base.local (172.27.30.1) 56(84) bytes of data.
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=1 ttl=254 time=87.4 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=2 ttl=254 time=8.65 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=3 ttl=254 time=31.1 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=4 ttl=254 time=55.2 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=5 ttl=254 time=86.5 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=6 ttl=254 time=102 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=7 ttl=254 time=30.7 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=8 ttl=254 time=53.6 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=9 ttl=254 time=77.2 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=10 ttl=254 time=102 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=11 ttl=254 time=25.2 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=12 ttl=254 time=45.5 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=13 ttl=254 time=1810 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=14 ttl=254 time=787 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=15 ttl=254 time=296 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=16 ttl=254 time=421 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=17 ttl=254 time=750 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=18 ttl=254 time=466 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=19 ttl=254 time=285 ms
64 bytes from esp32-s3-base.local (172.27.30.1): icmp_seq=20 ttl=254 time=102 ms
--- esp32-s3-base.local ping statistics ---
20 packets transmitted, 20 received, 0% packet loss, time 19036ms
rtt min/avg/max/mdev = 8.649/281.088/1809.820/418.520 ms, pipe 2
root@MYHOSTNAME:~# traceroute esp32-s3-base.local
traceroute to esp32-s3-base.local (172.27.30.1), 30 hops max, 60 byte packets
1 192.168.1.7 (192.168.1.7) 0.513 ms 0.431 ms 0.436 ms
2 192.168.4.31 (192.168.4.31) 0.825 ms 0.884 ms 0.989 ms
3 esp32-s3-base.local (172.27.30.1) 62.587 ms 66.768 ms 66.756 ms
root@MYHOSTNAME:~# tcptraceroute esp32-s3-base.local
Running:
traceroute -T -O info esp32-s3-base.local
traceroute to esp32-s3-base.local (172.27.30.1), 30 hops max, 60 byte packets
1 192.168.1.7 (192.168.1.7) 0.450 ms 0.450 ms 0.399 ms
2 192.168.4.31 (192.168.4.31) 1.086 ms 0.880 ms 0.734 ms
3 esp32-s3-base.local (172.27.30.1) <rst,ack> 53.865 ms 58.118 ms 61.274 ms
Out of desperation, I tried to flash the firmware from the Raspberry Pi that is running the AP, so it's DIRECTLY in the correct subnet for the esp32 device that it's trying to flash. Upload the firmware on the Raspberry Pi server first, you do NOT want to wait for all the time it takes for it to compile (Raspberry Pi 2 here ...):
Bash:
#!/bin/bash
# Activate Python venv
source ~/ESPHome/venv/bin/activate
# Flash Device
esphome upload esp32-s3-base.yaml --file firmware.bin --device esp32-s3-base.local
Root cause was to enable (on the Rock 5B in the Garage or the Raspberry Pi that I have in the House for some warmer-environment testing) in /etc/sysctl.conf (for Debian/Raspberry Pi OS/Ubuntu and similar GNU/Linux Distributions):
That's probably causing some serious flooding of the different subnets which are now "merged" into one, without NAT/Masquerading or other stuff to take care of these conflicts.
In past versions this was done by iptables, but nowadays it got replaced by nftables, which is very different to configure.
That's what I need to figure out right now ....