Closing the Loop — The Heater Obeys
The Shelly Plug S arrived. The radiator is now under software control.
The smart plug
The Shelly Plus Plug UK sits between the wall socket and the radiator heater. It exposes a local HTTP API — curl to turn the relay on or off. No cloud, no app, no subscription. Just a GET request with Digest auth.
First test from the Pi:
curl -s --digest -u admin:$SHELLY_PASS http://192.168.1.200/relay/0?turn=on
# → {"ison": true, ...}
The relay clicks. The radiator starts warming. Another curl, it stops. This is the moment the system transitions from passive monitoring to active control.
Why not let Claude control the heater?
The obvious approach: give the Claude agent direct access to the Shelly, add decision rules to its CLAUDE.md, and let it turn the heater on and off every 15 minutes.
The problem is thermal inertia. A room doesn’t respond to a heater instantly — there’s a lag of minutes between turning the heater on and seeing the temperature rise at the sensor. A 15-minute control loop is too slow. By the time the agent sees the temperature has overshot, it’s been overshooting for 14 minutes. And with only 4 readings per hour, the control signal is too coarse for tight regulation.
You need a fast loop for the physics and a slow loop for the strategy.
Two-layer architecture
FAST LOOP (thermostat.py, every 30s)
Reads sensor → compares to config → controls Shelly relay
Simple, deterministic, no AI
SLOW LOOP (Claude agent, every 15 min)
Reviews thermostat performance → adjusts config if needed
Intelligent, adaptive, can reason about trends
Layer 1 is a Python daemon running as a systemd service. Every 30 seconds, it reads the latest temperature from the Flask API, compares it against a setpoint with hysteresis bands, and turns the relay on or off. Bang-bang control — the simplest algorithm that works for a room heater.
The configuration lives in a JSON file:
{
"setpoint": 22.8,
"lower_band": 22.6,
"upper_band": 23.0,
"enabled": true
}
The thermostat re-reads this file every cycle. Change the setpoint, and within 30 seconds the new value is active. No restart needed.
Layer 2 is the Claude agent, which already runs every 15 minutes. Instead of directly controlling the heater, it now reviews the thermostat’s log — how often is the relay cycling? Is the temperature holding near the setpoint? Are the hysteresis bands too narrow (causing rapid cycling) or too wide (causing big swings)? If adjustment is needed, it modifies the JSON config file and the thermostat picks it up on the next cycle.
The agent can also disable the thermostat entirely by setting "enabled": false — a kill switch for when something goes wrong.
Layer 3 is the outer loop on the laptop — me and Claude reviewing both the thermostat and the agent, adjusting strategy, planning experiments. Three layers, three timescales: 30 seconds, 15 minutes, on-demand.
Fail-safe first
The thermostat was designed around one principle: if anything goes wrong, the heater turns off. A cold room is survivable. An overheated room kills cultures.
- Sensor API unreachable → heater OFF
- Sensor reading older than 5 minutes → heater OFF
- Shelly unreachable → heater OFF
- Config file unreadable → heater OFF
- Process crashes → systemd ExecStopPost curls heater OFF
- Process receives SIGTERM → signal handler turns heater OFF
Every failure mode defaults to cold.
The first real failure
Within minutes of deploying the thermostat, it proved its value — by doing nothing.
The ESP32 had been offline since 16:50. The last sensor reading was 7 hours old: 18.1°C / 59.9%. The thermostat detected the stale timestamp and refused to act:
[23:40:27] [ERROR] FAIL-SAFE — reading 24605s old (stale), heater OFF | temp=18.1
The FUMMDUS LCD hygrometer — visible in the camera photos — showed the room was actually 20.8°C. The stale 18.1°C reading would have triggered the heater needlessly. The staleness check prevented it.
Meanwhile, the Claude agent correctly reported what was happening:
- **Thermostat:** FAIL-SAFE — reading ~24700s stale, heater held OFF; correct behavior
Two independent systems — one dumb, one smart — both arrived at the right conclusion: don’t act on bad data.
Debugging the ESP32
Why was the ESP32 offline? The USB device was still present (/dev/ttyUSB0), the ESP32 responded to ping at its WiFi IP, but it wasn’t posting readings.
First attempt: reset the ESP32 via the serial DTR line. The CP2102 USB-UART bridge connects DTR to the ESP32’s EN (reset) pin — the same mechanism esptool uses before flashing. A quick Python script:
import serial, time
s = serial.Serial("/dev/ttyUSB0", 115200)
s.dtr = False
time.sleep(0.1)
s.dtr = True
s.close()
The ESP32 rebooted. Connected to WiFi. Then: Error: failed to read from DHT22 sensor. Every 30 seconds, the same error. The sensor was returning NaN on every read.
The fix wasn’t software. The firmware has no retry logic, but that wasn’t the root cause — the DHT22 had a loose wire. A physical reseat of the jumper wires, another DTR reset, and readings started flowing immediately:
[23:51:58] [ACTION] temp=20.5 | heater=ON | ACTION=ON (temp 20.5 < lower 22.6)
[23:52:28] [INFO] temp=20.5 | heater=ON
[23:52:58] [INFO] temp=21.4 | heater=ON
[23:53:28] [INFO] temp=21.6 | heater=ON
The thermostat saw 20.5°C — below the 22.6°C lower band — and turned the heater on. Within two minutes, the temperature was climbing.
Lesson: When a sensor fails, check the wires before writing software workarounds. The DTR reset was a useful diagnostic tool (it confirmed the ESP32 itself was fine), but the fix was physical. Software can’t fix a disconnected wire, and an auto-reset watchdog would have just boot-looped the ESP32 every 10 minutes while the DHT22 stayed broken.
Flask proxy endpoints
The Shelly’s local IP (192.168.1.200) is only reachable on the home network. To control the heater remotely — from the office, from a phone, or from the outer-loop laptop — the Flask server now proxies Shelly requests:
# From anywhere via Cloudflare tunnel
ssh daniel@ssh.egc.land "curl -s http://localhost:5000/shelly?key=tclab2026"
ssh daniel@ssh.egc.land "curl -s http://localhost:5000/shelly/on?key=tclab2026"
The Flask proxy handles Digest authentication internally. The outer loop doesn’t need the Shelly’s password — it authenticates to Flask with the API key, and Flask talks to the Shelly.
Similarly, the thermostat config is accessible remotely:
# Read config and recent actions
ssh daniel@ssh.egc.land "curl -s http://localhost:5000/thermostat?key=tclab2026"
# Adjust setpoint remotely
ssh daniel@ssh.egc.land 'curl -s -X POST -H "Content-Type: application/json" \
-d "{\"setpoint\": 23.0}" http://localhost:5000/thermostat?key=tclab2026'
Where things stand
| Item | Status |
|---|---|
| Blog | Live at blog.egc.land |
| Agent | Running 24/7, now reviews thermostat |
| Thermostat | Running 24/7, bang-bang with hysteresis |
| Shelly | Connected at 192.168.1.200, Digest auth |
| Camera | 4608×2592, auto-capture every cycle |
| Sensor cross-ref | DHT22 vs FUMMDUS LCD, validated |
| ESP32 | Back online after wire reseat |
| Environment | Active temperature control, targeting 22.8°C |
The system now has closed-loop temperature control. The thermostat reads the sensor, decides, acts, and the room responds. The Claude agent watches the thermostat, and I watch the agent. Three loops, three timescales, one room temperature.
Next: wait for the temperature to stabilise and see how tight the thermostat holds the setpoint. If the hysteresis bands are right, the relay should cycle a few times per hour and the temperature should stay within ±0.2°C of 22.8°C.