My building got a new intercom a while ago — a Fermax Blue system. The panel outside talks to a little screen inside each apartment, and there's an optional iOS app that does roughly the same thing from your phone. It works. I wanted to understand how.
By the end of that Saturday — I was on vacation, nothing better to do — I had a full replacement for the app running in my browser: login, door unlock, live video from the outdoor panel. I also had an accidental proof of concept that any authenticated Fermax user could join any other user's video session and watch their camera.
This is how that went.
Note on disclosure. I reported this to Fermax before publishing and I'm keeping the sensitive details out of this post on purpose — no credentials, no request payloads that aren't in a capture screenshot, no PoC code. Some bits are deliberately vague. I'll update the post once the patch is out.
No SSL pinning
The first thing I checked was whether the iOS app pinned its certificates. If it did, capturing traffic would mean patching the binary, which I wasn't going to do on a weekend.
It didn't. mitmproxy worked on the first try.
I wrote a small mitmdump addon to filter Fermax hostnames and dump each request/response as JSON. Three captures were enough: one session with login + door open + a video call, one with logout/login, one triggering "live view" from the app.
Mapping the API
From the captures I built an API_MAP.md — a full reference of every endpoint the app touches. OAuth2 with the resource owner password credentials grant (the client_id and client_secret are hardcoded in the iOS binary). REST endpoints for pairings, devices, panels, subscriptions. A single POST to open the door. And a signaling layer for video calls that I didn't understand yet.
The relevant shape, abbreviated:
AUTH
POST /oauth/token password grant, hardcoded client creds
POST /oauth/token/revoke
USER & PAIRING
GET /user/api/v1/users/me
GET /pairing/api/v4/pairings/me
DEVICES
GET /deviceaction/api/v1/device/{id}
GET /deviceaction/api/v1/device/{id}/panels
GET /services2/api/v1/services/{id}
SUBSCRIPTIONS
GET /subscriptionnowifi/api/v1/plans
GET /subscriptionnowifi/api/v1/subscription/{logicalId}
ACTIONS
POST /deviceaction/api/v1/device/{panelId}/directed-opendoor
POST /deviceaction/api/v2/device/{panelId}/autoon ← wakes the panel camera
NOTIFICATIONS
POST /notification/api/v1/apptoken
GET /notification/api/v1/mutedevice/me
SIGNALING (Socket.IO)
→ join_call
→ transport_consume
→ transport_connect
← on-browser-autoon broadcast
← end_upThe autoon endpoint is the door into the video flow. The on-browser-autoon event is where things go wrong later. Neither of those was obvious at this point — I just had a list.
The web client
Turning the captures into a typed TypeScript client was the fastest part of the project. Auth with auto-refresh, a thin wrapper around each REST endpoint, a Next.js app with server actions and iron-session for the cookie.
Login + dashboard + door unlock took a day.

The hard part: live video
The video signaling is Socket.IO on top of mediasoup, an SFU. There's no public documentation for the specific event protocol Fermax uses. I had to piece it together from the captures.
The steps, simplified:
- Hit the REST
autoonendpoint to wake the panel camera - The signaling server broadcasts an
on-browser-autoonevent with the new room ID - Join the room over Socket.IO
- Negotiate the mediasoup transports and consume the video + audio
- ~30 seconds later the server sends
end_upand the session ends - Wait a second for the panel to release and start over
Things that went wrong
A few bugs cost me real time:
- JWT without
Bearer. The REST API wantsAuthorization: Bearer <token>. The signaling server wants the raw JWT in a different field, no prefix. I copied the REST pattern and gotinvalid tokenerrors for an hour before I re-read the capture. - Result wrapper. Responses from
join_callandtransport_consumelook like{result: {...}, context: {...}}. I was reading fields from the top level. Everything wasundefinedand nothing made sense until I logged the raw payload. - Two signaling servers. Sessions get created on either
srv01orsrv02— load-balanced. You can't know which one in advance. I had to connect to both and race for the event. - Preemptive reconnect doesn't work. My first attempt at handling the 30-second cutoff was to start a new session before the old one ended. The panel returned
409 device_busy. Reactive reconnect — wait forend_up, then restart — works.
After all that, the video came through. H.264 frames and G.711 audio from the panel, rendered in a <video> tag in my browser.
The accident
I was trying to grab my own panel ID — I needed it to connect to the right room — so I left the Socket.IO stream open and waited for on-browser-autoon to fire. Three different IDs came through. Only one of them was mine.
Hmm. Does this mean what I think it means?
It did. The server was broadcasting on-browser-autoon — the event that carries the room ID and device ID of every starting video session — to every connected socket. Not just the owner's session. Every session, to everyone.
That's already a leak. Knowing who is at someone else's door and when is not something a stranger should have.
But the worse part was whether the room itself was protected. I built a small audit page that listens to both signaling servers, displays broadcasts in real time, and puts a "Join" button next to each session. Then I tried joining a session that wasn't mine, using my own JWT.

The "Probe" button is a safer check than "Join" — it asks the server for session metadata using someone else's room/device IDs without actually joining or touching media. When privacy mode is active, the payload itself is hidden; only the shape of the response tells you whether the server answered or refused.

I clicked Join on a session that wasn't mine. The feed came up — someone's hallway, not mine.
It worked. join_call didn't check whether my token belonged to the device owner. Any authenticated Fermax user could join any active room.
What's actually wrong
Two issues, stacked:
on-browser-autoonis scoped to the connection, not to the device. Anyone listening to the signaling server sees every starting session across all users.join_callauthenticates the user but doesn't authorize the room. A valid JWT is enough; it doesn't have to be the right JWT.
The REST side is fine — asking for someone else's device data returns 403, the door-open endpoint returns 403, everything you'd hope for. The vulnerability lives in the signaling and media layer. Which is where the camera is.
Disclosure
I've reported this to Fermax with a full write-up and a proof of concept. This post is the public-facing version and deliberately skips the details that would make exploitation easier.
I'll update this section once there's a patch.
What I took away from this
- Without certificate pinning, reverse-engineering the whole API took one Saturday. Most of that Saturday was me being confused, not clever.
- I wasn't hunting for a vulnerability. I was trying to get my panel ID. It found me.
- WebRTC is less scary than its reputation. The SFU does the media; the hard part is the protocol on top.
- A 403 on the REST side felt reassuring while I was building. It turned out to mean very little.
I'll publish the client source on GitHub once disclosure wraps up. The PoC stays private either way. If you work at Fermax and want those details, email me.