IoT Video Intercom System Development - Indeema’s Case Study

video-intercom-image

Short Overview

The partner from Germany (name not disclosed due to NDA), a company specializing in building communication systems, approached Indeema to develop a next-generation indoor intercom prototype. Their goal was to bridge analog video input from a door camera with smartphone connectivity, creating a more modern, user-friendly alternative to traditional in-building communication solutions.

Project Details

  • Industry: High-Tech, Smart Home
  • Services: IoT Development, R&D Services, IoT Consulting Services, Firmware Development
  • Lifetime: 2025
  • Client's Location: Germany

Project background

The client approached Indeema with a request to modernize their indoor communication system. They aimed to develop a monitor based on the ESP32-P4 platform that could receive analog video from a front-door camera and stream it directly to a resident’s smartphone. The project required evaluating the capabilities of the new hardware and delivering a prototype that would validate system performance in real-world use.

What was the customer's request?

  • The key requirement was to enable the device to receive PAL video from an analog front-door camera, convert it, and transmit it to a mobile app on the resident’s smartphone. The goal was to ensure that residents could view visitors in real time through their phones, making the system more convenient and better aligned with modern user habits.
What was the customer's request?

What did the client already have?

  • By the time we joined the project, partner had already formed a general concept of the prototype and outlined its core functionality. However, they required an external development team with solid experience in Espressif SoC–based solutions to turn that concept into a working product. Our role was to bring the technical know-how and hands-on implementation needed to move from idea to execution.
What did the client already have?

Solution we delivered

Where did we start?

The collaboration began with a kick-off meeting where the partner’s team shared their vision for the product and outlined the user challenges it aimed to address. This session was key to aligning on the business goals and understanding the practical context behind the idea. Our team paid special attention to the rationale for selecting the Espressif ESP32-P4 platform, as this decision directly influenced the technical architecture and system capabilities moving forward.

  • Requirements phase

    Following the initial alignment, the Indeema team moved into the Discovery phase to refine the product concept into actionable requirements. Together with the client, we created a detailed Functional Requirements Document that outlined all key features and expected behaviors. In parallel, the hardware components needed for the prototype were identified and procured. Once the FRD was finalized, a technical specification was prepared — covering system constraints, implementation strategy, and hardware integration details to guide the development process.

  • Prototyping and Wireframes

    To test hardware feasibility, a prototype was built using the ESP32-P4 platform. It digitized PAL video from an analog source, displayed the stream on an LCD, encoded it to H.264, and transmitted it in real time to a PC. All components were based on manufacturer-recommended parts. The setup helped evaluate latency, performance, and video quality across the system.

Where did we start?

Our Development Process

  • Firmware/Embedded Development

    To activate key hardware features, we configured the ESP32-P4 to support a DSI display, digital camera, and audio input. The video stream was encoded using H.264 and transmitted via Wi-Fi for real-time remote access. Since the standard RTSP component didn’t meet streaming needs, we implemented a custom version tailored to the system. This setup was further optimized to reduce latency and deliver smooth, responsive video — a critical requirement for any smart intercom system.

  • DevOps and Cloud Development

    To support efficient development and version control, a GitLab repository was set up along with a lightweight CI/CD pipeline. This setup allowed for streamlined code integration, automated builds, and faster iteration throughout the project. While minimal, this DevOps approach ensured stability during development and helped maintain consistent delivery of firmware updates.

The Team Involved In The Project

Engineers

2

Technical Lead

1

Project manager

1

Project Challenges And Our Suggestions

  • RTSP Limitations and Integration Complexity

    The RTSP component provided by Espressif wasn’t designed for full multimedia use, and its source code wasn’t publicly available. This significantly limited integration flexibility. We addressed this by customizing our own RTSP solution, enabling more reliable video streaming within the smart intercom system.

  • Color Synchronization Issues

    Achieving accurate color output from the analog video stream proved challenging due to discrepancies at both hardware and software levels. Fine-tuning and calibration were performed to ensure consistent and true-to-source image rendering.

  • Latency Optimization for Real-Time Streaming

    Low latency was essential for a smooth user experience. By utilizing the ESP32-P4’s built-in hardware codec, we successfully reduced video transmission delays and enhanced the responsiveness of the video door intercom system.

challenges-img

Impact

Together, we built a reliable, future-ready indoor intercom prototype. It connects an analog door camera to a smartphone, so the new video intercom can work with existing infrastructure and doesn’t require major hardware changes. Indeema’s engineers helped the partner test key features and move closer to a market-ready product.

Before And After Cooperation With Indeema

Before:

  • The customer had an idea for an indoor intercom system;

  • Client had already provided detailed requirements and a preliminary architecture document;

  • ESP32-P4 suggested as the target hardware platform, but not yet validated;

  • No firmware or video processing implementation;

  • No custom RTSP streaming solution;

  • No integration with mobile app functionality.

After:

  • Fully functional indoor monitor developed and tested, with all key features implemented and integrated, ready for productization.

  • The ESP32-P4 platform has been practically demonstrated to be a suitable choice;

  • Prototype developed with PAL video digitization and H.264 encoding;

  • DSI display and CSI camera integrated and configured;

  • Custom audio driver and RTSP component developed;

  • Custom audio driver and RTSP component developed;

  • Video stream successfully transmitted to PC via Wi-Fi;

  • Latency optimized using hardware codec;

  • Clear architecture laid out for further development of the video door intercom system.

Technical Highlights

Technologies

C/C++

ESP-ADF

ESP-IDF

RTSP

Silicons

ESP-32 P4

ESP-32 C6

TW9992 video grabber