Video Streaming/Conferencing App Development Guide
A surge in video communication tools has become a catalyst for telemedicine, entertainment, e-learning, fitness, ecommerce, and other online-related businesses. If you want to implement real-time video communication in your product, integrating a ready-to-use solution will be an easy way to follow. However, everything will drastically change if you decide to develop a custom solution for which video streaming is a key feature.
In this article, we will share our expertise in video streaming/conferencing app development and integration, explain what communication protocols to opt for, dive deeper into video streaming architecture patterns, and consider real-life examples.
Classification of Video Service Platforms
All live-streaming business cases can be divided into three groups of technological solutions:
- One-to-many: These are services such as broadcasting (e.g., IPTV), VOD (Video on demand), and live streaming.
- One-to-one: Video calls one-on-one, video chat.
- Many-to-many: Group calls, conferences with multiple hosts.
The chosen technological solution impacts the product’s architecture, defining the software roadmap and recommended real-time communication protocol.
Real-Time Communication Protocols for Building Video Services
The key moment in planning video streaming app development is the selection of a communication protocol that defines how data communicates from one device or system to another over the Internet. From the perspective of content delivery, there are three main approaches associated with different transport protocols:
- MPEG-DASH/HLS – a media protocol used for cross-platform video transmission of live or on-demand video content. For example, it can be used for TV streaming.
- WebRTC – is a low-latency protocol designed for one-to-one video streaming. It can also be used in other cases where low latency is required. It needs a more significant server infrastructure than the first case. WebRTC is developed specifically for certain business cases, since group calls, broadcast streaming, or one-on-one calls have significantly different architectures. However, if we’re talking about video calls, WebRTC app development is the way to go.
- RTMP – a Real-Time Messaging Protocol that can be optimized for low latency. It has several implementation options, but it can only be played by applications (in browsers, using plugins). By splitting streams into fragments, RTMP can effectively transmit more information. It is primarily used for transmitting live streams on platforms like YouTube.
We will now overview what types of solutions these protocols can be used for.
Video Streaming/Conferencing App Architecture Patterns
Architecture must be selected in accordance with the business case and functionality you want to incorporate into your product. Below, I will outline the characteristics of the architecture for different business cases.
Live streaming/video-on-demand applications like Netflix or Hulu
This group of projects requires the use of the RTMP protocol. The system should include several essential components:
- Streaming Server: This server is responsible for handling incoming streams from publishers and distributing them to viewers. It often includes a built-in transcoder.
- Transcoder: The transcoder is a crucial part of the streaming server, responsible for re-encoding the stream from the publisher into the broadcast protocol used for streaming. This ensures compatibility and optimization for viewers.
- CDN (Content Delivery Network): The CDN is essential for caching and delivering content to viewers. Without a CDN, the quality of the output can fluctuate depending on the network conditions, leading to an inconsistent user experience. Choosing the right CDN ensures the availability and performance of the live stream.
- Business Logic and Billing Server: This component manages the business-related aspects of the streaming service. It handles user authentication, authorization, billing, and other business logic. It’s crucial for monetization and user management.
Example of live streaming app architecture
Other system elements are optional and depend on the specific functionality you want to implement. Typical live streaming apps rely on NGINX, Amazon services, or NodeMediaServer. A perfect fit will depend on the business requirements. For instance, ready-to-use solutions like NodeMediaServer may suit products that won’t be used by a large audience. However, branding and scaling will require assembling the product from different parts.
One-to-one video chat apps
One–to–one video chat apps are the simplest option if no additional functionality is required. This functionality can be implemented in chat roulette, dating apps, and corporate systems. For example, we implemented one-to-one calls in an enterprise communication system.
If clients are located on the same network (except for 3G), the following parts of the backend infrastructure are required:
- Signaling Server: This server is used so that clients know whom to call (addresses).
- Business Logic Server: This server handles the business-related aspects of the service.
Unfortunately, such cases are extremely rare. Therefore, in real-world scenarios, two additional types of servers are necessary, known as STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) servers. There is no need to develop these types of servers from scratch as they already exist with MIT licenses and can be easily deployed.
The key point is that setting up a UDP pool for relay is required for TURN. Additionally, communication in cellular networks won’t work without TURN because they often use symmetric NAT and address falsification protection. In sectors like banking and healthcare, address falsification protection is also common, so the use of TURN servers is necessary.
Video conferencing apps like Zoom or Google Meet
WebRTC protocol is used for video conferencing apps like Zoom and Google Meet, and there are three network organization types for group calls in video conferencing systems, which determine the quality and functionality of group calls.
A mesh network is a decentralized network where devices are interconnected, forming a mesh-like structure. Unlike traditional networks with a central hub, mesh networks enable direct communication between devices. This architecture offers advantages like scalability, redundancy, and self-healing but may not handle a large number of participants and requires significant participant bandwidth.
MCU, commonly used in video conferencing, manages multiple audiovisual streams in multipoint conference calls. It combines individual participants’ audio and video data into a single stream sent to all participants. This centralized approach relies on the MCU for media stream processing and distribution.
In contrast, SFU is another component of video conferencing systems. It doesn’t merge and redistribute media streams but selectively forwards specific streams to participants based on their needs, considering network conditions and device capabilities. SFUs are often used in decentralized or peer-to-peer conference setups and are the most optimal choice for video conferencing applications.
Five Things to Consider When Planning a Video Communication Product
Before embarking on the video service development process, you need to conduct an analysis. This analysis will allow you to find out specific requirements and identify potential challenges.
For example, if you need to implement MPEG-DASH/HLS broadcasting for compatibility with any browser or application while also requiring low-latency streaming with delays under a second, it can be unattainable due to incorrectly chosen broadcasting standards.
The architectural design and technology selection stage (Technical Analysis) are crucial for the project’s success. You need to thoroughly examine all the criteria before starting, not only refining the requirements but also prioritizing each one for a specific client or project. This ensures choosing the most valid solution, optimal in terms of cost versus requirements, without overlooking something essential.
Features and integrations
Understanding what functionality is needed right now and may be required in the future allows for designing the right technical solution. It’s important to take into account all limitations and assess the need for additional services provided during video streaming. Such services can include recording, screenshot generation, AR/VR functionality, and machine learning capabilities (background blurring, face recognition, etc.).
Number of video session participants
Important factors to be considered in the planning stage are the number of participants in media sessions and sources and receivers of video streams. This affects the choice of technologies that will allow the transmission of video of satisfactory quality for all users.
By latency, I mean what real-time means to you. All real-time services have some delay. For example, in online conferences or live streaming, there’s typically a delay attributed to factors like broadcast standards. Even in advanced implementations, it starts at around 10-12 seconds and can go up to a minute. In practice, feedback usually occurs through chat, and delays in responses are often attributed to publisher-related physical delays (not having read a message in time, not responding in time, etc.)
Digital Rights Management (DRM) and Regulatory Compliance
DRM is necessary because there’s no foolproof protection against hacking, even if the transmission channel is fully secured. Any issues related to implementing protection against hacking need to be carefully evaluated in terms of the trade-off between implementation time and the level of protection. It’s crucial to consider that implementation time increases significantly with the level of protection.
Industry-specific requirements can impose additional demands. For example, in healthcare, complying with HIPAA involves implementing specific security measures and adhering to guidelines to protect sensitive health information.
Video services require ongoing maintenance costs that depend on the type of service and the workload. They consume significant processing power and bandwidth, which needs to be paid for, either to service providers or for servers and bandwidth.
While service providers might charge around 4-5 cents per minute of service (publisher), theoretically, self-implementation could reduce costs to about 1 cent per minute, but it depends on the services provided. Some services have non-standard billing strategies, such as Zoom, where you pay for the host rather than for time, or services with fixed fees after a certain number of minutes, making it possible to choose third-party services for different business cases.
MobiDev Case Study: Implementing Live Video Conferencing for a Wellness Platform
Let us share our expertise with live video chat solutions on the example of one of our projects. This is only one of many projects, but it illustrates that in cases where video streaming is not a main key feature, it is possible to apply the integration of ready-made third-party solutions to optimize the budget and the development process.
GroupWell is a cloud-based platform for promoting wellness, behavior change, and mental health care through licensed therapists and coaches.
Within the GroupWell ecosystem, members are thoughtfully matched with affinity groups that share similar characteristics based on various variables. Additionally, members have the flexibility to search for groups using advanced filters, ensuring they find the perfect match for their wellness journey.
To develop this innovative platform, we devised a strategic approach that seamlessly blended open-source tools and libraries with custom development. This approach was not only time-efficient but also cost-effective, enabling us to deliver a high-quality solution.
Key Technologies Utilized: NodeJS, NestJS, PostgreSQL, RDS, ReactJS, Ant Design library, React Admin, Recharts.JS, AWS, Stripe for payment integration, Google Calendar API, Amazon S3, and Amazon SES.
Integrating Zoom for Seamless Video Communication
One of the pivotal requirements of the platform was to facilitate both peer-to-peer and group video calls on both mobile applications and within the web version. Given that video conferencing wasn’t the primary focus of the project, we made a strategic decision to leverage a third-party solution.
After careful consideration, Zoom was chosen as the ideal tool. It not only met our HIPAA compliance requirements but also proved to be a cost-effective option that aligned perfectly with the client’s business needs. Zoom’s comprehensive API and SDK documentation streamlined the development process significantly.
Enhancing Healthcare Communication
Our developers seamlessly integrated Zoom’s functionality into the platform. This allowed doctors to create conference rooms, conduct group sessions, and host video calls, all within the GroupWell environment. We also went the extra mile to ensure the security of video sessions by implementing safeguards against re-entry using the same link and tokens, tracking call start and end times meticulously.
All call management, including the choice of call type, time, duration, and description, takes place on our platform, as well as data analysis and collection of call statistics to assess the quality and reliability of the service and then work with this data, and the exchange of video streams is provided by Zoom. Thanks to this, the product server doesn’t require additional load.
The GroupWell platform now stands as a testament to how technology can empower preventive care, behavioral change, and mental health treatment while providing a seamless and secure communication experience for healthcare professionals and users alike.
This started out as a small, demo project that blossomed into a full-scale application due to the dedication, commitment, and teamwork from MobiDev. The team is smart, capable, and very strong communicators. They find a way around every technical challenge we face. I don’t think of them as outsourced, but instead as my team.
Build a Video Streaming/Conferencing App with MobiDev
Whatever video service you’re looking to develop to address your business needs, the MobiDev team will be glad to assist and provide reliable support along the way. Our business analysts will help you finalize the vision and strategy of your future product, and solution architects will design the most suitable architecture that will meet your current needs and include enough room for future scaling.
Feel free to contact us via the form below or schedule a call with my colleagues:
+48 790 675 136 (EU)
+1 267 944 6127 (USA/Canada)