Videoconferencing Infrastructure: A Primer

David Maldow, Human Productivity Lab
Telepresence Options with Webtorials Publishing

If you are kicking around the idea of using telepresence and videoconferencing on a large scale then you'll need some "Big Iron." Here is a primer on the basics:

Infrastructure Elements

The products and solutions described below form the basic elements of a complete videoconferencing infrastructure. Keep in mind that some products combine functions (a conference manager may include a gatekeeper, some bridges do recording, etc..).

MCU / Video Bridge

A Multipoint Control Unit (MCU) enables multipoint (or multiparty) videoconferencing. MCUs are often called "video bridges," since they make connections between multiple videoconferencing endpoints. In most cases, when you see a videoconference with multiple people in a "Hollywood Squares" format, an MCU is receiving a video stream from each endpoint, composting those video streams into a single image, compressing the new composited image and sending it back to your videoconferencing end-point, which then displays the "Hollywood Squares."

To create successful sessions that include three or more endpoints, MCUs perform two key functions:

Signaling: Controlling the flow of data to and from each connected endpoint.

MCUs can either accept incoming calls from video endpoints or call out to the endpoints to establish a connection. Once they make the connection, the MCU must route the incoming audio and video signals from each endpoint in a conference to all other endpoints in that conference.

In the diagram above, the red, blue and purple endpoints are in one conference, while the orange, black and green endpoints are in another conference. The MCU connects to all six endpoints and routes the signals accordingly.

MCUs are expected to host multiple videoconferences simultaneously, add and drop participants from conferences, block audio and/or video to or from any particular participant in any particular conference, merge conferences, etc. They perform all these tasks by signaling control.

Transcoding: Decoding incoming data from one endpoint and re-encoding it in a new format before sending it to a second endpoint.

Videoconferencing products vary greatly from vendor to vendor, and even from product to product, and all support some--but not all--of the many video, audio and signaling protocols in use throughout the industry. By transcoding signals, MCUs can allow two otherwise incompatible endpoints to connect. For example, if one endpoint ONLY supported AAC-LD audio and another ONLY supported G.722 audio, an MCU could transcode the AAC-LD audio to G.722 and vice versa, allowing the two endpoints to share audio.

In addition to breaking incompatibility barriers, transcoding can also improve quality of experience for a videoconferencing session. Without transcoding, if one lowresolution endpoint participated in a conference, the MCU would have to use low-resolution connections between all endpoints. Each endpoint would only send the MCU a lowresolution signal so that the highest common denominator (the low-res endpoint) could handle it. By transcoding, the MCU can create a custom signal for each participating endpoint, sending high-resolution signals between high-res endpoints and transcoding it down to low resolution for the low-res endpoint. Each connected endpoint sends the bridge its highest-quality signal and the bridge sends each endpoint the highest-quality signal the endpoint can handle.

Transcoding also allows for additional layout options. Early MCUs that didn't support transcoding only offered one video layout, commonly referred to as VAS (Voice Activated Switching). Each participant would see a full-screen view of the active speaker, while non-speaking participants would be off-screen. Transcoding bridges can create CP (Continuous Presence) layouts where all participants can be seen at all times.

Today's MCUs often offer a selection of hybrid or custom layouts that can show everyone in the conference while giving the active speaker more space.

MCUs can be hardware- or software-based and externally hosted or installed on customer premises. Like many things in life, video network infrastructure is all about trade-offs-- pricing comparisons are generally made on a per port (or per connection) basis. A video bridge that costs $10,000 and can support a 10-person meeting would be considered a $1,000-perport bridge. The discussion becomes more complicated with variable capacity bridges, where all connections are not equal. These bridges may allocate more resources (processor power) to higher-resolution connections. In effect, this means the bridge's capacity depends on what types of calls it is hosting. For example, the Polycom RMX 4000 can support 120 HD (720p) connections, or 360 lower definition (CIF) connections, making its capacity and price-per-port calculations flexible.

When comparing pricing, determine what types of connections you expect and the call volume you'll need. You should also consider the following:

Overall call quality
Integration with other elements of the environment
Ease of use (for conference administrators and call participants)
Scalability

Management Solutions

A conferencing manager is responsible for the devices in the environment, the conferences taking place between these devices, and the traffic on the network. Management solutions can be deployed as software or hardware for these functions are often bundled with other conferencing infrastructure elements.

Device Management

Today's business-class videoconferencing and telepresence endpoints are not simple, plug-and-play consumer appliances. They are technical pieces of equipment that require IT and networking knowledge to set up and maintain, and an even higher level of expertise to truly optimize. Device management includes a wide variety of tasks that all fall roughly into three categories:

Provisioning: Configuring the software options of videoconferencing devices to meet network requirements and conferencing manager preferences.

Updating: Applying software updates to devices.

Monitoring: Ensuring endpoints are on line and operational.

Unfortunately, configuring the software settings for a videoconferencing endpoint is not a trivial matter. Even assigning dialing addresses to the devices can be beyond the layman videoconferencing user's ability. The administrative settings menus can contain numerous sub menus, each with their own submenus. Any incorrect settings can create hard-todiagnose call experience issues (i.e. duplex mismatch between network speed settings can cause effects resembling packet loss), or even make calls fail entirely. An unfortunate stereotype about videoconferencing is that the 3:00 meeting doesn't really start until 3:20 because someone has to spend 20 minutes bouncing through settings menus to figure out why the call isn't working. Without good device management, that stereotype becomes reality.

An organization with a small videoconferencing deployment doesn't necessarily require a device management solution. The devices generally come with remote controls and the settings can be changed on screen. In addition, most devices offer a web interface, allowing administrators to conveniently browse to the device and control it from their desktop PC. Though users can provision, update and monitor manually, management solutions provide scalability and automation. It is much more efficient to update 300 endpoints by clicking "Update Group A" on a management solution than to do by it manually by browsing to each endpoint, one by one. Just keeping track of which endpoints have and haven't been updated is a task within itself when done manually.

Administrators can also provision on a group basis, giving them an extraordinary level of control. Different groups of endpoints can be provisioned to make connections using more or less bandwidth depending on their needs and usage.

Management solutions also provide automated monitoring in the form of daily sweeps (or test calls) to all devices on the network. It could also include "heartbeat monitoring," where the solution continuously pings the device to check if it is still on and connected to the network. Advanced monitoring solutions can even be alerted to specific device failures (i.e. "microphone disconnected"). All of these monitoring tools give administrators a fighting chance to address device-related issues before they can ruin a meeting.

When choosing a management solution, keep in mind that not all solutions work equally well with all devices. The top videoconferencing device vendors all provide their own management solution: Polycom's Converged Management Application (CMA), LifeSize's LifeSize Control, and Vidyo's VidyoPortal, to name a few. Not surprisingly, a few of these solutions have a higher level of integration with the endpoints offered by that particular vendor. It can come down to a choice between a solution that has the desired functionality or one with native integration to the devices existing in the environment.

Conference Management

Users often need help with everything from dialing a call to un-muting their microphone, to say nothing of advanced features such as sharing digital media or creating a multiparty meeting. Most leading MCUs already include a fairly comprehensive user interface allowing administrators to perform essential conference management functions.

Administrators tasked with creating and controlling videoconferencing meetings may have dozens of simultaneous meetings taking place on a number of MCUs, and management solutions give them scalability and automation. These solutions let administrators manage all these meetings from a single interface. Conferences can be created ad-hoc or scheduled so that the management solution will direct the MCU to dial out to the correct endpoints at the appropriate time using the correct conference settings. Conference management solutions often leverage conference templates, or saved meetings, to allow for quick meeting creation. A conference template could include the call speed, network, protocols, resolution settings, layout preferences, and more.

Management solutions also let administrators control meetings in progress. Conference control options can include a number of items such as the ability to mute or unmute participants, change the layout, disconnect participants, and add new participants to the meeting.

Network Management

In general, video calls use a lot of bandwidth, increasing stress on the network as more and more workers turn on to the productivity benefits of videoconferencing. The rise of highbandwidth applications such as telepresence and high-definition videoconferencing in recent years has further added to the bandwidth problem.

The ScienceLogic Telepresence and Videoconferencing Management Platform provides a number of dashboard views allowing administrators to drill down on network bandwidth, individual endpoints, or video network infrastructure components.

When we asked Erik Rudin at ScienceLogic why enterprise and managed service clients were using its solution to monitor and manage devices and networks, he explained: "We see customers looking for technology to help manage the rapid growth of video endpoints as well as tracking the performance of the network and video infrastructure in a one centralized tool. Reducing the complexity created by a multi-vendor video environment saves time and money versus maintaining multiple vendorsupplied management systems which don't provide the necessary integration and data sharing flexibility they expect when monitoring mission critical applications like video.

Network management solutions help administrators protect both the network (and high-priority traffic) from the effects of network congestion. These solutions can be used to route videoconferencing traffic in a balanced way throughout the network, so that no particular network path becomes overburdened. Certain signals can even be prioritized and protected. This can be used, for example, to ensure that video traffic won't suffer from packet loss when someone else on the network starts a large but low-priority download.

Gateways and Gatekeepers

Gatekeepers and gateways enable connections between videoconferencing systems and devices on disparate networks. They are often bundled together and can be incorporated into a router, session border controller, MCU or management solution.

Gateways: Gateways are network translators. Gateways can, for example, allow videoconferencing devices on ISDN to participate in conferences with devices on an IP network. As videoconferencing continues to shift from ISDN to IP, gateways prevent legacy deployments from becoming isolated islands.

Gatekeepers: Gatekeepers are primarily used to create addressing schemes and dialing plans and to bypass the NAT. Endpoints registering to a gatekeeper can be assigned an E.164 dialing addresses. This allows for simplified intra-company calling via telephone-like numbers or e-mail addresses as opposed to IP addresses. Gatekeepers also allow endpoints behind a NAT to receive inbound calls. As described in the section on NAT/Firewall traversal above, videoconferencing endpoints behind an NAT do not have an actual internet IP address and therefore can't be dialed from external sites. However, in a properly configured environment, external endpoints can dial the gatekeeper's IP address (with an added extension belonging to an internal endpoint) and the gatekeeper will route the call.

Gatekeepers can work in either of two modes. In direct-routed mode the gatekeeper simply assists in initiating a "direct" connection between two endpoints. In gatekeeper-routed mode the signals travel through the gatekeeper itself. When using gatekeeper-routed mode, gatekeepers can offer advanced functionality such as bandwidth management.

NAT / Firewall Traversal

Network Address Translation (NAT) and firewalls present a common problem for videoconferencing deployments because, in the simplest of terms, they can make videoconferencing difficult. Many people find NAT and firewall discussions confusing, and understandably so. Both NAT and firewalls involve handling traffic between the Internet and private networks, and can be bundled into one device, deployed individually, or as components of other infrastructure elements (i.e. gatekeepers, MCUs, etc.).

NAT

Devices on private networks do not have "real" internet IP addresses. Instead, each device is assigned a private number (often 10.10.10.X as in the diagram below) with a NAT solution acting as a pass-through (translator) between public and private addresses. One real Internet address can be shared by a large number of devices in an NAT environment. All outgoing signals from any device on the network will appear to come from this one address. Returning data is then sent back to that address and the NAT must pass it through to the appropriate machine.

This type of setup provides some protection, as the devices on the network are essentially "hidden" from the Internet. Since the private IP addresses are not known to the Internet, the devices in the network should not receive any inbound traffic unless they first initiate the connection. Keep in mind that although the network may be hidden, malicious packets are not being actively blocked from entering. Therefore, NAT alone does not make a network secure.

Firewall

A firewall solution can be hardware- or software-based and often incorporates NAT functionality. Firewalls actively monitor incoming packets, traffic and application data entering a private network and block incoming network transmissions that violate their policies. Many home computer users and office workers are behind firewalls without even realizing it, as hardware routers often have incorporated firewalls and operating systems (i.e. Windows) can include one as well.

The Videoconferencing Problem: NAT / firewall solutions present a twofold problem for videoconferencing.

In an NAT environment, videoconferencing endpoints literally will not have a number for external callers to dial. Their private IP addresses can only be dialed from within the private network, behind the firewall.

The most commonly used videoconferencing protocols require access to many ports that firewalls generally close and system administrators like to keep closed.

Without some way of resolving these issues, enterprise videoconferencing deployments would be isolated communication islands, unable to conduct videoconferences with external clients, partners and associates at remote locations.

Getting Through the NAT / Firewall

There are a number of methodologies and protocols used to get through, or traverse, the NAT / firewall. While a discussion of IETF (Internet Engineering Task Force) traversal protocols (i.e. STUN, TURN, and ICE), is beyond the scope of this article, a network administrator must be sure to implement a methodology suited to the type of firewall in place and the type of traffic to be allowed.

Many solutions implement a hardware device in the DMZ (Demilitarized Zone), which is outside of the network's main firewall. This device establishes a trusted connection through the firewall to the videoconferencing environment. All videoconferencing traffic is then tunneled through this connection. The end result is that videoconferencing between the private network and the public Internet is now possible.

The Bottom Line

Your network most likely has a NAT and/or firewall. If you want your videoconferencing deployment to be able to communicate with the outside world, you need to ensure that an element of your proposed infrastructure (gatekeeper, management system, etc.) has NAT / firewall traversal capabilities or deploy a stand-alone traversal solution.

Recording and Streaming

Videoconferencing devices are primarily intended to be used for live, interactive meetings. However, there are obviously a large number of potential business uses for devices with high-definition video cameras. For example, a college wishing to broadcast lectures over the Internet to remote students would naturally prefer to use its existing videoconferencing equipment, rather than purchase additional cameras. Similarly, an organization wanting a video record of an important meeting should be able to simply use the videoconferencing system already in the meeting room.

Recording and streaming solutions allow organizations to leverage their existing videoconferencing infrastructure to serve as a homegrown Internet broadcast studio. Potential applications for this technology include:

Remote lectures / classes
Training videos
Sales videos / company announcements
Recording meetings (for legal compliance, etc.)
Broadcasting meetings to a larger audience.

Streaming solutions address the capture, management and delivery of video content. Solutions are available in either hardware or software and may be independent capture, management and delivery solutions, or solutions with multiple elements. These features may also be bundled with other infrastructure elements. For example, some MCUs offer capture/ recording functionality.

Capture solutions ingest rich media content to be recorded and streamed. They can often encode video/audio signals into formats playable by standard media players (i.e. wma, avi, etc.). Capture devices can store recorded content locally, or forward it to a content management system. In a videoconferencing environment, these solutions should also be able to capture any H.239 data (PowerPoint decks, etc.) shared during videoconferences.

Capture solutions can ingest content in a number of ways. For example, some solutions can be "called" from video endpoints. Once the solution answers the call, a connection is established and the solution can then record any video and audio it receives.

Similarly, these solutions can connect to MCUs and "sit in" on multipoint videoconferences, recording all other participants.

Management of rich media content includes a number of tasks including the following:

Content organization: Saving content to folders, creating channels of content
Content editing: Titling, captioning, clipping, cropping
Content access control: Allowing specific people, or groups of people, the ability to view and/or upload content
Content tagging: Associating keywords with content items for improved searching capabilities

Additional features could include everything from the ability to upload and associate files (such as PDF) with content items to automatic transcription of speech-to-text within content items. In the end, proper management of rich media content results in the creation of a well organized and indexed content library which allows users to easily find and view content of interest.

Delivery solutions use a variety of techniques to transmit live or recorded content across the Internet to viewers. Data can be transmitted by unicast or multicast streams.

Unicast streaming servers connect directly with, and send an individual data stream to, each viewer. This type of solution is easy to deploy but does not scale well with multiple simultaneous viewers.

Multicast streaming servers send out the content once, and it is replicated by "nodes" when necessary as it is passed on to groups of viewers. This is significantly more network-efficient and less burdensome on the host server, but it requires a multicast ready network. Multicast does not allow for video-on-demand as it does not start a new stream for each viewer. Instead, it functions more like a television broadcast.

Content delivery networks or content distribution networks (CDN) are systems that cache copies of content in servers near the end users. When a viewer requests a content item, it is streamed from the nearest available node in order to minimize latency.

Choosing the correct delivery methodology can depend on multiple factors, such as the type of content to be delivered, the size of the intended audience, and the capture and management systems in use. TPO

About the Author

David Maldow is a visual collaboration technologist and analyst with the Human Productivity Lab and an associate editor at Telepresence Options. David has extensive expertise in testing, evaluating, and explaining telepresence and other visual collaboration / rich media solutions. David is focused on providing third-party independent analysis and opinion of these technologies and helping end users better secure their telepresence, videoconferencing, and visual collaboration environments. You can follow David on Twitter and Google+.

This article is brought to you in part due to the generous support of:

Videoconferencing Infrastructure: A Primer

Search Webtorials

Get E-News and Notices via Email

Trending Discussions

Featured Sponsor Microsites

Archives

Notices