Hybrid meetings, where some participants join online and others join in person, are becoming more common in the post-pandemic world. However, hybrid meetings also pose some challenges, such as ensuring that everyone can see and hear each other clearly, and that online participants do not feel left out or disconnected from the in-room attendees. Fortunately, Microsoft has a solution for this: Microsoft IntelliFrame, an AI-driven online meeting experience that enhances the video quality and interactivity of hybrid meetings. In this blog post, we will explain what Microsoft IntelliFrame is, how it works, and how you can use it to make your hybrid meetings more engaging and productive. |
To be clear, there are 3 different types of IntelliFrame available now:
- Intelliframe Cloud
- Intelliframe Multistream
- Intelliframe OEM or Edge
1. Intelliframe Cloud
First, IntelliFrame Cloud is a feature of Microsoft Teams Rooms that uses supported cameras to create smart video feeds of the in-room attendees. These video feeds zoom in on and frame the faces and gestures of the people in the room, making them more visible and expressive to the online participants. IntelliFrame also automatically switches between different frames based on who is speaking or gesturing, creating a dynamic and immersive meeting experience. The video processing is performed completely by Teams software running in Azure and requires no additional hardware or software on the local Teams Room system.
IntelliFrame Cloud is available for all Microsoft Teams Rooms with a Pro license equipped with a supported camera. Online participants on Microsoft Teams Desktop (Windows and Mac) will see the IntelliFrame video feed by default from rooms with these cameras. You don’t need to do anything special to enable IntelliFrame; just create a meeting, add a conference room, and show up. Note: IntelliFrame supports a maximum of 12 people in the Teams Room and no more than 64 total meeting participants. However the max number of frames that can be shown is 8. All of these frames are composited into a single video stream sent to far end participants and the layout cannot be changed by the user.
IntelliFrame Multistream is available for Microsoft Teams Rooms equipped with Yealink SmartVision 60 or Jabra Panacast 50 cameras, which are designed for small- to medium-sized meeting spaces. A combination of intelligent cameras and the AI features in Teams enables powerful AI features for a more engaging and flexible video layout experience. The key differences between Intelliframe Multistream and Intelliframe Cloud are:
- Up to 4 people frames plus a panoramic frame can be sent to far end participants. Each 4 frame is a separate individual video stream rather than a composite stream. This allows Teams more flexibility to display each of the 4 participants
- IntelliFrame Multistream also leverages the in-room participant's face and voice recognition profiles to enable the camera to identify them by name in the meeting. This adds a personal touch to the video feed and the chat transcript, and makes it easier for the online attendees to follow the conversation and search for specific speakers in the participants list.
- Features such as Intelligent Recap in Teams Premium and Copilot are able to identify who said what in the meeting transcript by identifying the user's voice.
- Name labels are displayed for participants in the meeting room as Intelliframe Multistream uses the facial profile to identify the user
Intelliframe OEM or Edge relies completely on hardware OEMs to build AI features into the cameras. Microsoft Teams leverages these features built into the camera hardware but does not perform any processing of the video streams. Different OEM manufacturers have built their own proprietary AI features into their intelligent camera hardware such as Poly DirectorAI, Neat's Symmetry and Logitech RightSight to name a few. Broadly speaking, Intellframe OEM features can be categorized into 3 types:
- Front of Room (FoR) camera and Center of Room (CoR) camera combination
- Front of Room (FoR) camera only
- AI Audio enhancement features
1. Front of Room (FoR) camera and Center of Room (CoR) camera combination
In FoR and CoR camera combinations, the system uses a FoR camera together with one or more CoR cameras to detect and display the best view of the participant. Examples of this include the Logitech Sight + Rallybar system and the Neat Center + Neat Bar system. Here are some links that describe how it works:
- https://www.logitech.com/en-us/products/video-conferencing/room-solutions/sight.html
- https://neat.no/center/
2. Front of Room (FoR) camera only
For FoR only solutions, OEM manufacturers have also built many AI powered features into the camera hardware. For example, Poly MTR cameras and collaboration bars are shipped with Poly Director AI, a smart camera technology that uses AI and machine learning to deliver real-time automatic transitions, framing, and tracking that make everyone feel like they are in the room together. It offers different framing and tracking modes to create the most equitable experience for hybrid meetings The modes include Group Framing, People Framing, Speaker Framing, and Presenter Tracking. Below is a short video showing Poly DirectorAI in action:
Last but not least, many OEM manufacturers also build AI powered Audio enhancing features into their Teams Room solutions. One such example is Poly's NoiseblockAI that uses machine learning to filter out non-human noises such as keyboard typing, paper shuffling, and plastic bag rustling sounds from video calls. This feature ensures that meetings are more productive and distractions are kept out of your calls as the system automatically recognizes and blocks distracting background noises. Another example is the recently introduced Poly Sound Reflection Reduction, an algorithm that reduces the typical issues with reverb that are often found in rooms with glass walls or other very hard surfaces. If participant audio is bouncing off the wall and sounds hollow or reverberant to meeting participants, this will help reduce the effects of echo sent to far end participants as the device will use AI to remove the reverberations. Below is a video of this in action: