Company patents
Zoom Video Communications, Inc.
Zoom Video Communications, Inc. surprisingly showed a significant increase in patenting activity in emerging AI-related fields in 2024, with Natural Language Processing growing by +92.0% YoY and Computer Vision by +69.7% YoY, alongside a +105.0% YoY surge in Messaging & Email. However, patent filings across almost all categories, including its core Streaming & Real-Time Media (33.7% of portfolio) and Pictorial / Video Communications (30.4% of portfolio), have seen a dramatic decline in 2025 and so far in 2026, with categories like Routing, Switching & QoS and Business Methods & Fintech showing 0 patents filed so far in 2026, indicating a potential shift in its overall IP strategy.
Patent Trend by Technology Area
Yearly patent publications since 2023
Product themes
Product-level themes inferred from filings since 2023, with category chips showing where each theme appears. Select a theme to filter the patents below.
1,074 US filings (since 2023) · 12 categories · 32 themes
Technologies enabling synchronous, interactive multimedia communication sessions, including user interfaces, content sharing, and underlying session management for multiple participants.
Systems and methods for establishing, maintaining, modifying, and terminating communication sessions across various network architectures, including service discovery, resource allocation, and resilience mechanisms.
User interface designs and systems that enable multiple users to interact with shared content, provide feedback, or coordinate activities, often across different devices or locations.
Methods and systems for improving the quality of video streams, generating intermediate frames, or continuously locating and following objects within a sequence of images, even under occlusion.
Systems and methods for automatically managing telephone calls, including intelligent routing based on various criteria, scheduling callbacks, and processing emergency calls.
Features within messaging platforms that enhance user interaction and content consumption through intelligent suggestions, content persistence mechanisms, engagement analytics, and adaptive presentation of conversational media.
Methods and apparatus for improving the visual fidelity, resolution, or compression efficiency of video signals, often through advanced processing, up-scaling, or neural network-based filters.
Methods and systems for enhancing the security and privacy of electronic messages, often by integrating contextual data such as location, social network graphs, or user authentication levels to control access, filter content, or enable specific group interactions.
Techniques for enhancing, encoding, decoding, or separating speech and audio signals, often involving multi-microphone arrays, acoustic echo cancellation, beamforming, or advanced audio compression for improved clarity and quality.
Technologies that create dynamic and interactive visual content for displays, including virtual/wearable systems, by generating overlays, replacing input streams, or merging real-time user actions with digital environments.
Techniques for improving the perceived quality, synchronization, and moderation of audio and voice streams, often involving codec management, transcoding, and content analysis.
Technologies for generating artificial speech that is personalized, context-aware, or adaptable to specific virtual agents or messaging campaigns, often utilizing text-to-speech (TTS) and audio caching for efficient delivery.
Techniques for rendering, interacting with, and managing content within augmented or virtual reality environments, including spatial tracking, gaze interaction, and dynamic multi-application display management.
AI systems designed to engage in natural language dialogue, maintain conversation state, understand user intent, and generate relevant responses, often across multiple communication channels or modalities.
Methods and systems for efficiently distributing and delivering media content, including techniques for multi-source streaming, content caching, and optimizing delivery based on network conditions or device capabilities.
Designing user interfaces and interaction methods specifically for mobile or wearable devices, enabling control of external systems, monitoring user states, or facilitating real-world transactions.
Methods and systems for identifying, extracting, and structuring specific entities, relationships, or insights from text-based documents, often involving techniques like named entity recognition, relation extraction, or summarization.
Methods and systems for protecting network resources and data from unauthorized access, misuse, or attack, encompassing authentication, authorization, encryption, and traffic filtering mechanisms. This includes securing communication channels and validating network access.
Techniques for generating human-like text or other content using large pre-trained models, often involving prompt engineering, speculative decoding, or multi-modal inputs for content creation.
Core infrastructure and operational techniques for efficient and reliable message handling, including server-side logic for managing subscriptions, aggregating messages, optimizing network connections, and ensuring data consistency across distributed messaging services.
Methods and systems for identifying synthetic or manipulated speech (deepfake audio) using forensic analysis of audio features, such as breath patterns, vocoder signatures, or machine learning models to determine authenticity.
Techniques to improve the accuracy and robustness of Automatic Speech Recognition (ASR) systems by incorporating contextual information, dynamic hint words, or customized machine learning models for specific domains or users.
Enhancements to the physical and data link layers of network communication, focusing on hardware components, signal integrity, power efficiency, and efficient data transfer mechanisms for specific interfaces and buses.
Mobile applications and systems leveraging wireless communication and location data (e.g., GPS, RFID, geo-fencing) to provide context-specific services, transactions, or user interactions.
Systems that use user data, preferences, and machine learning to generate tailored advice, product recommendations, goal-setting plans, or contextual information for individuals across different domains.
Techniques for combining and analyzing information from multiple distinct data modalities (e.g., text, image, video, audio, sensor data) to derive richer insights or improve system performance and decision-making.
Systems and methods for automating the lifecycle of machine learning models, including pipeline deployment, model management, versioning, and configuring for different inference environments.
Systems that combine data from multiple camera sensors or capture multiple images from different perspectives or qualities, often involving image processing techniques like synthesis to create enhanced or comprehensive views.
Applications of speech processing and artificial intelligence for medical diagnosis, therapeutic interventions, or accessibility solutions, particularly for conditions affecting speech production or hearing.
Techniques and hardware architectures designed to efficiently generate and display complex 3D graphics, particularly for interactive applications like virtual reality, focusing on speed and visual quality.
Systems designed to streamline and automate various commercial transactions, including mobile-enhanced processes, secure online checkouts, customer service interactions, and privilege issuance, often leveraging digital authentication.
Engineering solutions for creating electronic devices with bendable, foldable, or stretchable form factors, often involving hinges, flexible displays, and sliding mechanisms to enable dynamic physical configurations.
Patents
Showing 1-10 of 219
Collaborative User Experiences