Camera App Architecture and Design#
This document provides a detailed overview of the design and architecture of the Matter camera application, focusing on the Linux implementation.
High-Level Architecture#
The camera application is designed with a clear separation between the generic
Matter cluster logic and the platform-specific hardware abstraction. This is
achieved through the use of a CameraDeviceInterface, which defines the
contract that any platform-specific camera implementation must adhere to.
The core components are:
Camera App (
CameraApp): Responsible for initializing and managing the Matter clusters related to camera functionality. It is platform-agnostic and interacts with the hardware through theCameraDeviceInterface.Camera Device (
CameraDevice): The Linux-specific implementation of theCameraDeviceInterface. It manages the camera hardware, using V4L2 and GStreamer, and provides the necessary delegates for the Matter clusters.Media Controller (
DefaultMediaController): The central hub for media data distribution. It receives encoded media frames from theCameraDeviceand distributes them to the various transport managers.Transport Managers (
WebRTCProviderManager,PushAvStreamTransportManager): These classes manage the specific transport protocols for streaming media to clients.Matter Cluster Managers: The various Matter cluster managers that expose the camera’s functionality to the Matter network (e.g.,
CameraAVStreamManagement,WebRTCTransportProvider).
This layered architecture allows for easy porting of the camera application to
other platforms by simply providing a new implementation of the
CameraDeviceInterface.
Component Breakdown#
CameraApp#
Responsibilities:
Initializes and configures the Matter clusters for camera functionality (e.g.,
CameraAVStreamManagementCluster,WebRTCTransportProviderCluster,ChimeServer).Fetches camera capabilities and settings from the
CameraDeviceInterfaceto configure the clusters.Manages the lifecycle of the cluster servers.
Key Interactions:
Takes a
CameraDeviceInterfacepointer in its constructor.Calls methods on the
CameraDeviceInterfaceto get delegates and hardware information.
CameraDeviceInterface#
Responsibilities:
Defines the abstract interface for a camera device.
Declares methods for accessing the various delegates required by the Matter clusters.
Defines the
CameraHALInterface, an inner interface that abstracts the hardware-specific operations (e.g., starting/stopping streams, taking snapshots).
CameraDevice#
Responsibilities:
Implements the
CameraDeviceInterfaceandCameraDeviceInterface::CameraHALInterfacefor the Linux platform.Manages the video device using V4L2.
Uses GStreamer to create and manage pipelines for video and audio streaming, as well as snapshots.
Instantiates and manages the various manager classes that implement the cluster delegate logic (e.g.,
CameraAVStreamManager,WebRTCProviderManager).Instantiates the
DefaultMediaControllerand passes the encoded media frames to it.Maintains the state of the camera device (e.g., pan, tilt, zoom, privacy modes).
Provides the
CameraHALInterfaceimplementation to theCameraAVStreamManager.
Key Interactions:
Is instantiated in
main.cppand passed toCameraAppInit.The manager classes it contains are returned by the
Get...Delegate()methods to theCameraApp.Its
CameraHALInterfacemethods (likeStartVideoStream,StopVideoStream) are called byCameraAVStreamManager.
DefaultMediaController#
Responsibilities:
Acts as the central distribution point for all encoded media data (video and audio).
Receives media frames from the
CameraDevice’s GStreamer pipeline callbacks.Maintains a pre-roll buffer (
PushAVPreRollBuffer) that stores a configurable duration of recent media frames. This is crucial for event-based recording, as it allows the recording to include footage from before the event occurred.Manages a list of registered transports (e.g.,
WebRTCTransport,PushAVTransport).When a new media frame is received, it is pushed to the pre-roll buffer, which then multicasts the frame to all interested and registered transports.
Key Interactions:
Is owned by the
CameraDevice.The
WebRTCProviderManagerandPushAvStreamTransportManagerregister their transport instances with theMediaController.
Manager Classes and SDK Interaction#
The manager classes in the camera-app are concrete implementations of the delegate interfaces defined in the Matter SDK. They act as a bridge between the generic cluster logic in the SDK and the specific hardware implementation in the camera-app.
CameraAVStreamManager:SDK Interface:
chip::app::Clusters::CameraAvStreamManagement::CameraAVStreamManagementDelegateResponsibilities: Implements the application-specific logic for the Camera AV Stream Management cluster. This includes:
Handling requests to allocate, deallocate, and modify video, audio, and snapshot streams (
VideoStreamAllocate,AudioStreamAllocate, etc.).Validating stream parameters against camera capabilities obtained from
CameraDeviceHAL.Checking for resource availability (e.g., encoder slots).
Interacting with the
CameraDevice(specifically itsCameraHALInterfaceimplementation) to start and stop the GStreamer pipelines for the various streams (e.g., callingStartVideoStream,StopVideoStream).Notifying the SDK cluster of allocation/deallocation results.
Interaction with SDK: The
CameraAVStreamManagementClusterin the SDK calls the methods of theCameraAVStreamManagerto handle the commands it receives from the Matter network. For example, when aVideoStreamAllocatecommand is received, theCameraAVStreamManagementClustercalls theVideoStreamAllocatemethod on theCameraAVStreamManager.Interaction with
CameraDevice: It holds aCameraDeviceInterface * mCameraDeviceHALpointer, set viaSetCameraDeviceHAL. When a stream needs to be started, stopped, or modified,CameraAVStreamManagercalls the appropriate method onmCameraDeviceHAL->GetCameraHALInterface(), e.g.,mCameraDeviceHAL->GetCameraHALInterface().StartVideoStream(allocatedStream).
WebRTCProviderManager:SDK Interface:
chip::app::Clusters::WebRTCTransportProvider::DelegateResponsibilities: Manages the lifecycle of WebRTC sessions for live streaming.
Session Initiation:
HandleSolicitOffer: When a client wants the camera to initiate the WebRTC handshake, this method creates aWebrtcTransport, generates an SDP Offer, and sends it to the client.HandleProvideOffer: When a client initiates the handshake, this method processes the received SDP Offer, creates aWebrtcTransport, generates an SDP Answer, and sends it back.
Handshake: Manages the exchange of SDP messages and ICE candidates between the camera and the WebRTC client. It uses callbacks like
OnLocalDescriptionto send the locally generated SDP, andHandleProvideICECandidatesto process candidates from the client.Media Flow: Once the WebRTC connection is established (
OnConnectionStateChanged(Connected)), it registers theWebrtcTransportwith theDefaultMediaControllerto receive and send audio/video frames.Stream Management: Acquires and releases audio/video stream resources from
CameraAVStreamManagerusingAcquireAudioVideoStreamsandReleaseAudioVideoStreams.Privacy: Handles
LiveStreamPrivacyModeChangedto end sessions if live stream privacy is enabled.
PushAvStreamTransportManager:SDK Interface:
chip::app::Clusters::PushAvStreamTransport::DelegateResponsibilities: Manages push-based AV streaming for event-based recording (e.g., CMAF).
Allocation:
AllocatePushTransportcreates aPushAVTransportobject for a given client request. This object is configured with details like container type (CMAF), segment duration, and target streams.Registration: The created
PushAVTransportis registered with theDefaultMediaControllerto access media data, including the pre-roll buffer.Triggering:
ManuallyTriggerTransport: Allows a client to force a recording.HandleZoneTrigger: Called byCameraDevicewhen a motion zone alarm is raised. This checks whichPushAVTransportinstances are configured for that zone and initiates the recording and upload process.
Bandwidth Management: Validates that new or modified transport configurations do not exceed the camera’s maximum network bandwidth (
ValidateBandwidthLimit).Session Management: Monitors active recording sessions and can restart them to limit maximum session duration, generating new session IDs.
ChimeManager:SDK Interface:
chip::app::Clusters::Chime::DelegateResponsibilities: Implements the logic for the Chime cluster.
Sound Management: Provides a list of available chime sounds (
GetChimeSoundByIndex).Playback: The
PlayChimeSoundcommand handler checks if the chime is enabled and logs the intent to play the selected sound. The current Linux example does not include actual audio playback for chimes.Configuration: Interacts with the
ChimeServerto get the enabled state and selected chime ID.
ZoneManager:SDK Interface:
chip::app::Clusters::ZoneManagement::DelegateResponsibilities:
Manages the creation, update, and removal of 2D Cartesian zones.
Handles the creation and management of triggers associated with zones (e.g., motion detection).
Receives zone event notifications from the
CameraDeviceHAL (e.g.,OnZoneTriggeredEvent).Emits
ZoneTriggeredandZoneStoppedMatter events to subscribers.Manages trigger logic, including initial duration, augmentation duration, max duration, and blind duration using an internal timer.
Key Interactions:
CreateTrigger,UpdateTrigger,RemoveTriggercommands delegate to theCameraHALInterface.CameraDevicecallsOnZoneTriggeredEventon this manager when the HAL detects activity in a zone.PushAvStreamTransportManageris notified of zone triggers to start recordings.
CameraAVSettingsUserLevelManager:SDK Interface:
chip::app::Clusters::CameraAvSettingsUserLevelManagement::DelegateResponsibilities: Handles user-level settings, including Pan, Tilt, Zoom.
Mechanical PTZ (MPTZ):
Commands like
MPTZSetPosition,MPTZRelativeMove, andMPTZMoveToPresetare received from the SDK cluster.These commands are delegated to the
CameraHALInterface(CameraDevice) to interact with the physical hardware (simulated in this app).The manager simulates the time taken for physical movement using a timer before responding to the command.
Digital PTZ (DPTZ):
DPTZSetViewport: Sets the digital viewport for a specific allocated video stream ID. It validates the requested viewport against the stream’s resolution, aspect ratio, and the camera sensor’s capabilities. The change is applied viaCameraHALInterface::SetViewport.DPTZRelativeMove: Adjusts the current viewport of a specific video stream by a delta. Calculations are done to keep the viewport within bounds and maintain the aspect ratio.
Interaction Diagrams#
Component Relationships#
graph TD
subgraph "Platform Agnostic"
CameraApp
CameraDeviceInterface
end
subgraph "Linux Platform"
CameraDevice -- implements --> CameraDeviceInterface
CameraDevice -- owns --> GStreamer
CameraDevice -- uses --> V4L2
CameraDevice -- owns --> DefaultMediaController
CameraDevice -- owns --> CameraAVStreamManager
CameraDevice -- owns --> WebRTCProviderManager
CameraDevice -- owns --> PushAvStreamTransportManager
CameraDevice -- owns --> ChimeManager
CameraDevice -- owns --> ZoneManager
WebRTCProviderManager --> DefaultMediaController
PushAvStreamTransportManager --> DefaultMediaController
end
subgraph "Matter Clusters (SDK)"
CameraAVStreamManagementCluster
WebRTCTransportProviderCluster
PushAvStreamTransportServer
ChimeServer
ZoneMgmtServer
end
Main --> CameraDevice
Main --> CameraApp
CameraApp -- aggregates --> CameraDeviceInterface
CameraApp --> CameraAVStreamManagementCluster
CameraApp --> WebRTCTransportProviderCluster
CameraApp --> PushAvStreamTransportServer
CameraApp --> ChimeServer
CameraApp --> ZoneMgmtServer
CameraAVStreamManagementCluster -- uses delegate --> CameraAVStreamManager
WebRTCTransportProviderCluster -- uses delegate --> WebRTCProviderManager
PushAvStreamTransportServer -- uses delegate --> PushAvStreamTransportManager
ChimeServer -- uses delegate --> ChimeManager
ZoneMgmtServer -- uses delegate --> ZoneManager
CameraAVStreamManager -- calls HAL --> CameraDevice
ZoneManager -- calls HAL --> CameraDevice
Video Stream Allocation Sequence#
sequenceDiagram
participant Client
participant CameraAVStreamManagementCluster
participant CameraAVStreamManager as Delegate
participant CameraDevice as HAL
Client ->> CameraAVStreamManagementCluster: VideoStreamAllocate Request
CameraAVStreamManagementCluster ->> Delegate: VideoStreamAllocate()
Delegate ->> HAL: GetAvailableVideoStreams()
HAL -->> Delegate: List of streams
Delegate ->> Delegate: Find compatible stream
Delegate ->> HAL: IsResourceAvailable()
HAL -->> Delegate: Yes/No
alt Resources Available
Delegate -->> CameraAVStreamManagementCluster: Success, streamID
CameraAVStreamManagementCluster ->> Delegate: OnVideoStreamAllocated()
Delegate ->> HAL: StartVideoStream(streamID)
HAL ->> HAL: Configure & Start GStreamer Pipeline
else Resources NOT Available
Delegate -->> CameraAVStreamManagementCluster: ResourceExhausted
end
CameraAVStreamManagementCluster -->> Client: VideoStreamAllocate Response
Push AV Transport Allocation Sequence#
sequenceDiagram
participant Client
participant PushAVStreamTransportServer as SDK Cluster
participant PushAvStreamTransportManager as Delegate
participant MediaController
participant CameraAVStreamManager
Client ->> SDK Cluster: AllocatePushTransport Request
SDK Cluster ->> Delegate: AllocatePushTransport()
Delegate ->> CameraAVStreamManager: GetBandwidthForStreams()
CameraAVStreamManager -->> Delegate: Bandwidth
Delegate ->> Delegate: ValidateBandwidthLimit()
alt Bandwidth OK
Delegate ->> Delegate: Create PushAVTransport instance
Delegate ->> MediaController: RegisterTransport(PushAVTransport, videoID, audioID)
MediaController ->> MediaController: Add to transport list
Delegate ->> MediaController: SetPreRollLength()
Delegate -->> SDK Cluster: Success
else Bandwidth Exceeded
Delegate -->> SDK Cluster: ResourceExhausted
end
SDK Cluster -->> Client: AllocatePushTransport Response
WebRTC Livestream Setup Sequence (Client Offer)#
sequenceDiagram
participant Client
participant WebRTCTransportProviderCluster as SDK Cluster
participant WebRTCProviderManager as Delegate
participant WebrtcTransport
participant MediaController
participant CameraAVStreamManager
Client ->> SDK Cluster: ProvideOffer Request (SDP Offer)
SDK Cluster ->> Delegate: HandleProvideOffer()
Delegate ->> Delegate: Create WebrtcTransport
Delegate ->> WebrtcTransport: SetRemoteDescription(Offer)
Delegate ->> CameraAVStreamManager: AcquireAudioVideoStreams()
CameraAVStreamManager -->> Delegate: Success
Delegate ->> WebrtcTransport: CreateAnswer()
WebrtcTransport -->> Delegate: OnLocalDescription(SDP Answer)
Delegate ->> Delegate: ScheduleAnswerSend()
SDK Cluster -->> Client: ProvideOffer Response
Delegate -->> Client: Answer Command (SDP Answer)
Client ->> SDK Cluster: ProvideICECandidates Request
SDK Cluster ->> Delegate: HandleProvideICECandidates()
Delegate ->> WebrtcTransport: AddRemoteCandidate()
Note right of Delegate: Meanwhile...
WebrtcTransport -->> Delegate: OnICECandidate (Local Candidates)
Delegate ->> Delegate: ScheduleICECandidatesSend()
Delegate -->> Client: ICECandidates Command
Note over Client, WebrtcTransport: ICE Connectivity Establishment
WebrtcTransport -->> Delegate: OnConnectionStateChanged(Connected)
Delegate ->> MediaController: RegisterTransport(WebrtcTransport, videoID, audioID)
Note over Client, MediaController: Live Stream Starts
High-Level Data Flow#
graph TD
subgraph "Camera Hardware"
CameraSensor
end
subgraph "Linux Kernel"
V4L2_Driver
end
subgraph "Userspace (Camera App)"
GStreamer_Pipeline
CameraDevice
DefaultMediaController
WebRTCTransport
PushAVTransport
end
subgraph "Matter Network"
ClientDevice
end
CameraSensor --> V4L2_Driver
V4L2_Driver --> GStreamer_Pipeline
GStreamer_Pipeline -- Encoded Data --> CameraDevice
CameraDevice -- Encoded Data (H.264/Opus) --> DefaultMediaController
DefaultMediaController --> WebRTCTransport
DefaultMediaController --> PushAVTransport
WebRTCTransport --> ClientDevice
PushAVTransport --> ClientDevice
Data Flow#
Video Streaming#
A Matter client requests a video stream from the
CameraAVStreamManagementcluster.The
CameraAVStreamManagementCluster(in the SDK) receives the request and calls theVideoStreamAllocatemethod on its delegate, theCameraAVStreamManager.The
CameraAVStreamManagervalidates the request, checks for compatible stream configurations and available resources by querying theCameraDevice(HAL).If a stream can be allocated,
CameraAVStreamManagerupdates the stream state.The SDK server informs the
CameraAVStreamManagerviaOnVideoStreamAllocated.CameraAVStreamManagercallsStartVideoStreamon theCameraDevice’sCameraHALInterface.The
CameraDevicecreates and starts a GStreamer pipeline to handle the video stream. The pipeline is configured to:Read raw video frames from the V4L2 device (
v4l2src).Convert the video frames to the I420 format (
videoconvert).Encode the frames to H.264 (
x264enc).Send the encoded frames to an
appsink.
The
appsinkhas a callback function (OnNewVideoSampleFromAppSink) that is called for each new frame.In the callback, the encoded H.264 data is retrieved and passed to the
DefaultMediaController.The
DefaultMediaControllerpushes the frame to its pre-roll buffer, which then distributes it to all registered transports.For a live stream, the
WebRTCTransportreceives the frame and sends it over the established WebRTC connection to the client.For an event-based recording, the
PushAVTransportreceives the frame and includes it in the recording that is pushed to the client.
Snapshot#
A Matter client requests a snapshot from the
CameraAVStreamManagementcluster.The
CameraAVStreamManagementClusterreceives the request and calls theCaptureSnapshotmethod on theCameraAVStreamManager.CameraAVStreamManagerdelegates the call toCameraDevice::CaptureSnapshot.The
CameraDevice(if a snapshot stream is not running) creates a GStreamer pipeline to capture a single frame. The pipeline is configured to:Read a frame from the V4L2 device (
v4l2srcorlibcamerasrc).Encode the frame as a JPEG (
jpegenc).Save the JPEG to a file (
multifilesink).
The
CameraDevicethen reads the JPEG file from disk and sends the data back to the client as the response to the snapshot request.
GStreamer Integration#
GStreamer is used extensively in the CameraDevice to handle all media
processing. The CameraDevice class contains helper methods
(CreateVideoPipeline, CreateAudioPipeline, CreateSnapshotPipeline,
CreateAudioPlaybackPipeline) that construct these GStreamer pipelines.
Pipelines are dynamically created, started, and stopped based on requests from
the Matter clusters, as orchestrated by the various managers (especially
CameraAVStreamManager).
Video Streaming (
CreateVideoPipeline):Source:
v4l2src(orvideotestsrcfor testing) to capture from the camera device.Caps Negotiation:
capsfilterto set resolution and framerate.Format Conversion:
videoconvertto ensure the format is suitable for the encoder (e.g., I420).Encoding:
x264encfor H.264 encoding.Sink:
appsinkwith theOnNewVideoSampleFromAppSinkcallback. This callback receives the encoded H.264 buffers and passes them toDefaultMediaController::DistributeVideo.Lifecycle: Started by
CameraDevice::StartVideoStream, stopped byCameraDevice::StopVideoStream.
Audio Streaming (
CreateAudioPipeline):Source:
pulsesrc(oraudiotestsrcfor testing).Caps Negotiation:
capsfilterto set sample rate, channels.Format Conversion:
audioconvertandaudioresample.Encoding:
opusencfor Opus encoding.Sink:
appsinkwith theOnNewAudioSampleFromAppSinkcallback, which passes encoded Opus buffers toDefaultMediaController::DistributeAudio.Lifecycle: Started by
CameraDevice::StartAudioStream, stopped byCameraDevice::StopAudioStream.
Snapshots (
CreateSnapshotPipeline):Source:
v4l2srcorlibcamerasrcdepending on camera type.Caps Negotiation:
capsfilterfor resolution/format.Encoding:
jpegencto create a JPEG image.Sink:
multifilesinkto save the JPEG to a temporary file (SNAPSHOT_FILE_PATH). TheCameraDevice::CaptureSnapshotmethod then reads this file.Lifecycle: Created on-demand when
CameraDevice::CaptureSnapshotis called and a snapshot stream is active.
Audio Playback (
CreateAudioPlaybackPipeline):Source:
udpsrcto receive RTP Opus packets from the network (e.g., from a WebRTC session).Jitter Buffer:
rtpjitterbufferto handle network jitter.Depacketization:
rtpopusdepayto extract Opus frames from RTP.Decoding:
opusdecto decode Opus audio.Output:
audioconvert,audioresample, andautoaudiosinkto play the audio on the device speakers.Lifecycle: Started by
CameraDevice::StartAudioPlaybackStream, stopped byCameraDevice::StopAudioPlaybackStream.
The state of these pipelines (e.g., NULL, READY, PLAYING) is managed using
gst_element_set_state. Callbacks on appsink elements are crucial for linking
GStreamer’s data flow to the Matter application logic.