Fix WHEP playback: track retention, vanilla-ICE offer, RTP depacketization, playout pacing#1919
Open
mkulaczkowski wants to merge 1 commit into
Open
Fix WHEP playback: track retention, vanilla-ICE offer, RTP depacketization, playout pacing#1919mkulaczkowski wants to merge 1 commit into
mkulaczkowski wants to merge 1 commit into
Conversation
…n, playout pacing
Fixes a chain of issues that prevented HTTP (WHEP) playback from ever
delivering a frame, verified end-to-end against MediaMTX (0 frames -> ~30fps):
- HTTPSession: retain playback tracks. RTCTrack.deinit calls rtcDeleteTrack,
so discarding the addTrack(...) result deleted the native tracks before
negotiation ("No DataChannel or Track to negotiate" -> connect always threw).
- HTTPSession: libdatachannel's rtcCreateOffer requires disableAutoNegotiation
and returns RTC_ERR_FAILURE under the default config. Use the canonical flow:
setLocalDescription, wait for ICE gathering to complete, then read
rtcGetLocalDescription - the offer then also carries the gathered candidates,
which a non-trickle client needs for the server to reach it.
- HTTPSession: throw on non-2xx WHEP/WHIP responses instead of feeding error
bodies into setRemoteDescription ("Remote description has no ICE user
fragment").
- HTTPSession: retry playback with a video-only offer when the server rejects
the audio m-line (e.g. MediaMTX "codecs not supported by client" for any
stream without Opus audio). Opus streams keep audio+video.
- HTTPSession: convert URL userinfo into an HTTP Basic Authorization header
(URLSession does not transmit userinfo; MediaMTX authenticates WHIP this way).
- RTCPeerConnection: offer multiple H264 profile variants (42e01f/42c01f/
42001f/4d001f/64001f) like browsers do; a single hardcoded profile is
rejected for streams whose profile bytes differ. Adds
currentLocalDescription() wrapping rtcGetLocalDescription.
- RTPJitterBuffer: prime the expected sequence from the first packet (RTP
sequence numbers start at a random value per RFC 3550), drop late packets
wrap-aware, and jump over gaps once the reorder window fills. The previous
advance-by-one stale handling let expectedSequence run past the live
sequence after the first loss, permanently stalling delivery.
- RTPH264Packetizer: implement STAP-A (RFC 6184 5.7.1) - SPS/PPS commonly
arrive aggregated; without it no frame ever decodes. Accumulate FU-A NAL
units into full access units and emit on the RTP marker (multi-slice frames
were emitted per-slice -> kVTVideoDecoderBadDataErr). Build the AVCC buffer
forward from parsed units; the in-place start-code rewrite corrupted access
units whose NAL lengths fall in 256-511 (a written length 00 00 01 xx
re-matches as a start code).
- MediaLink: use the audio clock for playout pacing only while it is
advancing; an attached but silent AudioPlayerNode reports currentTime == 0
forever, pinning video at the first frame for video-only streams.
- VTDecompressionSession: log decode failures (throttled) instead of silently
dropping them.
- DisplayLinkChoreographer (macOS): fall back to the display refresh period
when frameInterval is 0 so elapsed-time consumers advance.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WHEP playback via
HTTPSessioncurrently cannot deliver a frame:connect()always throws, and even with negotiation fixed, the RTP→decode pipeline drops everything. This PR fixes the full chain, verified end-to-end against a live MediaMTX server (0 frames → sustained ~30 fps on macOS and iOS).Fixes (in pipeline order)
RTCTrack.deinitcallsrtcDeleteTrack, andconnect(.playback)discarded theaddTrack(...)results — libdatachannel then threw "No DataChannel or Track to negotiate" onsetLocalDescription, so connect always failed. The tracks are now retained for the session lifetime.rtcCreateOfferis invalid under the default auto-negotiation config (returnsRTC_ERR_FAILURE; rtc.h marks it "for specific use cases only"). Replaced with the canonical flow:setLocalDescription→ wait for ICE gathering to complete → readrtcGetLocalDescription. The offer then also contains the gathered candidates — required for this non-trickle client (no PATCH), otherwise the server may never reach it.requestOfferignored the status code, so e.g. MediaMTX's JSON error bodies went intosetRemoteDescription("Remote description has no ICE user fragment"). Non-2xx now throws with status + body.https://user:pass@host/...) is converted into anAuthorization: Basicheader (URLSession does not transmit userinfo; MediaMTX authenticates WHIP publishing this way).42e01fonly — streams with different profile bytes (plain/constrained baseline from x264, main, high) were rejected. The offer now lists multiple profile variants like browsers do; the server's answer picks one and the depacketizer follows the negotiated description.RTPJitterBuffercould never start, then wedged permanently. It expected the first RTP sequence to be 0 (it's random per RFC 3550 §5.1), and its advance-by-one stale handling letexpectedSequencerun past the live sequence after the first loss/reorder — after which nothing matched again. Now primes from the first packet, drops late packets wrap-aware, and jumps over gaps once the reorder window fills.kVTVideoDecoderBadDataErron every frame. NAL units now accumulate into the access unit and emit on the RTP marker.00 00 01 xx, whichtoNALFileFormat's continuing reverse scan re-matched as a start code. The AVCC buffer is now built forward from the parsed NAL units.MediaLinkpacing preferredaudioPlayer.currentTime, which is 0 forever for an attached-but-idleAudioPlayerNode; the audio clock is now used only while it advances.guard let imageBuffer else { return }discarded the status). They're now logged (throttled). Also: the macOSDisplayLinkreported a zero frame interval with the defaultpreferredFramesPerSecond = 0, so elapsed-time consumers never advanced; it falls back to the display refresh period.Verification
HTTPSessionFactory→.playbackagainst MediaMTX (H.264 + Opus via RTSP ingest, and H.264 + AAC / video-only variants): SDP negotiation, ICE connect, sustained decoded frames; the audio-fallback matrix behaves as described..publishmode uses the same negotiation flow).swift buildclean on this branch.Happy to split this into smaller PRs if preferred.