Rtp-Rtcp
文章目录
- What It is?
- RTP Header Fields
- Rtcp SR message
- Rtcp RR message
- Rtcp SDES message
- Rtcp BYE message
- Rtcp APP message
- Rtcp Feedback(FB) message
- Rtcp Extended Reports (XR) Packet
- How to generate it & parse it?
- What is ntp? How to represent it?
- How to multiplexing RTP packets and RTCP packets on a single port
- How to packet H264 bit stream?
- How to set abs-send-time?
- How to parse Rtp packet header?
- How to work around rtp timestamp wrap around
- How to calculate RTT for media sender&media receiver?
- How to calculate jitter for RR?
- How to know whether a packet is a retransmitted one or not if RTX is not enabled?
- How to know whether a packet is arriving in-order or not?
- How to generate RR rtcp packet?
- How to generate SR rtcp packet?
- How to generate NACK rtcp fb packet?
- How to generate a compound rtcp packet?
- How to generate a compound rtcp feedback packet?
- Why reduced-size rtcp packets [a=rtcp-rsize]?
- How to map from rtp timestamp to it's ntp timestamp?
- When to send it?
- Reference
What It is?
RTP Header Fields
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| defined by profile | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| header extension |
| .... |
eg: abs-send-time
a=extmap:3 http://www.webrtc.org/experiments/rtp-hdrext/abs-send-time
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 0xBEDE | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ID=3 | Len=2 | 6.18 fixed point |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| other header extensions |
| .... |
The pidding packets can be used to probe bandwidth:
The proposed algorithm aims at keeping the queue as small as possible without underutilizing the link. To achieve this goal the algorithm has to probe for the available bandwidth increasing its sending rate until a positive queuing delay variation is detected. At this point the sending rate is reduced. For this reason we argue that some queuing delay has to be induced in order to run the delay-based congestion control.
The sender sets the target sending bitrate A equal to the minimum between Ar and As. The target bitrate A is fed both to the Pacer and to the Encoder. The Encoder strives to produce a bitrate as close to this target as possible. However, the encoder cannot change the rate as frequently as the Pacer’s rate to avoid video quality flickering which is known to have an adverse impact on the QoE. For this reason the encoder may not be able to produce a rate precisely matching the target A. In the case the encoder produces a rate higher than the target A, the Pacer is allowed to drain its queue at a rate f · A, where f is the Pacing factor equal to 1.5. In this way the pacer can quickly empty its queue avoiding queuing delays at the sender.
On the other hand, whenever the Pacer produces a rate lower than the target A, padding or FEC might be added. As a result of this mechanism, an average sending bitrate equal to A is produced.
is the receiving rate measured in the last 500ms and is upper bounded by .
Rtcp SR message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
header |V=2|P| RC | PT=SR=200 | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of sender |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
sender | NTP timestamp, most significant word |
info +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NTP timestamp, least significant word |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RTP timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sender's packet count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sender's octet count |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
report | SSRC_1 (SSRC of first source) |
block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1 | fraction lost | cumulative number of packets lost |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| extended highest sequence number received |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| interarrival jitter |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| last SR (LSR) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| delay since last SR (DLSR) |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
report | SSRC_2 (SSRC of second source) |
block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2 : ... :
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| profile-specific extensions |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Rtcp RR message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
header |V=2|P| RC | PT=RR=201 | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of packet sender |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
report | SSRC_1 (SSRC of first source) |
block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1 | fraction lost | cumulative number of packets lost |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| extended highest sequence number received |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| interarrival jitter |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| last SR (LSR) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| delay since last SR (DLSR) |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
report | SSRC_2 (SSRC of second source) |
block +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
2 : ... :
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| profile-specific extensions |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Rtcp SDES message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
header |V=2|P| SC | PT=SDES=202 | length |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
chunk | SSRC/CSRC_1 |
1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SDES items |
| ... |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
chunk | SSRC/CSRC_2 |
2 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SDES items |
| ... |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
eg: CNAME
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| CNAME=1 | length | user and domain name ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Rtcp BYE message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| SC | PT=BYE=203 | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC/CSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: ... :
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
(opt) | length | reason for leaving ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Rtcp APP message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| subtype | PT=APP=204 | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC/CSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| name (ASCII) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| application-dependent data ...
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Rtcp Feedback(FB) message
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| FMT | PT | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of packet sender |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of media source |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: Feedback Control Information (FCI) :
: :
Figure 3: Common Packet Format for Feedback Messages
Payload type (PT): 8 bits
This is the RTCP packet type that identifies the packet as being
an RTCP FB message. Two values are defined by the IANA:
Name | Value | Brief Description
----------+-------+------------------------------------
RTPFB | 205 | Transport layer FB message
PSFB | 206 | Payload-specific FB message
eg:
NACK
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| FMT=1 | PT=RTPFB | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of packet sender |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of media source |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| PID | BLP |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Picture Loss Indication (PLI)
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| FMT=1| PT=PSFB | length=2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of packet sender |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of media source |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
REMB
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| FMT=15 | PT=PSFB | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of packet sender |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of media source |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Unique identifier 'R' 'E' 'M' 'B' |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Num SSRC | BR Exp | BR Mantissa |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC feedback |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... |
bitrate = Mantissa[18]*2^Exp[6];
You can see an example implementation of REMB looking at chromium code: https://cs.chromium.org/chromium/src/third_party/webrtc/modules/rtp_rtcp/source/rtcp_packet/remb.cc?l=42
Rtcp Extended Reports (XR) Packet
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|reserved | PT=XR=207 | length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: report blocks :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
An extended report block has the following format:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| BT | type-specific | block length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
: type-specific block contents :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
eg:
Receiver Reference Time Report Block
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| BT=4 | reserved | block length = 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NTP timestamp, most significant word |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NTP timestamp, least significant word |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
DLRR Report Block
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| BT=5 | reserved | block length |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| SSRC_1 (SSRC of first receiver) | sub-
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block
| last RR (LRR) | 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| delay since last RR (DLRR) |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| SSRC_2 (SSRC of second receiver) | sub-
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block
: ... : 2
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
How to generate it & parse it?
What is ntp? How to represent it?
Wallclock time (absolute date and time) is represented using the timestamp format of the Network Time Protocol (NTP), which is in seconds relative to 0h UTC on 1 January 1900. The full resolution NTP timestamp is a 64-bit unsigned fixed-point number with the integer part in the first 32 bits and the fractional part in the last 32 bits. In some fields where a more compact representation is appropriate, only the middle 32 bits are used; that is, the low 16 bits of the integer part and the high 16 bits of the fractional part. The high 16 bits of the integer part must be determined independently.
class NtpTime {
public:
static constexpr uint64_t kFractionsPerSecond = 0x100000000;
NtpTime() : value_(0) {}
explicit NtpTime(uint64_t value) : value_(value) {}
NtpTime(uint32_t seconds, uint32_t fractions)
: value_(seconds * kFractionsPerSecond + fractions) {}
NtpTime(const NtpTime&) = default;
NtpTime& operator=(const NtpTime&) = default;
explicit operator uint64_t() const { return value_; }
void Set(uint32_t seconds, uint32_t fractions) {
value_ = seconds * kFractionsPerSecond + fractions;
}
void Reset() { value_ = 0; }
int64_t ToMs() const {
static constexpr double kNtpFracPerMs = 4.294967296E6; // 2^32 / 1000.
const double frac_ms = static_cast<double>(fractions()) / kNtpFracPerMs;
return 1000 * static_cast<int64_t>(seconds()) +
static_cast<int64_t>(frac_ms + 0.5);
}
// NTP standard (RFC1305, section 3.1) explicitly state value 0 is invalid.
bool Valid() const { return value_ != 0; }
uint32_t seconds() const {
return rtc::dchecked_cast<uint32_t>(value_ / kFractionsPerSecond);
}
uint32_t fractions() const {
return rtc::dchecked_cast<uint32_t>(value_ % kFractionsPerSecond);
}
private:
uint64_t value_;
};
Unix epoch:A time point of which the time is 1970-01-01 00:00:00 UTC
Ntp epoch:A time point of which the time is 1900-01-01 00:00:00 UTC
kUnixEpoch2NtpEpochDiff:
kNtpFractionsPerSecond:= 0x100000000 = 4.294967296E+9
Unsigned encoding number
eg:
Two’s complement encoding number
eg:
In computer, we use two’s complement to represent negative numbers. One of the property of 2’s complement numbers is that arithmetic opreration of either positive or negative numbers are identical because “substract x from y is equal to add to y, and it is easy for cpu to get from x simply by reversing every bit of x, then plus one”.
The computer cpu dose have a “Full-adder-circuit” which combines with other logical circuits to do calculation.
e.g.
In 3 bit binary number system:
5 = 101(2), 3 =011(2). 5+3 = 2^3, so 5 is the complement number of 3, 3 is the complement number of 5.
Substract a number is equal to add it’s complement number, So:
(7-3)mod8 = (7 + 5) mod8 = 100(2) = 4
(7-5)mod8 = (7+3)mod8 = 010(2)=2.
In 3 bit binary two’s complement encoding number system, the most highest bit is the sign bit, 1 represents a negtive number and 0 represents a positive number, so the bit pattern of 5 = 101(2) is the two’s complement encoding bit pattern of -3, 7=111(2) is the two’s complement encoding bit pattern of -1 and 100(2) is the two’s complement encoding bit pattern of -4.
So, to calculate (7+5)mod8, for computer, no matter how we define the mean of bit pattern 101(2) [5 or -3], It just do mod8 addtion like this: (111(2) + 101(2))mod8 = 1100(2)mod8 = 100(2) = 4 or -4 in two’s complement encoding.
Fixed point number
The fixed point number is a more general Unsigned encoding number/Two’s complement encoding number with customized location of binary point. The minimal number unit(so called precision/fraction unit) that a fixed point numer can represent is decided by the location of binary point. For a Unsigned encoding number/Two’s complement encoding number, The minimal number unit is , which is 1, more general, for a fixed point number type […].[…], The minimal number unit is . So, you can change the precision by changing the location of the binary point.
The binary point: A binary point is like the decimal point in a decimal system. It acts as a divider between the integer and the fractional part of a number. A binary point represents the coefficient of the term .
Note: Unsigned encoding number/Two’s complement encoding number itself is a “special type” of fixed point number which the binary point is located at the end of it’s binary bits. In c/c++, as long as fix point numbers belong to the same type(have the same number of bits and location of binary point), they can do calculation (+ - * /) just the same as Unsigned encoding number/Two’s complement encoding number.
Take a 5.3(5 bits integer part and 3 bits fractional part) unsigned fixed point number as an example:
5.25 + 9.125 =00101.010 + 01001.001 = 01110.011=14.375 .
We can use a 8 bit unsigned int type to represent a 5.3 unsigned fixed point number.
Below is the c code for the above example:
uint8_t a = (5<<3)|2; //00101.010
uint8_t b = (9<<3)|1; //01001.001
uint8_t c = a + b; //01110.011
/*
from the "int" point of view a = 42, b = 73, c=a+b=115.
you can think "a=42" represents "the number of 'fraction unit' that a has is 42"
and "b = 73" represents "the number of 'fraction unit' that b has is 73".
Then the number of 'fraction unit' that c=a+b has is 73+42=115.
Because the value of a 'fraction unit' of type 5.3 unsigned fixed point number is 2^(-3), so the value of c is 115*2^(-3)=14.375
*/
How to multiplexing RTP packets and RTCP packets on a single port
It is acceptable to multiplex RTP and RTCP packets on a single UDP port to ease NAT traversal for unicast sessions.
The RTCP packet type field occupies the same position in the packet as the combination of the RTP marker (M) bit and the RTP payload type (PT). This field can be used to distinguish RTP and RTCP packets.
Two restrictions are observed:
-
the RTP payload type values used are distinct from the RTCP packet types used; and
-
for each RTP payload type (PT), PT+128 is distinct from the RTCP packet types used. The first constraint precludes a direct conflict between RTP payload type and RTCP packet type; the second constraint precludes a conflict between an RTP data packet with the marker bit set and an RTCP packet.
it is RECOMMENDED to follow the guidelines in the RTP/AVP profile for the choice of RTP payload type values, with the additional restriction that payload type values in the range 64-95 MUST NOT be used. Specifically, dynamic RTP payload types SHOULD be chosen in the range 96-127 where possible. Values below 64 MAY be used if that is insufficient, in which case it is RECOMMENDED that payload type numbers that are not statically assigned by RTP/AVP profile be used first.
When the Session Description Protocol (SDP) is used to negotiate RTP sessions following the offer/answer model, the “a=rtcp-mux” attribute indicates the desire to multiplex RTP and RTCP onto a single port. The initial SDP offer MUST include this attribute at the media level to request multiplexing of RTP and RTCP on a single port. If the answerer wishes to multiplex RTP and RTCP onto a single port, it MUST include a media-level “a=rtcp-mux” attribute in the answer. If the answer does not contain an “a=rtcp-mux” attribute, the offerer MUST NOT multiplex RTP and RTCP packets on a single port.
The presence of an “a=rtcp- mux” attribute signals that the sender will multiplex RTP and RTCP on the same port. The receiver MUST be prepared to receive RTCP packets on the RTP port, and any resource reservation needs to be made including the RTCP bandwidth.
// Check the RTP payload type. If 63 < payload type < 96, it's RTCP. More range should be
// available if we carefully choose rtp pt.
// For additional details, see http://tools.ietf.org/html/rfc5761.
bool IsRtcp(const char* data, int len)
{
if (len < 2) {
return false;
}
char pt = data[1] & 0x7F;
return (63 < pt) && (pt < 96);
}
How to packet H264 bit stream?
a=rtpmap:96 H264/90000
See RtpPacketizerH264 and RTP Payload Format for H.264 Video.
See RTPReceiverVideo::ParseRtpPacket and PacketBuffer::InsertPacket for more detail on how to depacketizer h264 rtp packets and buffer it.
The RTP payload format allows for packetization of one or more Network Abstraction Layer Units (NALUs), produced by an H.264 video encoder, in each RTP payload. This RTP payload can only be used to carry the “naked” H.264 NAL unit stream over RTP and not the bitstream format discussed in Annex B(with start bit) of H.264.
The NAL unit header:
+---------------+
|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+
|F|NRI| Type |
+---------------+
A receiver can identify the payload structure of a H264 RTP packet by the first byte of the RTP packet payload, which co-serves as the RTP payload header. This byte is always structured as a NAL unit header. The NAL unit type field indicates which structure is present. The possible structures are as follows.
a) Single NAL Unit Packet (One NAL unit per RTP packet).
Single NAL Unit Packet: Contains only a single NAL unit in the payload. The NAL header type field is equal to the original NAL unit type, i.e., in the range of 1 to 23, inclusive.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|NRI|Type=1~23| |
+-+-+-+-+-+-+-+-+ |
| |
| Bytes 2..n of a single NAL unit |
| |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| :...OPTIONAL RTP padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
RTP payload format for single NAL unit packet
b) Fragmentation Unit (One Nal unit multiple RTP packet).
Fragmentation Unit: Used to fragment a single NAL unit over multiple RTP packets. Exists with two versions, FU-A and FU-B, identified with the NAL unit type numbers 28 and 29, respectively.
A fragment of a NAL unit consists of an integer number of consecutive octets of that NAL unit. Each octet of the NAL unit MUST be part of exactly one fragment of that NAL unit.
Fragments of the same NAL unit MUST be sent in consecutive order with ascending RTP sequence numbers (with no other RTP packets within the same RTP packet stream being sent between the first and last fragment). Similarly, a NAL unit MUST be reassembled in RTP sequence number order.
When a NAL unit is fragmented and conveyed within fragmentation units (FUs), it is referred to as a fragmented NAL unit. STAPs and MTAPs MUST NOT be fragmented. FUs MUST NOT be nested; that is, an FU MUST NOT contain another FU. The RTP timestamp of an RTP packet carrying an FU is set to the NALU- time of the fragmented NAL unit.
- FU-A
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| FU indicator | FU header | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| |
| FU payload |
| |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| :...OPTIONAL RTP padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The FU indicator octet of FU-A has the following format:
+---------------+
|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+
|F|NRI| Type=28 |
+---------------+
The FU header has the following format:
+---------------+
|0|1|2|3|4|5|6|7|
+-+-+-+-+-+-+-+-+
|S|E|R|Type=1~23|
+---------------+
A fragmented NAL unit MUST NOT be transmitted in one FU; that is, the Start bit and End bit MUST NOT both be set to one in the same FU header.
The NAL unit type octet of the fragmented NAL unit is conveyed in the F and NRI fields of the FU indicator octet of the fragmentation unit and in the type field of the FU header.
uint8_t fu_indicator, fu_header, start_bit, nal_type, nal, end_bit;
fu_indicator = buf[0];
fu_header = buf[1];
end_bit = fu_header & 0x40;
nal_type = fu_header & 0x1f;
// NAL unit type octet of the fragmented Nal unit.
nal = (fu_indicator & 0xe0) | nal_type;
A receiver in an endpoint or in a MANE MAY aggregate the first n-1 fragments of a NAL unit to an (incomplete) NAL unit, even if fragment n of that NAL unit is not received. In this case, the forbidden_zero_bit of the NAL unit MUST be set to one to indicate a syntax violation.
c) Aggregation Packet (Multiple Nal unit per RTP packet).
Aggregation Packet: Packet type used to aggregate multiple NAL units into a single RTP payload. This packet exists in four versions, the Single-Time Aggregation Packet type A (STAP-A), the Single-Time Aggregation Packet type B (STAP-B), Multi-Time Aggregation Packet (MTAP) with 16-bit offset (MTAP16), and Multi-Time Aggregation Packet (MTAP) with 24-bit offset (MTAP24). The NAL unit type numbers assigned for STAP-A, STAP-B, MTAP16, and MTAP24 are 24, 25, 26, and 27, respectively.
- STAP-A
A single-time aggregation packet (STAP) SHOULD be used whenever NAL units are aggregated that all share the same NALU-time. The payload of an STAP-A does not include DON and consists of at least one single-time aggregation unit
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|F|NRI| Type=24 | |
+-+-+-+-+-+-+-+-+ |
| |
| one or more aggregation units |
| |
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| :...OPTIONAL RTP padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
RTP payload format for aggregation packets
The marker bit in the RTP header is set to the value that the marker bit of the last NAL unit of the aggregated packet would have if it were transported in its own RTP packet. An aggregation packet MUST NOT contain fragmentation units; Aggregation packets MUST NOT be nested; that is, an aggregation packet MUST NOT contain another aggregation packet.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RTP Header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|STAP-A NAL HDR | NALU 1 Size | NALU 1 HDR |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NALU 1 Data |
: :
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | NALU 2 Size | NALU 2 HDR |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NALU 2 Data |
: :
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| :...OPTIONAL RTP padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
An example of an RTP packet including an STAP-A
containing two single-time aggregation units
Summary of allowed NAL unit types for each packetization
mode (yes = allowed, no = disallowed, ig = ignore)
Payload Packet Single NAL Non-Interleaved Interleaved
Type Type Unit Mode Mode Mode
-------------------------------------------------------------
0 reserved ig ig ig
1-23 NAL unit yes yes no *
24 STAP-A no yes no *
25 STAP-B no no yes
26 MTAP16 no no yes
27 MTAP24 no no yes
28 FU-A no yes yes *
29 FU-B no no yes
30-31 reserved ig ig ig
Note: NAL units of NAL unit types 1-23 can also be used in the "Non-Interleaved Mode" as aggregation units in STAP-A packets or fragmented units in FU-A packets, in addition to being directly used as packet payloads.
In the single NAL unit packetization mode, the transmission order of NAL units, determined by the RTP sequence number, MUST be the same as their NAL unit decoding order.
In the non-interleaved packetization mode, the transmission order of NAL units in single NAL unit packets, STAP-As, and FU-As MUST be the same as their NAL unit decoding order.The NAL units within an STAP MUST appear in the NAL unit decoding order.
Note: Multi-Time Aggregation Packets and Interleaved mode is not RECOMMENDED for real time stream.
A picture with multiple NAL unit, each one of them can be packeted with a payload structure depend on the size of that NAL unit.
How to set abs-send-time?
See RTPSender::SendToNetwork and AbsoluteSendTime::MsTo24Bits
static constexpr uint32_t MsTo24Bits(int64_t time_ms)
{
return static_cast<uint32_t>(((time_ms << 18) + 500) / 1000) & 0x00FFFFFF;
}
How to parse Rtp packet header?
See RtpHeaderParser::Parse .
How to work around rtp timestamp wrap around
int64_t TimestampWrapAroundHandler::Unwrap(uint32_t ts) {
if (num_wrap_ == -1) {
last_ts_ = ts;
num_wrap_ = 0;
return ts;
}
// case 1: wrap around.
if (ts < last_ts_) {
if (last_ts_ >= 0xf0000000 && ts < 0x0fffffff)
++num_wrap_;
}
// case 2: late packet:
/*
this is the case that last_ts_ is very small [wraped recently],
and ts is very large [a late packet].
*/
else if ((ts - last_ts_) > 0xf0000000) {
// Backwards wrap. Unwrap with last wrap count and don't update last_ts_.
return ts + ((num_wrap_ - 1) << 32);
}
last_ts_ = ts;
return ts + (num_wrap_ << 32);
}
How to calculate RTT for media sender&media receiver?
a) media sender:
Use last SR timestamp (LSR) and delay since last SR (DLSR) from Receiver Report RTCP Packet:
By recording the time when this reception report block is received.
[10 Nov 1995 11:33:25.125 UTC] [10 Nov 1995 11:33:36.5 UTC]
n SR(n) A=b710:8000 (46864.500 s)
---------------------------------------------------------------->
lsr=46853.125s v ^
ntp_sec =0xb44db705 v ^ dlsr=0x0005:4000 ( 5.250s)
ntp_frac=0x20000000 v ^ lsr =0xb705:2000 (46853.125s)
(3024992005.125 s) v ^
r v ^ RR(n)
---------------------------------------------------------------->
|<-DLSR->|
(5.250 s)
A 0xb710:8000 (46864.500 s)
DLSR -0x0005:4000 ( 5.250 s)
LSR -0xb705:2000 (46853.125 s)
-------------------------------
delay 0x0006:2000 ( 6.125 s)
Figure 2: Example for round-trip time computation
b) media receiver:
Receiver Reference Time Report Block : Receiver-end wallclock timestamp, together with the DLRR Report Block, these allow non-senders to calculate round-trip times.
/* ### At both side: webrtcvideoengine.cc AddDefaultFeedbackParams: ### */
//@zsf:add
codec->AddFeedbackParam(FeedbackParam(kRtcpFbParamRrtr, kParamValueEmpty));
namespace {
const uint32_t kRtcpAnyExtendedReports =
kRtcpXrVoipMetric | kRtcpXrReceiverReferenceTime | kRtcpXrDlrrReportBlock |
kRtcpXrTargetBitrate;
} // namespace
/* ### At receiver side: RTCPSender::PrepareReport: ### */
if ((!sending_ && xr_send_receiver_reference_time_enabled_) ||
!feedback_state.last_xr_rtis.empty() || video_bitrate_allocation_) {
SetFlag(kRtcpAnyExtendedReports, true);
}
/* ### At receiver side: RTCPSender::BuildExtendedReports: ### */
if (!sending_ && xr_send_receiver_reference_time_enabled_) {
rtcp::Rrtr rrtr;
rrtr.SetNtp(ctx.now_);
xr->SetRrtr(rrtr);
}
/* ### At sender side: RTCPReceiver::ConsumeReceivedXrReferenceTimeInfo: ### */
last_xr_rtis.emplace_back(rrtr.ssrc, rrtr.received_remote_mid_ntp_time,
now_ntp - rrtr.local_receive_mid_ntp_time);
/* ### At sender side: RTCPSender::BuildExtendedReports: ### */
for (const rtcp::ReceiveTimeInfo& rti : ctx.feedback_state_.last_xr_rtis) {
xr->AddDlrrItem(rti);
}
How to calculate jitter for RR?
int transit = arrival - r->ts;
int d = transit - s->transit;
s->transit = transit;
if (d < 0) d = -d;
s->jitter += (1./16.) * ((double)d - s->jitter);
// that is jitter = 1/16*current_jitter + 15/16*old_jitter.
Alternatively, the jitter estimate can be kept as an integer, but scaled to 16 to reduce round-off error. The calculation is the same except for the last line:
s->jitter += d - ((s->jitter + 8) >> 4);
In this case, the estimate is sampled for the reception report as:
rr->jitter = s->jitter >> 4;
How to know whether a packet is a retransmitted one or not if RTX is not enabled?
See RtpVideoStreamReceiver::IsPacketRetransmitted.
bool StreamStatisticianImpl::IsRetransmitOfOldPacket(
const RTPHeader& header, int64_t min_rtt) const {
rtc::CritScope cs(&stream_lock_);
if (InOrderPacketInternal(header.sequenceNumber)) {
return false;
}
uint32_t frequency_khz = header.payload_type_frequency / 1000;
assert(frequency_khz > 0);
int64_t time_diff_ms = clock_->TimeInMilliseconds() -
last_receive_time_ms_;
// Diff in time stamp since last received in order.
uint32_t timestamp_diff = header.timestamp - last_received_timestamp_;
uint32_t rtp_time_stamp_diff_ms = timestamp_diff / frequency_khz;
int64_t max_delay_ms = 0;
if (min_rtt == 0) {
// Jitter standard deviation in samples.
float jitter_std = sqrt(static_cast<float>(jitter_q4_ >> 4));
// 2 times the standard deviation => 95% confidence.
// And transform to milliseconds by dividing by the frequency in kHz.
max_delay_ms = static_cast<int64_t>((2 * jitter_std) / frequency_khz);
// Min max_delay_ms is 1.
if (max_delay_ms == 0) {
max_delay_ms = 1;
}
} else {
max_delay_ms = (min_rtt / 3) + 1;
}
return time_diff_ms > rtp_time_stamp_diff_ms + max_delay_ms;
}
How to know whether a packet is arriving in-order or not?
template <typename U>
inline bool IsNewer(U value, U prev_value)
{
static_assert(!std::numeric_limits<U>::is_signed, "U must be unsigned");
constexpr U kBreakpoint = (std::numeric_limits<U>::max() >> 1) + 1;
if (value - prev_value == kBreakpoint){return value > prev_value;}
return value != prev_value &&static_cast<U>(value - prev_value) < kBreakpoint;
}
inline bool IsNewerSequenceNumber(uint16_t sequence_number,
uint16_t prev_sequence_number) {
return IsNewer(sequence_number, prev_sequence_number);
}
bool StreamStatisticianImpl::InOrderPacketInternal(uint16_t sequence_number) const {
// First packet is always in order.
if (last_receive_time_ms_ == 0){return true;}
if (IsNewerSequenceNumber(sequence_number, received_seq_max_)) {
return true;
} else {
// If we have a restart of the remote side this packet is still in order.
return !IsNewerSequenceNumber(sequence_number, received_seq_max_ -
max_reordering_threshold_);
}
}
How to generate RR rtcp packet?
See StreamStatisticianImpl::CalculateRtcpStatistics and RTCPSender::BuildRR.
About 32bit LSR Ntp:
Wallclock time (absolute date and time) is represented using the timestamp format of the Network Time Protocol (NTP), which is in seconds relative to 0h UTC on 1 January 1900. The full resolution NTP timestamp is a 64-bit unsigned fixed-point number with the integer part in the first 32 bits and the fractional part in the last 32 bits. In some fields where a more compact representation is appropriate, only the middle 32 bits are used; that is, the low 16 bits of the integer part and the high 16 bits of the fractional part. The high 16 bits of the integer part must be determined independently.
See SaturatedUsToCompactNtp and CompactNtpRttToMs.
How to generate SR rtcp packet?
See ModuleRtpRtcpImpl::GetFeedbackState and RTCPSender::BuildSR.
How to generate NACK rtcp fb packet?
Keep a NACK list. Every time the receiver receives a rtp packet, It update the NACK list by:
case 1: In-order packet with seq_x.
Add [newest_seq_num + 1,seq_x) to NACK list. If the nack list is too large, remove packets from the nack list until the latest first packet of a keyframe. If the list is still too large, clear it and request a keyframe.
case 2: Disorder packet with seq_y.
Remove seq_y from NACK list if it was added to NACK list before.
std::vector<uint16_t> NackModule::GetNackBatch(NackFilterOptions options) {
bool consider_seq_num = options != kTimeOnly;
bool consider_timestamp = options != kSeqNumOnly;
int64_t now_ms = clock_->TimeInMilliseconds();
std::vector<uint16_t> nack_batch;
auto it = nack_list_.begin();
while (it != nack_list_.end()) {
if (consider_seq_num && it->second.sent_at_time == -1 &&
AheadOrAt(newest_seq_num_, it->second.send_at_seq_num)) {
nack_batch.emplace_back(it->second.seq_num);
++it->second.retries;
it->second.sent_at_time = now_ms;
if (it->second.retries >= kMaxNackRetries) {
RTC_LOG(LS_WARNING) << "Sequence number " << it->second.seq_num
<< " removed from NACK list due to max retries.";
it = nack_list_.erase(it);
} else {
++it;
}
continue;
}
if (consider_timestamp && it->second.sent_at_time + rtt_ms_ <= now_ms) {
nack_batch.emplace_back(it->second.seq_num);
++it->second.retries;
it->second.sent_at_time = now_ms;
if (it->second.retries >= kMaxNackRetries) {
RTC_LOG(LS_WARNING) << "Sequence number " << it->second.seq_num
<< " removed from NACK list due to max retries.";
it = nack_list_.erase(it);
} else {
++it;
}
continue;
}
++it;
}
return nack_batch;
}
How to generate a compound rtcp packet?
See rfc3550-6.1 RTCP Packet Format
Typically, multiple RTCP packets are sent together as a compound RTCP packet in a single packet of the underlying protocol; this is enabled by the length field in the fixed header of each RTCP packet.
Each RTCP packet begins with a fixed part similar to that of RTP data packets, followed by structured elements that MAY be of variable length according to the packet type but MUST end on a 32-bit boundary. The alignment requirement and a length field in the fixed part of each packet are included to make RTCP packets “stackable”.
Multiple RTCP packets can be concatenated without any intervening separators to form a compound RTCP packet that is sent in a single packet of the lower layer protocol, for example UDP. There is no explicit count of individual RTCP packets in the compound packet since the lower layer protocols are expected to provide an overall length to determine the end of the compound packet.
- Reception statistics (in SR or RR) should be sent as often as bandwidth constraints will allow to maximize the resolution of the statistics, therefore each periodically transmitted compound RTCP packet MUST include a report packet.
- New receivers need to receive the CNAME for a source as soon as possible to identify the source and to begin associating media for purposes such as lip-sync, so each compound RTCP packet MUST also include the SDES CNAME except when the compound RTCP packet is split for partial encryption
Thus, all RTCP packets MUST be sent in a compound packet of at least two individual packets, with the following format:
- SR or RR: The first RTCP packet in the compound packet MUST always be a report packet to facilitate header validation. This is true even if no data has been sent or received, in which case an empty RR MUST be sent, and even if the only other RTCP packet in the compound packet is a BYE.
- Additional RRs: If the number of sources for which reception statistics are being reported exceeds 31(the number that will fit into one SR or RR packet), then additional RR packets SHOULD follow the initial report packet.
- An SDES packet containing a CNAME item MUST be included in each compound RTCP packet.
- BYE or APP: Other RTCP packet types, including those yet to be defined, MAY follow in any order, except that BYE SHOULD be the last packet sent with a given SSRC/CSRC.
An individual RTP participant SHOULD send only one compound RTCP packet per report interval. If there are too many sources to fit all the necessary RR packets into one compound RTCP packet without exceeding the maximum transmission unit (MTU) of the network path, then only the subset that will fit into one MTU SHOULD be included in each interval. The subsets SHOULD be selected round-robin across multiple intervals so that all sources are reported.
Note that each of the compound packets MUST begin with an SR or RR packet.
How to generate a compound rtcp feedback packet?
See rfc4585-Compound RTCP Feedback Packets
Two components constitute RTCP-based feedback:
- Status reports are contained in sender report (SR)/received report (RR) packets and are transmitted at regular intervals as part of compound RTCP packets (which also include source description (SDES) and possibly other messages); these status reports provide an overall indication for the recent reception quality of a media stream.
- FB messages as defined in Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)
RTCP FB messages are just another RTCP packet type. Therefore, multiple FB messages MAY be combined in a single compound RTCP packet and they MAY also be sent combined with other RTCP packets.
Compound RTCP packets containing FB messages MUST contain RTCP packets in the order:
- MANDATORY SR or RR.
- MANDATORY SDES, which MUST contain the CNAME item; all other SDES items are OPTIONAL.
- One or more FB messages.
Two types of compound RTCP packets carrying feedback packets are used:
- Minimal compound RTCP feedback packet.
- (Full) compound RTCP feedback packet
Why reduced-size rtcp packets [a=rtcp-rsize]?
A Reduced-Size RTCP packet is an RTCP packet with the following properties that make it deviate from the compound RTCP packet:
- contains one or more RTCP packet(s)
- allows any RTCP packet type
- MUST NOT be used for Regular (scheduled) RTCP report purposes
- MUST NOT be used with the RTP/AVP profile [RFC3551] or the RTP/SAVP profile
Most advantages of using Reduced- Size RTCP packets exist in cases when the available RTCP bitrate is limited. This is because they can become substantially smaller than compound packets. A compound packet is forced to contain both an RR or an SR and the CNAME SDES item. The RR containing a report block for a single source is 32 bytes, an SR is 52 bytes. Both may be larger if they contain report blocks for multiple sources. The SDES packet containing a CNAME item will be 10 bytes plus the CNAME string length. Here, it is reasonable that the CNAME string is at least 10 bytes to get a decent collision resistance. If the recommended form of [email protected] is used, then most strings will be longer than 20 characters. Thus, a Reduced-Size RTCP can become at least 70-80 bytes smaller than the compound packet.
How to map from rtp timestamp to it’s ntp timestamp?
rtp_ts(y) = rtp_fz(k)*rtp_ntp(x) + rtp_ts_offset(d) ==> y = k*x+d.
rtp_fz can be calculated after receiving at least two sr packets:
// Calculates the RTP timestamp frequency from two pairs of NTP/RTP timestamps. bool CalculateFrequency(int64_t ntp_ms1, uint32_t rtp_timestamp1, int64_t ntp_ms2, uint32_t rtp_timestamp2, double* frequency_khz) { if (ntp_ms1 <= ntp_ms2) return false; *frequency_khz = static_cast<double>(rtp_timestamp1 - rtp_timestamp2) / static_cast<double>(ntp_ms1 - ntp_ms2); return true; }
void RtpToNtpEstimator::UpdateParameters() {
if (measurements_.size() != kNumRtcpReportsToUse)
return;
Parameters params;
int64_t timestamp_new = measurements_.front().unwrapped_rtp_timestamp;
int64_t timestamp_old = measurements_.back().unwrapped_rtp_timestamp;
int64_t ntp_ms_new = measurements_.front().ntp_time.ToMs();
int64_t ntp_ms_old = measurements_.back().ntp_time.ToMs();
if (!CalculateFrequency(ntp_ms_new, timestamp_new, ntp_ms_old, timestamp_old,
¶ms.frequency_khz)) {
return;
}
params.offset_ts = timestamp_new - params.frequency_khz * ntp_ms_new;
params_calculated_ = true;
smoothing_filter_.Insert(params);
}
// calculate rtp ntp ms:
double rtp_ms =
(static_cast<double>(rtp_timestamp_unwrapped) - params.offset_ms) /
params.frequency_khz + 0.5f;
When to send it?
Algorithm from rfc-3550
See 6.2 RTCP Transmission Interval and6.3 RTCP Packet Send and Receive Rules and rfc3550-A.7 Computing the RTCP Transmission Interval and RTCPSender::TimeToSendRTCPReport and RTCPSender::PrepareReport and live555/liveMedia/RTCP.cpp.
RTP is designed to allow an application to scale automatically over session sizes ranging from a few participants to thousands. The control traffic should be limited to a small and known fraction of the session bandwidth: small so that the primary function of the transport protocol to carry data is not impaired.
It is RECOMMENDED that MAX RTCP BW is 5% of the session BW (often, the session bandwidth is the sum of the nominal bandwidths of the senders expected to be concurrently active.). It is also RECOMMENDED that 1/4 of the RTCP bandwidth be dedicated to participants that are sending data(media senders). When the proportion of senders is greater than 1/4 of the participants, the senders get their proportion of the full RTCP bandwidth (that is S/(S+R)*BW).
At application startup, a delay SHOULD be imposed before the first compound RTCP packet is sent to allow time for RTCP packets to be received from other participants so the report interval will converge to the correct value more quickly. This delay MAY be set to half the minimum interval to allow quicker notification that the new participant is present.
Since the calculated interval is dependent on the number of observed group members, there may be undesirable startup effects when a new user joins an existing session, or many users simultaneously join a new session. These new users will initially have incorrect estimates of the group membership, and thus their RTCP transmission interval will be too short. This problem can be significant if many users join the session simultaneously. To deal with this, an algorithm called “timer reconsideration” is employed. This algorithm implements a simple back-off mechanism which causes users to hold back RTCP packet transmission if the group sizes are increasing.
When users leave a session, either with a BYE or by timeout, the group membership decreases, and thus the calculated interval should decrease. A “reverse reconsideration” algorithm is used to allow members to more quickly reduce their intervals in response to group membership decreases.
An implementation which is constrained to two-party unicast operation SHOULD still use randomization of the RTCP transmission interval to avoid unintended synchronization of multiple instances operating in the same environment, but MAY omit the “timer reconsideration” and “reverse reconsideration” algorithms.
If the participant has not yet sent an RTCP packet (the variable initial is true), the constant is set to 2.5 seconds, else it is set to 5 seconds. that is:
a) If the number of senders is less than or equal to 25% of the membership (members):
for media senders: , .
for media receivers: , .
b) If the number of senders is greater than 25%, senders and receivers are treated together. The constant is set to the average RTCP packet size divided by the total RTCP bandwidth and is set to the total number of members. that is:
,
An implementation MAY scale the minimum RTCP interval to a smaller value inversely proportional to the session bandwidth parameter
The RECOMMENDED value for the reduced minimum in seconds is 360 divided by the session bandwidth in kilobits/second. This minimum is smaller than 5 seconds for bandwidths greater than 72 kb/s.
case 1: audio
5s interval by default.
case 2: video
1s interval by default for a BW smaller than 360 kbit/s, and seconds interval for a BW greater than 360 kbit/s. technicaly we break the max 5% RTCP BW for video below 10 kbit/s but that should be extremely rare.
Algorithm from rfc-4585
See rfc-4585-3.5. AVPF RTCP Scheduling Algorithm.
rfc-3550 specify rules when compound RTCP packets should be sent. This document modifies those rules in order to allow applications to timely report events (e.g., loss or reception of RTP packets) and to accommodate algorithms that use FB messages.
ACK
feedback
V
:<- - - - NACK feedback - - - ->//
:
: Immediate ||
: Feedback mode ||Early RTCP mode Regular RTCP mode
:<=============>||<=============>//<=================>
: ||
-+---------------||---------------//------------------> group size
2 ||
Application-specific FB Threshold
= f(data rate, packet loss, codec, ...)
Figure 1: Modes of operation
Skip one regular transmission if an Early RTCP packet transmission has occurred, So the time slot for the next Regular RTCP packet MUST be updated accordingly to have a new tn (tn=tp+2*T_rr) and a new tp (tp=tp+T_rr) afterwards.
event to
report
detected
|
| RTCP feedback range
| (T_max_fb_delay)
vXXXXXXXXXXXXXXXXXXXXXXXXXXX ) )
|---+--------+-------------+-----+------------| |--------+--->
| | | | ( ( |
| t0 te |
tp tn
\_______ ________/
\/
T_dither_max
Figure 2: Event report and parameters for Early RTCP scheduling
Reference
RTP: A Transport Protocol for Real-Time Applications
RTP Profile for Audio and Video Conferences with Minimal Control
Multiplexing RTP Data and Control Packets on a Single Port
Sending Multiple RTP Streams in a Single RTP Session
RTP Control Protocol Extended Reports (RTCP XR)
Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)
Support for Reduced-Size Real-Time Transport Control Protocol (RTCP): Opportunities and Consequences
A General Mechanism for RTP Header Extensions
RTP Payload Format for H.264 Video
Sending Multiple RTP Streams in a Single RTP Session
An Offer/Answer Model with the Session Description Protocol (SDP)
Protocol overview: RTP and RTCP
How to limit WebRTC bandwidth by modifying the SDP
Analysis and Design of the Google Congestion Control forWebRTC
A Google Congestion Control Algorithm for Real-Time Communication draft-ietf-rmcat-gcc-02