Problem Reading an Rgb Value at Pixel (0, 0). Most Likely Due to a Non-integer Value.
This browser is no longer supported.
Upgrade to Microsoft Edge to have advantage of the latest features, security updates, and technical support.
Recommended 8-Bit YUV Formats for Video Rendering
Gary Sullivan and Stephen Estrop
Microsoft Corporation
Apr 2002, updated November 2008
This topic describes the eight-chip YUV color formats that are recommended for video rendering in the Windows operating system. This commodity presents techniques for converting between YUV and RGB formats, and besides provides techniques for upsampling YUV formats. This commodity is intended for anyone working with YUV video decoding or rendering in Windows.
Introduction
Numerous YUV formats are defined throughout the video manufacture. This article identifies the 8-flake YUV formats that are recommended for video rendering in Windows. Decoder vendors and brandish vendors are encouraged to support the formats described in this article. This article does not address other uses of YUV color, such equally however photography.
The formats described in this article all utilise 8 $.25 per pixel location to encode the Y channel (besides chosen the luma channel), and use 8 bits per sample to encode each U or 5 chroma sample. Still, most YUV formats use fewer than 24 bits per pixel on average, because they incorporate fewer samples of U and V than of Y. This article does non cover YUV formats with 10-bit or college Y channels.
Note
For the purposes of this article, the term U is equivalent to Cb, and the term V is equivalent to Cr.
This article covers the following topics:
- YUV Sampling. Describes the almost common YUV sampling techniques.
- Surface Definitions. Describes the recommended YUV formats.
- Color Infinite and Chroma Sampling Charge per unit Conversions. Provides some guidelines for converting betwixt YUV and RGB formats and for converting between different YUV formats.
- Identifying YUV Formats in Media Foundation. Explains how to draw YUV format types in Media Foundation.
YUV Sampling
Blush channels tin have a lower sampling charge per unit than the luma channel, without any dramatic loss of perceptual quality. A note called the "A:B:C" notation is used to draw how often U and 5 are sampled relative to Y:
- 4:4:4 means no downsampling of the chroma channels.
- four:2:2 means ii:1 horizontal downsampling, with no vertical downsampling. Every scan line contains iv Y samples for every two U or V samples.
- iv:2:0 ways 2:i horizontal downsampling, with two:1 vertical downsampling.
- four:1:one means 4:one horizontal downsampling, with no vertical downsampling. Every scan line contains four Y samples for each U and V sample. 4:1:ane sampling is less common than other formats, and is not discussed in detail in this commodity.
The following diagrams shows how chroma is sampled for each of the downsampling rates. Luma samples are represented by a cross, and blush samples are represented by a circumvolve.
The dominant form of 4:ii:2 sampling is defined in ITU-R Recommendation BT.601. There are two mutual variants of 4:2:0 sampling. One of these is used in MPEG-2 video, and the other is used in MPEG-1 and in ITU-T Recommendations H.261 and H.263.
Compared with the MPEG-ane scheme, it is simpler to convert between the MPEG-2 scheme and the sampling grids defined for 4:two:two and 4:4:4 formats. For this reason, the MPEG-ii scheme is preferred in Windows, and should exist considered the default estimation of four:2:0 formats.
Surface Definitions
This section describes the 8-chip YUV formats that are recommended for video rendering. These autumn into several categories:
- 4:four:4 Formats, 32 Bits per Pixel
- four:2:2 Formats, 16 Bits per Pixel
- 4:two:0 Formats, sixteen Bits per Pixel
- 4:2:0 Formats, 12 Bits per Pixel
Commencement, you should be aware of the following concepts in order to understand what follows:
- Surface origin. For the YUV formats described in this article, the origin (0,0) is always the acme left corner of the surface.
- Stride. The footstep of a surface, sometimes called the pitch, is the width of the surface in bytes. Given a surface origin at the top left corner, the step is always positive.
- Alignment. The alignment of a surface is at the discretion of the graphics display driver. The surface must ever be DWORD aligned; that is, individual lines within the surface are guaranteed to originate on a 32-chip (DWORD) boundary. The alignment can exist larger than 32 bits, however, depending on the needs of the hardware.
- Packed format versus planar format. YUV formats are divided into packed formats and planar formats. In a packed format, the Y, U, and V components are stored in a single array. Pixels are organized into groups of macropixels, whose layout depends on the format. In a planar format, the Y, U, and V components are stored as iii split planes.
Each of the YUV formats described in this article has an assigned FOURCC lawmaking. A FOURCC lawmaking is a 32-fleck unsigned integer that is created by concatenating four ASCII characters.
- 4:iv:4 (32 bpp)
- AYUV
- four:2:two (16 bpp)
- YUY2
- UYVY
- four:2:0 (sixteen bpp)
- IMC1
- IMC3
- 4:two:0 (12 bpp)
- IMC2
- IMC4
- YV12
- NV12
four:4:iv Formats, 32 Bits per Pixel
AYUV
A unmarried four:four:4 format is recommended, with the FOURCC code AYUV. This is a packed format, where each pixel is encoded as four consecutive bytes, arranged in the sequence shown in the following illustration.
The bytes marked A incorporate values for alpha.
4:2:2 Formats, xvi Bits per Pixel
2 4:2:2 formats are recommended, with the post-obit FOURCC codes:
- YUY2
- UYVY
Both are packed formats, where each macropixel is two pixels encoded as four sequent bytes. This results in horizontal downsampling of the blush by a factor of two.
YUY2
In YUY2 format, the data tin can be treated as an array of unsigned char values, where the showtime byte contains the commencement Y sample, the 2nd byte contains the commencement U (Cb) sample, the tertiary byte contains the second Y sample, and the fourth byte contains the first V (Cr) sample, as shown in the following diagram.
If the prototype is addressed as an assortment of little-endian WORD values, the kickoff Word contains the offset Y sample in the least significant bits (LSBs) and the showtime U (Cb) sample in the well-nigh meaning bits (MSBs). The 2nd WORD contains the 2d Y sample in the LSBs and the first 5 (Cr) sample in the MSBs.
YUY2 is the preferred 4:2:2 pixel format for Microsoft DirectX Video Acceleration (DirectX VA). It is expected to be an intermediate-term requirement for DirectX VA accelerators supporting four:two:2 video.
UYVY
This format is the same as the YUY2 format except the byte lodge is reversed—that is, the chroma and luma bytes are flipped (Figure four). If the image is addressed as an array of ii little-endian Discussion values, the outset Give-and-take contains U in the LSBs and Y0 in the MSBs, and the second Word contains V in the LSBs and Y1 in the MSBs.
4:2:0 Formats, xvi Bits per Pixel
Ii 4:ii:0 sixteen-$.25 per pixel (bpp) formats are recommended, with the following FOURCC codes:
- IMC1
- IMC3
Both of these YUV formats are planar formats. The chroma channels are subsampled past a factor of two in both the horizontal and vertical dimensions.
IMC1
All of the Y samples appear first in retention as an array of unsigned char values. This is followed past all of the V (Cr) samples, and then all of the U (Cb) samples. The V and U planes take the same stride every bit the Y aeroplane, resulting in unused areas of retention, as shown in Figure 5. The U and V planes must start on memory boundaries that are a multiple of xvi lines. Figure 5 shows the origin of U and V for a 352 x 240 video frame. The starting address of the U and Five planes are calculated every bit follows:
BYTE* pV = pY + (((Height + 15) & ~fifteen) * Stride); BYTE* pU = pY + (((((Height * 3) / 2) + fifteen) & ~15) * Stride); where pY is a byte pointer to the start of the retention array, equally shown in the following diagram.
IMC3
This format is identical to IMC1, except the U and 5 planes are swapped, every bit shown in the post-obit diagram.
4:ii:0 Formats, 12 $.25 per Pixel
Four 4:2:0 12-bpp formats are recommended, with the post-obit FOURCC codes:
- IMC2
- IMC4
- YV12
- NV12
In all of these formats, the chroma channels are subsampled by a factor of ii in both the horizontal and vertical dimensions.
IMC2
This format is the aforementioned as IMC1 except for the following divergence: The V (Cr) and U (Cb) lines are interleaved at one-half-stride boundaries. In other words, each total-stride line in the chroma area starts with a line of V samples, followed by a line of U samples that begins at the next half-stride boundary (Figure seven). This layout makes more efficient use of accost space than IMC1. Information technology cuts the chroma address space in half, and thus the total address space past 25 percent. Among four:2:0 formats, IMC2 is the second-most preferred format, afterwards NV12. The following image illustrates this process.
IMC4
This format is identical to IMC2, except the U (Cb) and Five (Cr) lines are swapped, every bit shown in the following illustration.
YV12
All of the Y samples appear outset in memory every bit an assortment of unsigned char values. This array is followed immediately by all of the V (Cr) samples. The stride of the V aeroplane is half the stride of the Y plane; and the V plane contains half as many lines as the Y plane. The 5 plane is followed immediately by all of the U (Cb) samples, with the same footstep and number of lines as the 5 plane, as shown in the following illustration.
NV12
All of the Y samples appear first in retentiveness every bit an array of unsigned char values with an even number of lines. The Y aeroplane is followed immediately by an array of unsigned char values that contains packed U (Cb) and V (Cr) samples. When the combined U-V array is addressed as an assortment of little-endian WORD values, the LSBs contain the U values, and the MSBs comprise the V values. NV12 is the preferred 4:two:0 pixel format for DirectX VA. It is expected to be an intermediate-term requirement for DirectX VA accelerators supporting iv:2:0 video. The post-obit illustration shows the Y plane and the array that contains packed U and V samples.
Color Infinite and Chroma Sampling Rate Conversions
This section provides guidelines for converting between YUV and RGB, and for converting between some unlike YUV formats. Nosotros consider two RGB encoding schemes in this section: viii-bit computer RGB, likewise known as sRGB or "full-scale" RGB, and studio video RGB, or "RGB with head-room and toe-room." These are defined every bit follows:
- Computer RGB uses 8 bits for each sample of cerise, green, and blue. Blackness is represented past R = G = B = 0, and white is represented past R = Chiliad = B = 255.
- Studio video RGB uses some number of bits N for each sample of red, green, and blue, where N is 8 or more. Studio video RGB uses a dissimilar scaling gene than calculator RGB, and it has an offset. Black is represented by R = G = B = 16*2^(N-8), and white is represented by R = G = B = 235*2^(North-8). Withal, bodily values may autumn outside this range.
Studio video RGB is the preferred RGB definition for video in Windows, while estimator RGB is the preferred RGB definition for not-video applications. In either form of RGB, the chromaticity coordinates are as specified in ITU-R BT.709 for the definition of the RGB colour primaries. The (x,y) coordinates of R, Chiliad, and B are (0.64, 0.33), (0.30, 0.60), and (0.15, 0.06), respectively. Reference white is D65 with coordinates (0.3127, 0.3290). Nominal gamma is ane/0.45 (approximately 2.2), with precise gamma defined in item in ITU-R BT.709.
Conversion between RGB and 4:four:four YUV
We first describe conversion between RGB and 4:four:4 YUV. To convert 4:2:0 or 4:two:2 YUV to RGB, we recommend converting the YUV data to four:four:iv YUV, and then converting from 4:four:4 YUV to RGB. The AYUV format, which is a iv:4:4 format, uses 8 bits each for the Y, U, and V samples. YUV can also exist defined using more than than viii bits per sample for some applications.
Two dominant YUV conversions from RGB have been defined for digital video. Both are based on the specification known as ITU-R Recommendation BT.709. The first conversion is the older YUV form defined for fifty-Hz utilise in BT.709. It is the aforementioned as the relation specified in ITU-R Recommendation BT.601, also known by its older name, CCIR 601. Information technology should be considered the preferred YUV format for standard-definition TV resolution (720 x 576) and lower-resolution video. Information technology is characterized past the values of ii constants Kr and Kb:
Kr = 0.299 Kb = 0.114 The second conversion is the newer YUV course defined for threescore-Hz employ in BT.709, and should exist considered the preferred format for video resolutions above SDTV. It is characterized by different values for these 2 constants:
Kr = 0.2126 Kb = 0.0722 Conversion from RGB to YUV is defined by starting with the post-obit:
L = Kr * R + Kb * B + (1 - Kr - Kb) * 1000 The YUV values are then obtained as follows:
Y = floor(two^(Thou-8) * (219*(50-Z)/S + 16) + 0.5) U = clip3(0, (2^Thou)-1, floor(2^(M-8) * (112*(B-L) / ((i-Kb)*S) + 128) + 0.5)) V = clip3(0, (2^Chiliad)-1, floor(2^(M-viii) * (112*(R-L) / ((1-Kr)*Due south) + 128) + 0.5)) where
- M is the number of $.25 per YUV sample (M >= eight).
- Z is the black-level variable. For calculator RGB, Z equals 0. For studio video RGB, Z equals xvi*two^(Northward-viii), where N is the number of bits per RGB sample (Due north >= 8).
- Due south is the scaling variable. For computer RGB, South equals 255. For studio video RGB, South equals 219*2^(N-eight).
The role flooring(ten) returns the largest integer less than or equal to x. The function clip3(x, y, z) is defined as follows:
clip3(ten, y, z) = ((z < x) ? x : ((z > y) ? y : z)) Notation
clip3 should be implemented as a function rather than a preprocessor macro; otherwise multiple evaluations of the arguments will occur.
The Y sample represents effulgence, and the U and V samples represent the color deviations toward blueish and red, respectively. The nominal range for Y is 16*2^(Chiliad-8) to 235*2^(Grand-viii). Blackness is represented as 16*2^(M-8), and white is represented as 235*ii^(M-8). The nominal range for U and V are 16*ii^(Thou-8) to 240*2^(M-8), with the value 128*2^(M-eight) representing neutral chroma. However, bodily values may fall outside these ranges.
For input data in the form of studio video RGB, the clip performance is necessary to go along the U and 5 values within the range 0 to (2^M)-i. If the input is computer RGB, the clip operation is not required, because the conversion formula cannot produce values outside of this range.
These are the exact formulas without approximation. Everything that follows in this document is derived from these formulas. This section describes the post-obit conversions:
- Converting RGB888 to YUV 4:four:iv
- Converting 8-bit YUV to RGB888
- Converting 4:two:0 YUV to 4:2:2 YUV
- Converting iv:2:2 YUV to 4:4:four YUV
- Converting 4:ii:0 YUV to 4:4:4 YUV
Converting RGB888 to YUV 4:4:4
In the case of calculator RGB input and 8-bit BT.601 YUV output, we believe that the formulas given in the previous section tin exist reasonably approximated by the following:
Y = ( ( 66 * R + 129 * G + 25 * B + 128) >> viii) + 16 U = ( ( -38 * R - 74 * G + 112 * B + 128) >> 8) + 128 5 = ( ( 112 * R - 94 * G - 18 * B + 128) >> viii) + 128 These formulas produce eight-bit results using coefficients that require no more than 8 bits of (unsigned) precision. Intermediate results will require up to sixteen $.25 of precision.
Converting 8-bit YUV to RGB888
From the original RGB-to-YUV formulas, one tin derive the following relationships for BT.601.
Y = round( 0.256788 * R + 0.504129 * G + 0.097906 * B) + 16 U = round(-0.148223 * R - 0.290993 * Grand + 0.439216 * B) + 128 V = round( 0.439216 * R - 0.367788 * G - 0.071427 * B) + 128 Therefore, given:
C = Y - 16 D = U - 128 East = V - 128 the formulas to catechumen YUV to RGB can be derived equally follows:
R = prune( round( 1.164383 * C + 1.596027 * E ) ) G = clip( circular( 1.164383 * C - (0.391762 * D) - (0.812968 * E) ) ) B = clip( round( i.164383 * C + two.017232 * D ) ) where prune() denotes clipping to a range of [0..255]. We believe these formulas can be reasonably approximated past the post-obit:
R = prune(( 298 * C + 409 * E + 128) >> 8) G = prune(( 298 * C - 100 * D - 208 * East + 128) >> viii) B = clip(( 298 * C + 516 * D + 128) >> viii) These formulas use some coefficients that require more 8 bits of precision to produce each 8-flake result, and intermediate results volition require more than sixteen bits of precision.
To catechumen 4:ii:0 or 4:2:2 YUV to RGB, we recommend converting the YUV data to 4:4:four YUV, and then converting from 4:4:4 YUV to RGB. The sections that follow present some methods for converting four:two:0 and 4:2:ii formats to 4:four:4.
Converting 4:2:0 YUV to four:2:2 YUV
Converting 4:2:0 YUV to 4:2:ii YUV requires vertical upconversion by a factor of two. This section describes an case method for performing the upconversion. The method assumes that the video pictures are progressive scan.
Note
The 4:2:0 to 4:2:ii interlaced scan conversion process presents atypical problems and is hard to implement. This commodity does non accost the issue of converting interlaced browse from four:ii:0 to four:ii:two.
Allow each vertical line of input blush samples be an array Cin[] that ranges from 0 to N - 1. The respective vertical line on the output paradigm will be an array Cout[] that ranges from 0 to 2N - 1. To convert each vertical line, perform the post-obit process:
Cout[0] = Cin[0]; Cout[1] = clip((9 * (Cin[0] + Cin[1]) - (Cin[0] + Cin[two]) + 8) >> 4); Cout[2] = Cin[1]; Cout[iii] = clip((9 * (Cin[one] + Cin[ii]) - (Cin[0] + Cin[iii]) + 8) >> 4); Cout[4] = Cin[two] Cout[five] = prune((9 * (Cin[2] + Cin[3]) - (Cin[1] + Cin[4]) + 8) >> 4); ... Cout[ii*i] = Cin[i] Cout[two*i+1] = clip((nine * (Cin[i] + Cin[i+1]) - (Cin[i-1] + Cin[i+2]) + 8) >> 4); ... Cout[2*N-3] = clip((9 * (Cin[N-2] + Cin[Due north-i]) - (Cin[Due north-3] + Cin[Northward-one]) + 8) >> four); Cout[ii*N-2] = Cin[North-ane]; Cout[2*North-1] = clip((nine * (Cin[Due north-1] + Cin[Due north-ane]) - (Cin[N-2] + Cin[Northward-1]) + 8) >> 4); where clip() denotes clipping to a range of [0..255].
Note
The equations for handling the edges can be mathematically simplified. They are shown in this form to illustrate the clamping effect at the edges of the motion picture.
In effect, this method calculates each missing value by interpolating the curve over the 4 adjacent pixels, weighted toward the values of the two nearest pixels (Figure 11). The specific interpolation method used in this instance generates missing samples at half-integer positions using a well-known method called Catmull-Rom interpolation, also known equally cubic convolution interpolation.
In signal processing terms, the vertical upconversion should ideally include a stage shift compensation to account for the half-pixel vertical commencement (relative to the output 4:2:2 sampling grid) betwixt the locations of the iv:2:0 sample lines and the location of every other 4:2:2 sample line. However, introducing this outset would increase the amount of processing required to generate the samples, and brand it impossible to reconstruct the original 4:2:0 samples from the upsampled 4:ii:2 prototype. It would too go far impossible to decode video directly into four:2:2 surfaces and then use those surfaces as reference pictures for decoding subsequent pictures in the stream. Therefore, the method provided here does not take into business relationship the precise vertical alignment of the samples. Doing so is probably not visually harmful at reasonably high moving-picture show resolutions.
If you start with four:2:0 video that uses the sampling grid defined in H.261, H.263, or MPEG-one video, the phase of the output 4:ii:2 chroma samples will besides be shifted by a half-pixel horizontal first relative to the spacing on the luma sampling grid (a quarter-pixel offset relative to the spacing of the four:2:ii blush sampling grid). However, the MPEG-2 course of 4:2:0 video is probably more than usually used on PCs and does not suffer from this problem. Moreover, the distinction is probably non visually harmful at reasonably high picture resolutions. Trying to correct for this trouble would create the same sort of problems discussed for the vertical phase offset.
Converting 4:2:2 YUV to four:four:iv YUV
Converting 4:2:ii YUV to four:iv:4 YUV requires horizontal upconversion past a cistron of ii. The method described previously for vertical upconversion tin likewise exist applied to horizontal upconversion. For MPEG-2 and ITU-R BT.601 video, this method will produce samples with the correct stage alignment.
Converting 4:2:0 YUV to 4:iv:4 YUV
To catechumen 4:2:0 YUV to 4:four:4 YUV, you lot can just follow the ii methods described previously. Convert the 4:2:0 image to 4:2:two, and then convert the 4:2:2 image to 4:4:4. You can also switch the order of the 2 upconversion processes, as the guild of operation does non really matter to the visual quality of the result.
Other YUV Formats
Some other, less common YUV formats include the following:
- AI44 is a palettized YUV format with eight bits per sample. Each sample contains an index in the 4 most pregnant bits (MSBs) and an alpha value in the iv least meaning bits (LSBs). The index refers to an array of YUV palette entries, which must be divers in the media type for the format. This format is primarily used for subpicture images.
- NV11 is a 4:1:1 planar format with 12 bits per pixel. The Y samples appear first in retention. The Y airplane is followed by an array of packed U (Cb) and V (Cr) samples. When the combined U-5 assortment is addressed as an array of piddling-endian Word values, the U samples are contained in the LSBs of each Discussion, and the V samples are contained in the MSBs. (This memory layout is like to NV12 although the chroma sampling is dissimilar.)
- Y41P is a 4:1:1 packed format, with U and V sampled every quaternary pixel horizontally. Each macropixel contains 8 pixels in three bytes, with the following byte layout:
U0 Y0 V0 Y1 U4 Y2 V4 Y3 Y4 Y5 Y6 Y7 - Y41T is identical to Y41P, except the least-pregnant bit of each Y sample specifies the chroma key (0 = transparent, one = opaque).
- Y42T is identical to UYVY, except the least-significant bit of each Y sample specifies the chroma key (0 = transparent, 1 = opaque).
- YVYU is equivalent to YUYV except the U and V samples are swapped.
Each of the YUV formats described in this article has an assigned FOURCC code. A FOURCC lawmaking is a 32-fleck unsigned integer that is created past concatenating 4 ASCII characters.
There are various C/C++ macros that get in easier to declare FOURCC values in source code. For example, the MAKEFOURCC macro is alleged in Mmsystem.h, and the FCC macro is declared in Aviriff.h. Use them as follows:
DWORD fccYUY2 = MAKEFOURCC('Y','U','Y','2'); DWORD fccYUY2 = FCC('YUY2'); You tin can also declare a FOURCC code directly as a string literal simply past reversing the lodge of the characters. For example:
DWORD fccYUY2 = '2YUY'; // Declares the FOURCC 'YUY2' Reversing the society is necessary because the Windows operating system uses a niggling-endian architecture. 'Y' = 0x59, 'U' = 0x55, and 'two' = 0x32, so '2YUY' is 0x32595559.
In Media Foundation, formats are identified past a major type GUID and a subtype GUID. The major type for computer video formats is ever MFMediaType_Video . The subtype can be constructed by mapping the FOURCC code to a GUID, as follows:
XXXXXXXX-0000-0010-8000-00AA00389B71 where XXXXXXXX is the FOURCC lawmaking. Thus, the subtype GUID for YUY2 is:
32595559-0000-0010-8000-00AA00389B71 Constants for the most mutual YUV format GUIDs are defined in the header file mfapi.h. For a listing of these constants, see Video Subtype GUIDs.
-
About YUV Video
-
Video Media Types
Source: https://docs.microsoft.com/en-us/windows/win32/medfound/recommended-8-bit-yuv-formats-for-video-rendering
0 Response to "Problem Reading an Rgb Value at Pixel (0, 0). Most Likely Due to a Non-integer Value."
Enregistrer un commentaire