close

Вход

Забыли?

вход по аккаунту

код для вставкиСкачать
DirectX Video Acceleration Specification
for High Efficiency Video Coding (HEVC)
9 August 2013
Gary J. Sullivan and Yongjun Wu
© 2013 Microsoft Corporation. All rights reserved. Any use, distribution or public discussion of, and any
feedback to, these materials is subject to the terms of the attached license. By providing any feedback on
these materials to Microsoft, you agree to the terms of that license.
Abstract – This document contains the specification for support of High Efficiency Video Coding (HEVC)
codec within the Microsoft Windows DirectX Video Acceleration (DXVA) API/DDI context. This includes
support of the HEVC Main Profile, Main Still Picture profile, and Main 10 profile as important special
cases. The document describes high-level design concepts and specific HEVC extensions to DXVA
interfaces and data structures of HEVC video decoding. This document specifies only off-host VLD
profiles for HEVC video decoding.
Microsoft Corporation Technical Documentation License Agreement
READ THIS! THIS IS A LEGAL AGREEMENT BETWEEN MICROSOFT CORPORATION ("MICROSOFT") AND THE RECIPIENT OF THESE
MATERIALS, WHETHER AN INDIVIDUAL OR AN ENTITY ("YOU"). IF YOU HAVE ACCESSED THIS AGREEMENT IN THE PROCESS OF
DOWNLOADING MATERIALS ("MATERIALS") FROM A MICROSOFT WEB SITE, BY CLICKING "I ACCEPT", DOWNLOADING, USING OR
PROVIDING FEEDBACK ON THE MATERIALS, YOU AGREE TO THESE TERMS. IF THIS AGREEMENT IS ATTACHED TO MATERIALS, BY
ACCESSING, USING OR PROVIDING FEEDBACK ON THE ATTACHED MATERIALS, YOU AGREE TO THESE TERMS.
For good and valuable consideration, the receipt and sufficiency of which are acknowledged, You and Microsoft agree as follows:
1. You may review these Materials only (a) as a reference to assist You in planning and designing Your product, service or technology
("Product") to interface with a Microsoft Product as described in these Materials; and (b) to provide feedback on these Materials to
Microsoft. All other rights are retained by Microsoft; this agreement does not give You rights under any Microsoft patents. You may
not (i) duplicate any part of these Materials, (ii) remove this agreement or any notices from these Materials, or (iii) give any part of
these Materials, or assign or otherwise provide Your rights under this agreement, to anyone else.
2. No part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any
means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of
Microsoft Corporation.
3. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places
and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail
address, logo, person, place or event is intended or should be inferred.
4. These Materials may contain preliminary information or inaccuracies, and may not correctly represent any associated Microsoft
Product as commercially released. All Materials are provided entirely "AS IS." To the extent permitted by law, MICROSOFT MAKES
NO WARRANTY OF ANY KIND, DISCLAIMS ALL EXPRESS, IMPLIED AND STATUTORY WARRANTIES, INCLUDING ALL WARRANTIES OF
NON-INFRINGEMENT, AND ASSUMES NO LIABILITY TO YOU FOR ANY DAMAGES OF ANY TYPE IN CONNECTION WITH THESE
MATERIALS OR ANY INTELLECTUAL PROPERTY IN THEM.
5. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject
matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this
document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. Implementation or
use of the proposed HEVC standard may require patent licenses from third parties. You are responsible for securing any such patent
licenses. No patent licenses are provided under this Agreement. Microsoft shall not be liable for any damages arising out of or in
connection with the use of these specifications, including liability for lost profit, business interruption, or any other damages
whatsoever. Some states do not allow the exclusion or limitation of liability or consequential or incidental damages; the above
limitation may not apply to you.
6. You have no obligation to give Microsoft any suggestions, comments or other feedback ("Feedback") relating to these Materials.
However, any Feedback you voluntarily provide may be used in Microsoft Products and related specifications or other
documentation (collectively, "Microsoft Offerings") which in turn may be relied upon by other third parties to develop their own
Products. Accordingly, if You do give Microsoft Feedback on any version of these Materials or the Microsoft Offerings to which they
apply, You agree: (a) Microsoft may freely use, reproduce, license, distribute, and otherwise commercialize Your Feedback in any
Microsoft Offering; (b) You also grant third parties, without charge, only those patent rights necessary to enable other Products to
use or interface with any specific parts of a Microsoft Product that incorporate Your Feedback; and (c) You will not give Microsoft
any Feedback (i) that You have reason to believe is subject to any patent, copyright or other intellectual property claim or right of
any third party; or (ii) subject to license terms which seek to require any Microsoft Offering incorporating or derived from such
Feedback, or other Microsoft intellectual property, to be licensed to or otherwise shared with any third party.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
ii
7. This agreement is governed by the laws of the State of Washington. Any dispute involving it must be brought in the federal or
state superior courts located in King County, Washington, and You waive any defenses allowing the dispute to be litigated
elsewhere. If there is litigation, the losing party must pay the other party’s reasonable attorneys’ fees, costs and other expenses. If
any part of this agreement is unenforceable, it will be considered modified to the extent necessary to make it enforceable, and the
remainder shall continue in effect. This agreement is the entire agreement between You and Microsoft concerning these Materials; it
may be changed only by a written document signed by both You and Microsoft.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
iii
Contents
1.
Introduction ...................................................................................................................................... - 1 1.1
Referenced Specifications and Referenced Software ............................................................... - 1 1.2
General Design Considerations ................................................................................................. - 2 1.3
Support Only for Off-Host VLD Operation ................................................................................ - 2 1.4
HEVC coded pictures, frame/field considerations,
and memory allocation recommendations............................................................................... - 2 1.5
Picture Data............................................................................................................................... - 3 1.6
Buffer Types .............................................................................................................................. - 6 1.7
DXVA Decoding Operations ...................................................................................................... - 6 1.8
Status Reporting........................................................................................................................ - 7 1.9
Accelerator Internal Information Storage................................................................................. - 8 1.10 Configuration Parameters ......................................................................................................... - 8 1.10.1 Syntax ................................................................................................................................ - 8 1.10.2 Semantics .......................................................................................................................... - 9 2. DXVA_PicEntry_HEVC Data Structure ............................................................................................. - 10 2.1
Syntax ...................................................................................................................................... - 10 2.2
Semantics ................................................................................................................................ - 10 3. Picture Parameters Data Structure ................................................................................................. - 11 3.1
Syntax ...................................................................................................................................... - 11 3.2
Semantics ................................................................................................................................ - 14 4. Quantization Matrix Data Structure ............................................................................................... - 21 4.1
Syntax ...................................................................................................................................... - 21 4.2
Semantics ................................................................................................................................ - 22 5. Slice Control Data Structure ............................................................................................................ - 23 5.1
Syntax ...................................................................................................................................... - 23 5.2
Semantics ................................................................................................................................ - 23 6. Status Report Data Structure .......................................................................................................... - 25 6.1
Syntax ...................................................................................................................................... - 25 6.2
Semantics ................................................................................................................................ - 26 7. Restricted-Mode Profiles ................................................................................................................ - 27 7.1
DXVA_ModeHEVC_VLD_Main Profile ..................................................................................... - 28 7.2
DXVA_ModeHEVC_VLD_Main10 Profile ................................................................................. - 28 8. For More Information ..................................................................................................................... - 28 -
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
iv
1.
Introduction
This specification defines extensions to DirectX® Video Acceleration (DXVA) to support decoding of HEVC
video, as specified in a video compression standard published jointly as Rec. ITU-T H.265 (in force April
2013) and ISO/IEC 23008-2 (to be published).
This specification assumes that you are familiar with the HEVC standard specification and with the basic
design of DXVA.
DXVA consists of a DDI for display drivers and an API for software decoders. Version 1.0 of DXVA is
supported in Windows 2000 or later versions. Version 2.0 is available starting in Windows Vista.
Considering the passage of time and the increasing prevalence of DXVA 2.0 support, this document
specifies the DXVA 2.0 operation for HEVC video decoding. We do not plan to specify HEVC video
decoding in the DXVA 1.0 context.
In DXVA, some decoding operations are implemented by the graphics hardware driver and GPU. This set
of functionality is termed the accelerator. Other decoding operations are implemented by user-mode
application software, called the host decoder or software decoder. Processing performed by the
accelerator is sometimes referred to as off-host processing. Typically the accelerator uses the GPU to
speed up some operations. When the accelerator performs a decoding operation, the host decoder
sends buffers of data to the accelerator that contains the information that is needed to perform the
operation.
Except where stated otherwise in this specification, DXVA operations in the accelerator shall be
stateless; the accelerator design shall not contain assumptions about the sequences of decoding
operation or internal-memory state dependencies. This is necessary to enable good "trick play" and loss
resilience functionality.
Note – In this document, the term shall describes behavior that is required by the specification. The
term should describes behavior that is encouraged but not required. The term note refers to
observations about implications of the specification.
Questions or comments about this specification may be sent to [email protected]
1.1
Referenced Specifications and Referenced Software
The referenced HEVC video coding standard (2013 edition) is specified in the following document:
Rec. ITU-T H.265 | ISO/IEC 23008-2:2013, High efficiency video coding (HEVC)
That standard is publicly available at the following link:
http://www.itu.int/rec/T-REC-H.265 (approved 2013-04-13, published 2013-06-17)
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-1-
Associated draft standard HM reference software is available at the following link:
http://hevc.info
1.2
General Design Considerations
Section 1 of this specification provides an overview of the DXVA design for HEVC video decoding. It is
intended as background information, and may be helpful in understanding the sections that follow. In
the case of conflicts, later sections of this document override this section. The initial design here is
intended to be sufficient for decoding bitstreams of the Main profile, Main Still Picture profile and Main
10 profile.
1.3
Support Only for Off-Host VLD Operation
Over time, the level of industry interest in supporting modes of DXVA operation other than off-host VLD
operation (e.g., as in the DXVA_ModeH264_MoComp_NoFGT and DXVA_ModeH264_IDCT_NoFGT
profiles of DXVA operation for H.264/AVC video decoding, and the DXVA_ModeWMV9_PostProc and
DXVA_ModeVC1_IDCT profiles of DXVA operation for WMV9/VC-1 video decoding) appears to have
waned. We therefore do not plan to specify such modes of DXVA operation for HEVC video decoding,
but only off-host VLD mode of DXVA operation for HEVC video decoding.
1.4 HEVC coded pictures, frame/field considerations, and memory
allocation recommendations
The HEVC specification does not make an explicit distinction, within the decoding process, regarding
whether a video picture represents a full frame of video content or an individual field. It is generally
anticipated that when the entire video content consists of progressive-scan video, pictures would
represent complete frames. However, the progressive-versus-interlaced nature of the source material is
left outside the scope of the decoding process, and an HEVC coded video sequence may consist either of
coded frames or coded fields (but not both within the same coded video sequence), regardless of
whether the source video material used an interlaced or progressive scan. Because the HEVC
specification does not contain low-level coding features for switching between frame and field coding,
this DXVA specification also does not consider whether the pictures represent frames or fields. Any
special handling for handling pictures that represent individual fields therefore be performed separately,
outside the scope of this DXVA decoding specification.
Each decoded destination surface therefore represents a decoded picture, which in general could be a
complete frame or a single field of video content.
The decoding process for HEVC decoded pictures operates using a picture size that, in general, may be
larger than the region to be displayed. The region of the decoded picture that is output for display is
selected by a window known as the conformance cropping window. However, the size and location of
the conformance cropping window do not affect the internal operation of the decoding process. Thus
the cropping operation is handled as a display operation outside the scope of this DXVA specification.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-2-
The internal operation of the decoding process is performed using a memory region that has a size that
is an integer multiple of a luma coding block size that has a width and height selected by the encoder
and is denoted by the variable MinCbSizeY. The value of MinCbSizeY may be equal to 8, 16, 32, or 64.
The value of MinCbSizeY is conveyed in the syntax using a syntax element
log2_min_luma_coding_block_size_minus3, such that MinCbSizeY =
( 1 << ( log2_min_luma_coding_block_size_minus3 ) ).
Better compression can generally be achieved when the encoder uses a smaller value of MinCbSizeY.
However, supporting a smaller value of MinCbSizeY would ordinarily increase encoder complexity. Thus,
encoders may typically be designed to operate using a specific value of MinCbSizeY, but not all encoders
may use the same value of MinCbSizeY.
To encode video with a particular source picture resolution in width and height, encoders may typically
just round the picture width and height up to the nearest multiple of the value of MinCbSizeY for which
they are designed to operate.
However, video programs may contain segments of video content that have been spliced together from
different sources. These segments may be coded with different source picture resolutions and different
values of MinCbSizeY.
When using DXVA decoding, it is desirable for host decoders to avoid glitching by avoiding any
unnecessary resource-intensive changes of memory surface allocation for accelerators caused by
changes in the source video resolution or changes of MinCbSizeY from segment to segment. It may
therefore be desirable for host decoders to allocate somewhat larger memory surfaces than the
minimum that would be necessary to decode each individual coded video sequence in the bitstream.
In particular, it is suggested that host decoders should always allocate memory surfaces with sizes that
are multiples of 64 in both height and width. This practice can avoid the need to reallocate the surfaces
if the value of MinCbSizeY increases when the source video content resolution and the number of
allocated surfaces have stayed the same.
1.5
Picture Data
The following data must be conveyed for each picture in order to decode each picture independently
without serial dependencies. For simplicity, the same flag names from HEVC specification are used. For
further details, see section 3 (Picture Parameters Data Structure) of this specification.

Basic coding parameter dimension and color format information, including
o chroma_format_idc
o separate_colour_plane_flag
o PicWidthInMinCbsY and PicHeightInMinCbsY
o log2_min_luma_coding_block_size_minus3
o log2_diff_max_min_luma_coding_block_size
o log2_min_transform_block_size_minus2
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-3-
o
o
o
o
log2_diff_max_min_transform_block_size
max_transform_hierarchy_depth_inter
max_transform_hierarchy_depth_intra
bit_depth_luma_minus8 and bit_depth_chroma_minus8

Picture buffering state and reference list related information, including:
o CurrPic (indicating the current destination surface)
o CurrPicOrderCntVal
o sps_max_dec_pic_buffering_minus1
o RefPicList[]
o Flags for which pictures are treated as long-term reference pictures (in this design, these
are included in RefPicList[])
o num_ref_idx_l0_default_active_minus1
o num_ref_idx_l1_default_active_minus1
o PicOrderCntValList[]
o RefPicSetStCurrBefore[]
o RefPicSetStCurrAfter[]
o RefPicSetLtCurr[]

Flags and associated data controlling particular coding features that are the same within a
picture, including
o QP control parameters, including cu_qp_delta_enabled_flag, diff_cu_qp_delta_depth,
init_qp_minus26, pps_cb_qp_offset, and pps_cr_qp_offset
o PCM control parameters, including pcm_enabled_flag and the associated data
pcm_sample_bit_depth_luma_minus1, pcm_sample_bit_depth_chroma_minus1,
log2_min_pcm_luma_coding_block_size_minus3, and
log2_diff_max_min_pcm_luma_coding_block_size, pcm_loop_filter_disabled_flag
o Quantization scaling list control parameters, including scaling_list_enabled_flag and the
associated scaling lists. (When scaling_list_enabled_flag is equal to 0 and thus "flat"
scaling lists with all entries equal to 16 are used, the host shall not send the scaling lists
to the accelerator.)
o Tiling control parameters, including tiles_enabled_flag, num_tile_columns_minus1,
num_tile_rows_minus1, uniform_spacing_flag, column_width_minus1[],
row_height_minus1[], and loop_filter_across_tiles_enabled_flag
o amp_enabled_flag
o sample_adaptive_offset_enabled_flag
o sps_temporal_mvp_enabled_flag
o strong_intra_smoothing_enabled_flag
o sign_data_hiding_enabled_flag
o constrained_intra_pred_flag
o transform_skip_enabled_flag
o transquant_bypass_enabled_flag
o weighted_pred_flag and weighted_bipred_flag
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-4-
o
o
o
entropy_coding_sync_enabled_flag
pps_loop_filter_across_slices_enabled_flag
log2_parallel_merge_level_minus2

Data that are the same within a picture that control syntax in the slice headers, including:
o log2_max_pic_order_cnt_lsb_minus4
o num_short_term_ref_pic_sets
o long_term_ref_pics_present_flag
o num_long_term_ref_pics_sps
o The value of NumDeltaPocs[ RefRpsIdx ] (herein called ucNumDeltaPocsOfRefRpsIdx)
that is used for parsing short_term_ref_pic_set( num_short_term_ref_pic_sets ) in the
slice header when short_term_ref_pic_set_sps_flag is equal to 0.
o dependent_slice_segments_enabled_flag
o output_flag_present_flag
o num_extra_slice_header_bits
o cabac_init_present_flag
o pps_slice_chroma_qp_offsets_present_flag
o deblocking_filter_override_enabled_flag
o pps_deblocking_filter_disabled_flag
o lists_modification_present_flag
o slice_segment_header_extension_present_flag

Some "helper" flags that are not specified in the standard and may not be essential, but may
possibly be helpful for optimizations in the accelerator, including:
o A flag, IrapPicFlag, indicating that the current picture is an IRAP picture.
o A flag, IdrPicFlag, indicating that the current picture is an IDR picture.
o A flag, IntraPicFlag, identifying pictures in which all slices are intra slices. (Not specified
in the standard and not essential but possibly helpful for optimizations in accelerator.)
o A flag, NoPicReorderingFlag, indicating that no picture reordering is used in the coded
video sequence.
o A flag, NoBiPredFlag, indicating that no biprediction is used in coded video sequence.

Some initial or default values for deblocking, including:
o pps_beta_offset_div2
o pps_tc_offset_div2
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-5-
1.6
Buffer Types
The host software decoder will send the following DXVA buffers to the accelerator in off-host VLD
profile,

One picture parameters buffer.

Conditionally, one quantization matrix buffer. If scaling_list_enabled_flag is equal to 0 and thus
"flat" scaling lists with all entries equal to 16 are used, the host shall not send the scaling lists to
the accelerator; otherwise, if scaling_list_enabled_flag is equal to 1, the host shall send the
scaling lists to the accelerator in one quantization matrix data buffer.

One or more slice control buffers.

One or more bitstream data buffers.
These buffer types are defined in the prior DXVA specifications, but new data structures have been
defined herein for HEVC video decoding. The sequence of operations is described in section 1.8.
1.7
DXVA Decoding Operations
The basic sequence of operations for DXVA decoding consists of the following calls by the host software
decoder. In DXVA 2.0, they are part of the IDirectXVideoDecoder interface.
1. BeginFrame. Signals the start of one or more decoding operations by the accelerator, which will
cause the accelerator to write data into an uncompressed surface buffer.
2. Execute. The decoder calls Execute one or more times, sending one or more compressed data
buffers to the accelerator and specifying the operations to perform on the buffers. The
accelerator may return status information from the call. In DXVA 2.0, the command is specified
in the Function member of the optional DXVA2_DecodeExtensionData structure passed to
IDirectXVideoDecoder::Execute by the DXVA2_DecodeExecuteParams structure.
3. EndFrame. Signals that the host software decoder has sent all of the data needed for the
corresponding BeginFrame call.
For HEVC video decoding, the data passed with the Execute method includes a destination index to
indicate which uncompressed surface buffer is affected by the operation. The host software decoder can
call Execute more than once between each BeginFrame/EndFrame pair.
During the BeginFrame/EndFrame sequence, the accelerator will, in some cases, access uncompressed
surfaces other than the surface being written to. For example, decoding a picture may require data from
one or more previously-decoded pictures for use as reference data for inter-picture prediction. If the
host software decoder issues a command that requires writing to a buffer, and then issues a command
that requires reading from the same buffer, it is the accelerator's responsibility to serialize these
operations. In other words, the accelerator must complete a preceding write operation before starting a
subsequent read operation on the same buffer.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-6-
The DXVA design for HEVC video decoding restricts the sequence of buffer types that can be sent to the
accelerator. With compressed picture decoding in off-host parsing, i.e. VLD profile, the host software
decoder sends the following data buffers:

One picture parameters data buffer.

One quantization matrix data buffer, if scaling_list_enabled_flag in the picture parameters data
buffer is equal to 0 and thus "flat" scaling lists with all entries equal to 16 are used, the host shall
not send a quantization matrix data buffer to the accelerator.


One or more slice control data buffers.
One or more bitstream data buffers.
The host software decoder does not send buffers for status reporting feedback. Rather, it reads such
buffers when requesting status reporting feedback.
Two values of bDXVA_Func are defined, as follows;
Value
1
7
Description
Compressed picture decoding with off-host parsing
Request for status report.
dwFunction shall contain exactly one of the two values listed here. Function 7 (status reporting) is
described in the next section.
Between a single pair of BeginFrame and EndFrame calls, the host software decoder can send one or
more sets of buffers with bDXVA_Func equal to 1 for off-host parsing.
The total quantity of data in any bitstream data buffer (and the amount of data reported by the host
software decoder) shall be an integer multiple of 128 bytes. The accelerator may treat any additional
padding of zero bytes for 128-byte alignment as trailing_zero_8bits (as defined in the Annex B byte
stream format of the HEVC specification, in order to identify the length of the actual NALU and
determine the actual end of the data in the bitstream.
Whenever the host software decoder calls Execute to pass a set of compressed buffers to the
accelerator, the private output data pointer shall be NULL, as stated in other DXVA 2.0 documentation:
when the NumCompBuffers member of the DXVA2_DecodeExecuteParams structure is greater than
zero, pPrivateOutputData shall be NULL and PrivateOutputDataSize shall be zero. Alternatively, the
pExtensionData member of the DXVA2_DecodeExecuteParams structure can be NULL.
1.8
Status Reporting
After calling EndFrame for the uncompressed destination surfaces, the host software decoder may call
Execute with bDXVA_Func = 7 to get a status report. The host software decoder does not pass any
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-7-
compressed buffers to the accelerator in this call. Instead, the decoder provides a private output data
buffer into which the accelerator will write status information. The decoder provides the output data
buffer as follows in DXVA 2.0: the host software decoder sets the pPrivateOutputData member of the
DXVA2_DecodeExecuteParams structure to point to the buffer. The PrivateOutputDataSize member
specifies the maximum amount of data that the accelerator should write to the buffer. The value of
cbPrivateOutputData or PrivateOutputDataSize shall be an integer multiple of
sizeof(DXVA_Status_HEVC).
When the accelerator receives the Execute call for status reporting, it should not stall operation to wait
for any prior operations to complete. Instead, it should immediately provide the available status
information for all operations that have completed since the previous request for a status report, up to
the maximum amount requested. Immediately after the Execute call returns, the host software decoder
can read the status report information from the buffer. The status report data structure is described in
section 6.
1.9
Accelerator Internal Information Storage
The HEVC decoding process requires storing some additional information along with the array of
decoded pictures to be used as reference pictures for picture decoding. Rather than have the host
decoder collect this information and explicitly provide it to the accelerator, the accelerator must store
this information as it decodes each picture, so that the information is available if the picture is later used
as a reference picture.
Specifically, the accelerator shall store the set of information necessary for use in inter-picture
prediction for each coding tree unit in each decoded reference picture, such as a flag indicating whether
the coding tree unit was predicted using intra or inter prediction, a reference picture identifier, and the
motion vectors used for motion compensation for inter coding tree units. The accelerator shall also
store the PicOrderCntVal of reference pictures together with the corresponding decoded pictures when
the decoded pictures are used for reference, which is needed for collocated motion vector derivation.
1.10 Configuration Parameters
This section describes the configuration parameters for HEVC video decoding according to this
specification.
1.10.1 Syntax
In DXVA 2.0, configuration uses the DXVA2_ConfigPictureDecode structure. This syntax structure is
documented in the DXVA 2.0 documentation, available at http://msdn.microsoft.com/enus/library/ms694823(VS.85).aspx.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-8-
1.10.2 Semantics
The ordinary semantics of this data structure apply for HEVC video decoding according to this
specification. Some details of the usage in this context are provided below.
guidConfigBitstreamEncryption
Defines the encryption protocol type for bitstream data buffers. If no encryption is applied, the value is
DXVA_NoEncrypt.
guidConfigMBcontrolEncryption
Shall be DXVA_NoEncrypt, as ConfigBitstreamRaw is equal to 1 always.
guidConfigResidDiffEncryption
Shall be DXVA_NoEncrypt, as ConfigBitstreamRaw is equal to 1 always.
ConfigBitstreamRaw
Shall be 1, as only off-host VLD parsing profiles are supported by this specification with
DXVA_Slice_HEVC_Short structure. DXVA_Slice_HEVC_Long is not defined in this specification.
ConfigMBcontrolRasterOrder
Shall be 0, as ConfigBitstreamRaw is equal to 1 always.
ConfigResidDiffHost
Shall be 0, as ConfigBitstreamRaw is equal to 1 always.
ConfigSpatialResid8
Shall be 0, as ConfigResidDiffHost is equal to 0 always.
ConfigResid8Subtraction
Shall be 0, as ConfigSpatialResid8 is equal to 0 always.
ConfigSpatialHost8or9Clipping
Shall be 0, as ConfigResidDiffHost is equal to 0 always.
ConfigSpatialResidInterleaved
Shall be 0, as ConfigResidDiffHost is equal to 0 always.
ConfigIntraResidUnsigned
Shall be 0, as ConfigResidDiffHost is equal to 0 always.
ConfigResidDiffAccelerator
Shall be 0, as ConfigBitstreamRaw is equal to 1 always.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
-9-
ConfigHostInverseScan
Shall be 0, as ConfigResidDiffAccelerator is equal to 0 always.
ConfigSpecificIDCT
Shall be 0, as ConfigResidDiffAccelerator is equal to 0 always.
Config4GroupedCoefs
Shall be 0, as ConfigResidDiffAccelerator is equal to 0 always
ConfigDecoderSpecific
Shall be 0.
2.
DXVA_PicEntry_HEVC Data Structure
The DXVA_PicEntry_HEVC structure specifies a reference to an uncompressed surface. It is used in other
data structures described in this document. The data structure itself is the same as the previous
DXVA_PicEntry_H264 data structure, but the associated semantics are somewhat different. Although
the data structure is the same as the previous DXVA_PicEntry_H264 data structure, it has been given a
new name so that the data structures used for HEVC will have names that are associated with the new
design. For convenience, the form of this data structure is shown below.
2.1
Syntax
typedef struct _DXVA_PicEntry_HEVC {
union {
struct {
UCHAR Index7Bits
: 7;
UCHAR AssociatedFlag : 1;
};
UCHAR bPicEntry;
};
} DXVA_PicEntry_HEVC, *LPDXVA_PicEntry_HEVC;
2.2
Semantics
Index7Bits
An index that identifies an uncompressed surface for the CurrPic or RefPicList member of the picture
parameters structure (section 4.0).
When Index7Bits is used in the CurrPic and RefPicList members of the picture parameters structure, the value
directly specifies the DXVA index of an uncompressed surface.
When Index7Bits is 127 (0x7F), this indicates that it does not contain a valid index.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 10 -
AssociatedFlag
Optional 1-bit flag associated with the surface. It specifies whether the reference picture is a long-term
reference or a short-term reference for RefPicList, and it has no meaning when used for CurrPic.
bPicEntry
Accesses the entire 8 bits of the union.
3.
Picture Parameters Data Structure
The DXVA_PicParams_HEVC structure provides the picture-level parameters of a compressed picture
for HEVC video decoding. This structure is used when bDXVA_Func is 1 and the buffer type is
DXVA2_PictureParametersBufferType (in DXVA 2.0).
3.1
Syntax
typedef struct _DXVA_PicParams_HEVC {
USHORT
PicWidthInMinCbsY;
USHORT
PicHeightInMinCbsY;
union {
struct {
USHORT
chroma_format_idc
: 2;
USHORT
separate_colour_plane_flag
: 1;
USHORT
bit_depth_luma_minus8
: 3;
USHORT
bit_depth_chroma_minus8
: 3;
USHORT
log2_max_pic_order_cnt_lsb_minus4
: 4;
USHORT
NoPicReorderingFlag
: 1;
USHORT
NoBiPredFlag
: 1;
USHORT
ReservedBits1
: 1;
};
USHORT
wFormatAndSequenceInfoFlags;
};
DXVA_PicEntry_HEVC
CurrPic;
UCHAR
sps_max_dec_pic_buffering_minus1;
UCHAR
log2_min_luma_coding_block_size_minus3;
UCHAR
log2_diff_max_min_luma_coding_block_size;
UCHAR
log2_min_transform_block_size_minus2;
UCHAR
log2_diff_max_min_transform_block_size;
UCHAR
max_transform_hierarchy_depth_inter;
UCHAR
max_transform_hierarchy_depth_intra;
UCHAR
num_short_term_ref_pic_sets;
UCHAR
num_long_term_ref_pics_sps;
UCHAR
num_ref_idx_l0_default_active_minus1;
UCHAR
num_ref_idx_l1_default_active_minus1;
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 11 -
CHAR
init_qp_minus26;
UCHAR
ucNumDeltaPocsOfRefRpsIdx;
USHORT
wNumBitsForShortTermRPSInSlice;
USHORT
ReservedBits2;
union {
struct {
UINT32
scaling_list_enabled_flag
: 1;
UINT32
amp_enabled_flag
: 1;
UINT32
sample_adaptive_offset_enabled_flag
: 1;
UINT32
pcm_enabled_flag
: 1;
UINT32
pcm_sample_bit_depth_luma_minus1
: 4;
UINT32
pcm_sample_bit_depth_chroma_minus1
: 4;
UINT32
log2_min_pcm_luma_coding_block_size_minus3
: 2;
UINT32
log2_diff_max_min_pcm_luma_coding_block_size
: 2;
UINT32
pcm_loop_filter_disabled_flag
: 1;
UINT32
long_term_ref_pics_present_flag
: 1;
UINT32
sps_temporal_mvp_enabled_flag
: 1;
UINT32
strong_intra_smoothing_enabled_flag
: 1;
UINT32
dependent_slice_segments_enabled_flag
: 1;
UINT32
output_flag_present_flag
: 1;
UINT32
num_extra_slice_header_bits
: 3;
UINT32
sign_data_hiding_enabled_flag
: 1;
UINT32
cabac_init_present_flag
: 1;
UINT32
ReservedBits3
: 5;
};
UINT32 dwCodingParamToolFlags;
};
union {
struct {
UINT32
constrained_intra_pred_flag
: 1;
UINT32
transform_skip_enabled_flag
: 1;
UINT32
cu_qp_delta_enabled_flag
: 1;
UINT32
pps_slice_chroma_qp_offsets_present_flag
: 1;
UINT32
weighted_pred_flag
: 1;
UINT32
weighted_bipred_flag
: 1;
UINT32
transquant_bypass_enabled_flag
: 1;
UINT32
tiles_enabled_flag
: 1;
UINT32
entropy_coding_sync_enabled_flag
: 1;
UINT32
uniform_spacing_flag
: 1;
UINT32
loop_filter_across_tiles_enabled_flag
: 1;
UINT32
pps_loop_filter_across_slices_enabled_flag
: 1;
UINT32
deblocking_filter_override_enabled_flag
: 1;
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 12 -
UINT32
pps_deblocking_filter_disabled_flag
: 1;
UINT32
lists_modification_present_flag
: 1;
UINT32
slice_segment_header_extension_present_flag
: 1;
UINT32
IrapPicFlag
: 1;
UINT32
IdrPicFlag
: 1;
UINT32
IntraPicFlag
: 1;
UINT32
ReservedBits4
: 13;
};
UINT32 dwCodingSettingPicturePropertyFlags;
};
CHAR
pps_cb_qp_offset;
CHAR
pps_cr_qp_offset;
UCHAR
num_tile_columns_minus1;
UCHAR
num_tile_rows_minus1;
USHORT
column_width_minus1[19];
USHORT
row_height_minus1[21];
UCHAR
diff_cu_qp_delta_depth;
CHAR
pps_beta_offset_div2;
CHAR
pps_tc_offset_div2;
UCHAR
log2_parallel_merge_level_minus2;
INT
CurrPicOrderCntVal;
DXVA_PicEntry_HEVC
RefPicList[15];
UCHAR
ReservedBits5;
INT
PicOrderCntValList[15];
UCHAR
RefPicSetStCurrBefore[8];
UCHAR
RefPicSetStCurrAfter[8];
UCHAR
RefPicSetLtCurr[8];
USHORT
ReservedBits6;
USHORT
ReservedBits7;
UINT
StatusReportFeedbackNumber;
} DXVA_PicParams_HEVC, *LPDXVA_PicParams_HEVC;
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 13 -
3.2
Semantics
PicWidthInMinCbsY, PicHeightInMinCbsY
Correspond to the variables of the same name in the HEVC specification and affect the decoding process
accordingly.
chroma_format_idc, separate_colour_plane_flag, bit_depth_luma_minus8
bit_depth_chroma_minus8, log2_max_pic_order_cnt_lsb_minus4
Correspond to the syntax elements of the same name in the HEVC specification and affect the decoding
process accordingly.
NoPicReorderingFlag
When NoPicReorderingFlag is equal to 1, this indicates that the maximum allowed number of pictures
preceding any picture in decoding order and succeeding that picture in output order is equal to 0, i.e.
that no picture reordering is used in the coded video sequence. When NoPicReorderingFlag equal to 0,
picture reordering (i.e., having an output order that differs from the decoding order) may be used in the
coded video sequence. This flag does not affect the decoding process.
Note – NoPicReorderingFlag may be set to 1 by the host software decoder when
sps_max_num_reorder_pics is equal to 0. However, there is no requirement that NoPicReorderingFlag
must be derived from sps_max_num_reorder_pics.
NoBiPredFlag
When NoBiPredFlag equal to 1, this indicates that B slices are not used in the coded video sequence.
This flag does not affect the decoding process.
Note – This flag does not correspond to any indication provided in the HEVC bitstream itself. Thus, a
host software decoder would need some external information (e.g. as determined at the application
level) to be able to set this flag to 1. In the absence of any such available indication, the host software
decoder must set this flag to 0.
ReservedBits1
Bit field added to enable WORD alignment of parameters. Shall be set to 0 by the host decoder and
accelerators shall ignore its value.
wFormatAndSequenceInfoFlags
Provides an alternative way to access the bit fields.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 14 -
CurrPic
Specifies the destination picture buffer/surface index for the decoded picture. In this context, the
AssociatedFlag has no meaning and shall be 0, and the accelerator shall ignore its value.
sps_max_dec_pic_buffering_minus1
Corresponds to the variable of the same name in the HEVC specification, although only one value is
provided here rather than the value for each temporal sub-layer. The provided value applies to the
entire coded video sequence. The value of sps_max_dec_pic_buffering_minus1, shall be in the range of
0 to 15.
Note – The host software decoder may change the value of sps_max_dec_pic_buffering_minus1,
relative to the value in the bitstream, for the purpose of error concealment or trick play operation. The
accelerator shall honor the value sent by host software decoder.
log2_min_luma_coding_block_size_minus3, log2_diff_max_min_luma_coding_block_size,
log2_min_transform_block_size_minus2, log2_diff_max_min_transform_block_size,
max_transform_hierarchy_depth_inter, max_transform_hierarchy_depth_intra
Correspond to the syntax elements of the same name in the HEVC specification and affect the decoding
process accordingly.
ucNumDeltaPocsOfRefRpsIdx
When the short_term_ref_pic_set_sps_flag in the slice header is equal to 0, this shall be equal to the
value of the variable NumDeltaPocs[ RefRpsIdx ] that is appropriate for parsing
short_term_ref_pic_set( num_short_term_ref_pic_sets ) in the slice header. When the value of
short_term_ref_pic_set_sps_flag in the slice header is equal to 1, ucNumDeltaPocsOfRefRpsIdx shall be
set to 0 by the host decoder and accelerators shall ignore its value.
Note – The purpose of ucNumDeltaPocsOfRefRpsIdx is only to enable the parsing of slice headers. It is
not to be used by the accelerator for any other purpose (in order to ensure stateless operation for which
the decoded picture buffer handling is to be performed under the control of the host rather than
inferred from the bitstream by the accelerator).
wNumBitsForShortTermRPSInSlice
When the short_term_ref_pic_set_sps_flag in the slice header is equal to 0,
wNumBitsForShortTermRPSInSlice shall be equal to the number of bits used in short_term_ref_pic_set( )
syntax structure that is directly included in the slice headers of the current picture. When the value of
short_term_ref_pic_set_sps_flag in the slice header is equal to 1, wNumBitsForShortTermRPSInSlice
shall be set to 0 by the host decoder and accelerators shall ignore its value.
Note – The purpose of wNumBitsForShortTermRPSInSlice is only to enable accelerator to skip the
parsing of short_term_ref_pic_set( ) syntax structure when it is directly included in the slice headers of
the current picture. It is not to be used by the accelerator for any other purpose.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 15 -
ReservedBits2
Bit field added to enable DWORD alignment of parameters. Shall be set to 0 by the host decoder and
accelerators shall ignore its value.
num_short_term_ref_pic_sets, num_long_term_ref_pics_sps,
num_ref_idx_l0_default_active_minus1, num_ref_idx_l1_default_active_minus1,
init_qp_minus26, scaling_list_enabled_flag, amp_enabled_flag,
sample_adaptive_offset_enabled_flag, pcm_enabled_flag,
pcm_sample_bit_depth_luma_minus1, pcm_sample_bit_depth_chroma_minus1,
log2_min_pcm_luma_coding_block_size_minus3,
log2_diff_max_min_pcm_luma_coding_block_size, pcm_loop_filter_disabled_flag,
long_term_ref_pics_present_flag, sps_temporal_mvp_enabled_flag,
strong_intra_smoothing_enabled_flag, dependent_slice_segments_enabled_flag,
output_flag_present_flag, num_extra_slice_header_bits, sign_data_hiding_enabled_flag,
cabac_init_present_flag
Correspond to the syntax elements of the same name in the HEVC specification and affect the decoding
process accordingly.
When scaling_list_enabled_flag is equal to 0 and thus "flat" scaling lists with all entries equal to 16 are
used, the host decoder shall not send the scaling lists to the accelerator. When
scaling_list_enabled_flag is equal to 1, the host decoder shall send the scaling lists to the accelerator.
ReservedBits3
Bit field added to enable DWORD alignment of parameters. Shall be set to 0 by the host decoder and
accelerators shall ignore its value.
dwCodingParamToolFlags
Provides an alternative way to access the bit fields.
constrained_intra_pred_flag, transform_skip_enabled_flag, cu_qp_delta_enabled_flag,
pps_slice_chroma_qp_offsets_present_flag, weighted_pred_flag, weighted_bipred_flag,
transquant_bypass_enabled_flag, tiles_enabled_flag, entropy_coding_sync_enabled_flag,
uniform_spacing_flag, loop_filter_across_tiles_enabled_flag,
pps_loop_filter_across_slices_enabled_flag, deblocking_filter_override_enabled_flag,
pps_deblocking_filter_disabled_flag, lists_modification_present_flag,
slice_segment_header_extension_present_flag
Correspond to the syntax elements of the same names in the HEVC specification and affect the decoding
process accordingly.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 16 -
IrapPicFlag
Indicates whether the current picture is an IRAP picture. This flag shall be equal to 1 when the current
picture is an IRAP picture and shall be equal to 0 when the current picture is not an IRAP picture.
IdrPicFlag
Indicates whether the current picture is an IDR picture. This flag shall be equal to 1 when the current
picture is an IDR picture and shall be equal to 0 when the current picture is not an IDR picture.
IntraPicFlag
Indicates whether all slices of the current picture are I slices as follows:
Value
Description
0
Some slices of the current picture may not be I slices.
1
All slices of the current picture are I slices.
When the current picture is an IRAP picture, IntraPicFlag shall be equal to 1. When the current picture is not an
IRAP picture, the host software decoder is not required to determine whether all slices of the current picture are I
slices – i.e. it may simply set IntraPicFlag to 0 in this case.
ReservedBits4
Bit field added to enable DWORD alignment of parameters. Shall be set to 0 by the host decoder and accelerators
shall ignore its value.
dwCodingSettingPicturePropertyFlags
Provides an alternative way to access the bit fields.
pps_cb_qp_offset, pps_cr_qp_offset
Correspond to the syntax elements of the same name in the HEVC specification and affect the decoding
process accordingly.
num_tile_columns_minus1, num_tile_rows_minus1, column_width_minus1[],
row_height_minus1[]
Correspond to the syntax elements of the same name in the HEVC specification and affect the decoding
process accordingly.
In Table A-1 General tier and level limits in the HEVC specification, MaxTileCols and MaxTileRows have
the maximum values of 20 and 22 at the highest specified level (level 6.2). Thus, the arrays
column_width_minus1[] and row_height_minus1[] are defined to be large enough to support their
maximum allowed numbers of elements, 19 and 21, respectively.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 17 -
When tiles_enabled_flag is equal to 1, for Main Profile, Main 10 Profile, and Main Still Picture Profile,
ColumnWidthInLumaSamples[ i ] shall be greater than or equal to 256 for all values of i in the range of 0
to num_tile_columns_minus1, inclusive, and RowHeightInLumaSamples[ j ] shall be greater than or
equal to 64 for all values of j in the range of 0 to num_tile_rows_minus1, inclusive. The values of
ColumnWidthInLumaSamples[ i ], specifying the width of the i-th tile column in units of luma samples,
are set equal to colWidth[ i ] << Log2CtbSizeY. The values of RowHeightInLumaSamples[ j ], specifying
the height of the j-th tile row in units of luma samples, are set equal to rowHeight[ j ] << Log2CtbSizeY.
colWidth[ i ] and rowHeight[ j ] are derived according to section 6.5.1 in the HEVC specification,
according to num_tile_columns_minus1, num_tile_rows_minus1, column_width_minus1[],
row_height_minus1[].
When tiles_enabled_flag is equal to 1 and uniform_spacing_flag is equal to 1, the accelerator shall
ignore the values in column_width_minus1[] and row_height_minus1[]. In this case, it is not necessary
for the host software decoder to set specific values in column_width_minus1[] and
row_height_minus1[].
When tiles_enabled_flag is equal to 0, the accelerator shall ignore num_tile_columns_minus1,
num_tile_rows_minus1, uniform_spacing_flag, column_width_minus1[], row_height_minus1[], and
loop_filter_across_tiles_enabled_flag. In this case, it is not necessary for the host decoder to set
specific values for num_tile_columns_minus1, num_tile_rows_minus1, uniform_spacing_flag,
column_width_minus1[], row_height_minus1[], and loop_filter_across_tiles_enabled_flag.
diff_cu_qp_delta_depth, pps_beta_offset_div2, pps_tc_offset_div2,
log2_parallel_merge_level_minus2
Correspond to the syntax elements of the same name in the HEVC specification and affect the decoding
process accordingly.
CurrPicOrderCntVal
Specifies the picture order count of the current picture.
Note – The accelerator must use the value of CurrPicOrderCntVal provided by the accelerator rather
than trying to derive this information from the bitstream (in order to ensure stateless operation for
which the decoded picture buffer handling is to be performed under the control of the host rather than
inferred from the bitstream by the accelerator). Moreover, the accelerator must not store the value of
CurrPicOrderCntVal internally for subsequent use – it must instead rely on the values provided in
PicOrderCntValList[] when subsequent pictures are decoded that use the current picture as a reference
for inter-picture prediction.
RefPicList[]
Contains a list of uncompressed picture buffer surfaces. Entries that will not be used for decoding the
current picture, or any subsequent pictures, are indicated by setting bPicEntry to 0xFF or 0x7F, i.e.
setting Index7Bits equal to 127. If bPicEntry is not equal to 0xFF or 0x7F, the entry may be used as a
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 18 -
short-term or long-term reference surface for decoding the current picture or a subsequent picture in
decoding order. All uncompressed surfaces that correspond to pictures currently marked as "used for
reference" that may be used for reference in the decoding process of the current picture or any
subsequent picture shall appear in the RefPicList[] array (regardless of whether these pictures are
actually used in the decoding process of the current picture or not).
No particular order is specified for the ordering of the entries in the RefPicList[] array.
For each entry whose value is not equal to 0xFF or 0x7F, the value of AssociatedFlag is interpreted as
follows:
Value
0
1
Description
Not a long-term reference picture.
Long-term reference picture. The uncompressed buffer contains a reference picture
marked as "used for long-term reference."
Note – The accelerator must use the content of the RefPicList[] as provided by the accelerator rather
than trying to derive this information from the bitstream (in order to ensure stateless operation for
which the decoded picture buffer handling is to be performed under the control of the host rather than
inferred from the bitstream by the accelerator).
PicOrderCntValList[]
Contains the picture order counts for the reference pictures listed in RefPicList[].
If an element of the list is not relevant (for example, if the corresponding entry in RefPicList[] is empty
or is marked as "not used for reference"), the value in PicOrderCntValList[] shall be 0. Accelerators can
rely on this constraint being fulfilled.
Note – The accelerator must use the content of the PicOrderCntValList[] as provided by the accelerator
rather than trying to derive this information from the bitstream (in order to ensure stateless operation
for which the decoded picture buffer handling is to be performed under the control of the host rather
than inferred from the bitstream by the accelerator).
RefPicSetStCurrBefore[], RefPicSetStCurrAfter[], RefPicSetLtCurr[]
Contain the indices to the RefPicList[] for all reference pictures that may be used in inter prediction of
the current picture and that may be used in inter prediction of one or more of the pictures following the
current picture in decoding order.
When an entry in RefPicSetStCurrBefore[], RefPicSetStCurrAfter[] and RefPicSetLtCurr[] is not valid, it
shall be set to 0xff. Invalid entries shall not be present between valid entries in RefPicSetStCurrBefore[],
RefPicSetStCurrAfter[] and RefPicSetLtCurr[]. Valid entries in RefPicSetStCurrBefore[],
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 19 -
RefPicSetStCurrAfter[] and RefPicSetLtCurr[] shall have values in the range of 0 to 15, inclusive, and
each corresponding entry in RefPicList[] referred to by a valid entry in RefPicSetStCurrBefore[],
RefPicSetStCurrAfter[] and RefPicSetLtCurr[] shall not have bPicEntry equal to 0xFF. Any entry in
RefPicSetStCurrBefore[], RefPicSetStCurrAfter[] and RefPicSetLtCurr[] that is not equal to 0xFF shall not
be equal to the value of any other entry in RefPicSetStCurrBefore[], RefPicSetStCurrAfter[] or
RefPicSetLtCurr[].
Note – The accelerator must directly use the valid entries of the RefPicSetStCurrBefore[],
RefPicSetStCurrAfter[] and RefPicSetLtCurr[] as provided by the accelerator for initialization of the
reference picture lists (for derivation of RefPicListTemp0 and RefPicListTemp1 and subsequent
derivation of RefPicList0 and RefPicList1 as specified in the HEVC specification) rather than trying to
derive this information from the bitstream (in order to ensure stateless operation for which the decoded
picture buffer handling is to be performed under the control of the host rather than inferred from the
bitstream by the accelerator). The host decoder may change the value of RefPicSetStCurrBefore[],
RefPicSetStCurrAfter[] and RefPicSetLtCurr[]. For example, for error concealment or trick play purposes,
the host software decoder may change some values in the parameters. The accelerator shall honor the
value sent by the host decoder, regardless of whether a short-term reference picture set is present in
slice headers of current picture or not. When a short-term reference picture set is present in slice
headers of current picture, the accelerator shall ignore it and honor the set of parameters sent by host
decoder. However, the number of valid entries in RefPicSetStCurrBefore[], RefPicSetStCurrAfter[] and
RefPicSetLtCurr[] must not be changed by host decoder, as the value of the variable NumPocTotalCurr is
derived as NumPocStCurrBefore + NumPocStCurrAfter + NumPocLtCurr and is used for parsing the
ref_pic_list_modification(), syntax in the slice header, where NumPocStCurrBefore, NumPocStCurrAfter,
and NumPocLtCurr are the number of valid entries in RefPicSetStCurrBefore[], RefPicSetStCurrAfter[]
and RefPicSetLtCurr[].
In ordinary sequential and conforming decoding, the array RefPicSetStFoll[] in the HEVC specification
corresponds to the short-term reference pictures in the RefPicList[] that are not referred to by
RefPicSetStCurrBefore[] and RefPicSetStCurrAfter[]. The array RefPicSetLtFoll[] in the HEVC
specification corresponds to the long-term reference pictures in RefPicList[] that are not referred to by
RefPicSetLtCurr[]. RefPicSetStFoll and RefPicSetLtFoll consist of all reference pictures that are included
in the reference picture set but are indicated not to be used for inter prediction of the current picture
(although they may be used for inter prediction of one or more of the pictures following the current
picture in decoding order).
However, alternative decoding scenarios with HEVC DXVA, such as smooth reverse play or picture
decoding starting at the position of a random seek in the video content, instead of ordinary sequential
decoding order, it may be necessary for the buffer state to differ from what would be used for ordinary
forward playback starting at the beginning of an HEVC bitstream. Thus, in general, there could be
pictures in the RefPicList[] that are not included in RefPicSetStCurrBefore[], RefPicSetStCurrAfter[] and
RefPicSetLtCurr[] that also do not appear in the RefPicSetStFoll[] and RefPicSetLtFoll[] in the slice
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 20 -
headers of the bitstream data, and there could be pictures that appear in the RefPicSetStFoll[] and
RefPicSetLtFoll[] that do not appear in the RefPicList[] in a corresponding fashion.
Because the RefPicSetStFoll[] and RefPicSetLtFoll[] are not used in the decoding process, these lists are
not sent in the picture parameters data structure.
RefPicList[] shall still contain the full list of uncompressed picture buffer surfaces containing previouslydecoded pictures that may be used as a short-term or long-term reference surfaces for decoding the
current picture or any subsequent picture in the decoding order that is selected by host, regardless of
whether or not this decoding order selected by the host corresponds to the ordinary order of picture
decoding in the original HEVC bitstream or not.
ReservedBits5, ReservedBits6, ReservedBits7
Reserved bit fields. Shall be set to 0 by the host decoder and accelerators shall ignore their value.
StatusReportFeedbackNumber
Arbitrary number set by the host decoder to use as a tag in the status report feedback data. The value
should not equal 0, and should be different in each call to Execute. For more information, see section 6,
Status Report Data Structure.
Requirements
Header: Include dxva.h.
4.
Quantization Matrix Data Structure
The quantization matrix data structure is used when bDXVA_Func is 1 and the buffer type is
DXVA2_InverseQuantizationMatrixBufferType (in DXVA 2.0).
4.1
Syntax
The DXVA_Qmatrix_HEVC structure, as specified for other DXVA usage cases, is sent by the host
software decoder to the accelerator to load inverse-quantization matrix data for off-host bitstream
compressed video picture decoding.
When scaling_list_enabled_flag is equal to 0 and thus "flat" scaling lists with all entries equal to 16 are
used, the host decoder shall not send the scaling lists to the accelerator to reduce the data transfer from
the host decoder to the accelerator. When scaling_list_enabled_flag is equal to 1, the host decoder
shall send the scaling lists to the accelerator, using the DXVA_Qmatrix_HEVC structure.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 21 -
The form of this data structure is shown below:
typedef struct _DXVA_Qmatrix_HEVC {
UCHAR ucScalingList0[6][16];
UCHAR ucScalingList1[6][64];
UCHAR ucScalingList2[6][64];
UCHAR ucScalingList3[2][64];
UCHAR ucScalingListDCCoefSizeID2[6];
UCHAR ucScalingListDCCoefSizeID3[2];
} DXVA_Qmatrix_HEVC, *LPDXVA_Qmatrix_HEVC;
4.2
Semantics
When present, the semantics of the DXVA_Qmatrix_HEVC data structure are as follows.
ucScalingList0[6][16]
Contains the scaling lists for the 4x4 scaling process, corresponding to ScalingList[ 0 ][ MatrixID ][ i ] in
HEVC specification, where MatrixID is in the range of 0 to 5, inclusive, and i is in the range of 0 to 15,
inclusive.
ucScalingList1[6][64]
Contains the scaling lists for the 8x8 scaling process, corresponding to ScalingList[ 1 ][ MatrixID ][ i ] in
the HEVC specification, where MatrixID is in the range of 0 to 5, inclusive, and i is in the range of 0 to 63,
inclusive.
ucScalingList2[6][64]
Contains the scaling lists for the 16x16 scaling process, corresponding to ScalingList[ 2 ][ MatrixID ][ i ] in
HEVC specification, where MatrixID is in the range of 0 to 5, inclusive, and i is in the range of 0 to 63,
inclusive.
ucScalingList3[2][64]
Contains the scaling lists for the 32x32 scaling process, corresponding to ScalingList[ 3 ][ MatrixID ][ i ] in
HEVC specification, where MatrixID is in the range of 0 to 1, inclusive, and i is in the range of 0 to 63,
inclusive.
ucScalingListDCCoefSizeID2[6]
Contains the DC value of the scaling list for 16x16 size with sizeID equal to 2 and corresponding to
scaling_list_dc_coef_minus8[ sizeID − 2 ][ matrixID ] +8 with sizeID equal to 2 and matrixID in the range
of 0 to 5, inclusive, in HEVC specification.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 22 -
ucScalingListDCCoefSizeID3[2]
Contains the DC value of the scaling list for 32x32 size with sizeID equal to 3, and corresponding to
scaling_list_dc_coef_minus8[ sizeID − 2 ][ matrixID ] +8 with sizeID equal to 3 and matrixID in the range
of 0 to 1, inclusive, in HEVC specification.
Note – Hypothetically, this structure could have been included in the picture parameters data structure,
but DXVA already defines a buffer type for quantization matrices and this method has been specified for
design consistency.
5.
Slice Control Data Structure
These structures are used when bDXVA_Func is 1 and the buffer type is DXVA2_SliceControlBufferType
(DXVA 2.0). The slice control buffer is accompanied by a bitstream data buffer. The total quantity of data
in the bitstream buffer (and the amount of data reported by the host software decoder) shall be an
integer multiple of 128 bytes.
The DXVA_Slice_HEVC_Long data structure is not defined; only the DXVA_Slice_HEVC_Short structure
is defined in this specification.
5.1
Syntax
The DXVA_Slice_HEVC_Short data structure, as specified for other DXVA usage cases, is sent by the host
software decoder to the accelerator to convey slice control data. The data structure and associated
semantics are essentially the same as for the previous DXVA_Slice_H264_Short data structure. Although
the data structure is the same as the previous DXVA_Slice_H264_Short data structure, it has been given
a new name so that the data structures used for HEVC will have names that are associated with the new
design. Each slice segment NAL unit has its own associated DXVA_Slice_HEVC_Short data structure,
regardless of whether it contains an independent slice segment or a dependent slice segment.
For convenience, the form of this data structure is shown below:
typedef struct _DXVA_Slice_HEVC_Short {
UINT
BSNALunitDataLocation;
UINT
SliceBytesInBuffer;
USHORT
wBadSliceChopping;
} DXVA_Slice_HEVC_Short, *LPDXVA_Slice_HEVC_Short;
5.2
Semantics
BSNALunitDataLocation
When the bitstream data buffer contains the first byte of the start code prefix of a byte stream NAL unit
that contains a coded slice segment NAL unit, this member locates the NAL unit with nal_unit_type in
the range of 0 to 9, inclusive, or the range of 16 to 21, inclusive, for the current slice segment. The value
is the byte offset, from the start of the bitstream data buffer, of the first byte of the start code prefix of
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 23 -
the byte stream NAL unit that contains the NAL unit. (The start code prefix is the
start_code_prefix_one_3bytes syntax element. The byte stream NAL unit syntax is defined in Annex B of
the HEVC specification. The current slice segment is the slice segment associated with this slice control
data structure.)
The bitstream data buffer shall not contain NAL units with values of nal_unit_type outside the range of 0
to 9, inclusive, or the range of 16 to 21, inclusive. However, the accelerator shall allow any such NAL
units to be present and ignore their content if present.
Note – The bitstream data buffer may or may not contain leading_zero_8bits, zero_byte, and
trailing_zero_8bits syntax elements. If present, the accelerator shall ignore these elements.
If wBadSliceChopping is not 0 or 1, BSNALunitDataLocation shall be 0.
SliceBytesInBuffer
Number of bytes in the bitstream data buffer that are associated with this slice control data structure,
starting with the byte at the offset given in BSNALunitDataLocation.
wBadSliceChopping
Contains one of the following values:
Value
0
Description
All bits for the slice are located within the corresponding bitstream data buffer.
1
The bitstream data buffer contains the start of the slice, but not the entire slice, because
the buffer is full.
2
The bitstream data buffer contains the end of the slice. It does not contain the start of the
slice, because the start of the slice was located in the previous bitstream data buffer.
3
The bitstream data buffer does not contain the start of the slice (because the start of the
slice was located in the previous bitstream data buffer), and it does not contain the end of
the slice (because the current bitstream data buffer is also full).
Note – The above table refers to the bitstream data buffer contents for a slice rather than for a slice
segment. This is intentional. A slice may contain multiple NAL units, for which the first NAL unit contains
an independent slice segment and the subsequent NAL units contain dependent slice segments.
Generally the host decoder should avoid using values other than 0 for wBadSliceChopping.
The size of the data in the bitstream data buffer (and the amount of data reported by the host decoder)
shall be an integer multiple of 128 bytes. When wBadSliceChopping is 0 or 2, if the end of the slice data
is not an even multiple of 128 bytes, the decoder should pad the end of the buffer with zeroes.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 24 -
Any dependent slice segment shall be sent in the same bitstream data buffer as the preceding
independent slice segment in decoding order, unless wBadSiceChopping is not equal 0, as the decoding
process for the dependent slice segment needs to infer the value of its slice header syntax elements
from the values for the preceding independent slice segment.
Note – According to Section 7.4.1.4.4 (Order of VCL NAL units and association to coded pictures) of the
HEVC specification, arbitrary slice order is prohibited by the HEVC specification.
The host decoder is recommended to send only decodable pictures to accelerator. For example, when
the associated IRAP picture is a BLA picture or is a CRA picture that is the first coded picture in the
bitstream, the host decoder is recommended not to send the associated RASL pictures to accelerator if
any such RASL pictures are present in the bitstream. However, the accelerator shall be robust enough to
handle any non-decodable pictures.
6.
Status Report Data Structure
The DXVA_Status_HEVC data structure is sent by the accelerator to the host software decoder to
convey decoding status information. This structure is used when bDXVA_Func is 7.
The status reporting command does not use a compressed buffer. Instead, the host software decoder
provides a buffer as private output data. For more information, see section 1.9 (Status Reporting) of this
specification.
The status information command should be asynchronous to the decoding process. The host software
decoder should not wait to receive status information on a process before it proceeds to initiate another
process. After the host software decoder has received a status report for a particular operation, the
accelerator shall discard that information and not report it again. (That is, the results of each particular
operation shall not be reported to the host software decoder more than once.) Accelerators shall be
capable of providing status information for every buffer for every operation performed.
Accelerators are required to store at least a minimum of 512 DXVA_Status_HEVC structures internally,
pending status requests from the host software decoder. An accelerator may (and should) exceed this
storage capacity. If the accelerator discards reporting information, it should discard the oldest data first.
The accelerator should provide status reports in approximately reverse temporal order of when the
operations were completed. That is, status reports for the most recently completed operations should
appear earlier in the list of status report data structures.
Note – As previously stated, the term should describes guidelines that are encouraged but are not
mandatory requirements.
6.1
Syntax
The DXVA_Status_HEVC data structure is sent by the accelerator to the host software decoder to
convey decoding status information. The data structure and associated semantics are essentially the
same as for the previous DXVA_Status_H264 data structure. Although the data structure is the same as
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 25 -
the previous DXVA_Status_H264 data structure, it has been given a new name so that the data
structures used for HEVC will have names that are associated with the new design. For convenience, the
form of this data structure is shown below:
typedef struct _DXVA_Status_HEVC {
USHORT StatusReportFeedbackNumber;
DXVA_PicEntry_HEVC CurrPic;
UCHAR
bBufType;
UCHAR
bStatus;
UCHAR
bReserved8Bits;
USHORT wNumMbsAffected;
} DXVA_Status_HEVC, *LPDXVA_Status_HEVC;
6.2
Semantics
StatusReportFeedbackNumber
Contains the value of StatusReportFeedbackNumber set by the host decoder in the picture parameters
data structure or the film-grain synthesis buffer for the associated operation.
CurrPic
Specifies the uncompressed destination surface that was affected by the operation.
bBufType
Indicates the type of compressed buffer associated with this status report. If bStatus is 0, the value of
bBufType may be 0xFF. This value indicates that the status report applies to all of the compressed
buffers conveyed in the associated Execute call. Otherwise, if bBufType is not 0xFF, it must contain one
of the following values, defined in dxva.h:
Value
DXVA_PICTURE_DECODE_BUFFER (1)
Description
Picture decoding parameter buffer.
DXVA_SLICE_CONTROL_BUFFER (6)
Slice control buffer.
DXVA_BITSTREAM_DATA_BUFFER (7)
Bitstream data buffer.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 26 -
bStatus
Indicates the status of the operation.
Value
0
Description
The operation succeeded.
1
Minor problem in the data format. The host decoder should continue processing.
2
Significant problem in the data format. The host decoder may continue executing
or skip the display of the output picture.
3
Severe problem in the data format. The host decoder should restart the entire
decoding process, starting at a sequence or random-access entry point.
4
Other severe problem. The host decoder should restart the entire decoding
process, starting at a sequence or random-access entry point.
If the value is 3 or 4, the host decoder should halt the decoding process unless it can take corrective
action.
bReserved8Bits
This structure member has no meaning, and the value shall be 0. Accelerators shall ignore its value.
wNumMbsAffected
If bStatus is not 0, this member contains the accelerator's estimate of the number of coding tree units in
the decoded picture that were adversely affected by the reported problem. If the accelerator does not
provide an estimate, the value is 0xFFFF.
If bStatus is 0, the accelerator may set wNumMbsAffected to the number of coding tree units that were
successfully affected by the operation. If the accelerator does not provide an estimate, it shall set the
value either to 0 or to 0xFFFF.
Requirements
Header: Include dxva.h.
7.
Restricted-Mode Profiles
The following restricted-mode profiles for DXVA operation for HEVC video decoding are defined. The
GUIDs that identify these profiles will be defined in the header file dxva.h.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 27 -
7.1
DXVA_ModeHEVC_VLD_Main Profile
This profile supports the features necessary for a decoder that conforms to the HEVC Main profile and
Main Still Picture profile. In this profile, the accelerator performs bitstream parsing, inverse quantization
scaling, inverse transform processing, motion compensation, and deblocking.
All data buffers shall contain only data that is consistent with the constraints specified for the Main
profile and Main Still Picture profile in Annex A of the HEVC specification.
The associated GUID definition for the corresponding entry in the dxva.h header file is as follows:
// {5B11D51B-2F4C-4452-BCC3-09F2A1160CC0}
DEFINE_GUID(DXVA_ModeHEVC_VLD_Main,
0x5b11d51b, 0x2f4c, 0x4452, 0xbc, 0xc3, 0x9, 0xf2, 0xa1, 0x16, 0xc, 0xc0);
7.2
DXVA_ModeHEVC_VLD_Main10 Profile
This profile supports the features necessary for a decoder that conforms to the HEVC Main 10 profile. In
this profile, the accelerator performs bitstream parsing, inverse quantization scaling, inverse transform
processing, motion compensation, and deblocking.
All data buffers shall contain only data that is consistent with the constraints specified for the Main 10
profile in Annex A of the HEVC specification.
The associated GUID definition for the corresponding entry in the dxva.h header file is as follows:
// {107AF0E0-EF1A-4D19-ABA8-67A163073D13}
DEFINE_GUID(DXVA_ModeHEVC_VLD_Main10,
0x107af0e0, 0xef1a, 0x4d19, 0xab, 0xa8, 0x67, 0xa1, 0x63, 0x7, 0x3d, 0x13);
8.

For More Information
DirectX Video Acceleration 2.0 documentation: http://go.microsoft.com/fwlink/?LinkId=94771
Web addresses can change, so you might be unable to connect to the Web site or sites mentioned here.
© 2013 Microsoft Corporation. All rights reserved. By using or providing feedback on these materials, you agree to
the attached license agreement.
- 28 -
1/--страниц
Пожаловаться на содержимое документа