Microsoft Media Foundation官方文档翻译(11)《Media Types & Audio Media Types》

时间:2021-02-25 18:58:18

官方英文文档链接:https://docs.microsoft.com/en-us/windows/desktop/medfound/media-types

基于05/31/2018

我试一下把下面几个页面全放在这一篇里,所以此篇内容较多。video内容有点多,重新开一篇

目前包含以下页面:

Media Type

    About Media Type

    Major Media Type

    Audio Media Type

        Audio Subtype GUIDs

        Uncompressed Audio Media Types

        AAC Media Types

 

 media type 用来描述媒体流格式。在 Media Foundation 中,medai type 用 IMFMediaType 表示。应用程序使用 media types 来辨别媒体文件或媒体流的格式。Media Foundation pipeline 中的各种对象也使用 media type 来协商他们将传递或接收的格式。

此文主要包含一下几部分

Topic Description
About Media Types General overview of media types in Media Foundation.
Media Type GUIDs Lists the defined GUIDs for major types and subtypes.
Audio Media Types How to create media types for audio formats.
Video Media Types How to create media types for video formats.
Complete and Partial Media Types Describes the difference between complete media types and partial media types.
Media Type Conversions How to convert between Media Foundation media types and older format structures.
Media Type Helper Functions A list of functions that manipulate or get information from a media type.
Media Type Debugging Code Example code that shows how to view a media type while debugging.

 

关于 Media Types

IMFMediaType 接口继承自 IMFAttributes 。media type 的详细信息就是一些 attributes。

要创建一个 media type 对象,调用 MFCreateMediaType 。此方法返回一个指向 IMFMediaType 接口的指针。media type 最开始没有 attributes。为了设置 media type 的细节,请设置相关的 attributes。

 

Major Types and Subtypes

任何媒体类型的两个重要信息就是 major type 和 subtype。

  • major type 是定义媒体流大类型的 GUID。Major types 包括了视频和音频还有其他一些类型。要设置 major type,去设置MF_MT_MAJOR_TYPE attribute。IMFMediaType::GetMajorType 方法返回此 attribute 的值。
  • subtype 指明了具体格式。例如 major type 为 video 时,subtype 可以是 RGB-24, RGB-32, YUY2 等等。对于音频,subtype 可以是 PCM audio, IEEE floating-point audio 或者其他。subtype 提供了比 major type 更详细的信息。但它并不一定提供了所有的信息。例如视频的 subtype 没有提供画面的分辨率以及帧率。要设置 subtype,去设置 MF_MT_SUBTYPE attribute。

所有的媒体类型都应该有一个 major type GUID 和 subtype GUID,要获取完整的GUID列表,参阅 Media Type GUIDs

为什么用 Attributes?

与以前的技术(例如 DirectShow 和 Windows Media Format SDK )使用的格式结构相比,attribute 有几个优点。

  • 更容易表示“不知道”或“不关心”的值。例如,你正在写一个视频转换程序,你可能知道要支持的RGB格式和YUV格式,但你不会知道要转换的视频的分辨率,帧率等等信息,你可能不关心这些信息。对于一个表示视频格式的结构,每个成员变量都必须有一个值(设置好的或者默认值),使用 0 作为默认值是一种很常见的做法。如果对于另一个组件来说,0 是一个合法的值,则此时可能造成错误。对于 attribute 来说,无关的值只要省略即可。

  • 需求会随着时间变化,通过在结构末尾添加更多数据来支持更多格式。例如将 WAVEFORMATEXTENSIBLE 扩展为 WAVEFORMATEX 。这种做法容易出错,因为组件必须强制转换指针类型。而 attribute 可以安全地扩展。

  • 定义了相互不兼容的的格式结构。例如 DirectShow 定义了 VIDEOINFOHEADERVIDEOINFOHEADER2。属性彼此独立设置,因此不会出现此问题。

 

 

Major Media Types

 

在 media type 中, major type 对数据类型进行了总体描述,例如这是一个视频或者音频。subtype 会进一步细化(如果存在 subtype)。例如,如果 major type 是视频,则 subtype 可能是 32 位 RGB 的视频。subtype 也可以表示编码格式,例如 H.264 视频。

Major type and subtype are identified by GUIDs and stored in the following attributes:

Attribute Description
MF_MT_MAJOR_TYPE Major type.
MF_MT_SUBTYPE Subtype.

 

The following major types are defined.

Major Type Description Subtypes
MFMediaType_Audio 音频。 Audio Subtype GUIDs.
MFMediaType_Binary 二进制流。 None.
MFMediaType_FileTransfer 包含数据文件的流。 None.
MFMediaType_HTML HTML 流。 None.
MFMediaType_Image 图片流。 WIC GUIDs and CLSIDs.
MFMediaType_Protected 受保护的媒体数据。 The subtype specifies the content protection scheme.
MFMediaType_Perception Streams from a camera sensor or processing unit that reasons and understands raw video data and provides understanding of the environment or humans in it. None.
MFMediaType_SAMI Synchronized Accessible Media Interchange (SAMI) captions. None.
MFMediaType_Script Script stream. None.
MFMediaType_Stream 多路流或单路流。 Stream Subtype GUIDs
MFMediaType_Video 视频。 Video Subtype GUIDs.

第三方组件可以定义新的 majortype 和 subtype。

 

 

Audio Media Types

本节介绍如何创建和操作描述音频数据的媒体类型。

Topic Description
Audio Subtype GUIDs 音频 subtype GUID 列表。
Uncompressed Audio Media Types 如何创建一个描述未压缩音频格式的 media type。
AAC Media Types Advanced Audio Coding (AAC) 流的 media type。

 

 

Audio Subtype GUIDs

下面是已经第定义的音频 subtype GUID。要指定 subtype,在 media type 上设置 MF_MT_SUBTYPE attribute。除非另有说明,这些常量都定义在 mfapi.h 中。

使用这些 subtypes 时,设置 MF_MT_MAJOR_TYPEMFMediaType_Audio

GUID Description Format Tag (FOURCC)
MEDIASUBTYPE_RAW_AAC1 Advanced Audio Coding (AAC).
This subtype is used for AAC contained in an AVI file with an audio format tag equal to 0x00FF.
For more information, see AAC Decoder.
Defined in wmcodecdsp.h
WAVE_FORMAT_RAW_AAC1 (0x00FF)
MFAudioFormat_AAC Advanced Audio Coding (AAC).
[!Note]
Equivalent to MEDIASUBTYPE_MPEG_HEAAC, defined in wmcodecdsp.h.

The stream can contain raw AAC data or AAC data in an Audio Data Transport Stream (ADTS) stream.
For more information, see:
WAVE_FORMAT_MPEG_HEAAC (0x1610)
MFAudioFormat_ADTS Not used. WAVE_FORMAT_MPEG_ADTS_AAC (0x1600)
MFAudioFormat_ALAC Apple Lossless Audio Codec
Supported in Windows 10 and later.
WAVE_FORMAT_ALAC (0x6C61)
MFAudioFormat_AMR_NB Adaptative Multi-Rate audio
Supported in Windows 8.1 and later.
WAVE_FORMAT_AMR_NB
MFAudioFormat_AMR_WB Adaptative Multi-Rate Wideband audio
Supported in Windows 8.1 and later.
WAVE_FORMAT_AMR_WB
MFAudioFormat_AMR_WP Supported in Windows 8.1 and later. WAVE_FORMAT_AMR_WP
MFAudioFormat_Dolby_AC3 Dolby Digital (AC-3).
Same GUID value as MEDIASUBTYPE_DOLBY_AC3, which is defined in ksuuids.h
None.
MFAudioFormat_Dolby_AC3_SPDIF Dolby AC-3 audio over Sony/Philips Digital Interface (S/PDIF).
This GUID value is identical to the following subtypes:
  • KSDATAFORMAT_SUBTYPE_IEC61937_DOLBY_DIGITAL, defined in ksmedia.h.
  • MEDIASUBTYPE_DOLBY_AC3_SPDIF, defined in uuids.h.
WAVE_FORMAT_DOLBY_AC3_SPDIF (0x0092)
MFAudioFormat_Dolby_DDPlus Dolby Digital Plus.
Same GUID value as MEDIASUBTYPE_DOLBY_DDPLUS, which is defined in wmcodecdsp.h.
None
MFAudioFormat_DRM Encrypted audio data used with secure audio path. WAVE_FORMAT_DRM (0x0009)
MFAudioFormat_DTS Digital Theater Systems (DTS) audio. WAVE_FORMAT_DTS (0x0008)
MFAudioFormat_FLAC Free Lossless Audio Codec
Supported in Windows 10 and later.
WAVE_FORMAT_FLAC (0xF1AC)
MFAudioFormat_Float Uncompressed IEEE floating-point audio. WAVE_FORMAT_IEEE_FLOAT (0x0003)
MFAudioFormat_Float_SpatialObjects Uncompressed IEEE floating-point audio. None
MFAudioFormat_MP3 MPEG Audio Layer-3 (MP3). WAVE_FORMAT_MPEGLAYER3 (0x0055)
MFAudioFormat_MPEG MPEG-1 audio payload. WAVE_FORMAT_MPEG (0x0050)
MFAudioFormat_MSP1 Windows Media Audio 9 Voice codec. WAVE_FORMAT_WMAVOICE9 (0x000A)
MFAudioFormat_Opus Opus
Supported in Windows 10 and later.
WAVE_FORMAT_OPUS (0x704F)
MFAudioFormat_PCM Uncompressed PCM audio. WAVE_FORMAT_PCM (1)
MFAudioFormat_QCELP QCELP (Qualcomm Code Excited Linear Prediction) audio. None
MFAudioFormat_WMASPDIF Windows Media Audio 9 Professional codec over S/PDIF. WAVE_FORMAT_WMASPDIF (0x0164)
MFAudioFormat_WMAudio_Lossless Windows Media Audio 9 Lossless codec or Windows Media Audio 9.1 codec. WAVE_FORMAT_WMAUDIO_LOSSLESS (0x0163)
MFAudioFormat_WMAudioV8 Windows Media Audio 8 codec, Windows Media Audio 9 codec, or Windows Media Audio 9.1 codec. WAVE_FORMAT_WMAUDIO2 (0x0161)
MFAudioFormat_WMAudioV9 Windows Media Audio 9 Professional codec or Windows Media Audio 9.1 Professional codec. WAVE_FORMAT_WMAUDIO3 (0x0162)

T此表第三列中的格式标记,是在 WAVEFORMATEX 结构中使用,并在头文件 mmreg.h 中定义。

给定一种格式,你可以用以下步骤创建一个 subtype GUID :

  1. 从定义在 mfaph.i 中的 MFAudioFormat_Base 这个值开始。
  2. 使用格式标记(fourcc?)替换 GUID 中的第一个 DWORD

你可以使用 DEFINE_MEDIATYPE_GUID 宏定义一个遵循此模式的新的 GUID 常量。

 

 

Uncompressed Audio Media Types

要创建一个完整的描述未压缩音频格式的 media type,要在 IMFMediaType 接口指针上设置 至少 以下 attribute。

Attribute Description
MF_MT_MAJOR_TYPE Major type. Set to MFMediaType_Audio.
MF_MT_SUBTYPE Subtype. See Audio Subtype GUIDs.
MF_MT_AUDIO_NUM_CHANNELS Number of audio channels.
MF_MT_AUDIO_SAMPLES_PER_SECOND Number of audio samples per second.
MF_MT_AUDIO_BLOCK_ALIGNMENT Block alignment.
MF_MT_AUDIO_AVG_BYTES_PER_SECOND Average number of bytes per second.
MF_MT_AUDIO_BITS_PER_SAMPLE Number of bits per audio sample.
MF_MT_ALL_SAMPLES_INDEPENDENT Specifies whether each audio sample is independent. Set to TRUE for MFAudioFormat_PCM and MFAudioFormat_Float formats.

 

另外,某些格式还要求以下 attribute:

Attribute Description
MF_MT_AUDIO_VALID_BITS_PER_SAMPLE Number of valid bits of audio data in each audio sample. Set this attribute if the audio samples have padding—that is, if the number of valid bits in each audio sample is less than the sample size.
MF_MT_AUDIO_CHANNEL_MASK The assignment of audio channels to speaker positions. Set this attribute for multichannel audio streams, such as 5.1. This attribute is not required for mono or stereo audio.

 

Example Code

以下代码展示了如何为未压缩的 PCM 格式音频创建一个 media type。

 1 HRESULT CreatePCMAudioType(
 2     UINT32 sampleRate,        // Samples per second
 3     UINT32 bitsPerSample,     // Bits per sample
 4     UINT32 cChannels,         // Number of channels
 5     IMFMediaType **ppType     // Receives a pointer to the media type.
 6     )
 7 {
 8     HRESULT hr = S_OK;
 9 
10     IMFMediaType *pType = NULL;
11 
12     // Calculate derived values.
13     UINT32 blockAlign = cChannels * (bitsPerSample / 8);
14     UINT32 bytesPerSecond = blockAlign * sampleRate;
15 
16     // Create the empty media type.
17     hr = MFCreateMediaType(&pType);
18     if (FAILED(hr))
19     {
20         goto done;
21     }
22 
23     // Set attributes on the type.
24     hr = pType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Audio);
25     if (FAILED(hr))
26     {
27         goto done;
28     }
29 
30     hr = pType->SetGUID(MF_MT_SUBTYPE, MFAudioFormat_PCM);
31     if (FAILED(hr))
32     {
33         goto done;
34     }
35 
36     hr = pType->SetUINT32(MF_MT_AUDIO_NUM_CHANNELS, cChannels);
37     if (FAILED(hr))
38     {
39         goto done;
40     }
41 
42     hr = pType->SetUINT32(MF_MT_AUDIO_SAMPLES_PER_SECOND, sampleRate);
43     if (FAILED(hr))
44     {
45         goto done;
46     }
47 
48     hr = pType->SetUINT32(MF_MT_AUDIO_BLOCK_ALIGNMENT, blockAlign);
49     if (FAILED(hr))
50     {
51         goto done;
52     }
53 
54     hr = pType->SetUINT32(MF_MT_AUDIO_AVG_BYTES_PER_SECOND, bytesPerSecond);
55     if (FAILED(hr))
56     {
57         goto done;
58     }
59 
60     hr = pType->SetUINT32(MF_MT_AUDIO_BITS_PER_SAMPLE, bitsPerSample);
61     if (FAILED(hr))
62     {
63         goto done;
64     }
65 
66     hr = pType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, TRUE);
67     if (FAILED(hr))
68     {
69         goto done;
70     }
71 
72     // Return the type to the caller.
73     *ppType = pType;
74     (*ppType)->AddRef();
75 
76 done:
77     SafeRelease(&pType);
78     return hr;
79 }

 

Microsoft Media Foundation官方文档翻译(11)《Media Types & Audio Media Types》

 

下一个例子中输入一个音频编码格式,然后创建一个对应的 PCM 音频 type。这个 type 可以用来设置编码器或解码

 1 //-------------------------------------------------------------------
 2 // ConvertAudioTypeToPCM
 3 //
 4 // Given an audio media type (which might describe a compressed audio
 5 // format), returns a media type that describes the equivalent
 6 // uncompressed PCM format.
 7 //-------------------------------------------------------------------
 8 
 9 HRESULT ConvertAudioTypeToPCM(
10     IMFMediaType *pType,        // Pointer to an encoded audio type.
11     IMFMediaType **ppType       // Receives a matching PCM audio type.
12     )
13 {
14     HRESULT hr = S_OK;
15 
16     GUID majortype = { 0 };
17     GUID subtype = { 0 };
18 
19     UINT32 cChannels = 0;
20     UINT32 samplesPerSec = 0;
21     UINT32 bitsPerSample = 0;
22 
23     hr = pType->GetMajorType(&majortype);
24     if (FAILED(hr)) 
25     { 
26         return hr;
27     }
28 
29     if (majortype != MFMediaType_Audio)
30     {
31         return MF_E_INVALIDMEDIATYPE;
32     }
33 
34     // Get the audio subtype.
35     hr = pType->GetGUID(MF_MT_SUBTYPE, &subtype);
36     if (FAILED(hr)) 
37     { 
38         return hr;
39     }
40 
41     if (subtype == MFAudioFormat_PCM)
42     {
43         // This is already a PCM audio type. Return the same pointer.
44 
45         *ppType = pType;
46         (*ppType)->AddRef();
47 
48         return S_OK;
49     }
50 
51     // Get the sample rate and other information from the audio format.
52 
53     cChannels = MFGetAttributeUINT32(pType, MF_MT_AUDIO_NUM_CHANNELS, 0);
54     samplesPerSec = MFGetAttributeUINT32(pType, MF_MT_AUDIO_SAMPLES_PER_SECOND, 0);
55     bitsPerSample = MFGetAttributeUINT32(pType, MF_MT_AUDIO_BITS_PER_SAMPLE, 16);
56 
57     // Note: Some encoded audio formats do not contain a value for bits/sample.
58     // In that case, use a default value of 16. Most codecs will accept this value.
59 
60     if (cChannels == 0 || samplesPerSec == 0)
61     {
62         return MF_E_INVALIDTYPE;
63     }
64 
65     // Create the corresponding PCM audio type.
66     hr = CreatePCMAudioType(samplesPerSec, bitsPerSample, cChannels, ppType);
67 
68     return hr;
69 }

 

Microsoft Media Foundation官方文档翻译(11)《Media Types & Audio Media Types》

 

AAC Media Types

本文介绍了如何在 Media Foundation 中创建 Advanced Audio Coding (AAC) 流格式的 media type。

AAC 音频定义了两种 subtype:

Subtype Description Header
MFAudioFormat_AAC Raw AAC or ADTS AAC. mfapi.h
MEDIASUBTYPE_RAW_AAC1 Raw AAC. wmcodecdsp.h

1. MFAudioFormat_AAC

对于这种 subtype,media type 在应用 spectral band replication (SBR) 和 parametric stereo (PS) tools(如果存在) 之前给出了 sample rate 和声道数。SBR 工具的效果是使解码后的 sample rate 变为 core AAC-LC sample rate 的两倍。PS tool 的效果是从单声道 core AAC-LC 流解码立体声。

此 subtype 等同于定义在 wmcodecdsp.h 中的 MEDIASUBTYPE_MPEG_HEAAC。参阅 Audio Subtype GUIDs

2. MEDIASUBTYPE_RAW_AAC1

此 subtype 用于 AVI 文件中包含的 AAC,等同于 WAVE_FORMAT_RAW_AAC1 (0x00FF)。

对于此种 subtype,media type 在应用 SBR 和 PS 工具(如果存在)之后给出采样率和声道数。

 

以下 media type attributes 适用于 AAC 音频。

Attribute Description
MF_MT_MAJOR_TYPE Major type。必须是 MFMediaType_Audio.
MF_MT_SUBTYPE Audio subtype。参考上面的描述(两种其一)
MF_MT_AAC_AUDIO_PROFILE_LEVEL_INDICATION Audio profile and level.
The value of this attribute is the audioProfileLevelIndication field, as defined by ISO/IEC 14496-3.
If unknown, set to zero or 0xFE ("no audio profile specified").
MF_MT_AUDIO_AVG_BYTES_PER_SECOND Bit rate of the encoded AAC stream, in bytes per second.
MF_MT_AAC_PAYLOAD_TYPE Payload type.
Applies only to MFAudioFormat_AAC.
MF_MT_AAC_PAYLOAD_TYPE is optional. If this attribute is not specified, the default value 0 is used, which specifies the stream contains raw_data_block elements only.
MF_MT_AUDIO_BITS_PER_SAMPLE Bit depth of the decoded PCM audio.
MF_MT_AUDIO_CHANNEL_MASK Assignment of audio channels to speaker positions.
MF_MT_AUDIO_NUM_CHANNELS Number of channels, including the low frequency (LFE) channel, if present.
The interpretation of this value depends on the media subtype, as described previously.
MF_MT_AUDIO_SAMPLES_PER_SECOND Sample rate, in samples per second.
The interpretation of this value depends on the media subtype, as described previously.
MF_MT_USER_DATA The value of this attribute depends on the subtype:
  • MFAudioFormat_AAC: Contains the portion of the HEAACWAVEINFO structure that appears after the WAVEFORMATEX structure (that is, after the wfx member). This is followed by the AudioSpecificConfig() data, as defined by ISO/IEC 14496-3.
  • MEDIASUBTYPE_RAW_AAC1: Contains the AudioSpecificConfig() data.