新书推介:《语义网技术体系》
作者:瞿裕忠,胡伟,程龚
   XML论坛     W3CHINA.ORG讨论区     计算机科学论坛     SOAChina论坛     Blog     开放翻译计划     新浪微博  
 
  • 首页
  • 登录
  • 注册
  • 软件下载
  • 资料下载
  • 核心成员
  • 帮助
  •   Add to Google

    >> VoiceXML, CCXML, OpenVXI
    [返回] 中文XML论坛 - 专业的XML技术讨论区XML.ORG.CN讨论区 - 高级XML应用『 XML在语音技术中的应用 』 → MP3文件格式(英文)[转帖] 查看新帖用户列表

      发表一个新主题  发表一个新投票  回复主题  (订阅本版) 您是本帖的第 4829 个阅读者浏览上一篇主题  刷新本主题   树形显示贴子 浏览下一篇主题
     * 贴子主题: MP3文件格式(英文)[转帖] 举报  打印  推荐  IE收藏夹 
       本主题类别:     
     zhu_ruixian 帅哥哟,离线,有人找我吗?射手座1983-12-2
      
      
      威望:2
      等级:大二期末(Java考了96分!)
      文章:406
      积分:3471
      门派:W3CHINA.ORG
      注册:2006/3/30

    姓名:(无权查看)
    城市:(无权查看)
    院校:(无权查看)
    给zhu_ruixian发送一个短消息 把zhu_ruixian加入好友 查看zhu_ruixian的个人资料 搜索zhu_ruixian在『 XML在语音技术中的应用 』 的所有贴子 引用回复这个贴子 回复这个贴子 查看zhu_ruixian的博客楼主
    发贴心情 MP3文件格式(英文)[转帖]

    (讲的感觉不是太详细,又是英文的)
    NOTE: You cannot just search the Internet and find the MPEG audio specs. It is copyrighted and you will have to pay quite a bit to get the Paper. That's why I made this. Informations I got are gathered from the internet, and mostly originate from sources I found available. Despite my custom to always specify the sources, I am not able to do it this time. Sorry, I did not maintain the list. :(

    This is not decoding specs, it just informs you how to read the MPEG headers and the MPEG TAG. MPEG Version 1, 2 and 2.5 and Layer I, II and III are supported, the MP3 TAG (MP3v1 and MP3v1.1) also.. Those of you who use Delphi may find my MPGTools Delphi unit useful, it is where I implemented this stuff.

    MPEG Audio Frame Header

    An MPEG audio file is separated in smaller parts called frames. Each frame is independent. It has its own header and audio informations. There is no file header. Therefore, you can cut any part of MPEG file and play it correctly.

    When you want to read info about an MPEG file, it is usually enough to find the first frame, read its header and assume that the other frames are the same (which may not be always the case).

    The frame header is constituated by the very first four bytes (32bits) in a frame. The first eleven bits of a frame header are always set and they are called "frame sync". Therefore, you can search through the file for the first occurence of eleven bits set (meaning that you have to find a byte with a value of 255, and followed by a byte with its three most significant bits set). Then you read the whole header and check if the values are correct. You will see in next table the exact meaning of each bits in the header, and which values may be checked for validity. Each value that is specified as reserved, invalid, bad, or not allowed should indicate an invalid header.

    Frames may have a CRC check, but it's pretty rare. The CRC is 16 bits long and, if it exists, it follows the frame header. After the CRC comes the audio data. You may calculate the length of the frame and use it if you need to read other headers too or just want to calculate the CRC of the frame, to compare it with the one you read from the file. This is actually a very good method to check the MPEG header validity.

    Here is "graphical" presentation of the header content. The letters are used to indicate the different fields. In the table, you can see the details about the content of each field.

    AAAAAAAA AAABBCCD EEEEFFGH IIJJKLMM

    Sign Length
    (bits) Position
    (bits) Description
    A 11 (31-21) Frame sync (all bits set)
    B 2 (20,19) MPEG Audio version
    00 - MPEG Version 2.5
    01 - reserved
    10 - MPEG Version 2
    11 - MPEG Version 1
    C 2 (18,17) Layer description
    00 - reserved
    01 - Layer III
    10 - Layer II
    11 - Layer I
    D 1 (16) Protection bit
    0 - Protected by CRC (16bit crc follows header)
    1 - Not protected
    E 4 (15,12) Bitrate index
    bits V1,L1 V1,L2 V1,L3 V2,L1 V2,L2 V2,L3
    0000 free free free free free free
    0001 32 32 32 32 32 8 (8)
    0010 64 48 40 64 48 16 (16)
    0011 96 56 48 96 56 24 (24)
    0100 128 64 56 128 64 32 (32)
    0101 160 80 64 160 80 64 (40)
    0110 192 96 80 192 96 80 (48)
    0111 224 112 96 224 112 56 (56)
    1000 256 128 112 256 128 64 (64)
    1001 288 160 128 288 160 128 (80)
    1010 320 192 160 320 192 160 (96)
    1011 352 224 192 352 224 112 (112)
    1100 384 256 224 384 256 128 (128)
    1101 416 320 256 416 320 256 (144)
    1110 448 384 320 448 384 320 (160)
    1111 bad bad bad bad bad bad

    NOTES: All values are in kbps
    V1 - MPEG Version 1
    V2 - MPEG Version 2 and Version 2.5
    L1 - Layer I
    L2 - Layer II
    L3 - Layer III
    "free" means variable bitrate.
    "bad" means that this is not an allowed value

    The values in parentheses are from different sources which claim that those values are valid for V2,L2 and V2,L3. If anyone can confirm please let me know.

    F 2 (11,10) Sampling rate frequency index (values are in Hz) bits MPEG1 MPEG2 MPEG2.5
    00 44100 22050 11025
    01 48000 24000 12000
    10 32000 16000 8000
    11 reserv. reserv. reserv.

    G 1 (9) Padding bit
    0 - frame is not padded
    1 - frame is padded with one extra bit
    H 1 (8) Private bit (unknown purpose)
    I 2 (7,6) Channel Mode
    00 - Stereo
    01 - Joint stereo (Stereo)
    10 - Dual channel (Stereo)
    11 - Single channel (Mono)
    J 2 (5,4) Mode extension (Only if Joint stereo)
    value Intensity stereo MS stereo
    00 off off
    01 on off
    10 off on
    11 on on

    K 1 (3) Copyright
    0 - Audio is not copyrighted
    1 - Audio is copyrighted
    L 1 (2) Original
    0 - Copy of original media
    1 - Original media
    M 2 (1,0) Emphasis
    00 - none
    01 - 50/15 ms
    10 - reserved
    11 - CCIT J.17


    How to calculate frame size

    Read the BitRate, SampleRate and Padding (as value of one or zero) of the frame header and use the formula:

    FrameSize = 144 * BitRate / SampleRate + Padding

    Example: BitRate = 128000, SampleRate=441000, Padding=0  ==>  FrameSize=417 bytes

    MPEG Audio Tag MP3v1

    The TAG is used to describe the MPEG Audio file. It contains information about artist, title, album, publishing year and genre. There is some extra space for comments. It is exactly 128 bytes long and is located at very end of the audio data. You can get it by reading the last 128 bytes of the MPEG audio file.

    AAABBBBB BBBBBBBB BBBBBBBB BBBBBBBB
    BCCCCCCC CCCCCCCC CCCCCCCC CCCCCCCD
    DDDDDDDD DDDDDDDD DDDDDDDD DDDDDEEE
    EFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFG
    Sign Length
    (bytes) Position
    (bytes) Description
    A 3 (0-2) Tag identification. Must contain 'TAG' if tag exists and is correct.
    B 30 (3-32) Title
    C 30 (33-62) Artist
    D 30 (63-92) Album
    E 4 (93-96) Year
    F 30 (97-126) Comment
    G 1 (127) Genre


    The specification asks for all fields to be padded with null character (ASCII 0). However, not all applications respect this (an example is WinAmp which pads fields with , ASCII 32).

    There is a small change proposed in MP3v1.1 structure. The last byte of the Comment field may be used to specify the track number of a song in an album. It should contain a null character (ASCII 0) if the information is unknown.

    Genre is a numeric field which may have one of the following values: 0 'Blues' 20 'Alternative' 40 'AlternRock' 60 'Top 40'
    1 'Classic Rock' 21 'Ska' 41 'Bass' 61 'Christian Rap'
    2 'Country' 22 'Death Metal' 42 'Soul' 62 'Pop/Funk'
    3 'Dance' 23 'Pranks' 43 'Punk' 63 'Jungle'
    4 'Disco' 24 'Soundtrack' 44 'Space' 64 'Native American'
    5 'Funk' 25 'Euro-Techno' 45 'Meditative' 65 'Cabaret'
    6 'Grunge' 26 'Ambient' 46 'Instrumental Pop' 66 'New Wave'
    7 'Hip-Hop' 27 'Trip-Hop' 47 'Instrumental Rock' 67 'Psychadelic'
    8 'Jazz' 28 'Vocal' 48 'Ethnic' 68 'Rave'
    9 'Metal' 29 'Jazz+Funk' 49 'Gothic' 69 'Showtunes'
    10 'New Age' 30 'Fusion' 50 'Darkwave' 70 'Trailer'
    11 'Oldies' 31 'Trance' 51 'Techno-Industrial' 71 'Lo-Fi'
    12 'Other' 32 'Classical' 52 'Electronic' 72 'Tribal'
    13 'Pop' 33 'Instrumental' 53 'Pop-Folk' 73 'Acid Punk'
    14 'R&B' 34 'Acid' 54 'Eurodance' 74 'Acid Jazz'
    15 'Rap' 35 'House' 55 'Dream' 75 'Polka'
    16 'Reggae' 36 'Game' 56 'Southern Rock' 76 'Retro'
    17 'Rock' 37 'Sound Clip' 57 'Comedy' 77 'Musical'
    18 'Techno' 38 'Gospel' 58 'Cult' 78 'Rock & Roll'
    19 'Industrial' 39 'Noise' 59 'Gangsta' 79 'Hard Rock'
    Any other value should be considered as 'Unknown'  

    MPEG Audio Tag MP3v2

    This is new proposed TAG format which is different than MP3v1 and MP3v1.1. Complete tech specs for it may be found at http://www.id3.com/.

    --------------------------------------------------------------------------------

    Mpeg 1.0/2.0 LayersI, II and III header and trailer formats
    -----------------------------------------------------------
    Laurent.Clevy@alcatel.fr

    * HEADER

    bits name              comments
    --------------------------------------------------
    12   sync              0xFFF
    1    version           1=mpeg1.0, 0=mpeg2.0
    2    lay               4-lay = layerI, II or III
    1    error protection  0=yes, 1=no
    4    bitrate_index     see table below
    2    sampling_freq     see table below
    1    padding
    1    extension         see table below
    2    mode              see table below
    2    mode_ext          used with "joint stereo" mode
    1    copyright         0=no 1=yes
    1    original          0=no 1=yes
    2    emphasis          see table below
    --------------------------------------------------
    - bitrate_index
    . mpeg1.0
                1  2  3   4   5   6   7   8   9  10  11  12  13  14
    layer1     32 64 96 128 160 192 224 256 288 320 352 384 416 448
    layer2     32 48 56  64  80  96 112 128 160 192 224 256 320 384
    layer3     32 40 48  56  64  80  96 112 128 160 192 224 256 320
    . mpeg2.0
                1  2  3   4   5   6   7   8   9  10  11  12  13  14
    layer1     32 48 56  64  80  96 112 128 144 160 176 192 224 256
    layer2      8 16 24  32  40  48  56  64  80  96 112 128 144 160
    layer3      8 16 24  32  40  48  56  64  80  96 112 128 144 160

    - sampling_freq
    . mpeg1.0
        0     1     2     
    44100 48000 32000
    . mpeg2.0
        0     1     2     
    22050 24000 16000

    - mode:
    0 "stereo"
    1 "joint stereo"
    2 "dual channel"
    3 "single channel"

    - mode extension:

    0      MPG_MD_LR_LR
    1      MPG_MD_LR_I
    2      MPG_MD_MS_LR
    3      MPG_MD_MS_I
    jsbound :
       mode_ext     0  1   2   3
    layer
    1               4  8  12  16
    2               4  8  12  16
    3               0  4   8  16

    - emphasis:
    0 "none"
    1 "50/15 microsecs"
    2 "reserved"            must not be used !
    3 "CCITT J 17"

    * TRAILER
    at end of file - 128 bytes
    offset  type  len   name
    --------------------------------------------
    0       char  3                   "TAG"
    3       char  30    title
    33      char  30    artist
    63      char  30    album
    93      char  4     year
    97      char  30    comments
    127     byte  1     genre
    --------------------------------------------
    - genre :
    0    "Blues"
    1    "Classic Rock"
    2    "Country"
    3    "Dance"
    4    "Disco"
    5    "Funk"
    6    "Grunge"
    7    "Hip-Hop"
    8    "Jazz"
    9    "Metal"
    10    "New Age"
    11    "Oldies"
    12    "Other"
    13    "Pop"
    14    "R&B"
    15    "Rap"
    16    "Reggae"
    17    "Rock"
    18    "Techno"
    19    "Industrial"
    20    "Alternative"
    21    "Ska"
    22    "Death Metal"
    23    "Pranks"
    24    "Soundtrack"
    25    "Euro-Techno"
    26    "Ambient"
    27    "Trip-Hop"
    28    "Vocal"
    29    "Jazz+Funk"
    30    "Fusion"
    31    "Trance"
    32    "Classical"
    33    "Instrumental"
    34    "Acid"
    35    "House"
    36    "Game"
    37    "Sound Clip"
    38    "Gospel"
    39    "Noise"
    40    "AlternRock"
    41    "Bass"
    42    "Soul"
    43    "Punk"
    44    "Space"
    45    "Meditative"
    46    "Instrumental Pop"
    47    "Instrumental Rock"
    48    "Ethnic"
    49    "Gothic"
    50    "Darkwave"
    51    "Techno-Industrial"
    52    "Electronic"
    53    "Pop-Folk"
    54    "Eurodance"
    55    "Dream"
    56    "Southern Rock"
    57    "Comedy"
    58    "Cult"
    59    "Gangsta"
    60    "Top 40"
    61    "Christian Rap"
    62    "Pop/Funk"
    63    "Jungle"
    64    "Native American"
    65    "Cabaret"
    66    "New Wave"
    67    "Psychadelic"
    68    "Rave"
    69    "Showtunes"
    70    "Trailer"
    71    "Lo-Fi"
    72    "Tribal"
    73    "Acid Punk"
    74    "Acid Jazz"
    75    "Polka"
    76    "Retro"
    77    "Musical"
    78    "Rock & Roll"
    79    "Hard Rock"
    80    "Unknown"

    - frame length :
    . mpeg1.0
    layer1 :
    (48000*bitrate)/sampling_freq + padding
    layer2&3:
    (144000*bitrate)/sampling_freq + padding
    . mpeg2.0
    layer1 :
    (24000*bitrate)/sampling_freq + padding
    layer2&3 :
    (72000*bitrate)/sampling_freq + padding


    From:http://ustcers.com/blogs/devzhao/articles/11123.aspx


       收藏   分享  
    顶(0)
      




    ----------------------------------------------
    为什么总是索取的人多,奉献的人少...

    点击查看用户来源及管理<br>发贴IP:*.*.*.* 2007/3/9 9:35:00
     
     GoogleAdSense射手座1983-12-2
      
      
      等级:大一新生
      文章:1
      积分:50
      门派:无门无派
      院校:未填写
      注册:2007-01-01
    给Google AdSense发送一个短消息 把Google AdSense加入好友 查看Google AdSense的个人资料 搜索Google AdSense在『 XML在语音技术中的应用 』 的所有贴子 访问Google AdSense的主页 引用回复这个贴子 回复这个贴子 查看Google AdSense的博客广告
    2024/4/27 19:27:01

    本主题贴数1,分页: [1]

    管理选项修改tag | 锁定 | 解锁 | 提升 | 删除 | 移动 | 固顶 | 总固顶 | 奖励 | 惩罚 | 发布公告
    W3C Contributing Supporter! W 3 C h i n a ( since 2003 ) 旗 下 站 点
    苏ICP备05006046号《全国人大常委会关于维护互联网安全的决定》《计算机信息网络国际联网安全保护管理办法》
    85.938ms