You are using staging server - a separate instance of the ESP Component Registry that allows you to try distribution tools and processes without affecting the real registry.

uploaded 6 months ago
Espressif audio encoder and decoder

readme

# ESP_AUDIO_CODEC

Espressif Audio Codec (ESP_AUDIO_CODEC) is the official audio encoding and decoding processing module developed by Espressif Systems for SoCs. 

The ESP Audio Encoder provides a common encoder interface that allows you to register multiple encoders, such as AAC, AMR-NB, AMR-WB, ADPCM, G711A, G711U, PCM, OPUS, ALAC. User can create one or multiple encoder instance based on the encoder interfaces, these instance can run simultaneous encoding. Meanwhile user can also call specified encoder API directly to have less call depth. 

The ESP Audio Decoder provides a common decoder interface that allows you to register multiple decoders, such as AAC, MP3, AMR-NB, AMR-WB, ADPCM, G711A, G711U, VORBIS, OPUS, ALAC. You can create one or multiple decoder instance using the provided interfaces, enabling simultaneous decoding. Meanwhile user can also call specified decoder API directly to have less call depth. ESP Audio Decoder can only process audio frame data (which means input data is frame boundary).

To decrease the decoding effort of parsing and finding audio frame, ESP Audio Simple Decoder is imported. It use parser to gather and form audio frame then decode the frame use ESP Audio Decoder for common audio container. User can feed in input data of any length. The supported audio container are: AAC, MP3, WAV, FLAC, AMRNB, AMRWB, M4A etc.  

The licenses of the third-party copyrights are recorded in [Copyrights and Licenses](http://docs.espressif.com/projects/esp-adf/en/latest/COPYRIGHT.html).

# Features

The ESP Audio Codec supports the following features:   

## Encoder   

* Support encoding to AAC, AMR-NB, AMR-WB, ADPCM, G711A, G711U, PCM, OPUS, ALAC etc
* Support operate all encoder through common API see [esp_audio_enc.h](include/encoder/esp_audio_enc.h)
* Support customized encoder through `esp_audio_enc_register` or overwrote default encoder
* Support register all supported encoder through `esp_audio_enc_register_default` and manager it by menuconfig

The details of supported encoding codec show as belows: 
AAC     
- AAC low complexity profile encode (AAC-LC)
- Encoding sample rates (Hz): 96000, 88200, 64000, 48000, 44100, 32000, 24000, 22050, 16000, 12000, 11025, 8000    
- Encoding channel num: mono, dual     
- Encoding bit per sample: 16 bits    
- Constant bitrate encoding from 12 Kbps to 160 Kbps    
- Choosing whether to write ADTS header or not   

AMR       
- Encoding narrow band (NB) and wide band (WB)   
- AMRNB encoding at the sampling rate of 8 kHz       
- AMRWB encoding at the sampling rate of 16 kHz     
- Encoding channel num: mono    
- Encoding bit per sample: 16 bits    
- AMRNB encoding bitrate (Kbps): 4.75, 5.15, 5.9, 6.7, 7.4, 7.95, 10.2, 12.2    
- AMRWB encoding bitrate (Kbps): 6.6, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85      
- Discontinuous transmission (DTX)     

ADPCM   
- Encoding sample rates (Hz): all    
- Encoding channel num: mono, dual    
- Encoding bit per sample: 16 bits    

G711    
- Encoding A-LAW and U-LAW      
- Encoding sample rates (Hz): all    
- Encoding channel num: all    
- Encoding bit per sample: 16 bits    

OPUS    
- Encoding sample rates (Hz): 8000, 12000, 16000, 24000, 48000    
- Encoding channel num: mono, dual    
- Encoding bit per sample: 16 bits    
- Constant bitrate encoding from 20Kbps to 510Kbps      
- Encoding frame duration (ms): 2.5, 5, 10, 20, 40, 60       
- Application mode for VoIP and music       
- Encoding complexity adjustment, from 0 to 10      
- Inband forward error correction (FEC)     
- Discontinuous transmission (DTX)

ALAC    
- Encoding sample rates (Hz): 8000, 12000, 16000, 24000, 48000    
- Encoding channel num: mono, dual    
- Encoding bit per sample: 16 bits  
  
## Decoder   

* Support decoding of AAC, MP3, AMR-NB, AMR-WB, ADPCM, G711A, G711U, PCM, OPUS, VORBIS, ALAC etc
* Support operate all decoder through common API see [esp_audio_dec.h](include/encoder/esp_audio_dec.h)
* Support customized decoder through `esp_audio_dec_register` or overwrote default decoder
* Support register all supported decoder through `esp_audio_enc_register_default` and manager it by menuconfig

The details of supported decoding codec show as belows:
| Codec         |  Notes                                          |
|       --      | --                                              |
|       AAC     |  Support AAC, AAC-Plus (mono/dual channel only) |
|       MP3     |                                                 |
|       AMRNB   |   Support 8K sample rate only                   | 
|       AMRWB   |   Support 16K sample rate only                  |
|       G711A   |                                                 | 
|       G711U   |                                                 |
|       ADPCM   |  supprot IMA-ADPCM mono channel only            |
|       FLAC    |                                                 | 
|       OPUS    |  Support self delimited also                    |
|       VORBIS  |  User need provide common header information    |
|       ALAC    |  User need provide magic cookie information     |



## Simple Decoder   

* Support audio frame finding and decoding
* Support common parser, user can add customized parser according parser rules
* Support customized simple decoder to handle new file format
* Support customized parser and decoder pair: Use default parser but with customized decoder
* Support streaming decode only not support seek

The details of supported audio format lists as belows:
| File Format   | Notes |
|       --      |  --       |
|       AAC     |           |
|       MP3     |           |
|       AMRNB   |           | 
|       AMRWB   |           | 
|       FLAC    |           |
|       ADPCM   |  supprot IMA-ADPCM only         |
|       WAV     |  Support g711a, g711u, pcm, adpcm |
|       M4A     |  Support MP3, AAC, ALAC <br> Support mdat after moov only |


# Performance

The following results were obtained through testing with ESP32-S3R8 and internal RAM memory.    

## Encoder 

AAC     
| Sample Rate (Hz)    | Memory (KB) | CPU loading (%)|
|       --            |  --         |     --         |  
|       8000          |  52         |    3.5         | 
|       11025         |  52         |    4.9         | 
|       12000         |  52         |    5.6         | 
|       16000         |  52         |    6.0         | 
|       22050         |  52         |    8.1         | 
|       24000         |  52         |    8.2         | 
|       32000         |  52         |    12.1        | 
|       44100         |  52         |    15.7        | 
|       48000         |  52         |    16.4        | 
|       64000         |  52         |    20.2        | 
|       88200         |  52         |    25.9        | 
|       96000         |  52         |    27.7        |      

Note:       
    The CPU loading values in the table pertain to the mono channel, while the CPU loading for the dual channel is approximately 1.6 times that of the mono channel.   

AMR     
| Type    | Memory (KB)  | CPU loading (%) |
|   --    |  --          |     --          |  
|  AMR-NB |  3.4         |    24.8         | 
|  AMR-WB |  5.8         |    57.6         |     

Note:   
    1) The CPU loading in the table is an average number.       
    2) The CPU loading of AMR is related to the bitrate. The higher the bitrate is set, the higher the CPU loading will be.     

ADPCM     
| Channel | Memory (B)    | CPU loading (%) |
|   --    |  --           |     --          |  
|  mono   |  120          |    < 2          | 
|  dual   |  120          |    < 4          | 

G711     
| Type    | Memory (B)    | CPU loading (%) |
|   --    |  --           |     --          |  
|  G711-A |  40           |    < 4          | 
|  G711-U |  40           |    < 4          | 

Note:   
    The CPU loading in the table is for mono, and the CPU loading of dual is about 2 times that of mono.     

OPUS
| Sample Rate (Hz)     | Memory (KB) | CPU loading (%) |
|       --             |  --         |     --          |  
|       8000           |  43         |    15.9         | 
|       12000          |  43         |    16.7         | 
|       16000          |  43         |    16.8         | 
|       24000          |  43         |    17.8         | 
|       48000          |  43         |    19.9         | 

Note:   
    1) The data in the table is tested under the configuration with mono channel, complexity of 1, VoIP application mode, and a frame duration of 20 ms.    
    2) The dual channel encoding consumes about 13 KB more memory compared to the mono channel.     
    3) The CPU loading for the dual channel is about 1.6 times that of the mono channel.     
    4) The chosen complexity level directly impacts CPU loading, with 1 being the lowest and 10 being the highest.          

#  ESP_AUDIO_CODEC Release and SoC Compatibility

The following table shows the support of ESP_AUDIO_CODEC for Espressif SoCs. The "&#10004;" means supported, and the "&#10006;" means not supported. 

|Chip         |         v1.0.0     |
|:-----------:|:------------------:|
|ESP32        |       &#10004;     |
|ESP32-S2     |       &#10004;     |
|ESP32-C3     |       &#10004;     |
|ESP32-C6     |       &#10004;     |
|ESP32-S3     |       &#10004;     |
|ESP32-P4     |       &#10004;     |

# Usage

## Encoder Usage
The sample usage can refer to [audio_encoder_test.c](test_apps/audio_codec_test/main/audio_encoder_test.c)

## Decoder Usage
The sample usage can refer to [audio_decoder_test.c](test_apps/audio_codec_test/main/audio_decoder_test.c)

## Simple Decoder Usage
The sample usage can refer to [simple_decoder_test.c](test_apps/audio_codec_test/main/simple_decoder_test.c)

# Change log

## Version 1.0.0
- Added ESP Audio Encoder

## Version 2.0.0
- Add encoder register API, remove `esp_audio_enc_install` and `esp_audio_enc_uninstall`
- Add ALAC encoder
- Add `esp_audio_enc_register_default` to register all encoder
- Added ESP Audio Decoder
- Added ESP Audio Simple Decoder

Links

Supports all targets

License: Custom

To add this component to your project, run:

idf.py add-dependency "jason-mao/esp_audio_codec^0.0.4"

or download archive

Stats

  • Archive size
    Archive size ~ 37.06 MB
  • Downloaded in total
    Downloaded in total 18 times
  • Downloaded this version
    This version: 8 times

Badge

jason-mao/esp_audio_codec version: 0.0.4
|