b0e59497733098d2db7bfa37253048d8.ppt
- Количество слайдов: 31
Leveraging Wideband Codecs for Vo. IP Development Laurent Amar President, Voice. Age Corporation
Contents Ø Speech Communication/Coding Basics Ø Wideband Speech Description and Applications Ø Wideband Speech Codec Standards Ø Real-World Wideband Vo. IP Deployment Ø Wideband Momentum Ø What’s Next & Wrap Up
Speech Signal Basics
Understanding Speech Communication Human Physiology and Perception are Key Ø Encode and exchange primarily the speech signal information that is important for human perception Ø Use human speech production and comprehension parameters to reduce bit rate and enhance communication quality
Speech Coding Attributes Ø Bit rate • As low as possible As required by specific applications Ø Delay • As little as possible Ø Quality • As high as possible Ø Difficult to attain all of these often divergent objectives at the same time Complexity • As algorithmically simple as possible to constrain platform processing and memory requirements Ø Robustness • Effective operation under background noise and channel impairment conditions Ø Standards compliance • Open, tested and interoperable solutions
Speech Synthesis Model Used in CELP (Code Excited Linear Prediction) Speech Coding 1 = air from lungs 2 = vocal chords (periodicity) 3 3 = vocal tract (mouth + lips) Very successful speech compression algorithm is based on Algebraic CELP: 2 1 ACELP ® c(n) Innovative excitation 1 Long-term Prediction 2 v(n) S hort-term Prediction 3 ^ s (n) S ynthes ized speech
CELP Decoder Principles ACELP at the heart (overview) Ø Ask Redwan – glean from his presentation
CELP Encoder Principles More on ACELP implementation Ø Ask Redwan or take block diagrams from the old poster. • The excitation parameters (codebook indices and gains) are determined by minimizing the perceptually weighted error between original and synthesized speech. • Analysis-by-synthesis where a ‘local decoder’ (the orange part) exists inside the encoder.
International Standards Using ACELP
AMR Standard Codec Family at a Glance What is Wideband Telephony? Built on a solid, market-proven ACELP ® technology foundation G. 722. 2 A Complete Suite of Low Bit Rate Speech and Audio Coding Solutions
Contents Ø Speech Communication/Coding Basics Ø Wideband Speech Description and Applications Ø Wideband Speech Codec Standards Ø Real-World Wideband Vo. IP Deployment Ø Wideband Momentum Ø What’s Next & Wrap Up
What is Wideband Telephony? An Emerging Opportunity to Deliver Vastly Improved Speech Quality • Substantially increases transmitted speech information • Double the bandwidth • Enables digital end-to-end packet-based telephony services to deliver much better speech quality than traditional PSTN circuitswitched telephony • Vo. IP quality differentiator Hearing is believing! Visit Voice. Age at booth #305 for a demo Also visit the listening room at www. voiceage. com to hear samples
Why Wideband Vo. IP Telephony Now? Enabling Technologies and Consumer Perceptions are Converging Wideband Telephony Benefits: Ø Improved presence, naturalness and intelligibility • • Reduces listener fatigue Improved Hands-free/speakerphone sound quality Ø Improves speaker and speech recognition Ø High-quality low bit rate wideband codecs • e. g. , G. 722. 2/AMR-WB & VMR-WB at rates ranging from 7– 24 kbps Ø Rising user awareness of enhanced sound quality • • • Wideband teleconferencing Wideband enterprise IPtelephony Wireless/Vo. IP multimedia services Driving up expectations! Ø Interoperable wideband codec solutions over end-to-end digital networks help pave the way for fixed/mobile convergence
Typical Speech Signal Acoustics Wideband telephony covers much more speech signal information Improved voice quality and intelligibility (e. g. , s & f differentiation) “Everyone looked extremely confused about the news” Improved speech naturalness, presence and comfort
Wideband Telephony Applications Scope is much wider than Vo. IP Telephony Ø Vo. IP hi-fi telephony (G. 722. 2) Ø Cellular wireless hi-fi telephony (AMR-WB & VMR-WB) Ø Wi-Fi Vo. IP telephony Ø Converged wireless/wire-line telephony Ø Multi-point audio and video teleconferencing Ø Video telephony audio coding Ø Call center conversation recording and archiving Ø Speech and speaker recognition-based systems Ø Digital radio broadcasting and field reporting Ø Hi-fi ringtones
Contents Ø Speech Communication/Coding Basics Ø Wideband Speech Description and Applications Ø Wideband Speech Codec Standards Ø Real-World Wideband Vo. IP Deployment Ø Wideband Momentum Ø What’s Next & Wrap Up
The Standard Solution Advantage Interoperable, Open and Fully Tested Ø Open, collaborative and competitive process Ø Requirements specifically address target applications Ø Published algorithms and source code • Permits wider and more effective scrutiny Ø Rigorous comparative testing under diverse conditions • Background noise types and levels • Spoken languages • Speaker types • Various network impairments Ensures that the best technologies are chosen
Evolution of Wideband Standards A steady progression of high-quality speech coding technologies 1988 G. 722 48, 56, 64 kb/s Narrowband 3 GPP 1987 FR 13 kb/s 1994 HR 5. 6 kb/s 1972 G. 711 64 kb/s 24, 32 kb/s 1995 EFR 32 kb/s 1992 G. 728 16 kb/s 3 GPP 2 1993 IS-96 A Rate-Set I 1999 AMR-NB 4. 75 -12. 2 kb/s 1984 G. 726 1995 IS-96 A Rate-Set II Wideband 1999 G. 722. 1 12. 2 kb/s ITU-T 1997 EVRC Rate-Set I 1995 G. 729 2001 -2002 3 GPP/ITU-T AMR-WB/ G 722. 2 6. 6 -23. 85 kb/s 6. 4, 8, 11. 8 kb/s 2000 SMV Rate-Set I 2004 3 GPP 2 VMR-WB (Source Controlled) Rate-Set I & II Interoperable Wideband
G. 722. 2/AMR-WB and VMR-WB Standards • • • 3 GPP 1999 TS 26. 111 recommends AMR-WB for (3 G-324 H) multimedia telephone handsets 3 GPP 2001 TS 26. 190 defines the AMR-WB codec ITU-T 2002 G. 722. 2 recommended for wideband speech 3 GPP 2 (2004) C. S 0052, “Source-Controlled Variable-Rate Multimode Wideband Speech Codec (VMR-WB), Service Options 62 and 63 for Spread Spectrum Systems, ” specifies the VMR-WB codec for cdma 2000® systems. 3 GPP 2005 TS. 235 requires packet-switched multimedia terminals at 16 k. Hz and Po. C terminals to support AMR-WB OMA 2005 Push-to-Talk User Plane states the Po. C server must support AMR and AMR-WB media parameters Widespread success in international standards competitions
G. 722. 2 Subjective Testing Results 5. 0 4. 5 Clean Condition Test (English Language) AMR-WB Characterization Test G. 722 @ 64 kbps 4. 0 G. 722 @ 48 kbps MOS 3. 5 3. 0 G. 722. 2 @ 8. 85 kbps 2. 5 G. 722. 2 @ 12. 65 kbps G. 722. 2 @ 18. 25 kbps 2. 0 G. 722. 2 @ 23. 05 kbps 1. 5 1. 0 No Tandem -26 d. Bov Self-Tandem -26 d. Bov G. 722. 2/AMR-WB Delivers Excellent Wideband Speech Quality Even at Low Bit Rates (e. g. MOS at 8. 85 kbps exceeds G. 722 at 48 kbps)
Contents Ø Speech Communication/Coding Basics Ø Wideband Speech Description and Applications Ø Wideband Speech Codec Standards Ø Real-World Wideband Vo. IP Deployment Ø Wideband Momentum Ø What’s Next & Wrap Up
Enabling Wideband Vo. IP Telephony The Key Underpinnings Wideband speech coding technology is ready – what else is needed for mass adoption? Ø Wideband capable terminal device speakers and microphones Ø More and more network elements and end-devices equipped with compatible wideband codecs • Standard wideband codecs ensure smooth interoperability • Software-driven terminals enable downloading of the latest enhancements to standard wideband codecs • Relevant application servers and network infrastructure gear need to support the necessary wideband standard codecs Ø Fully digital packet-based Vo. IP networks that are readily configurable to support wideband telephony
Implementation Considerations Ø Interoperability • Important to eliminate or reduce transcoding § Transcoding adds cost, delay and jitter § Degrades speech quality Ø Complexity • • • Tradeoff between bit rate and complexity/memory An important design consideration for handheld devices Miniaturization trends, Moore’s law and other innovations are still going strong though Ø Quality of Service • Robust real-world performance, need to consider: § Packet loss – Counter with concealment and FEC methods Background noise – Mitigate with noise suppression § Delay and jitter – Minimize delay and manage jitter Ø Total bit rate available • Codec & system/channel coding both contribute
Enabling Transcoder Free Interoperability Enabling Seamless Communication across Wireline, Wireless and Wi-Fi Networks
Growing Real-World Wideband Deployment Momentum Ø Teleconferencing system vendors • Wi Wideband telephony deployment pioneers – have a very compelling wideband speech application Ø Hi-fi ringtones (True Tones) • Increasing deployment in newer mobile phones from major vendors Ø Enterprise IPphone systems • Campus LAN environments provide an ideal platform for rolling out wideband telephony Ø Emerging wideband Vo. IP services for the masses • • Provide an opportunity for service providers to differentiate Vo. IP offering to the mass market Broadband Internet access is quickly becoming the norm helping Vo. IP become mainstream Increasing availability of wideband speech capable devices Softphone clients like Xten. TM’s eye. Beam. TM are integrating wideband codecs (G. 722. 2) d ban de
Wideband in Enterprise Vo. IP Ø Enterprise are deploying wideband Vo. IP telephony • Intra-site Gb. E/10 Gb. E LANs widely deployed § Facilitate converged IT corporate data and Vo. IP voice communications over a common network infrastructure § Intra-site networks primed for Vo. IP with wideband • Improves communications effectiveness and productivity within a corporate network § Little or no additional cost needed § Improves mission critical communications (e. g. hospitals) • Compression and robustness are important for costeffective communications between sites over a WAN § Also significant when reaching out to mobile employees (either over cellular at remote sites or over a WLAN connection within a site or campus)
Wideband Vo. IP over Xten Softphones Xten. TM eye. Beam. TM has readily implemented and demonstrated G. 722. 2 Vo. IP Ø Enabling a higher quality conversation with the same/similar bandwidth as narrowband codecs Ø Service providers can provide a higher value service for the same cost Ø Supports interoperability between SIP and 3 G cellular network devices without audio signal transcoding Ø G. 722. 2 readily integrated and demonstration on the eye. Beam Ø Reduces the need for operators to purchase, operate and maintain additional equipment such as transcoders and wideband capable hard-phones Ø Enables service providers to rollout Vo. IP services with superior voice quality Xten. TM eye. Beam. TM
Contents Ø Speech Communication/Coding Basics Ø Wideband Speech Description and Applications Ø Wideband Speech Codec Standards Ø Wideband Vo. IP Implementation Considerations Ø Real-World Wideband Momentum Ø What’s Next & Wrap Up
Beyond Wideband Speech, what’s next? Ø Teleconferencing solution pioneers are introducing new audio enhancements: • Ultra-wideband, which typically increases the transmitted speech bandwidth to 14 – 16 k. Hz § Increases further the richness of conversational voice quality • Stereo sound and spatial sound § Gives a better sense of speaker directionality for remote meetings Ø Audio improvements also driven by multimedia services, such as: • On-line gaming, audiovisual telephony and rich messaging Ø Emerging hybrid speech and stereo audio codecs effectively meet these emerging needs with efficient use of channel capacity, e. g. : • The AMR-WB+ hi-fi audio compression codec (selected by the 3 GPP for mobile multimedia services), encompasses essentially the full human audio spectrum with parametric stereo, even at low bit rates.
Summary Ø Wideband speech is beginning to gain real-world momentum • The key enablers are widely available (end-user devices, end-toend digital networks, interoperable standard WB codecs, …) Ø User expectations for improved audio quality are rising • Video telephony, audiovisual conferencing and remote collaboration and other multimedia services are expected to be extremely popular for both business and residential use Ø Once wideband speech communication is widely deployed and available it will increasingly become expected by users as the norm Ø The stage is set for widespread wideband Vo. IP – it is time for main the players (you the developers) to make it happen What are you waiting for? Go make it happen!
Abbreviations/Glossary 3 GPP: Third Generation Partnership Project (Standards body defining GSM evolution to 3 G networks) 3 GPP 2: Third Generation Partnership Project 2 (Standards body defining CDMA evolution to 3 G) AMR: Adaptive Multi-Rate (standard narrowband speech codec for GSM and WCDMA networks) AMR-WB/G. 722. 2: Adaptive Multi-Rate Wideband (standard wideband speech codec for GSM and WCDMA networks and ITU-T (as G. 722. 2)) AMR-WB+: Extended Adaptive Multi-Rate Wideband (standard wideband speech and hi-fi audio codec) CDMA: Code Division Multiple Access (Technology behind the second most popular cellular networks) BTS: Base Transceiver Station BSS: Base Station System CNG: Comfort Noise Generation (decoder feature the generates comfort noise to avoid listener annoyance when the encoder at the far-end is not transmitting due to silence) GSM: Global System for Mobile (most widely deployed cellular mobile technology) ITU-T: International Telecommunications Union – Telecommunications standardization sector MOS: Mean Opinion Score (a subjective test methodology for evaluating speech quality) OMA: Open Mobile Alliance (an organization formed to facilitate the global user adoption of mobile data services) Po. C: Push-to-talk over Cellular (walkie-talkie like service over cellular networks) VAD: Voice Activity Detection (an encoder feature that detects when the user is speaking) VMR-WB: Variable Rate Multi-mode Wideband (standard wideband speech codec for CDMA 2000 ®) WCDMA: Wideband CDMA (Technology adopted by GSM networks for their evolution to 3 G) w. MOPS: weighted Million Operations Per Second (measure of codec complexity)
b0e59497733098d2db7bfa37253048d8.ppt