Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
EE Times-Asia > Processors/DSPs

Achieving better smartphone voice quality

Posted: 18 Sep 2013 ?? ?Print Version ?Bookmark and Share

Keywords:mobile phones? voice quality? inter-aural level difference? smartphones? microphone?

Since the introduction of mobile phones, voice quality has not yet improved significantly. This is due primarily to network and device limitations as the sound frequency range used by mobile devices has historically been constrained by narrowband, circuit-switched networks. This results to lower voice quality than that of a face-to-face conversation.

In addition, mobile devices have been unable to adequately separate the user's voice from background noise, forcing users to tolerate noisy, poor quality voice communication. After years of mobile network infrastructure investments in bandwidth and connectivity, mobile network operators (MNOs) are now turning their attention to voice and audio quality as a way to differentiate their service offerings by improving the user experience, satisfaction, and loyalty.

The transition from narrowband to wideband communications (i.e., HDVoice, VoLTE) yields networks capable of carrying signals of higher sound quality. As users become aware of these network and device improvements, we expect they will demand improved voice quality in the devices they depend upon, even when used on legacy narrowband networks.

Dedicated voice and audio processors are expanding rapidly as a new product category, not only in mobile handsets but also in market segments such as automobile infotainment systems, desktop PCs, digital cameras, digital televisions, headsets, and set-top boxes. IDC, an independent market research firm, estimates that voice and audio processor unit sales will grow from 63 million units in 2012 to over 1.6 billion units in 2015, representing a CAGR of 92%.

A variety of trends are driving demand for high-quality voice and audio solutions in mobile devices, including:

???Users requiring more freedom in how and where they communicate;
???Users expecting high-quality voice and audio from their mobile devices;
???Voice becoming a preferred interface for mobile device applications;
???Users increasingly relying on their mobile devices for far-field interaction, where the mobile device is held remotely from the user, such as in speakerphone mode or video conferencing;
???Users' perception of the HD video experience as negatively impacted by poor quality audio;
???OEMs continuing to expand functionality in mobile devices; and
???MNOs deploying wideband communications networks.
These trends, in turn, introduce challenges to delivering high-quality voice and audio in mobile devices, including:

???Providing high quality even when used in noisy environments;
???Working with the significant limitations on acoustics and signal processing imposed by the size, cost and power constraints of mobile devices; and
???Implementing voice and audio signal processing techniques that are scalable and adaptable to dynamic sound environments in a way traditional technology has not been able to provide.

Figure 1: Time and level differences between sounds arriving at both ears provide binaural cues.

Enabling technologies
A field of auditory research called computational auditory scene analysis (CASA) aims to mimic the intelligibility of the human ear to separate sound sources by using conventional digital signal processing principles. The classic example is the 'cocktail party effect': the human ear is able to hone in on a particular conversation and separate the desired conversation from others. From the perspective of speech intelligibility, these other conversations are considered 'babble' noise, an actual standard type of 'distractor' used to test phones equipped with voice processing.

In a single microphone system, monaural cues such as frequency pitch and onset time can be used to separate sound sources. But humans have two ears for a reason: the additional binaural information provides the brain with the ability to distinguish subtle differences in both time of arrival C inter-aural time difference (ITD) C and the level of audio signals arriving at each ear Cinter-aural level difference (ILD). The principle of binaural processing in CASA relies on the interpretation of ITD and ILD cues to separate sound sources.

1???2???3?Next Page?Last Page

Article Comments - Achieving better smartphone voice qu...
*? You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.

Back to Top