📜 ⬆️ ⬇️

Prospects for the development of speech recognition systems (excerpt from the study)

Instead of introducing

Buying research on the speech recognition market is quite expensive (from 2 to 10 thousand dollars and more). Not everyone can afford, especially among developers. It is also impossible to fully translate as a translator, since there are restrictions under the terms of the license agreement. But I would also very much like to share information with the interested public, as there are more and more fans of speech technology. Therefore, I decided to publish part of the squeeze of the international Technavio study, which we acquired in due time - of course, in our free goblin interpretation. I hope the information will be useful. The truth had to abandon many numbers, except those that already exist on the network. Our study goes without graphs, tables, and unfortunately, SWOT analyzes. Anyone who is really interested - always has the opportunity to purchase the most recent research here .

The presented work was mainly analyzed in parts of the company from North America and Europe. The Asian market is poorly represented in the study. But perhaps we will leave all these details for the time being. However, it is very interesting to describe the trends and the current characteristics of the industry, which in itself is very interesting - all the more it can be presented in various variations without losing the general essence. Let us not torment - perhaps we will begin to describe the most interesting moments, where the speech recognition industry is moving and what awaits us in the near future (2012–2016), as the researchers assert.


Voice recognition systems are computing systems that can identify the speaker from a common thread. This technology is associated with speech recognition technology, which converts spoken words into digital text signals, through the process of speech recognition machines. Both of these technologies are used in parallel: on the one hand, to identify the voice of a specific user; on the other hand, to identify voice commands through speech recognition. Voice recognition is used for biometric security purposes to determine the voice of a particular person. This technology has become very popular in mobile banking, which requires the identification of the authenticity of users, as well as for other voice commands to help them make transactions.

The global speech recognition market is one of the fastest growing markets in the voice industry. Most of the growth in the market comes from America, and then from Europe, the Middle East and Africa (EMEA) and the Asia-Pacific Region (APR). Most of the growth in the market comes from health care, financial services, and the public sector. However, in other segments, such as telecommunications and transport, a significant increase in growth is expected in the next few years. Market forecast, further increase with an average annual growth rate of 22.07 percent in the period 2012-2016. (indicators of the growth of current companies).

Market growth drivers

The growth of the global voice recognition market depends on many factors. One of the main factors is the increase in demand for voice biometrics services. With the increasing complexity and frequency of security breaches, security continues to be one of the basic requirements for enterprises, as well as government organizations. The high demand for voice biometrics, which is unique to any person, is crucial in determining a person’s identity. Another key factor for the market is the wider use of speaker identification for forensic purposes.

Some of the main factors of the global speech recognition market are:
• Increased demand for voice biometrics
• Increased use of speaker identification for forensic purposes
• Demand for military speech recognition
• High demand for voice recognition in healthcare

Increased demand for voice biometrics

Initially, the word "biometrics" was found only in medical theory. However, security needs began to increase with the use of biometric technologies among enterprises and government agencies. The use of biometric technologies is one of the key factors in the global speech recognition market. Voice recognition is used to authenticate a person, since each person’s voice is different. This will provide a high level of accuracy and security. Voice recognition is of great importance in financial institutions such as a bank, as well as in health care enterprises. Currently, the speech recognition segment accounts for 3.5% of the share of biometrics technologies in the global market, but this share has a constant growth. Also, the low cost of biometric devices increases demand from small and medium businesses.

Increased use of speaker identification for forensic purposes

The use of speaker identification technology for forensic purposes is one of the main driving forces in the global voice recognition market. There is a difficult process of determining whether the voice of a person suspected of committing a crime matches a voice from forensic samples. This technology allows law enforcement agencies to identify criminals on one of the most unique characteristics of a person, his voice, thereby offering a relatively high level of accuracy. Forensic experts are analyzing the compliance of the suspect's voice with the samples until the culprit is found. Recently, this technology has been used to help solve some criminal cases.

Demand for military speech recognition

Military departments in most countries use extremely restricted areas to prevent intruders from entering. To ensure secrecy and security in this area, the military uses voice recognition systems. These systems help military institutions detect unauthorized entry into a protected area. The system contains a database of votes of military personnel and government officials who have access to a protected area. These people are identified by a voice recognition system, thereby preventing the admission of people whose voices are not in the system’s database. In addition, we can say that the US Air Force uses voice commands to control the aircraft. In addition, military departments use speech recognition and Voice-to-text to communicate with citizens in other countries. For example, the US military is actively using speech recognition systems in their operations in Iraq and Afghanistan. Thus, there is a high demand for speech and voice recognition for military purposes.

High demand for voice recognition in healthcare

Biometric technologies such as vascular recognition, voice recognition and retinal scanning are widely embedded in the healthcare industry. Voice recognition is expected to become one of the main identification modes in medical institutions. Many US healthcare companies, when referring to the standards of the Health Insurance Portability and Accountability Act (HIPAA), also use biometric technologies such as voice recognition, fingerprint recognition to more securely and efficiently register a patient, accumulate patient information, protect patient medical records. Also, clinical trial institutions are introducing voice recognition to identify individuals recruited for clinical trials. Thus, voice biometrics is one of the main modes for client identification in the health sector in the Asia-Pacific region.

Market requirements

The impact of the four main trends and problems on the global recognition market is shown in the figure.

The impact of problems and trends is estimated based on the intensity and duration of their impact on the current market. Impact magnitude classification:
• Low - little or no impact on the market
• Medium - average level of market influence
• Moderately high - significant market impact
• High - very strong impact with a radical effect on market growth.

Despite the growth in trends, the global voice recognition market continues to face some serious growth brakes. One of the major problems is the difficulty of suppressing ambient noise. Although the speech recognition market has witnessed several technological advances, the inability to suppress ambient noise is still an obstacle to the recognition of voice recognition applications. Another challenge for this market is the high cost of voice recognition applications.

Some of the main challenges facing the global voice recognition market are:
• Inability to suppress external noise
• High cost of voice recognition applications
• Problems with recognition accuracy
• Low security in announcer verification

Impossible to suppress external noise

Despite technical advances in voice recognition, noise continues to be one of the major problems in the global voice recognition market. In addition, voice biometrics is particularly sensitive compared to other types of biometrics. Voice recognition, voice biometrics and speech recognition applications are very sensitive to environmental noise. As a result, any noise violation interferes with recognition accuracy. Also violated the automated response to the voice command. The inability to suppress ambient noise is the only factor that prevents voice recognition systems from achieving high results and taking a high percentage of the global market share in biometric technologies.

High cost of voice recognition applications

One of the main problems hindering the development of speech recognition technologies is the need for large investments required for development and implementation. Large-scale deployment of voice recognition technology in the enterprise is a time consuming process and requires a huge investment. Savings in the budget leads to a limitation of testing technology, therefore, any failure can lead to large losses in the enterprise. Therefore, alternatives to voice recognition options such as the swipe card and keypad are still actively used in many companies, especially among small and medium businesses, due to their economic efficiency. Thus, voice recognition applications require large material investments, including the cost of an integration system, additional equipment and other costs.

Problems with recognition accuracy

In the global voice recognition market, a low level of recognition accuracy is a common problem, despite the fact that currently voice recognition systems are capable of recognizing different languages ​​and determining voice authenticity. Since the system includes a complex database reconciliation process with spoken commands and integrated speech recognition and voice verification technology, even a minor error in any part of the process can lead to an incorrect result. The error in speech recognition is one of the main limitations in voice recognition applications. However, some manufacturers began developing systems with a very low level of error in voice recognition. They developed systems with less than 4% inaccurate results (for example, voice biometric measurements incorrectly identify and reject the voice of a person who has access).

Low security in speaker verification

A high level of inaccuracy in speaker verification leads to a low level of security. Currently, voice recognition systems have a high percentage of inaccurate results. The higher the speed of making wrong decisions, the higher the likelihood that, for example, a stranger will receive an entry permit. Since voice recognition systems are very sensitive, they catch everything, including throat problems, cough, cold, voice change due to illness, there is a high probability that an unauthorized person can gain access to a closed area, the reason for this is a low level of security in recognition of a person based on voice.

Market trends

The effect of the problems facing the market is expected to bring to naught the existence of the various trends that are emerging in the market. One such trend is the increase in demand for speech recognition on mobile devices. Realizing the enormous potential of mobile devices, manufacturers in the global voice recognition market are developing innovative applications specific to mobile devices. This is one of the future driving factors. The increasing demand for voice authenticated mobile banking is another positive trend in the voice recognition market.

Some of the major trends in the global voice recognition market are:
• Increased demand for speech recognition on mobile devices
• Growing demand for voice authentication services for mobile banking
• Integration of voice verification and speech recognition
• Increased mergers and acquisitions

Increased demand for speech recognition on mobile devices

The growing number of traffic regulations prohibiting the use of mobile devices while driving a car has increased the demand for speech recognition applications. Countries where strict restrictions were imposed: Australia, the Philippines, the United States, the United Kingdom, India and Chile. In the USA in more than 13 states, in spite of the introduction of the Regulation on the Use of Mobile Devices, it is allowed to use the speakerphone while driving. Consequently, customers are increasingly choosing mobile devices that are equipped with voice recognition applications that can help them access the device without having to be distracted by the device itself. In order to meet the growing demand for speech recognition applications in mobile devices, manufacturers have increased research and development in order to develop speech options commands for a mobile device. As a result, a large number of speech recognition applications were included in the mobile device, for example, managing a music playlist, reading an address, reading a subscriber’s name, voice SMS messages, etc.

Growing demand for voice authentication services for mobile banking

The need for enhanced validation leads to the universal integration of voice authentication in mobile banking. In regions such as North America and Western Europe, a large number of banking customers use telephone banking. A large number of such financial institutions make voice authentication decisions from the user about accepting or rejecting mobile transactions. In addition, the inclusion of voice authentication in mobile devices is cost-effective and at the same time provides a higher level of security. Thus, the trend towards integration of voice authentication for mobile banking will grow further over the years. Indeed, telephone banking institutions work with voice authentication solution providers and voice biometrics incorporations, which is a key competitive advantage.

Integration of voice verification and speech recognition

Some manufacturers are working towards the integration of voice verification and speech recognition technology. Instead of offering voice verification as a separate product, manufacturers offer to integrate voice verification and speech recognition functionality. Voice verification helps determine who is talking, and at the same time, which person is talking. Most manufacturers started or during the launch of speech recognition applications that are related to the integration of the two technologies described above.

Increasing mergers and acquisitions

In the global voice recognition market, there are serious trends in mergers and acquisitions. The dominant market leader, Nuance Communications Inc. , which holds more than 50% market share, has acquired a large number of small companies in the speech recognition market. From this it follows that the acquisition is a new approach to the growth of the company, with the result that Nuance has six acquisitions in 2007. This trend is expected to continue in the next few years due to the presence of numerous smaller players that can be acquired by larger companies like Nuance . Since the market is technologically oriented, small companies develop innovative solutions. But due to lack of resources, these companies are not able to increase the scale of their business. Thus, large companies, such as Nuance , use the takeover process as their main strategy for entering new markets and industries. For example, Nuance acquired Loquendo Inc. To enter the EMEA region.


There are 2 branches of speech recognition systems development (market volume from $ 1.09 to $ 2.42 billion from 2012 to 2016, growth rate + 22.07%)
Speech to text conversion (market size from $ 860 million (2012) to $ 1727 million (2016) - total share 79% -71% from 2012 to 2016)
Verification and identification of a person’s voice (market size from $ 229 million (2012) to $ 697 million - a total share of 21% -28.8% from 2012 to 2016)

In competition, companies that exist on the verge of these two directions will develop more actively - on the one hand, improving the accuracy of speech recognition programs and translating it into text, on the other hand, solving this problem by announcing the speaker and verifying his speech using an additional channel (for example video) as a source of information.

- According to Technavio , the main problem of existing speech recognition programs is their susceptibility to suppressing ambient noise;
- The main trend is the spread of speech technologies by increasing the quantity and quality of mobile devices and the development of mobile banking solutions;
- Great weather in the development of speech recognition technology currently plays government organizations, the military sphere, medicine and the financial sector. However, there has been a great demand for this kind of technology in the form of mobile applications and voice navigation tasks, as well as biometrics;
- The main market for speech recognition systems is in the USA, however, the fastest and most solvent audience lives in the countries of Southeast Asia, especially in Japan (due to full voice automation of call-centers). It is assumed that it is in this region that a strong player should emerge, which will be a great help for the global power of Nuance Communications (the current global market share is 70%);
- The most common policy in the speech recognition systems market is mergers and acquisitions (M & A) - market leaders often buy small technology laboratories or firms around the world to maintain hegemony.
- The cost of applications is rapidly falling, accuracy is increasing, filtering of extraneous noise is improving, security is increasing - the estimated implementation date of the ultra-precise speech recognition technology is 2014.

Thus, according to Technavio forecasts for the period 2012-2016. The speech recognition systems market is expected to increase by more than 2.5 times. A large share of one of the most dynamic and fast IT technology markets will be given to players who can solve 2 problems simultaneously in their product: learn to recognize speech and translate it into text, and also be able to identify the speaker’s voice, verify it from the general stream. A great advantage in the competition can be called dumping (artificially reducing the cost of such technologies), creating programs with a user-friendly interface and a quick adaptation process - with high quality work. Over the next 5 years, it is assumed that there will be new players on the market that may cast doubt on less agile large corporations like Nuance Communications .

Source: https://habr.com/ru/post/232613/

All Articles