Microsoft is developing a new Program who will have the ability to “listen”, learn and imitate human voices after being exposed to a voice sample of a few seconds.
VALLEY, what is this called artificial intelligenceis based on a digital tool that produces voices from text through a process of analysis and conversion to a “digitized voice”.
The recordings voice of the people required to generate new discourses only serve as a guide for the software produce completely different words and that try to imitate not only the timbre of people’s voices but also different accents, intonations, verbal expressions of humor, among other variants that were tested in the earliest phase of the development of this artificial intelligence.
“VALLEY it has in-context learning capabilities and can be used to synthesize high-quality custom speech with just a 3-second recording of an invisible speaker recorded as an acoustic prompt,” Microsoft states.
The acoustic environment is also an available variable in the results of the artificial intelligence of Microsoftsince it has the ability to imitate how the voices sound that are recorded during telephone calls, so that the custom voices can get even closer to the different physical environments in which the videos are recorded. messages initially.
Sometimes the voices produced by this artificial intelligence They may have unrealistic sounds such as slow pronunciation or slurred diction at times. This is the product of the voice synthesis process, so it is possible to identify the messages generated by computer at least in this first test version.
On the other hand, the company also recognizes that there may be a compromise ethical in between when it comes to the use of this technology in broader fields.
Microsofteach experiment that was carried out within the framework of this work was carried out with the consent of the speakers who lent their voices to be imitated by artificial intelligence. In addition, it was stated that it is important that people agree to carry out the software picking up their voices.
This reveals an ethical problem resulting from the use of this virtual toolsince a user could request that the voice of a famous person be imitated so that an unreal speech that could be used in different contexts is disseminated, including those that are not legal or that may cause inconvenience to the owners of the voices involved in the process.
This would not be the only application that the company technology would be developing to integrate it into its services. Already in October of 2022 announced the integration process of the software of DALL-E to the search engine Bing so that users can generate their own image search results without having to resort to other services.
The Image Maker Microsoft will work just like other programs that transform text descriptions into images with different styles. The company indicated that the tool It was not yet available globally, but it could already be seen in some markets in its test or preview version.