Tech Computers

Teaching and Learning to the computers 
           This article was originally published on the Tec Republic.
      More recently, the combination of automatic speech recognition (ASR), natural language understanding (NLU) and text-to-speech (TTS) has come to mainstream attention in virtual assistants such as Apple's Siri, Google Now, Microsoft's Cortana, and Amazon's Alexa.

How speech and language work

To get a handle on how speech technologies work, we clearly need to know something about the mechanics of human speech and the structure of language.
When we speak, air from the lungs passes through the vocal tract to produce "voiced" or "unvoiced" sounds (depending on whether the vocal cords are vibrating or not) that may then be modulated by the tongue, teeth, and lips. At its most "atomic," speech is a stream of audio segments called "phones," among which are characteristic frequencies called "formants" that can be used to identify vowels (sounds produced with an open vocal tract, all of which, in English, are voiced). In this spectrogram, which is produced by applying a Fast Fourier Transform (FFT) to the speech waveform, the author is saying: "a, e, I, o, u" and it's clear that each one has a distinctive signature:

            The characteristic formant frequencies (F1 and F2) for the English vowels a, e, I, o and u are 850Hz and 1610Hz (a); 390Hz and 2300Hz (e); 240Hz and 2400Hz (i); 360Hz and 640Hz (o) and 250Hz and 595Hz (u).
           The basic unit of language is the phoneme, which is defined as the smallest part of a word that, if changed, alters its meaning: /p/ and /b/ are phonemes in English, for example, because "pack" and "back" are words with different meanings. Phonemes can be thought of as the conceptual building blocks of words, whereas phones are the actual sounds that we make. To illustrate the difference, consider the words "pin" and "spin": in the former, the "p" sound is aspirated, while in the latter it is not. The phoneme /p/ therefore has two "allophones," usually written as [ph] and [p], that sound slightly different but will not change the meaning of a word if substituted (instead, such substitutions merely result in an odd pronunciation).

Addressing Cyber Security Concerns of Data Center Remote Monitoring Platforms :
Digital remote monitoring services provide real-time monitoring and data analytics for data center physical infrastructure systems. However, with the cost of cyber security crime projected to quadruple over the next few years reaching $2 trillion by 2019, there is a concern these systems could be a successful avenue of attack for cyber criminals. This paper describes key security aspects of developing and operating digital, cloud-based remote monitoring platforms that keep data private and infrastructure systems secure from attackers. 

Downloaded from : 
  • White Papers provided by APC by Schneider Electric
  • Phonemes can be vowels, consonants or diphthongs, and there are different numbers of them in different languages. Here's a comparison for a sample of European languages:

modified : Suman- k 09/23/2016 

Comments

Popular posts from this blog

Microsoft's cloud data centers