Elektrotehniški vestnik 76(5): 311-317, 2009 Electrotechnical Review: Ljubljana, Slovenija Impact of the Echo Canceller and VAD System on Data Transmission over the GSM System Voice Channel Zdenko Mezgec 1 , Amor Chowdhury 1 , Rajko Svečko 2 and Bojan Kotnik 3 1 Margento R&D d.o.o., Gosposvetska cesta 84, 2000 Maribor 2 Univerza v Mariboru, Fakulteta za elektrotehniko, računalništvo in informatiko, Smetanova 17, 2000 Maribor 3 Ultra d.o.o., Cesta Otona Zupančiča 23A, 1410, Zagorje ob Savi E-pošta: zdenko.mezgec@margento.com Abstract. The presented mobile payment system uses the GSM speech channel for data transmission. The speech channel is optimized for human speech transmission and, therefore, the transmission of modulated data is affected by various factors. The Echo Canceller and VAD systems are the factors having the greatest impact on performance of data transmission over the voice channel. For mobile phone payment, it is important for tempo-spectral characteristics of transmitted modulation signals to be similar to those of the human speech. Otherwise, these signals can be interpreted as nonspeech and blocked by the echo canceller or VAD. In this work, we investigated the impact of ETSI VAD (European Telecommunication Standards Institute - Voice Activity Detector) and Echo Canceller system on different types of test signals. By implementing the ETSI VAD system in the MATLAB workspace and various tests performed with the mobile phones and different types of test signals, determined the causes for unpredictable changes in the quality of audio data transmission. At the end of this work, some proposals for avoiding these factors and establishing a robust quality of data transmissions will be given. Keywords: GSM system, echo cancellation, voice activity detection, data over voice, data transmission Vpliv izničevalnika odbojev in modula za zaznavanje aktivnosti govora na prenos podatkov po govornem kanalu sistema GSM Povzetek. Predstavljeni sistem mobilnega plačevanja temelji na prenosu podatkov prek govornega kanala sistema GSM. Govorni kanal sistema GSM je namenjen predvsem za prenos govora, zato pri prenosu zvokovno moduliranih podatkov zasledimo močan negativen vpliv izničevalnika odbojev in stopnje za zaznavanje aktivnosti govora. Za uspešno izvedbo predlaganega koncepta mobilnega plačevanja potrebujemo takšen modulacijski postopek, da bo rezultirajoči zvokovno modulirani signal čim bolj podoben časovno-spektralnim in dinamičnim karakteristikam govornega signala. V članku predstavljamo rezultate simulacije in analize odzivov standardiziranega modula za zaznavanje aktivnosti govora (ETSI VAD) na različne vhodne testne signale. Na podlagi teh analiz smo tako določili nabor priporočil za izvedbo optimalnega postopka zvokovne modulacije v smislu neobčutljivosti na neželene vplive izničevalnika odbojev in stopnje za zaznavanje aktivnosti govora, Ključne besede: system GSM, izničevalnik odbojev, detekcija aktivnosti govora, zvokovno modulirani podatki, prenos podatkov 1 Introduction The GSM mobile phone is an indispensable device in everyday life. Its basic function is to enable a remote voice communication. People living in the modern information technology era often come across various services such as mobile trading, mobile business, and mobile payments. All of these services usually use different interfaces, payment instruments, and/or data systems for data transmission. One of the numerous types of data transmissions is the voice-modulated data transmission over the GSM (Global System for Mobile) voice channel [11]. This principle is already incorporated in the internationally patented Margento mobile payment system, developed by Ultra, d.o.o., and Margento R&D d.o.o. [1]. The basis of the Margento system is voice-modulated data transmission between the Margento centre and the payment terminal (see Fig. 1). The Margento terminal is a device which enables the usage of user’s GSM mobile phone as a universal payment instrument. It works similarly as the ordinary Received 12 August 2009 Accepted 18 November 2009 Figure 1. Concept of the Margento system. 312 Mezgec, Chowdhury, Svečko, Kotnik POS (Point of Sale) terminal, where the buyer introduces his/ her credit card to perform a payment. There are two types of the Margento terminals on the market; the MPOS and VEND terminals, which are intended for different applications (Fig. 2). Data transmission over the GSM voice channel is a subject to various disturbances. The main reason for this is because the GSM voice channel is not intended to transmit arbitrary acoustical signals with exception of the speech signal itself [2], [3]. Little has been done to avoid these problems because of lack of competition and the patent protection afforded to the Margento [1]. The intention of this paper is to highlight some problems and solutions for improvement of the voice modulated data transmission over the GSM voice channel. In the first part of the paper, some problems which cause certain difficulties with the data transmission are introduced. Section 3 depicts the analysis and the ETSI VAD system implementation in the MALTAB programme environment. Furthermore, there are some results and system impacts shown in Section 4. In the conclusion, some solutions are suggested to increase performance of the data transmission over the GSM voice channel. 2 Description of the problem The Margento terminal is a device which simultaneously modulates and demodulates data signals. The data communication between the Margento Centre and the Margento Terminal can be performed either in the full duplex mode, which increases the data transmission bit rate, or with the less effective and slower half duplex. The transmission quality of the audio modulated data is limited due to the following impacts and interferences: • The impact of the Margento terminal audio coupling and surrounding acoustic environment, • GSM system impact (speech codecs, compression, packet loss), and • Mobile phone impact (VAD, echo canceller). 2.1 The Margento Terminal and its Surrounding Impact Modulated data transmission runs in an unsatisfactory acoustic surrounding, because of the different shapes of the mobile phones, thus lowering the transmission quality. The consequences of the unsatisfactory acoustic connection between the Margento terminal and the mobile phone (Fig. 3) are additional signals disturbing the terminal and mobile phone communication. The terminal works in noisy surroundings, such as restaurants, shops, etc. It could happen in some areas that the surrounding signal, which is added to the modulated signal, is stronger than the modulated signal (SNR<0dB). Because of this and some surrounding noise, the VAD of the GSM phone can block the modulated signal. The terminal microphone also receives a deformed signal. In addition to the modulated signal sent by the Margento centre, the microphone also receives the surrounding noise and the noise broadcasted by the speaker inside the terminal. Likewise, this occurrence happens for the modulated signal sent to the Margento centre. The surrounding noise can thus be stronger than the modulated signal. The MPOS and VEND terminal hardware equipment are very much alike (see Fig. 2). The only difference is that the VEND terminal does not have the support for the communication between the human and the terminal (keyboard, screen, printer, etc.). Both types of terminals have the Texas Instruments 32-bit TMS320F2812 DSP processor, with the working core running on 120 MHz. 2.2 GSM System Impact One of the most important impacts of the GSM system is the voice coder, developed and intended primary for speech transmission, and not for the transmission of arbitrary modulation signals. The GSM speech channel is adjusted to the human speech, which has very specific features. In the GSM system, three standard voice coders are applied most frequently: GSM EFR, GSM FR, and GSM FR [2], [3]. All of these voice coders affect every mobile phone equally. The impact of the GSM system and Margento terminal can be avoided by using adjusted modulated methods and by implementing the forward error correction algorithms – FEC. The major degradation factor which reduces the quality of data transmission is the impact of the mobile phone itself. 2.3 Mobile Phone Impact The most important mobile phone impacts are: • Echo Canceller, and • Voice Activity Detector System – VAD. The echo canceller and the voice activity detector are – from the audio modulated data transmission point of view – unwanted systems that disturb the quality of data transmission over the GSM voice channel. Both systems can disturb broadcasted signal up to the limit, where the signal, received by the Margento centre, is totally useless for demodulation. Figure 2.VEND terminal (left), and MPOS terminal (right). Impact of the Echo Canceller and VAD System on Data Transmission over the GSM System Voice Channel 313 2.3.1 Echo Canceller The basic function of the echo canceller is to eliminate all the signals, which are not similar to speech, and the sub signals, which appear along the speech transmission. One of the unwanted speech signal characteristics is echo [10]. The echo occurrence and development of the echo reduction had appeared long before the mobile phoning. The first demands to eliminate the echo started in the 1960s, more precisely, in satellite telecommunication. These echoes can be divided into the hybrid and the acoustic echoes. The hybrid echo elimination is carried out well with the FIR (Fine Impulse Response) in mobile phone technology [14]. The acoustic echo is generated in the analogue and digital device. The echo cancellers are realized with complex algorithms, which have to predict the echo path or the delay and echo strength. The mathematical acoustic echo model is needed for the qualitative canceller. Model approximation is adaptive, which means that the canceller parameters adjust to the certain mobile phone environment system by themselves. The canceller adapts or adjusts the parameters continuously. The final parameter’s adaptation is called the convergent time [20]. The canceller’s efficiency depends on the strength and maximal echo signal delay, received by the microphone. The echo signal estimation is subtracted from the original signal and the person, who is the source of the speech, cannot hear his/ her own speech echo. 2.3.2 Voice Activity Detector Generally speaking, the human speech consists of sequences of silences and sounds. The silence parts are between the words and sentences, and/or when we do not talk and just listen to the speaking person [4], [5], [6], [18]. Most of the time, the mobile phone user is in the areas where there are various acoustic background noises. The signal, detected by the mobile microphone, is the sum of the surrounding noise, or the surrounding noise and speech. In most speech applications there are utilized algorithms to automatically distinguish between useful speech and background noise (or silence). These are the so called the Voice Activity Detectors, VAD. The VAD systems are used in different communication systems. Quite often, VAD is one of the most important components in the mobile phone and in similar systems. The communication quality increases with the quality of speech detection; however, it can disable normal communication in extremely noisy surroundings. Without the VAD system, there is no qualitative communication in modern mobile networks. The VAD system in connection with the echo canceller has become the integral component of every mobile phone. The primary intention is to estimate the tempo-spectral characteristics of the input speech signal, captured by the microphone [7]. The noise spectrum is not stationary and can be changed quite quickly; therefore, the changes have to be followed properly with the noise spectrum estimations. The noise estimation can be carried out any time when the speech is not active and the microphone detects only the noise signal. At that time, the system estimates the spectrum and cancels the signal as long as only the noise is in the input [8]. Furthermore, there are also some algorithms which continuously estimate the noise spectrum and do not need any speech pauses for noise estimation [9]. The VAD system is efficiently used for the discontinuous speech transmission. The problem of the VAD systems is that they can decrease the speech quality. It can happen that VAD can misinterpret the speech for the noise and in these circumstances the mobile phone will not transmit the speech. Moreover, the high quality VAD can increase the perceptive quality of the transmitted speech and simultaneously reduce the power consumption of the mobile device [16]. Namely, there is no need to transmit the nonspeech frames through the GSM network. Where the surrounding noise strength is low in comparison with the speech strength (big SNR- Signal to Noise Ratio), VAD can quite quickly and easily detect segments in the signal where there is no speech. And there is quite the opposite situation in the case of low SNR. In such cases quite complex and adaptive VAD algorithms are needed to deal with these very dynamic situations [15]. The VAD systems and echo cancellers render the audio modulated data transmission over the voice channel because the voice modulated data signals, which are FSK (Frequency-Shift-Keying), ASK (Amplitude-Shift Keying), QAM (Quadrature Amplitude Shift-Keying), OFDM (Orthogonal Frequency-Division Multiplexing), etc., have tempo- spectral characteristics usually much different than the speech signal. Therefore, it is necessary to analyse the VAD’s and echo canceller’s impact on the voice modulated data signals. The documentation and specifications about the GSM ETSI VAD system are publicly available and thus the ETSI VAD simulator can be implemented, i.e. in the MATLAB environment. However, it should be noted that the exact simulation of the echo canceller cannot be realized because –in Figure 3.Margento terminal and corresponding acoustic impacts and interferences. 314 Mezgec, Chowdhury, Svečko, Kotnik opposite to the VAD– there is no official standard for the echo canceller. Namely, each mobile-phone manufacturer has its own secret algorithm for the echo cancellation. Only the testing on the real objects can be carried out in order to analyse the VAD together with the echo canceller. 3 Implementation of the GSM ETSI-VAD system within MATLAB 3.1 Description of the GSM ETSI VAD system The GSM VAD system proposed by the ETSI institution is basically a signal-energy detector based on the adaptation adjustment to the determined speech- nonspeech threshold (Fig. 4). The VAD system uses the following characterizations and facts about the speech and surrounding noise: • Speech is a discontinuous signal, • The speech signal spectrum changes in short periods of time from 20 to 30 ms [19], • The surrounding noise is usually more stationary than speech, • The surrounding noise spectrum is usually changed with longer average sequences of time, depending on the speech, • The amplitude dynamics of the speech is more expressive than the dynamics of the surrounding noise, • The surrounding noise usually has a spectrum similar to the white or coloured noise which differs from the spectrum of the speech. The tempo- spectral characteristics of the speech are usually more complex. The VAD system is roughly divided into nine components, where the logical decision about the speech presence or absence is based on various parameters (see Fig. 4), calculated in 160 samples (20 ms) long intervals (GSM speech coders operate at the sampling frequency of 8000 Hz) [13]. The input signal is filtered through the adaptive noise reduction filters, which reduce the level of the surrounding noise in the captured waveform. Filter coefficients are calculated every four frames (80 ms) through average coefficient autocorrelation. This method contributes to better surrounding noise cancellation [12]. The energy threshold and adaptive filter coefficients are adapted only when there is no speech in the input signal. Namely, in this case the input signal energy is usually low, or the input signal spectrum is stationary and it does not include the periodical component, which could be the result of the special network information tone. The signal stationarity is verified by the LHR (Likelihood Ratio) measurement between the average linear-predictive coefficients of the last four frames and the temporary LPC (Linear prediction Coding) filters. The LPC filter coefficients represent the formants of the human vocal tract. In the linear prediction, it is presumed that the signal sample value at a certain moment of time is the linear combination of the determined number of past signal samples. When these coefficients are calculated, human speech can be produced with the reverse filtering of the received signal. In case the result of LHR method is smaller than the fixed threshold, the signal is stationary. The presences of periodical components are calculated every 5 ms. In order to prevent the pauses between syllables (smaller speech quality) to be misclassified as noise or silence, the VAD system is making decisions every five frames (100 ms). Moreover, hangover is applied thus marking few noise-only or silence frames after every, sufficiently long speech interval, as speech. The GSM VAD system is adjusted to the principle of making decisions such as »Wrong-Safety«, which means that whenever the system is in doubt about the input signal classification, VAD makes a decision that the input signal is the human voice (preference of false acceptance rather than true rejection) [17]. 3.2 Implementation in the MATLAB Programme environment The GSM VAD system, which was standardised by ETSI, has been implemented in the MATLAB programme environment. The program code is written in the M-file. ETSI has prescribed 16 test vectors (the input and output signals) used to check VAD algorithm implementation suitability and compliance with the standard. The algorithm suitability of the MATLAB implementation has been checked with test vectors so that the ETSI VAD system impact analysis on voice modulated data signal can be performed. The most important VAD algorithm’s intermediate signal results are shown in the diagrams in Fig. 5 in order to understand the operation of the ETSI VAD system. The selected input test signal is the standardized ETSI test vector number 5 (female speech). Fig. 5 shows the following signals (from top to bottom): Figure 4.VAD system diagram according to the ETSI standard. Impact of the Echo Canceller and VAD System on Data Transmission over the GSM System Voice Channel 315 1. The input test signal, 2. The output signal- The ETSI VAD system result, 3. The computed result of an average autocorrelation or input signal energy, 4. The output of the adaptive algorithm for the adjustment of the threshold decision, and 5. The VAD system result and the result of the postprocessed VAD decision algorithm (hangover). The ETSI VAD system implementation in the MATLAB environment enables the analysis of disturbances over the other test signals, which replicate the real-world scenarios in the operating Margento (Margento) system. Six new test signals were used (each of them of length of 30 seconds): • White noise, • QAM modulated signal at 400bps (carrier at 800Hz), • FSK 200bps (carriers at 800 Hz and 1100Hz), • ASK 200bps (carrier at 800Hz), • LPC Modulation based on the LCP speech synthesis, and • Speech signal (male, female voice). The analysis results of the ETSI VAD system show the percentage of the total input signal length which was marked as “speech” and, therefore, wasn’t cut-off by the VAD system. The results presented in Fig. 6 show that the white noise, QAM, and FSK test signals are more often classified as “non-speech”. It can also be seen that the human speech and the LPC modulated signal are - as expected - entirely marked as “speech” by ETSI VAD. Therefore, the proposed digital modulation scheme must have similar tempo-spectral characteristics as human speech in order to be properly transmitted through the GSM speech channel. 4 Real GSM system experimental results and discussion The ETSI VAD system performance has been checked also in the real system with six test signals which were transmitted over the voice channel of various GSM mobile phones. The experiment was carried out in the supervised environment with the high SNR signal quality. Each mobile phone has been tested multiple Figure 5.VAD system intermediate operation signals (see description in Section 3.2) ETSI VAD EFFECT ON TEST SIGNALS 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% WHITE NOISE QAM 400 bps 800 Hz FSK 200 bps 800-1100 Hz ASK 200 bps 800 Hz LPC MOD HUMAN SPEECH Figure 6. Simulated ETSI VAD performance using different test signals. 316 Mezgec, Chowdhury, Svečko, Kotnik times in the two phase test and the average of the results was taken into consideration in each phase. The whole testing was divided into the following two phases: • one-way communication (half-duplex) • simultaneous both way communication (full duplex). The purpose of this testing was the analysis of VAD and echo canceller impact on the signal reception. Fig. 7 shows the average VAD system impact with various mobile phones on the above mentioned test signals with half-duplex communication. The results show, that the mobile phones pass over the test signals over the GSM’s voice channel almost entirely. There were actually some difficulties with the Siemens SL55 and BENQ Athena mobile phones. Fig. 8 show the average VAD system impact of various mobile phones at full-duplex communication. It has to be mentioned that all mobile phones were set to their maximum volume performance and the Margento centre was sending the test signal FSK 300 bps (carriers at 1650Hz, 1850Hz) with the relative amplitude of 30.000 at 16 bit separation. Consequently, the echo canceller disturbed the input signal up to the limit where the VAD system algorithm quickly interpreted the test signal as the surrounding noise. Even some unexpected speech cuttings were noticed with some mobile phones and consequently the average speech pass was decreased/ lowered from over 90% to 70%. The results show that the echo canceller impact is very important for further analysis. Therefore, the mobile phone Siemens SL45 has been chosen because of the highest fluctuations of the result. Fig. 9 shows the combined echo canceller and VAD system impact at the volume changing with the data transmission towards from the Margento centre to the mobile phone (consequently the Margento terminal). The volume of the full-duplex transmission is divided into three levels with the half duplex result as a reference. The results show a strong signal volume dependence of the VAD and echo canceller system performance. Thus the operation of the echo canceller depends on the data transmission volume. It can also be seen that the difference between the signal pass during one way communication and simultaneous both way communication is minimal, up to 56% at the FSK test signal and 64% at the ASK test signal. In order to improve the data transmission quality over the voice channel of the real GSM system, the new adaptive algorithm is proposed (see Fig. 10). At the beginning of the data transmission, there is always simultaneous both-way communication (full duplex). In the situation where the bit-error rate (BER) is increased, the data transmission volume is decreased. Consequently, the negative echo canceller impact is decreased (see Fig. 9). In the situation where the signal volume decrease has not the expected impact, the transmission mode is switched to half duplex. This will eliminate the echo canceller impact. However, the overall data transmission time will be increased. 5 Conclusion In this paper we studied the effects of VAD and echo canceller modules on audio modulated digital data transmission of the GSM speech channel. First, we analyzed and implemented the ETSI VAD procedure in the MATLAB environment. Next, the simulated VAD was evaluated using different modulation signals in order to check the VAD’s output decision. It was found that the best performance is achieved when the modulation signals has speech-alike tempo-spectral characteristics. Such signals will usually not be cut-off by the VAD. Next, we performed real GSM system tests (VAD+echo canceller) using multiple mobile phones at MOBILE PHONES VAD EFFECT ON TEST SIGNALS - SEMI DUPLEX AVERAGE OF ALL MOBILE PHONES 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% WHITE NOISE QAM 400 bps 800 Hz FSK 200 bps 800-1100 Hz ASK 200 bps 800 Hz LPC MOD HUMAN SPEECH Figure 7. Average GSM phone VAD+echo canceller impact at half-duplex communication over the GSM speech channel. MOBILE PHONES VAD EFFECT ON TEST SIGNALS - FULL DUPLEX AVERAGE OF ALL MOBILE PHONES 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% WHITE NOISE QAM 400 bps 800 Hz FSK 200 bps 800-1100 Hz ASK 200 bps 800 Hz LPC MOD HUMAN SPEECH Figure 8. Average GSM phone VAD+echo canceller impact at full-duplex communication over the GSM speech channel. MOBILE PHONE VAD EFFECT ON TEST SIGNALS VARIATING VOLUME 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Semiduplex Fullduplex-Minimum Fullduplex-Medium Fullduplex-Maximum FSK 200 bps 800-1100 Hz ASK 200 bps 800 Hz Figure 9. Signal volume dependence of average GSM phone VAD+echo canceller impact at full-duplex communication over the GSM speech channel. Impact of the Echo Canceller and VAD System on Data Transmission over the GSM System Voice Channel 317 two different transmission mode: half duplex and full duplex. It was observed that the half duplex mode performs more or less flawlessly, while the half duplex mode in some cases degrades the transmission of the modulated signals due to the combined impact of the VAD and echo canceller. We observed that the VAD algorithm is standardized by ETSI and as such implemented by all manufacturers. However, the echo canceller is not standardized. Therefore, the performance of different mobile phones is varying at full duplex communication. Furthermore, the higher volume of the modulation signal transmitted in one direction degrades the reception of the signal in the opposite direction more strongly than the lower volume. Finally, a new adaptive data transmission scheme is proposed to optimise the data transmission performance over the GSM speech channel and to compensate the negative impacts of mobile phone’s VAD and echo canceller. 6 Literatura [1] Ultra Margento patent 1 and 2, WO 02/33669, WO 03/088165, 2002 [2] ETSI EN 301 245 v4.1.1, “Digital cellular telecommunications system (Phase 2)-enhanced full rate speech transcoding (GSM 06.60)”, 2000 [3] ETSI EN 300 730 v7.0.1, “Digital cellular telecommunications system-voice activity detector for enhanced full rate speech traffic channels (GSM 06.82)”, 2000 [4] L. Hanzo, F.C.A Somerville, J. P. Woodard, “Voice compression and communications”, 1999 [5] Huan M. Huerta, “Speech recognition in mobile environments”, 2000 [6] Mark Marzinzik, Birger Kollmeier, “Speech pause detection for noise spectrum estimation by tracking power envelope dynamics”, 2002 [7] Wang Fan, Zheng Fang, Wu Wenhu, “Speech detection in non-stationary noise based on 1-f process”, 2002 [8] J. Rosca, R. Balan, N. P. Fan, “Multichannel voice detection in adverse environments”, 2002 [9] J. Ramirez, J. C. Segura, C. Benitez, A. Rubio, “Efficient voice activity detection algorithms using long-term speech information”, 2003 [10] Peter Eneroth, “Stereophonic acoustic echo cancellation”, 2001 [11] John Scourias, “Overview of the global system for mobile communications”, 1995 [12] Richard V. C., Peter Kroon, “Low bit-rate speech coders for multimedia communication”, 1996 [13] L. Besacier, S. Grassi, A. Dufaux, M. Ansorge, F. Pellandini, “GSM speech coding and speaker recognition”, 2000 [14] Jan Mark De Han, “Filter bank design for digital speech signal processing”, 2004 [15] Kristo Lehtonen, “Digital signal processing and filtering – GSM Codec”, 2004 [16] Arvind Raman Kizhanatham, “Detection of cochannel speech and usable speech – GSM Codec”, 2002 [17] Khaled El-Maleh, Peter Kabal, “Comparison of voice activity detection algorithms for wireless personal communications systems”, 1997 [18] Mark D. Skowronski, “Biologically inspired noise-robust speech recognition for both man and machine”, 2004 [19] F. Beritelli, S. Casale, A. Cavallaro, “A robust voice activity detector for wireless communications using soft computing”, 1998 [20] Peter Eneroth, Tomas Gansler, “A frequency domain adaptive echo canceller with post-processing residual echo suppression by decorrelation”, 1997 Zdenko Mezgec received his BSc degree in Electrical Engineering in 2004 and Ph.D. in 2009 both at University of Maribor at University of Maribor, Slovenia. He has been working at Margento R&D as Chief of Embedded systems development. Amor Chowdhury received his MSc degree in Electrical Engineering in 1997 and PhD in Robust Control in 2001 both at University of Maribor, Slovenia. Since 2008 he is CEO of Margento R&D d.o.o., and beside this he is still working at University of Maribor, Faculty of Electrical Engineering and Computer Science, Slovenia. Rajko Svečko received his MSc degree in Electrical Engineering in 1984 and PhD in Robust Control in 1998 both at University of Maribor, Slovenia. He works as a associates professor and researcher at University of Maribor, Faculty of Electrical Engineering and Computer Science, Slovenia. Bojan Kotnik obtained his B.Sc. degree in Electrical Engineering in 2000 and Ph.D. in Automatic Speech Recognition in 2004 both at University of Maribor, Slovenia. His research domains are in the fields of digital signal processing, digital modulation and demodulation algorithms, and statistical methods for data classification. Since 2008 he has been working as Chief Scientific Officer at Margento R&D. Figure 10. Proposed adaptive data transmission scheme to deal with the negative effects of VAD and echo canceller.