ISSN 0352-9045

*Informacije* MIDEM

*Electronic Components and Materials Vol. 49, No. 4(2019), December 2019*

*Revija za mikroelektroniko, elektronske sestavne dele in materiale letnik 49, številka 4(2019), December 2019*

# MIDEN Priznanje de de la tropik LMFE

# *Informacije* MIDEM *4-2019*

Journal of Microelectronics, Electronic Components and Materials

#### VOLUME 49, NO. 4(172), LJUBLJANA, DECEMBER 2019 | LETNIK 49, NO. 4(172), LJUBLJANA, DECEMBER 2019

Published quarterly (March, June, September, December) by Society for Microelectronics, Electronic Components and Materials - MIDEM. Copyright © 2019. All rights reserved. | Revija izhaja trimesečno (marec, junij, september, december). Izdaja Strokovno društvo za mikroelektroniko, elektronske sestavne dele in materiale – Društvo MIDEM. Copyright © 2019. Vse pravice pridržane.

#### **Editor in Chief | Glavni in odgovorni urednik**

Marko Topič, University of Ljubljana (UL), Faculty of Electrical Engineering, Slovenia

#### **Editor of Electronic Edition | Urednik elektronske izdaje**

Kristijan Brecl, UL, Faculty of Electrical Engineering, Slovenia

#### **Associate Editors | Odgovorni področni uredniki**

Vanja Ambrožič, UL, Faculty of Electrical Engineering, Slovenia Arpad Bürmen, UL, Faculty of Electrical Engineering, Slovenia Danjela Kuščer Hrovatin, Jožef Stefan Institute, Slovenia Franc Smole, UL, Faculty of Electrical Engineering, Slovenia Matjaž Vidmar, UL, Faculty of Electrical Engineering, Slovenia

#### **Editorial Board | Uredniški odbor**

Mohamed Akil, ESIEE PARIS, France Giuseppe Buja, University of Padova, Italy Gian-Franco Dalla Betta, University of Trento, Italy Martyn Fice, University College London, United Kingdom Ciprian Iliescu, Institute of Bioengineering and Nanotechnology, A\*STAR, Singapore Marc Lethiecq, University of Tours, France Teresa Orlowska-Kowalska, Wroclaw University of Technology, Poland Luca Palmieri, University of Padova, Italy Goran Stojanović, University of Novi Sad, Serbia

#### **International Advisory Board | Časopisni svet**

Janez Trontelj, UL, Faculty of Electrical Engineering, Slovenia - Chairman Cor Claeys, IMEC, Leuven, Belgium Denis Đonlagić, University of Maribor, Faculty of Elec. Eng. and Computer Science, Slovenia Zvonko Fazarinc, CIS, Stanford University, Stanford, USA Leszek J. Golonka, Technical University Wroclaw, Wroclaw, Poland Jean-Marie Haussonne, EIC-LUSAC, Octeville, France Barbara Malič, Jožef Stefan Institute, Slovenia Miran Mozetič, Jožef Stefan Institute, Slovenia Stane Pejovnik, UL, Faculty of Chemistry and Chemical Technology, Slovenia Giorgio Pignatel, University of Perugia, Italy Giovanni Soncini, University of Trento, Trento, Italy Iztok Šorli, MIKROIKS d.o.o., Ljubljana, Slovenia Hong Wang, Xi´an Jiaotong University, China

#### **Headquarters | Naslov uredništva**

Uredništvo Informacije MIDEM MIDEM pri MIKROIKS Stegne 11, 1521 Ljubljana, Slovenia  $T. +386(0)15133768$ F. + 386 (0)1 513 37 71 E. info@midem-drustvo.si www.midem-drustvo.si

Annual subscription rate is 160 EUR, separate issue is 40 EUR. MIDEM members and Society sponsors receive current issues for free. Scientific Council for Technical Sciences of Slovenian Research Agency has recognized Informacije MIDEM as scientific Journal for microelectronics, electronic components and materials. Publishing of the Journal is cofinanced by Slovenian Research Agency and by Society sponsors. Scientific and professional papers published in the journal are indexed and abstracted in COBISS and INSPEC databases. The Journal is indexed by ISI® for Sci Search®, Research Alert® and Material Science Citation Index™. |

Letna naročnina je 160 EUR, cena posamezne številke pa 40 EUR. Člani in sponzorji MIDEM prejemajo posamezne številke brezplačno. Znanstveni svet za tehnične vede je podal pozitivno mnenje o reviji kot znanstveno-strokovni reviji za mikroelektroniko, elektronske sestavne dele in materiale. Izdajo revije sofinancirajo ARRS in sponzorji društva. Znanstveno-strokovne prispevke objavljene v Informacijah MIDEM zajemamo v podatkovne baze COBISS in INSPEC. Prispevke iz revije zajema ISI® v naslednje svoje produkte: Sci Search®, Research Alert® in Materials Science Citation Index™.

Design | Oblikovanje: Snežana Madić Lešnik; Printed by | tisk: Biro M, Ljubljana; Circulation | Naklada: 1000 issues | izvodov; Slovenia Taxe Percue | Poštnina plačana pri pošti 1102 Ljubljana



Journal of Microelectronics, Electronic Components and Materials vol. 49, No. 4(2019)

# *Content | Vsebina*



https://doi.org/10.33180/InfMIDEM2019.401



Journal of Microelectronics, Electronic Components and Materials Vol. 49, No. 4(2019), 193 – 201

# *Digital Implementation of a Spiking Convolutional Neural Network for Tumor Detection*

*Ayoub Adineh-vand1 , Gholamreza Karimi1 and Mozafar Khazaei2*

*1 Razi University, Faculty of Engineering, Electrical Engineering Department, Kermanshah, Iran 2 Kermanshah University of Medical Sciences, Reproduction Research Center, Kermanshah, Iran*

**Abstract:** The structural variation of the brain tissue creates challenges for detection of tumors in MRI images. In this paper, an architecture for spiking convolutional neural networks (SCNNs) is implemented in an embedded system and their potential is evaluated in terms of hardware utilization and power consumption in complex applications such as tumor detection. Accordingly, the structure of the proposed SCNN is implemented on a field-programmable gate array (FPGA) using fixed point arithmetic. To evaluate the speed, accuracy and flexibility of the proposed SCNN, Izhikevich neuron model is used with the spike-timing-dependent plasticity (STDP) learning rule. The suggested neural network is explored for digital implementation possibility and costs. Results of the hardware synthesis and digital implementation are presented on an FPGA.

**Keywords:** Brain tissue; MRI images; Spiking Neural Network; Digital Implementation; STDP

# *Digitalna implementacija sunkovnih nevronskih mrež za detekcijo tumorjev*

**Izvleček:** Strukturne razlike v možganskem tkivu predstavljajo iziv pri detekciji tumorjev v MRI slikah. Članek opisuje arhitekturo implementacije sunkovnih konvolucijskih nevronskih mrež v vgradnih sistemih. Njihov potencial je ocenjen na osnovi strojne uporabnosti in porabe pri detekciji tumorjev. Struktura je implementirana v FPGA okolju. Ocena hitrosti, natančnosti in fleksibilnosti je opravljena z Izhikevichevim nevronskim modelom.

**Ključne besede:** možgansko tkivo; MRI slike; sunkovne nevronske mreže; digitalizacija; STDP

*\* Corresponding Author's e-mail: ghkarimi@razi.ac.ir*

# *1 Introduction*

Most clinical reports struggle with the problem of large volume of data about patient records in the form of medical imaging such as MRI images and CT scans. Analyzing and efficient processing of these huge data opens up a research avenue motivating researchers to explore possible solutions and help physicians to have a better diagnosis particularly in case of emergencies when no expert is available. Convolutional neural networks (CNNs) as a type of deep neural network (DNN) present a premium performance in machine learning fields including pattern recognition, speech and image processing, and natural language processing.

Simulation and implementation of brain-like networks are vital for perceiving the way the brain processes information. The third generation of neural network models, called spiking neural networks (SNNs), improved the level of biological realism in neural simulations. SNNs have provided many opportunities for opening up a totally new field in artificial intelligence research. Currently, spiking neural networks (SNNs) have gained popularity because of their biological plausibility. Practically, when a neuron model is selected for large SNN, there is always an exchange between the biological plausibility and computational efficiency [6].

Alan Lloyd Hodgkin and Andrew Huxley suggested the first scientific model of spike neurons named Hodgkin-Huxley (HH) model in 1952 [7]. This model presented the procedure of spike generation with a set of four differential equations by describing how action potentials take place and reproduce. Considering accuracy and computational complexity, various biological models such as Izhikevich model [8], [9], Integrate and Fire model [8], FitzHugh-Nagumo (FHN) model [10], [11], and Hindmarsh-Rose model [12] are available. Effective tools for analysis of primary procedures in the brain are provided by spiking models and solutions are suggested for a wide range of special problems in engineering, including fast signal processing and pattern and speech recognition [13]. The procedure of data processing in the brain can be simulated outside the brain through analog or digital circuits if the effective model and detailed condition of neuron connections are selected. Through targeting various platforms, hardware realization of biological neuron models has been examined. VLSI systems are notable options for the neural systems' direct implementations. Rapid prototyping of neural algorithms to realize theories of computational neuroscience, network architecture, and learning system is made by a VLSI implementation as it enjoys high performance and remarkably improved technology [14]. Digitally implemented neurobiological networks possess shorter development times and are more flexible while they consume more silicon area and power compared with their analog counterparts.

Nowadays, breakthroughs in circuits and systems such as application specific integrated circuits (ASIC), graphical processing units (GPU), and custom hardware accelerators have been proposed as methods for implementing CNNs in practical applications [1-2]. A high accuracy digital implementation makes it possible to develop networks with high dynamic range and stability. Recently, in order to realize neural system models, reconfigurable digital platforms have been utilized [15]-[22]. Critical challenges of the digital implementation include Through FPGA it is possible to achieve lower power consumption [3, 4, 5].

In this paper, a spiking convolutional network has been proposed based on Izhikevich neuron. By using STDP as a learning rule, the network was trained to achieve the higher accuracy in tumor detection. Furthermore, an architecture is presented for Izhikevich neuron. Accordingly, synthesis and physical implementation have been done on the FPGA board.

# *2 The proposed neural network*

Considering the biological plausibility and power efficiency of neuromorphic platforms, developing deep SNNs for these platforms is inevitable. These types of neural networks are not precise and are not considered as deep learning methods. On the other side, SNNs are efficient networks for simulating the brain performances to solve complicated problems in the field of intelligent objects. The proposed architecture has been presented as the simplest deep structure which is fully connected and consists of input, hidden, and output layer. Fig. 1 shows the overall structure of the deep SNN with its layers. The structure of the suggested deep spiking neural network offers a neuronal population with hidden layers which is capable of being employed in the medical images. The input layer learns to perform pre-processing on the input. Information is then sent to a series of hidden layers. These layers can vary in number. As the information disseminates through hidden layers, more complex features are extracted and learned. The output layer performs classifications and detects the tissues of the input images, usually by Soft-max. The proposed SNN network contains Izhikevich neurons. In this structure, the data flows in a completely one-way flow from the input to the output units. Data processing can be performed at several layers of neurons, but there is no feedback in this structure. For bridging biologically plausible learning algorithms and traditional learning methods in neural networks, deep spiking neural networks can be an ideal choice. An important restricting parameter is lack of training algorithms that have specific uses in the capability of spiking neuron models. Most methods use rate-based approximations of conventional DNNs. Deep SNNs might still be suitable because approximating the results could be achieved more efficiently and faster than traditional systems, especially if the SNN is implemented on a neuromorphic hardware platform. Furthermore, designing and analyzing the training algorithms for SNNs and their employment are more difficult because they are discontinuous and asynchronous methods for computing [23]. In the last decades, a new learning approach has been emerged in cellular learning according to which temporal order has been focused instead of frequency. This novel learning rule has been known as spike-timing dependent plasticity (STDP). STDP process presents the activity-dependent development of neural systems by considering longterm potentiation and long-term depression. Also, it has obtained great popularity because of having the mixture of computational power and biological plausibility [24]. To train network weights, an STDP algorithm has been applied along with a gradient algorithm that is a kind of reinforcement training. This network is used to achieve a better classification in detecting benign and malignant tumors. The output results of the SNN are used for categorizing two types of images which are recognized at frequencies 11 Hz and 80 Hz for images with and without tumors, respectively.



**Figure 1:** The suggested deep spiking neural network

By using Izhikevich neurons as biologically plausible units, the equations of the neuron are as follows [8], [9]:

$$
\frac{dv}{dt} = (0.04 \times v^2) + (5 \times v) + 140 - u + I \tag{1}
$$

$$
\frac{du}{dt} = a \times ((b \times v) - u)
$$
 (2)

if 
$$
v \ge vth
$$
: 
$$
\begin{cases} v \leftarrow c \\ u \leftarrow u + d \end{cases}
$$
 (3)

Also STDP algorithm is presented by [25]:

$$
\begin{cases}\nW(x) = A_+ \exp\left(-\frac{x}{\tau_+}\right) & \text{for } x > 0 \\
W(x) = -A_- \exp\left(-\frac{x}{\tau_-}\right) & \text{for } x < 0\n\end{cases}
$$
\n(4)

where, A $_+$  and A $_{\scriptscriptstyle\perp}$  are the domains of weight changes,  $\tau_{_+}$ and τ $_{\rm c}$  are 10 ms, and W is the synaptic weight. To evaluate the speed, accuracy, and flexibility of the proposed spiking neural network, it is implemented on FPGA.

#### *3 Hardware design*

Countless analog and digital brain-inspired electronic systems have been put forward as solutions for brisk simulations of spiking neural networks. While these architectures are proper for realizing the computational features of large scale models of the nervous system, the challenge of constructing physical devices that are able to operate intelligently in the real world and display cognitive competence is still kept open. Designing and efficient implementation of these structures in hardware provide us with the benefit of presenting a processing system based on the structure of brains. Analog circuits require precision in terms of the fabrication procedure variations and environment temperatures. In fact, designing circuits that perform reliably under a vast range of extraneous factors is a challenging endeavor. As a result of this challenge, there is a dissonant condition between simulation results and the analog implementation. Furthermore, the reconfiguration of a very large scale integrated (VLSI) implementation is not achieved easily. Consequently, having a rapid prototyping platform for neural models with homogeneous flexibility in general purpose microprocessors seems essential. An FPGA is an ideal technology for this purpose. It is true that digital computation consumes more silicon area and power per operation than its analog counterpart; however, it affords extra merits. Having fascinating features such as low-cost, flexibility, reliability, and digital precision makes FPGAs popular as a promising choice over analog VLSI approaches for designing neuromorphic systems. A digital implementation of the spiking neural network is considered for its fast, high precision, and flexible storage structure. On the other hand, usually both analog and digital implementations are used. Analog implementations have more restrictions than digital ones. Using a reconfigurable and programmable device like FPGA can be an ideal option. The smart and small systems used in modern day-to-day applications and the possibility of their connection to the computer, require the implementation of neural network hardware in small volumes. In this structure, a large number of neurons are packed in order to implement the network at a huge scale. Based on the Euler recursive method, differential equations are solved for a neuron model.



**Figure 2:** General structure of the proposed hardware

Fig. 2 presents an overview of the proposed hardware which consists of the training unit (TU), the coefficient matrix (CM), the control unit (CU), and the neural population. The TU deals with the process of training neurons based on their weights. The CM contains values of weights, parameters of neurons, and other network variables. CU produces the necessary control signals for the training of the proposed network and also controls the necessary conditions for passing the computational units. The process of training neuron weights is done in the training unit. Neuronal population consists of biochemical neurons that are used in all three layers of the network. This unit is employed for evaluation of the proposed CNN on an FPGA for tumor detection



**Figure 3:** Scheduling diagram for the Izhikevich neuron

as a case study which will be discussed presently. As shown in Fig. 3, each Izhikevich neuron can be implemented at six different states by the presented scheduling diagram for describing in Verilog HDL as a hardware neuron unit. Also, Fig. 4 demonstrates the general structure of Izhikevich neuron by logical units. Accordingly, a hardware architecture is proposed based on combinational circuits such as Multiplexers, Multipliers, and arithmetic logic units (ALU). Each part of this architecture can perform on the base of the discretized the Izhikevich neuron in Euler method. Also, by using

functional units, the spiking neuron model can be described in HDLcode.

Figs. 5 and 6 represent the frequency movement for detecting cancer tumors at frequency 11 Hz and for noncancerous tumors at frequency 80 Hz, respectively. Accordingly, the proposed hardware system is designed for recognizing the cancerous and non-cancerous tumors at two different frequencies. Fig. 7 shows how to change weights for 8 different neurons.

# *4 Brain tumors in MR images and the proposed convolutional neural network*

A spiking convolutional neural networks (SCNN) was used for tumor detection as a signal processing application in magnetic resonance imaging (MRI). Based on Fig. 8, which illustrates the basic architecture, the images as inputs are transformed to spikes after preprocessing. The weights trained by a non-spiking CNN have been used in the spiking layers. The neuron with maximum activity (spike frequency) has been selected as the image's class. One of the most significant CNNto-SNN approaches for energy efficient recognition is the structure displayed in Fig. 8 in which the weight normalization is employed to reduce the performance loss. The MRI images are employed for the SCNN by using STDP learning. There are 700 experimental MRI images as input, 80% of which are used for training



**Figure 4:** The general structure of the Izhikevich neuron and the used blocks



**Figure 5:** Convergence of the neuron of the network output layer to 80 Hz frequency



**Figure 6:** Convergence of the neuron of the network output layer to 11 Hz frequency



**Convolutional Neural Network** 



**Figure 8:** Architecture - The CNN is a two-way path in which the rotation on the input piece is performed in two ways.

and 20% are applied for test. All simulations have been done in MATLAB.

Fig. 9 provides a better picture of the performance of the proposed architecture during shaping features which suggests that it is an improvement on the basis of the Dice criteria in all areas of the tumor.

## *5 Physical Implementation*

The proposed SCNN is used for recognizing the tumors in MRI images which are categorized based on the presence or absence of tumors as shown in Fig. 10. On the base of this figure, a physical implementation can be proposed for detecting the differences in these two categories. Fig. 11 depicts the final results of physicallyimplemented network by which the images with and without tumors (displayed in Fig. 10) can be recognized at 11 Hz and 80 Hz frequencies, respectively. Furthermore, the outcomes of the physical implementation are presented as experimental results, verifying and validating the accuracy of the proposed method. Also, the hardware utilization summary results are presented in Table 1 in which an efficient implementation has been obtained. Table 2 compares the synthesis results



**Figure 9:** Results of applying the proposed method based on the criterion on one of the database images





**Figure 10:** Two MRI images a) without and b) with tumors "

of the present study with those of two other studies reported in the literature, which suggests the lower cost of SCNN as compared with CNN on a Virtex-6 FPGA ML605 board. The merits of SCNN as an ideal choice for hardware implementation purposes is evident.

**Table 1:** Device utilization summary





**Figure 11:** Results of the physical implementation of the proposed SNN: (a) for images with tumor at frequency 11 Hz (b) for images without tumor at frequency 80 Hz

**Table 2:** Comparison of the synthesis results of the study with the previously published works



# *6 Conclusions*

In this paper, an automated fragmentation of brain lesions was presented based on deep convolution neural networks. The proposed architecture is a great improvement in the voxel-based classification with regard to neighboring information and a number of features. In addition to the ability to apply MR images, the proposed method can also be applied to enhancedcontrast scan images. Moreover, by appropriate training of this learning method, a wide range of medical images taken with different devices can also be covered. A spiking neural network was used to detect benign and malignant tumors. Furthermore, hardware design and digital implementation on the FPGA framework improved the speed, accuracy, and flexibility of

the proposed spiking network, which uses a combination of the Izhikevich and the STDP learning model. The experimental test results confirmed the accuracy of the proposed model. To the best of our knowledge, while there are some research studies on medical image processing by deep learning, particularly convolutional neural networks, no instance of hardware implementation of SCNN was found to use medical images as a dataset. Also, our dataset is a special one based on experimental work and it is not a general dataset. Our hardware implementation based on a Virtex-6 FPGA ML605 board, presents a hardware module which is appropriate for MRI-embedded devices and this reduces human mistakes. Also, based on the results of the study, the possibility of physical implementation is recommended.

# *7 References*

- 1. A. Putnam et al., "A reconfigurable fabric for accelerating large-scale datacenter services," ACM SIGARCH Computer Architecture News, vol. 42, no. 3, pp. 13–24, 2014. https://doi.org/10.1145/2678373.2665678
- 2. A. Pullini, et al., "A heterogeneous multi-core system-on-chip for energy efficient brain inspired computing," IEEE Transactions on Circuits and Systems II (TCAS-II): Express Briefs, 2017. https://doi.org/10.1109/TCSII.2017.2652982
- 3. C. Zhang et al., "Optimizing fpga-based accelerator design for deep convolutional neural networks," Proceedings of the ACM/SIGDA Int Symp on Field-Programmable Gate Arrays, pp. 161-170, 2015. https://doi.org/10.1145/2684746.2689060
- 4. M. Motamedi, et al., "Design space exploration of fpga-based deep convolutional neural networks," proceedings of Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 575-580, 2016. https://doi.org/10.1109/ASPDAC.2016.7428073
- 5. A. Rahman, J. Lee, and K. Choi. "Efficient FPGA acceleration of Convolutional Neural Networks using logical-3D compute array," proceedings of Design, Automation and Test in Europe (DATE), pp. 1393-1398, 2016.

```
https://doi.org/978-3-9815370-7-9/DATE16
```
- 6. F. Ponulak and A. Kasinski, "Introduction to spiking neural networks: Information processing, learning and applications," Acta Neurobiologiae Experimentalis, vol. 71, no. 4, pp. 409–433, 2011. https://www.ncbi.nlm.nih.gov/pubmed/22237491
- 7. A. L. Hodgkin and A. F. Huxley, "A quantitative description of membrane current and its application to conduction and excitation in nerve," Journal of Physiol, vol. 117, no. 4, pp. 500–544, Aug. 1952. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1392413/
- 8. E. M. Izhikevich, "Which model to use for cortical spiking neurons?" IEEE Trans. Neural Netw., vol. 15, no. 5, pp. 1063-1070, Sep. 2004. https://doi.org/10.1109/TNN.2004.832719
- 9. E. M. Izhikevich, "Simple model of spiking neurons," IEEE Trans. Neural Netw., vol. 14, no. 6, pp. 1569-1572,Nov.2003. https://doi.org/10.1109/TNN.2003.820440
- 10. R. FitzHugh, "Impulses and physiological states in theoretical models of nerve membrane," Biophysical Journal, vol. 1, no.6, pp. 445– 466, Jul. 1961. https://doi.org/10.1016/S0006-3495(61)86902-6
- 11. J. Nagumo, S. Arimoto and S. Yoshizawa, "An active pulse transmission line simulating nerve axon," Proc Inst.RadioEng, vol. 50, no.10, pp. 2061–2070, Oct. 1962.

https://doi.org/10.1109/JRPROC.1962.288235

- 12. R. M. Rose and J. L. Hindmarsh, "The assembly of ionic currents in a thalamic neuron I. The threedimensional model," Proceeding of the Royal Society, vol. 237, no. 1288, pp. 267–288, Aug. 1989. https://doi.org/10.1098/rspb.1989.0049
- 13. H. Paugam-Moisy and S. Bohte, "Computing with spiking neuron networks," In Handbook of natural computing, Springer Berlin Heidelberg, pp. 335- 376, 2012.

https://doi.org/10.1007/978-3-540-92910-9\_10

14. G. Indiveri, E. Chicca, and R. Douglas, "A VLSI array of low-power spiking neurons and bistable synapses with spike-timing dependent plasticity," IEEE Trans. Neural Netw., vol. 17, no. 1, pp. 211– 221, Jan. 2006.

https://doi.org/10.1109/TNN.2005.860850

- 15. R. Tapiador-Morales, A. Linares-Barranco, A. Jimenez-Fernandez, G. Jimenez-Moreno, "Neuromorphic lif row-by-row multiconvolution processor for fpga", IEEE Transactions on Biomedical Circuits and Systems, vol. 13, no. 1, pp. 159-169, Feb 2019. https://doi.org/10.1109/TBCAS.2018.2880012
- 16. T. Naka and H. Torikai, "A Novel Generalized Hardware-Efficient Neuron Model Based on Asynchronous CA Dynamics and Its Biologically Plausible On-FPGA Learnings" IEEE Trans. Circuits Syst. II, Express Briefs, vol. 66, no. 7, pp. 1247–1251, July. 2019. https://doi.org/10.1109/TCSII.2018.2876974
- 17. G. Karimi, M. Gholami, and E. Z. Farsa, "Digital implementation of biologically inspired Wilson model, population behavior, and learning," *Int. J. Circuit Theory Appl.*, vol. 46, no. 4, pp. 965–977, Apr. 2018.

https://doi.org/10.1002/cta.2457

18. E. Z. Farsa, S. Nazari, and M. Gholami, "Function approximation by hardware spiking neural network," *J. Comput. Electron.*, vol. 14, no. 3, pp. 707– 716, Sep. 2015. https://doi.org/10.1007/s10825-015-0709-x

- 19. A. Jiménez-Fernández et al., "A binaural neuromorphic auditory sensor for FPGA: A spike signal processing approach," IEEE Trans. Neural Netw. Learn. Syst., vol. 28, no. 4, pp. 804–818, Apr. 2016. https://doi.org/10.1109/TNNLS.2016.2583223
- 20. S. Yang et al., "Digital implementations of thalamocortical neuron models and its application in thalamocortical control using FPGA for Parkinson's disease", Neurocomputing, vol. 177, pp. 274- 289, 2016.

https://doi.org/10.1016/j.neucom.2015.11.026

- 21. E. Zaman Farsa, A. Ahmadi, M. A. Maleki, M. Gholami, and H. Nikafshan Rad, "A low-cost high-speed neuromorphic hardware based on spiking neural network," IEEE Trans. Circuits Syst. II: Express. Briefs, published, Jan. 2019. https://doi.org/10.1109/TCSII.2019.2890846
- 22. E. I. Guerra-Hernandez, A. Espinal, P. Batres-Mendoza, C. Garcia-Capulin, R. D. J. Romero-Troncoso, H. Rostro-Gonzalez, "A FPGA-based neuromorphic locomotion system for multi-legged robots", IEEE Access, vol. 5, pp. 8301-8312, Apr. 2017. https://doi.org/10.1109/ACCESS.2017.2696985
- 23. Michael Pfeiffer, Thomas Pfeil, "Deep learning with spiking neurons: Opportunities & challenges", *Frontiers in Neuroscience*, vol. 12, pp. 774, 2018. https://doi.org/10.3389/fnins.2018.00774
- 24. H. Markram, W. Gerstner, P. J. Sjstrm, "A history of spike-timing-dependent plasticity", *Front. Synap. Neurosci.*, vol. 3, 2011. https://doi.org/10.3389/fnsyn.2011.00004
- 25. M. R. Azghadi, et al. "Efficient design of triplet based spike-timing dependent plasticity,"IEEE International Joint Conference on Neural Networks (IJCNN), 2012.

https://doi.org/10.1109/IJCNN.2012.6252820

26. A. Azarmi Gilan, et al. "FPGA-based Implementation of a Real-Time Object Recognition System using Convolutional Neural Network", IEEE Transactions on Circuits and Systems II: Express Briefs. 2019. https://doi.org/10.1109/TCSII.2019.2922372



Copyright © 2019 by the Authors. This is an open access article distributed under the Creative Com-

mons Attribution (CC BY) License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Arrived: 27. 09. 2019 Accepted: 17. 12. 2019

https://doi.org/10.33180/InfMIDEM2019.402



ournal of Microelectronics, Electronic Components and Materials Vol. 49, No. 4(2019), 203 – 210

# *Design of Priority Based Reconfigurable Router in Network on Chip*

*D. David Neels Ponkumar1 , V. Sushmitha2 , K.Saravanan3*

*1 Dr. Sivanthi Aditanar College of Engineering, Department of Electronics Communication Engineering, Tamil Nadu, India 2 Chandy College of Engineering, Department of Electronics and Communication Engineering, Tamil Nadu, India 3 SAINTGITS College of Engineering and Technology, Department of Electronics and Communication Engineering, Kerala, India*

**Abstract:** Network on Chip (NoC) is an advanced integration design for communication networks while providing a solution to the traditional bus based System on Chip designs (SoC) too. A router is a key component which is considered as the backbone of communication in NoC. The objective of this work is to design a priority based reconfigurable router for NoC. Initially, a 4x4 Baseline router is designed and synthesised and then the channels inside the router are modified to achieve reconfiguration to improve the router's efficiency. In 4x4 Reconfigurable Router the slots are well utilized but prioritization portion is not considered. Routers are associated with switches to take data transfer decisions resulting in high power consumption. In order to overcome this problem, a new priority based reconfigurable router is designed. The design for this router is carried out using Verilog HDL and synthesized and simulated using Xilinx ISE Design Suite 14.3 and ModelSim-Altera 6.5b Software respectively. The corresponding results in terms of power, energy efficiency, area and delay are analysed and the proposed work gives better results than the conventional Baseline Router.

**Keywords:** Router; NoC; On Chip Power; Area; Energy Efficiency; Delay; Priority; Reconfigurable Router

# *Dizajn prioritetno nastavljivega usmerjevalnika v omrežju na čipu*

**Izvleček:** Omrežkje na čipu (NoC) je napredna integracija komunikacijskih omrežij z uporabo tradicionalnih sitemov na čipu na osnovi vodil (SoC). Usmerjevalnik predstavlja ključno komponento v komunikaciji preko NoC. Namen dela je razvoj prioritetno nastavljivega usmerjevalnika za NoC. V osnovi je uporabljen 4x4 usmerjevalnik, v kateremu so, za doseganje učinkovitosti, kanali naknadno razširjeni. Reže v nastavljivem usmerjevalniku so dobro uporabljene, vendar ne upoštevajo prioritet. Usmerjevalniki so povezani s stikali za prenos podatkov, ki imajo visoko porabo. Nastavljivi usmrjevalniku rešujejo problem visoke porabe. Dizajn usmerjevalnika je realiziran v Verilog HDL okolju in simuliran z Xilinx ISE Design Suite 14.3 in ModelSim-Altera 5.5b programsko opremo. Rezultati izkazujejo boljšo učinkovitost, nižjo porabo in zakasnitev predlaganega usmerjevalnika.

**Ključne besede:** Router; NoC; On Chip Power; Area; Energy Efficiency; Delay; Priority; Reconfigurable Router

*\* Corresponding Author's e-mail: david26571@gmail.com*

# *1 Introduction*

The key behind the Integrated Circuit technology has been Moore's law for almost five decades. Although this is projected to slow down to doubling every 3 years in the next few years for fixed chip sizes, the exponential trend is still in force. Because of the evolution, the system level focus moves in steps. It leads to a paradigm shift through the technology maturity for a given implementation style. Past examples of such

shifts were moving from room to rack level systems (LSI-1970s) and later from rack to board level systems (VLSI-1980s). This trend allowed the introduction of Systems on-Chip (SoC) (1990s), the integration of many components such as Microprocessors and Custom IPs in a single chip. Hence the integration of many processing elements along with the memory cores in a single chip [1, 2] was introduced. In turn it created a communication overhead that traditional bus-based architectures cannot handle for a number of reasons. In order to solve these problems, NoC (Network on Chip) [3, 4] is a good paradigm. NoC is an integrated network that uses routers to allow the communication among those blocks. It uses networking theory methods for on chip communications where the blocks exchange information on a chip [5].

# *2 Network on chip*

NoC technology is a new approach to communication that [6, 7] enables not only more efficient interconnects but also more efficient design and verification processes for modern SoCs. NoC is an approach in signalling the needs of the signal to various communication protocols by reducing the complexity of the chips interconnects. The communication through the NoC is performed by Processing Elements (PEs) through the network fabric composed of switches or routers [8] through physical channels. A typical NoC based MPSoC is composed of a number of PEs such as CPUs, custom IPs, DSPs and Storage elements (Embedded Memory Blocks).

## *3 Proposed design*

The proposed architecture is a Priority based Reconfigurable Router for NoC. The architecture constitutes of a Reconfigurable Router with four channels and a Priority based Scheduler, a Buffer and an Arbiter Unit. Under this section, we will see the design of a Reconfigurable Router alongside the buffer and arbiter unit.

#### *3.1 Router design*

Router architectures have dominated the early NoC researches [9] and the first NoC design proposed the



use of simplistic routers with deterministic routing algorithms in terms of RTL design. Since, the router [10] is a component that is to be used in every future versions of the system, its architecture options may be either revised or coexist in the same architecture (heterogeneous NoCs [11]), it should be designed as a reusable IP block [12].

The basic block diagram of a NoC Router is shown in figure 1 and its major components are listed below.

- The input/output buffers that temporarily store flits.
- The output port allocation logic which selects the output port for each flit/packet.
- The switch fabric which makes the physical connection from input to output port.
- The control logic that is responsible for overall synchronization.

#### *3.2 Buffer*

Buffers are the greatest power consumers [13, 14]. Thus, efficient buffer design is critical for achieving good performance/area/power trade-off. To minimize the implementation cost, the on-chip network has to be implemented with little area overhead. Thus, unlike off chip networks which feature large memories, NoCs typically use small registers for buffering. Major advantage of using registers over large memories is the significant reduction of the address decoding/encoding latency and the access latency [15]. The FIFO scheduler which has to be fit inside the Router is designed. There are 4 FIFOs used in the design of this router. Each FIFO used here is of 9 bits wide and 16 bits depth. The FIFO operates on the system clock and the reset operation is synchronized with an active-low reset. Sizing of buffers in worst case scenario will have to compromise routing area and power consumption while improving the throughput. If depth is small, latency is more leading to a compromise in QoS.

Above figure shows the internal design of an FIFO Buffer module of our proposed design. It has two major units namely Data Path unit and Control unit. Control unit receives control signals as Credit\_ in as In\_request and Out\_request. It has sub modules such as FSM, Counter and Decoder. Minimum buffer dimensions requirements (width and depth) are functions of switching modes, packet size, flit size and expected traffics. One of the vital goals in designing NOCs is to minimize the buffer size by incorporating a trade off between latency and throughput performance degradations.

Upon arrival of the packet at the input port, it is stored in the FIFO Buffer flit by flit during every clock cycle **Figure1:** NoC Router Block Diagram provided that there is availability of free space. Simulta-



**Figure 2:** Internal design of an FIFO Buffer module

neously, FIFO Buffer will be evacuated until the output port finds free space. The evacuation stops when the control signal credit\_in entering the arbiter becomes invalid indicating non availability of free space. However the storage in the current buffer will continue. Now, the control signal credit\_out representing this port will be sent as back pressure to the adjacent source node when the FIFO Buffer becomes almost full. As soon as the first flit reaches the first location (Shift Register SR) of FIFO which is connected directly to the router, the header flit will notify arbiter about the arrival of the flit/ packet in the port and provide destination address to the Direction Decoder to calculate the direction.

Using the mechanism, eight buffer locations is sufficient enough to get the minimum latency in the absence of blocking. However, the buffer size has to be made parameterizable and chosen as 16 to make a fair understanding and reallocation to four different channels to accommodate the reconfiguration process during contention and to make a fair comparison with other designs. Hence, by doing systematic Design Space Exploration (DSE), the depth of buffers is zeroed in as 16 bits for our design.

#### **Write Operation:**

- Signal data is sampled at the rising edge of the clock when write\_enable is high.
- Write operation takes place only when FIFO is not full to avoid overrun condition.

#### **Read operation:**

- The data is read from data\_out at rising edge of the clock, when read\_enable is high.
- Read operation takes place only when the FIFO is not empty in order to avoid under run condition.

#### *3.3 Arbiter unit*

Finite State Machine (FSM) in general is simply another name for sequential circuits. Finite refers to the fact that the number of states the circuit can take during finite. A synchronous clocked FSM changes state only when a triggering edge occurs on the clock signal. It provides the selection process of selecting the operation to be carried out by the router. The FSM controller here is used to display the work to be done by the router [12]. In general, a synchronous circuit is a digital circuit in which the changes in the state of memory elements are synchronized by a clock signal. In a sequential digital logic circuit, data is stored in simple memory devices termed as the latches.

A synchronizer is a digital circuit that converts an asynchronous signal into the recipient clock domain without causing any stability issues. This module provides synchronization between router Finite State Machine and router FIFO modules. Thus, it provides fruitful communication between all the four input ports and output ports. The register module is designed using four internal registers in order to hold a header byte, FIFO full state byte, internal parity byte and packet parity byte. All the available registers are latched on the rising edge of the clock.

#### *3.4 Reconfigurable router*

If an NoC router has a larger FIFO buffer, the network will have higher throughput [16] and smaller latency as there will be fewer flits stagnant on the network .Since, each communication will have its peculiarities, sizing the FIFO for the worst case communication sce(b)

(c)

(a)



**Figure3:** a) Router Design with depth 4; b) Need for Reconfiguration; c) Reconfiguration of the Buffer to attend the Need

nario will compromise not only the routing area, but power [17] as well. However, if the router has a small FIFO depth, the latency will be larger and the quality of service (QoS) can be compromised. The solution is to have a heterogeneous router, [18] in which each channel can have a different buffer size. In this situation, if a channel has a communication rate smaller than its neighbour, it may lend some of its buffer slots that are not being used as explained clearly in stages in the figure 3. In figure 3.a all the channels are designed with a depth of 4 slots. The south channel is filled with 9 slots by borrowing three free slots from west and two free slots from east, along with its own four slots as shown in figure 3.b. Thus, the final reconfiguration of south channel with depth 9 is shown in figure 3.c. In a different communication pattern, the roles may be reversed or changed at run time, without a redesign step [19]. When the traffic is less there will be no waiting time and reconfiguration process leading to no impact in latency based on buffer depth. Our proposed idea is to design a router having 4 channels namely East Channel, West Channel, South Channel and North Channel. All four channels have separate buffer slots along with FSM and Registers as 4x4 Baseline Router. It is followed by the modification of the channels inside the router in a way to achieve reconfiguration. Reconfiguration in a router works according to the needed bandwidth in the channel. Initially, the buffer depth of all channels is decided during design. In our design, the buffer depth is fixed for 16. If the east channel is filled with the data it can be transferred to its neighbouring channel when it uses less number of slots. For storing the huge data, the data loss can be avoided by this process known as reconfigurable mechanism.

The design of east channel with the novel reconfiguration mechanism is shown in the figure 4. Priority, [20] a crucial part in networking was not considered in our primary design. Routers are associated with switches to take data transmission decisions resulting in higher power consumption [21]. In order to overcome this problem of higher power consumption, a new prior-



**Figure 4:** Proposed Design of Reconfigurable East Channel

| Α                     | B          |   | D       |                   |                | G         | Н                                       |               | K                |             | M                                   | N         |
|-----------------------|------------|---|---------|-------------------|----------------|-----------|-----------------------------------------|---------------|------------------|-------------|-------------------------------------|-----------|
| Device                |            |   | On-Chip | Power (W)         | Used           | Available | Utilization (%)                         |               | Supply Summary   | <b>ctal</b> | <b>Dynamic</b>                      | Quiescent |
| Family                | Spartan3e  |   | Clocks  | 0.003             |                |           |                                         | Source        | Voltage          |             | Current (A) Current (A) Current (A) |           |
| Pat                   | xc3s100e   |   | Logic   | 0.000             | 476            | 1920      | 25                                      | <b>Vocint</b> | 1,200            | 0.011       | 0.003                               | 0.008     |
| Package               | cp132      |   | Signals | 0.000             | 878            |           |                                         | Vocaux        | 2.500            | 0.008       | 0.000                               | 0.008     |
| Temp Grade            | Commercial | ¥ | IOs     | 0.001             | 43             | 83        | 52                                      | Voco25        | 2500             | 0.002       | 0.000                               | 0.002     |
| Process               | Typical    | ٠ | Leakage | 0.034             |                |           |                                         |               |                  |             |                                     |           |
| Speed Grade           |            |   | Total   | 0.037             |                |           |                                         |               |                  | Total       | <b>Dynamic</b>                      | Quiescent |
|                       |            |   |         |                   |                |           |                                         |               | Supply Power (W) | 0.037       | 0.004                               | 0.034     |
| Environment           |            |   |         |                   |                |           | Effective TJA Max Ambient Junction Temp |               |                  |             |                                     |           |
| Ambient Temp (C) 25.0 |            |   |         | Themal Properties | (C/W)          | O         | O                                       |               |                  |             |                                     |           |
| The motion TIAN TIA   |            |   |         |                   | m <sub>1</sub> | n27       | 222                                     |               |                  |             |                                     |           |

2. Summary

2.1. On-Chip Power Summary

| On-Chip Power Summary |  |  |          |  |     |  |       |                                                 |                 |  |  |
|-----------------------|--|--|----------|--|-----|--|-------|-------------------------------------------------|-----------------|--|--|
| On-Chip               |  |  |          |  |     |  |       | Power (mW)   Used   Available   Utilization (%) |                 |  |  |
| Clocks                |  |  | 2.62     |  |     |  |       |                                                 |                 |  |  |
| Logic                 |  |  | 0.14     |  | 476 |  | 1920  |                                                 | 25 <sub>1</sub> |  |  |
| Signals               |  |  | 0.39     |  | 878 |  | $---$ |                                                 |                 |  |  |
| <b>IOs</b>            |  |  | $0.56$ 1 |  | 43  |  | 83    |                                                 | 52              |  |  |
| <b>Static Power</b>   |  |  | 33.70    |  |     |  |       |                                                 |                 |  |  |
| Total                 |  |  | 37.41    |  |     |  |       |                                                 |                 |  |  |

**Figure 5:** Power results of 4x4 Baseline Router

ity based reconfigurable router is designed. Here, the priority is assigned at the slot allocator stage which is incorporated in the router.

During typical router operation, an incoming flit is first successfully received and possibly stored in the input queue. Second, the output port request for the incoming flit is determined based on the flit destination address according to the routing algorithm. Third, the output port allocator receives the flit [22, 23] requests and allocates the output ports according to priority. Finally, as soon as a flit is granted a port it is routed through the switch fabric to the granted output port to reach the neighbouring router or destination node.

| Delav:                         | 9.012ns (Levels of Logic = 4) |      |       |                                                             |  |  |  |  |  |  |
|--------------------------------|-------------------------------|------|-------|-------------------------------------------------------------|--|--|--|--|--|--|
| Source:                        | resetn (PAD)                  |      |       |                                                             |  |  |  |  |  |  |
| Destination: vld out 1 (PAD)   |                               |      |       |                                                             |  |  |  |  |  |  |
| Data Path: resetn to vid out 1 |                               |      |       |                                                             |  |  |  |  |  |  |
|                                |                               | Gate | Net   |                                                             |  |  |  |  |  |  |
| $Cell:in\rightarrow out$       |                               |      |       | fanout Delay Delay Logical Name (Net Name)                  |  |  |  |  |  |  |
| $IBUF:I->O$                    |                               |      |       | 17 1.218 1.226 resetn IBUF (resetn IBUF)                    |  |  |  |  |  |  |
| $LUT2:10->0$                   |                               |      |       | 130  0.704  1.468  fifo[2].f/  or00001  (fifo[2].f/  or0000 |  |  |  |  |  |  |
| $LUT4:10-20$                   |                               |      |       | 1 0.704 0.420 s1/vld out 21 (vld out 2 OBUF)                |  |  |  |  |  |  |
| $O$ BUF: $I - > 0$             |                               |      | 3.272 | vld out 2 OBUF (vld out 2)                                  |  |  |  |  |  |  |
| Total                          |                               |      |       | 9.012ns (5.898ns logic, 3.114ns route)                      |  |  |  |  |  |  |
|                                |                               |      |       | (65.4% logic, 34.6% route)                                  |  |  |  |  |  |  |

**Figure 6:** Timing results of 4x4 Baseline Router

## *4 Results and discussion*

The design of 4x4 Baseline Router is done using Verilog in Xilinx ISE software. The design is simulated using ModelSim. The on chip power consumption is found to be 37.41mW as shown in figure 5. The timing report in figure 6 shows that the synthesis of the design takes a delay of about 9.012ns.

For the efficient utilization of slots in the 4x4 Baseline Router, the need of reconfiguration of the slots in the channel is identified. The Reconfiguration of slots is done according to the bandwidth necessity of the channel. For example, if the south channel is overflowing with the data it can be transferred to its neighbouring channel if it uses the less number of slots. For assigning the four directions four channels are used in reconfiguration router. The register inside the east channel is provided with push and pop as input signal to perform the read and write operation respectively. Moreover,



**Figure7:** Simulation Waveform showing the functionality of 4x4 Reconfigurable Router



**Figure 8:** 4x4 Priority based Reconfigurable router

the output of the register is provided with full, empty, half, almost full and almost empty indications to showcase the availability of traffic. From that, the availability of free space of a channel can be inferred. Similarly, the entire data transfer which takes place inside all four channels is analysed separately. Once the push operation is done, the data out will fill the output channel manifesting the full signal indication as high. From the various full and empty indications, the need for reconfiguration is validated.

The function of 4x4 Reconfigurable Router is obtained from the simulation results as in figure 7. It shows that the data input of west 77 & 74, are not able to be stored in corresponding west output port due to scarcity of memory slot in south. We can visualize that these two data 77 &74 are stored in north's and south's empty slots respectively. Thus, the concept of reconfiguration is achieved in the simulations. Also, the on chip power consumption is found to be 33.66mW, which is 38% reduction than the Baseline router. The timing delay is







 $(a)$  (b)



**Figure 9:** Comparison of performance for various routers with proposed Priority based Reconfigurable router a) On Chip Power; b) Energy Efficiency; c) Delay; d) Area in LUTs

also found to be 4.731ns leading to a further reduction of delay too.

The function of 4x4 Priority based Reconfigurable router is obtained from the simulation results as shown in figure 8, the data input of north 20,21 & 22 are not able to store in corresponding north output port due to the scarcity of memory in north channel. So the data 20, 21 & 22 are stored in east's, west's and south's empty slots respectively. The concept of priority based reconfiguration is achieved in the sense of accessing the data inputs of any of the channels from any of the slots. The 4x4 Priority based Reconfigurable Router on chip power consumption is found to be much lesser at 15.54mW, which is 44% reduction in comparison with 4x4 Baseline router, and where as it has a further 36% reduction in delay at 3.050ns.

The figure 9 shows the overall comparison results of the designed Routers. From that, it is found that the designed Priority based Reconfigurable Router consumes less power with a reduction of 58.5% than the 4x4 Baseline router. The number of (Look Up Tables) LUTs in 4x4 Priority based Reconfigurable Router is reduced from 86 to 38, leading to an optimized router architecture of reduced area and power consumption. When implementing the proposed design on spartan3e FPGA, the delay among the configurable logic blocks is reduced from 9.012ns to 3.05ns. Thus, this design leads to a high efficiency router too. The utilization of the available energy is also significantly improved. The maximized energy efficiency is 86.46% as it has improved from 42.02 %.

## *5 Conclusion*

The Baseline Router is designed using the FIFO, FSM, Synchronizers and Registers with single input channel and three output channels. Then, the channels are extended in all four directions as 4x4 Baseline Router. The design entry of this router is done using Verilog. Their corresponding test fixtures are synthesized and implementation design is obtained using Xilinx ISE Design Suite 14.3. Those on chip supply power consumptions are obtained using XPower Analyzer. The functionality verification of the router is obtained in the form of simulation waveform results from ModelSim -Altera 6.5b. By analysis, it is found that reconfiguration of these buffer slots will lead to the reduction in power consumption in router configuration. Also, the efficient utilization of slots in the 4x4 Baseline Router will improve its performance. So, the channels of the 4x4 Baseline router's channels are modified to achieve reconfiguration. The on chip power consumption and delay of the 4x4 Reconfigurable Router is found to be reduced to 33.66mW which is 44% reduction from the Baseline Router.

In 4x4 Reconfigurable Router the slots are utilized but priority, a crucial part in networking was not considered. Routers are associated with switches to take data transmission decisions resulting in higher power consumption. In order to overcome this problem of higher power consumption a new priority based reconfigurable router is designed and found that the power consumption is reduced by 58.5% while delay is reduced to 3.05 ns. The number of LUTs is reduced to 38. The utilization of the available energy is also significantly improved. The maximized energy efficiency is 86.46% as it has improved from 42.02 %. Thus, this design has led to a high efficiency router too. Designing of Multiple priority Traffic class Router [24] can be considered for future works.

# *6 References*

- 1. A.Hemani, A.Jantsch, S.Kumar, A.Postula, J.Oberg, M.Milberg and D.Lindqvist, (2000), "Network on chip: Architecture for billion transistor era". In Proceeding of the IEEE Nor chip conference. https://doi.org/10.1109/ISVLSI.2002.1016885
- 2. Park, Dongkook, et al.(2007) "Design of a dynamic priority-based fast path architecture for on-chip interconnects. "15th Annual IEEE Symposium on-High-Performance Interconnects (HOTI 2007), IEEE Computer Society, 2007, Proceedings, 15 - 20. https://doi.org/10.1109/HOTI.2007.1
- 3. Ogras, U. Y., Marculescu, R., Gyu Lee, H., Choudhary, P., Marculescu, D., Kaufman, M., & Nelson, P. (2007). "Challenges and Promising Results in NoC Prototyping Using FPGAs". IEEE Micro, 27(5), 86–95. https://doi.org/10.1145/1255456.1255460.
- 4. Kiasari, Abbas Eslami, Zhonghai Lu, and Axel Jantsch, (2012), "An analyticallatency model for networks-on-chip." IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Volume No: 21 Issue No: 1, 113-123. https://doi.org/10.1109/TVLSI.2011.2178620.
- 5. Gaurav Verma, Harsh Agarwal, Shreya Singh, Shaheem Nighat Khanam, Prateek kumar Gupta and Vishal Jain,(2016), "Design and implementation of router for NOC on FPGA", International Journal of Future Generation communication and networking Vol.9.No.12(2016),pp.263-272 https://doi.org/10.14257/ijfgcn.2016.9.12.24
- 6. Oveis-Gharan, M., & Khan, G. N. (2018). "Flexible Reconfigurable On-chip Networks for Multi-core SoCs". Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies - HEART 2018. https://doi.org/10.1145/3241793.3241813
- 7. Kakoee, Mohammad Reza, Valeria Bertacco, and Luca Benini, (2011), "ReliNoC: Areliable network

for priority-based on-chip communication." Design, Automation & Test in Europe. IEEE, 2011. https://doi.org/10.1.1.281.3548.

- 8. Kumar, S., Jantsch, A., Soininen, J.-P., Forsell, M., Millberg, M., Oberg, J., Hemani, A. (n.d.).(2002), "A network on chip architecture and design methodology", Proceedings IEEE Computer Society Annual Symposium on VLSI. New Paradigms for VLSI Systems Design. ISVLSI 2002. https://doi.org/10-7695-1486-3
- 9. Concatto, C., Matos, D., Carro, L., Kastensmidt, F., Susin, A., &Kreutz, M. (2009), "NoC Optimization Using a Reconfigurable Router", IEEE Computer Society Annual Symposium on VLSI**.**  https://doi.org/10.1109/ISVLSI.2009.7
- 10. Pau,R., & Manjikian, N. (2008),"High-level specification and logic implementation of single-chip multiprocessor systems based on a configurable router" Canadian Conference on Electrical and Computer Engineering , 2008 https://doi.org/10.1109/CCECE.2008.4564695
- 11. Ben-Itzhak, Y., Cidon, I., Kolodny, A., Shabun, M., &Shmuel, N. (2015), "Heterogeneous NoC Router Architecture" . IEEE Transactions on Parallel and Distributed Systems, 26(9), 2479–2492. https://doi.org/10.1109/TPDS.2014.2351816
- 12. Chan, Cheng-Hao, Kun-Lin Tsai, Feipei Lai, and Shun-Hung Tsai (2011), "A priority based output arbiter for NoC router." 2011 IEEE International Symposium of Circuits and Systems (ISCAS). IEEE, 1928 – 1931.

https://doi.org/10.1109/ISCAS.2011.5937966.

- 13. Debore Matos, CarolineConcatto, Luigi carro, Fernanda Kastensmidt and MaccioKreutz,(2015), "Highly efficient Reconfigurable Routers in Network-on-Chip".IEEE Transaction on Very Large Scale Integration (VLSI) Systems https://doi.org/10.1109/VLSISOC.2009.6041348
- 14. Kashif, Hany, and Hiren Patel, (2016), "Buffer space allocation for real-time priority-aware networks." 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS). https://doi.org/10.1109/RTAS.2016.7461324.
- 15. Deepak S, Divyaprabha, M Z Kurian (2015), "Design and Verification Of Network-On-Chip Router Architecture". International Journal of Advanced Computational Engineering and Networking, ISSN: 2320-2106, Volume-3, Issue9. https://doi.org/10.2106/ijacen.2320.3.15.9.1
- 16. Amit Bhanwala, Mayank Kumar and Yogendera Kumar( 2015), "FPGA based Design of Low Power Reconfigurable Router for Network on Chip (NoC)". International Conference on Computing, Communication and Automation2015, IEEE 1320. https://doi.org/10.1109/CCAA.2015.7148581
- 17. Nirmaladevi, K., & Sundararajan, J. (2017). Low power NoC architecture based dynamic reconfigurable system Cluster Computing. https://doi.org/10.1007/s 10586-017-1413-3
- 18. Mostafa S. Sayeda, Ahmed Shalabya , Mohamed El-Sayeda , Victor Goulart, (2012), "Flexible router architecture for network-on-chip". Computers and Mathematics with Applications 64 1301–1310 at SciVerse Science Direct https://doi.org/10.1016/j.camwa.2012.03.074
- 19. Zhang, Z., Greiner, A., & Taktak, S. (2008). "A reconfigurable routing algorithm for a fault-tolerant 2D-Mesh Network-on-Chip". Proceedings of the 45th Annual

Conference on Design Automation - DAC '08. https://doi.org/10.1145/1391469.1391584

- 20. Nguyen, H. K., & Tran, X.-T. (2018), "A novel priority-driven arbiter for the router in reconfigurable Network-on-Chips". International Conference on IC Design & Technology (ICICDT). https://doi.org/10.1109/ICICDT.2018.8399747
- 21. Meenu Anna George, Aravindhan and Lakshminarayanan (2017),"Design of five port priority based router with port selection logic for noc". ICTACT Journal On Microelectronics, Volume: 02, Issue: 04 https://doi.org/10.21917/ijme.2017.0051
- 22. Li, X., Duraisamy, K., Baylon, J., Majumder, T., Wei, G., Bogdan, P.,Pande, P. (2017), "A Reconfigurable Wireless NoC for Large Scale Microbiome Community Analysis" IEEE Transactions on Computers, 66(10), 1653–1666. https://doi.org/10.1109/TC.2017.2706278
- 23. Sinduri.M N. Nagaraja Kumar, (2013),"Design and Implementation of Reconfigurable Router Architecture for Dynamic NoC's". IJESC Volume No: 1 Issue No: 10 ISSN: 2250 - 1371 Paper IJESC116 . https://doi.org/10.1371/ijesc.2250.10.13.1.3
- 24. Mandal, Sumit K., Raid Ayoub, Michael Kishinevsky, Umit Y. Ogras, Analytical Performance Models for NoCs withMultiple Priority Traffic Classes." ACM Transactions on Embedded Computing Systems, Volume 1, Issue 1, https://doi.org/10.1145/3126530.

 $\bf G$ (cc

Copyright © 2019 by the Authors. This is an open access article distributed under the Creative Com-

mons Attribution (CC BY) License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Arrived: 18. 06. 20119 Accepted: 06. 01. 2020 https://doi.org/10.33180/InfMIDEM2019.403

Informacije ournal of Microelectronics,

Electronic Components and Materials Vol. 49, No. 4(2019), 211 – 217

*Electronically Adjustable Capacitance Multiplier Circuit with a Single Voltage Differencing Gain Amplifier (VDGA)*

*Worapong Tangsrirat1 , Orapin Channumsin2 , Jirapun Pimpol2*

*1 King Mongkut's Institute of Technology Ladkrabang (KMITL), Faculty of Engineering, Bangkok, Thailand*

*2 University of Technology Isan, Faculty of Engineering Rajamangala, Khonkaen Campus, Khonkaen, Thailand*

**Abstract:** In this work, we propose a resistorless realization of a simple electronically adjustable capacitance multiplier circuit using a single voltage differencing gain amplifier (VDGA) as an active building block. The circuit utilizes one VDGA and only one capacitor in a simple circuit configuration. The proposed capacitance multiplier circuit can be tuned electronically with the adjustment of the transconductance gains of the VDGA. To emphasis the applicability of the proposed circuit, a second-order RC low-pass filter is constructed as an application example. PSPICE simulations are performed to verify the theory.

**Keywords:** Voltage Differencing Gain Amplifier (VDGA); capacitance multiplier; electronically adjustable; active circuits

# *Elektronsko nastavljiv kapacitivni množilnik z enojnim napetostnim diferencialnim ojačevalnikom (VDGA)*

**Izvleček:** V članku je predstavljen enostaven elektronsko nastavljiv kapacitiven množilnik z diferencialnim ojačevalnikom (VDGA) brez uporabe uporov. Vezje vključuje en VDGA in le en kondenzator. Predlagan kapacitivni množilnik je elektronsko nastavljiv s pomočjo spreminjanja transkonduktančnega ojačenja VDGA. Kot primer uporabe je predstavljen nizkopasovni RC filter drugega reda. Teoretični zaključki so preverjeni s PSPICE simulacijami.

**Ključne besede:** Napetostni diferencialni ojačevalnik (VDGA); kapacitiven množilnik; elektronska nastavljivost; aktivna vezja

*\* Corresponding Author's e-mail: j.pimpol.rmuti@gmail.com*

# *1 Introduction*

Integrable high-valued capacitances are necessary and often used in several analog integrated applications, such as sensor interfacing circuits, monolithic phase-locked loops, sample-and-hold data systems, and implantable biomedical systems [1-3]. However in fully integrated circuit design, the fabrication of the large-valued capacitors is an essential problem, due to their large occupation of fractional die areas for standard silicon-based technology [4-5]. A possible solution is the implementation of high capacitance values from smaller ones by the use of a capacitance multiplication technique. This justifies the development, in the last few decades, of various circuit techniques to implement grounded and floating capacitance multiplier circuits with some active elements like second-generation current conveyors (CCIIs) [6-11], current feedback operational amplifiers (CFOAs) [12- 13], operational transconductance amplifiers (OTAs) [14-15], current follower transconductance amplifiers (CFTAs) [16], current differencing transconductance amplifiers (CDTAs) [17], current backward transconductance amplifiers (CBTAs) [18], differential voltage current conveyors (DVCCs) [19], DVCC-transconduct-

ance amplifiers (DVCCTAs) [20], and voltage differencing buffered amplifiers (VDBAs) [21].

Circuits from [8-16], [18-19], [21] employ at least two active components. The active element-based capacitance multipliers proposed in [6], [8-10], [12-13], [17- 21] are designed with two or more passive elements. These would need relatively high power dissipation and large silicon area on the chip. In the literature [8-9], [14-15], the capacitance multiplier realizations with different active components have been proposed. Moreover, the available active capacitance multipliers in [6- 10], [12-13], [19], [21] are not programmable.

This paper proposes a voltage differencing gain amplifier (VDGA)-based capacitance multiplier circuit. The circuit is realized with only one VDGA together with one floating capacitor. No need for strict componentmatching conditions is required. The equivalent capacitor value is electronically tunable by changing the ratio of the VDGA transonductances. The effects of the VDGA non-idealities are also discussed and evaluated. To confirm the analytical calculation, the simulation results with TSMC 0.25-µm CMOS technology are reported.

### *2 Description of the VDGA*

The VDGA, whose circuit symbol is represented in Fig.1, is a recently reported active building block introduced in [22]. In ideal operation, the behavior of the VDGA element can be characterized by the matrix equation:

$$
\begin{bmatrix} i_p \\ i_n \\ i_z \\ i_z \\ i_x \\ v_w \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ g_{mA} & -g_{mA} & 0 & 0 \\ 0 & 0 & g_{mB} & 0 \\ 0 & 0 & g_{mB} & 0 \\ 0 & 0 & \beta & 0 \end{bmatrix} \begin{bmatrix} v_p \\ v_p \\ v_z \\ v_z \\ v_x \end{bmatrix},
$$
(1)

where  $g_{m}$  and  $g_{m}$  denote the transconductance gains and β represents the voltage gain of the VDGA. Internal structure of the VDGA based on MOS transistors is depicted in Fig.2 [22]. The circuit comprises by three floating current sources (FCSs)  $\mathsf{M}_{1\mathcal{A}}\text{-}\mathsf{M}_{\mathsf{9}\mathcal{A}'}$   $\mathsf{M}_{1\mathcal{B}}\text{-}\mathsf{M}_{\mathsf{9}\mathcal{B}}$  and  $M_{1c}$ - $M_{9c}$ . Each of them realizes independent tunable transconductance gain  $g_{mkl}$  ( $k = A$ ,  $B$  and  $C$ ). The value of *gmk* can be determined by: [23]

$$
g_{mk} \cong \left(\frac{g_{1k}g_{2k}}{g_{1k} + g_{2k}}\right) + \left(\frac{g_{3k}g_{4k}}{g_{3k} + g_{4k}}\right)
$$
 (2)

where

$$
g_{ik} = \sqrt{KI_{Bk}} \,, \text{ for } i = 1, 2, 3, 4. \tag{3}
$$

In equation (3), *K* is the transconductance parameter of the transsitor and  $I_{Bk}$  is an external DC bias current. It is worth mentioning that the transconductance  $g_{mk}$ is tuned electronically by changing the bias current  $I_{g_{k}}$ . The FCS  $M_{1A}$ -M<sub>4A</sub> allows for having the differential-input voltage to current converter by  $i_z = g_{mA}(v_p - v_n)$ , while the FCS  $M_{1B}$ -M<sub>4B</sub> performs the transconductance amplifier action between the z and x terminals (i.e.  $i_x = g_{m} y_z$ ). Furthermore, a pair of FCSs  $M_{1B}$ -M<sub>4B</sub> and  $M_{1C}$ -M<sub>4C</sub> allows us to obtain a current-controlled voltage amplifier behavior ( $v_w$  = β $v_z$ ) with the voltage transfer gain equal to β =  $g_{m\beta}/g_{mC}$ . Of course, the gain  $\beta$  can be adjusted simple by setting the  $g_{mB}$  to  $g_{mC}$  ratio.



**Figure 1:** Schematic symbol of the VDGA.

# *3 Proposed capacitance multiplier circuit*

The proposed topology of the capacitance multiplier with a single VDGA is shown in Fig.3. It consists of only one VDGA and one floating capacitor. Although the floating capacitor is required, it can be implemented using metal-oxide-metal (MOM) double poly (poly1 poly2) or metal-insulator-metal (MIM) capacitor process [24]. Considering the VDGA port relation (1), the input impedance of the proposed capacitance multiplier circuit is

$$
Z_{in}(s) = \frac{V_{in}(s)}{I_{in}(s)} = \frac{1}{sC_{eq}} = \frac{1}{sC\left(1 + \frac{g_{mA}}{g_{mC}}\right)}
$$
(4)

where the simulated equivalent capacitance  $C_{eq}$  is equal to:

$$
C_{eq} = C \left( 1 + \frac{g_{mA}}{g_{mC}} \right). \tag{5}
$$



**Figure 2:** CMOS implementation of the VDGA circuit obtained from the one in [22].

It is clear that the proposed circuit in Fig.3 implements a variable capacitance multiplier with a capacitance multiplication factor given by:

$$
K = 1 + \frac{g_{mA}}{g_{mC}}.\tag{6}
$$

With this expression, the capacitance multiplication factor *K* is scaled electronically by setting the transconductance ratio  $g_{\textit{mA}}/g_{\textit{mC}}$ .



**Figure 3:** Proposed capacitance multiplier circuit and its equivalent circuit.



**Figure 4:** Non-ideal behavior model of the VDGA with terminal parasitics.

## *4 Analysis of non-ideal behavior*

Deviations from the ideal circuit performance are mainly due to the voltage and current transfer inaccuracies and the parasitics of the VDGA. The non-ideal transfer gains of an actual VDGA are expressed as:

$$
\begin{bmatrix} i_p \\ i_n \\ i_z \\ i_z \\ v_w \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ \alpha_A g_{mA} & -\alpha_A g_{mA} & 0 & 0 \\ 0 & 0 & \alpha_B g_{mB} & 0 \\ 0 & 0 & \delta \beta & 0 \end{bmatrix} \begin{bmatrix} v_p \\ v_n \\ v_z \\ v_x \end{bmatrix} , \qquad (7)
$$

where  $\alpha_{\scriptscriptstyle A}$  and  $\alpha_{\scriptscriptstyle B}$  are the non-ideal transconductance gains and  $δ$  is the non-ideal voltage transfer gain. If these non-ideal transfer gains are considered, then the non-ideal input impedance can be rewritten as:

$$
Z_{in}(s) = \frac{1}{sC\left(1 + \frac{\delta \alpha_A g_{mA}}{\alpha_B g_{mC}}\right)} \tag{8}
$$

In this case, the equivalent capacitance value changes to:

$$
C_{eq} = C \left( 1 + \frac{\delta \alpha_{A} g_{m} }{\alpha_{B} g_{m} } \right) \tag{9}
$$

Another non-ideality is introduced by the parasitic impedances at VDGA terminals (Fig.4). Parasitic resistances  $R_{p}$ ,  $R_{n}$ ,  $R_{z}$ ,  $R_{x}$  and the parasitic capacitances  $C_{p}$ ,  $C_{n}$ ,  $C_{z}$ ,  $C_{x}$ are connected between the high-impedance terminals (p, n, z and x) and ground. Series parasitic resistance R<sub>w</sub> is associated with the w-terminal. If these parasitic impedances are taken into consideration, the non-ideal performance of the proposed circuit in Fig.3 can then be evaluated as follows.

If only the p-terminal parasitic impedances are considered, the equivalent capacitance is obtained as:

$$
C_{eq} = \left(C + C_p + \frac{1}{R_p}\right) K \quad . \tag{10}
$$

Similarly, if only the z- and x-terminal parasitic impedances are taken into account, the equivalent capacitance can be computed as:

$$
C_{eq} = C \left\{ 1 + \frac{g_{md}}{g_{mB}} \left[ \frac{1}{1 - \left( \frac{1}{g_{mB} R_{zx}} \right) \left( 1 + s R_{zx} C_{zx} \right)} \right] \right\},
$$
(11)

where  $R_{zx} = R_z / R_x$  and  $C_{zx} = C_z + C_x$ 

If only the effect of *R<sub>w</sub>* is considered, the equivalent capacitance is :

$$
C_{eq} = CK \left(\frac{1}{1 + sR_w C}\right) \tag{12}
$$

By considering equation (10)-(12), it can be seen that the various parasitics exhibited at different terminals of the VDGA will affect the high-frequency behavior of the proposed circuit. However, from equation (10), the influence of the p-terminal parasitic impedances on the simulated capacitance can be reduced sufficiently under the assumption that  $R_p \gg 1$ , and by choosing the external capacitor such that  $C >> C_{\rho}$ . We also observe from equation (11) and (12) that the presence of parasitic impedances at terminals z, x and w introduces two extra poles, which reduces the useful bandwidth of



**Figure 5:** Simulated transient responses for  $v_{in}$  and  $i_{in}$ of Fig.3. **Figure 6:** Simulated frequency responses for *Zin* of Fig.3.

the proposed circuit. Therefore, the circuit behaves as a capacitor for frequencies:

$$
f \ll \min\left\{\frac{1}{2\pi R_{zx}C_{zx}}, \frac{1}{2\pi R_{w}C}\right\} \quad . \tag{13}
$$

# *5 Simulation results and application*

The behavior of the proposed circuit in Fig.3 has been simulated with PSPICE using the transistor model parameters of a 0.25-µm TSMC CMOS process. Transistor dimensions are given in Table 1 and symmetrical supply voltages are  $+V = -V = 1$  V.

**Table 1:** Transistor aspect ratios of CMOS VDGA in Fig.2.



The proposed capacitance multiplier circuit depicted by Fig.3 was simulated with the following component values:  $I_{BA} = I_{BB} = 100 \mu A$  ( $g_{mA} = g_{MB} = 1 \mu A/V$ ),  $I_{BC} = 4$  $\mu A$  ( $g_{mc} = 0.2$  mA/V) and  $C = 50$  pF, which results in  $C_{eq}$  $= 0.3$  nF. The quiescent power consumption of the circuit was 1.09 mW. In Fig.5, the simulated transient waveforms for  $v_{in}$  and  $i_{in}$  with a frequency of 10 MHz are given, wherein the phase difference has been found to be 89° . Fig.6 also represents the simulated frequency responses for  $Z_{in}$ , compared with that of an ideal capacitor response. It is observed from Fig.6 that the simulation results are in very close agreement with the theoretically predicted response far beyond 10 MHz. In addition to illustrate a variation of the *Ceq* value versus *iin*



the capacitance multiplication factor, the impedance magnitude responses with different values of  $g<sub>ma</sub>$  are depicted in Fig.7. The results are plotted for the circuit parameters listed in Table 2.



**Figure 7:** Magnitude-frequency responses of  $Z_{in}$  with tuning *I BA*.

**Table 2:** Detailed circuit component settings for Fig.7, where  $C = 50$  pF.

multiplier circuit in Fig.3. The simulations of the illustrative low-pass filter have been performed by keeping  $R_1 = R_2 = 1$  k $\Omega$ , and varying the values of  $C_{eq} = C_{eq1} = C_{eq2}$ . Fig.9 shows the simulated voltage-gain responses for  $C_{eq}$  = 0.13 nF, 0.30 nF and 0.67 nF, where detailed  $C_{eq}$  settings are the same as those given in Table 2. The results indicate that the value of *f c* is: 1.37 MHz, 0.53 MHz and 0.23 MHz, respectively for different sets of C<sub>eq</sub> values.

## *6 Conclusions*

In this work, a simple realization of an adjustable grounded capacitance multiplier is introduced. The configuration uses only one VDGA as an active element and one floating capacitor as a passive element. The capacitance multiplication factor is electronically tunable by the ratio of the VDGA's transconductance gains. The effects of the VDGA non-idealities including voltage and current transfer errors and parasitic elements on



An illustrative application of the proposed capacitance multiplier circuit in Fig.3 is the realization of a second-order RC low-pass filter depicted by Fig.8. The cut-off frequency point is determined by:  $f_{\gamma} =$ <sup>1/2</sup> π( $R_1R_2C_{eq1}C_{eq2}$ <sup>1/2</sup>. In this realization, the capacitors *Ceq*<sup>1</sup> and *Ceq*<sup>2</sup> are realized with the proposed capacitance



**Figure 8:** Second-order RC low-pass filter.



**Figure 9:** Simulated frequency responses of the filter in Fig.8 with tuning C<sub>eq</sub>.

the realized capacitor are investigated. The feasibility of the proposed capacitance multiplier is demonstrated on a second-order RC low-pass filter. Simulation results employing TSMC 0.25-µm CMOS process parameters are provided to verify the theoretical analysis.

# *7 Acknowledgments*

This work is supported by Faculty of Engineering, King Mongkut's Institute of Technology Ladkrabang (KMITL), Project No.2563-02-01-002. Partial supports for the work by Institute of Research and Development Rajamangala University of Technology Isan, and by Faculty of Engineering, Rajamangala University of Technology Isan, Khonkaen Campus are also immensely grateful. The authors are deeply indebted to the anonymous reviewers for their helpful comments and suggestions.

# *8 Conflict of interest*

The authors confirm that this article content has no conflict of interest.

## *9 References*

- 1. S. Pennisi, "CMOS multiplier for grounded capacitor", *Electron. Lett.*, vol.38, no.15, pp.765-766, 2002, https://doi.org/10.1049/el:20020517.
- 2. J. Choi, J. Park, W. Kim, K. Lim and J. Laskar, "High multiplication factor capacitor multiplier for an on-chip PLL loop filter", *Electron. Lett.*, vol.45, no.5, pp.239-240, 2009, https://doi.org/10.1049/el:20092874.
- 3. M. A. Al-Absi, E. S. Al-Suhaibani, and M. T. Abuelma'atti, "A new compact CMOS C multiplier", *Analog Integr. Circ. Sig. Process*., vol.90, no.3, pp.653- 658, 2017,

https://doi.org/10.1007/s10470-016-0822-1.

- 4. R. Aparicio, and A. Hajimiri, "Capacity limits and matching properties of integrated capacitors", *IEEE J. Solid-State Circ.*, vol.37, no.3, pp.384-393, 2002,
	- https://doi.org/10.1109/4.987091.
- 5. L. C. Hwang, "Area-efficient and self-biased capacitor multiplier for on-chip loop filter", *Electron. Lett.*, vol.42, no.24, pp.1392-1393, 2006, https://doi.org/10.1049/el:20062486.
- 6. C. Premont, R. Grisel, N. Abouchi, and J. P. Chante, "A current conveyor based capacitive multiplier", in *Proc. IEEE Midwest Symposium on Circuits and Systems*, 1997, pp.146-147, https://doi.org/10.1109/mwscas.1997.666054.
- 7. G. D. Cataldo, G. Ferri, and S. Pennisi, "Active capacitance multipliers using current conveyors", in *Proc. IEEE Int. Symp. Circ. Syst.*, 1998, pp.343-346, https://doi.org/10.1109/iscas.1998.706935.
- 8. G. Ferri, and S. Pennisi, "A 1.5-V current-mode capacitance multiplier", in *Proc. IEEE Int. Conf. Microelectron.*, 1998, pp.9-12, https://doi.org/10.1109/icm.1998.825555.
- 9. A. Yesil, E. Yuce, and S. Minaei, "Grounded capacitance multipliers based on active elements", *Int. J. Electron. Commun. (AEU)*, vol.79, no.9, pp.243-249, 2019,
	- https://doi.org/10.1016/j.aeue.2017.06.006.
- 10. P. V. A. Mohan, "Floating capacitance simulation using current conveyors", *J. Circuits Syst. Comput.*, vol.14, no.1, pp.123-128, 2005, https://doi.org/10.1142/s0218126605002209.
- 11. M. T. Abuelma'atti and N. A. Tasadduq, "Electronically tunable capacitance multiplier and frequency-dependent negative-resistance simulator

using the current-controlled current conveyor", *Microelectron. J*., vol.30, no.9, pp.869-873, 1999, https://doi.org.10.1016/s0026-2692(99)00025-7.

- 12. R. Verma, N. Pandey, and R. Pandey, "Novel CFOA based capacitance multiplier and its application", *Int. J. Electron. Commun. (AEU)*, vol.107, pp.192- 198, 2019,
	- https://doi.org/10.1016/j.aeue.2019.05.010.
- 13. A. Lahiri and M. Gupta, "Realizations of grounded negative capacitance using CFOAs", *Circ. Syst. Sig. Process*., vol.30, no.1, pp.143-155, 2011, https://doi.org/10.1007/s00034-010-9215-3.
- 14. I. A. Khan, and M. T. Ahmed, "OTA-based integrable voltage/current-controlled ideal C multiplier", *Electron. Lett.*, vol.22, no.7, pp.365-366, 1986, https://doi.org/10.1049/el:19860248.
- 15. M. T. Ahmed, I. A. Khan, and N. Minhaj, "Novel electronically tunable C-multipliers", *Electron. Lett.*, vol.31, no.1, pp.9-11, 1995, https://doi.org/10.1049/el:19950018.
- 16. Y. A. Li, "A series of new circuits based on CFTAs", *Int. J. Electron. Commun. (AEU)*, vol.66, no.7, pp. 587-592, 2012,

https://doi.org/10.1016/j.aeue.2011.11.011.

- 17. D. Biolek, J. Vavra, and A. U. Keskin, "CDTA-based capacitance multipliers", *Circ. Syst. Sig. Process*., vol.38, no.4, pp.1466-1481, 2019, https://doi.org/10.1007/s00034-018-0929-y.
- 18. U. E. Ayten, M. Sagbas, N. Herencsar, and J. Koton, "Novel floating general element simulators using CBTA", Radioengineering, vol.21, no.1, pp. 11-19, 2012.
- 19. H. Alpaslan, "DVCC-based floating capacitance multiplier design", Turkish J. Electr. Eng. Comput. Sci., vol.25, no.2, pp.1334-1345, 2017, https://doi.org/10.3906/elk-1509-112.
- 20. W. Tangsrirat, "Floating simulator with a single DVCCTA", *Indian J. Eng. Mater. Sci*., vol.20, no.2, pp.79-86, 2013.
- 21. P. Mongkolwai and W. Tangsrirat, "Generalized impedance function simulator using voltage differencing buffered amplifiers (VDBAs)", in *Proc. Int. MultiConf. Engineers Comput. Scientists 2016*, 2016, pp.609-612.
- 22. J. Satansup and W. Tangsrirat, "CMOS realization of voltage differencing gain amplifier (VDGA) and its application to biquad filter", *Indian J. Eng. Mater. Sci*., vol.20, no.6, pp.457-464, 2013.
- 23. A. F. Arbel and L. Goldminz, "Output stage for current-mode feedback amplifiers, theory and ap-

plications", *Analog Integr. Circ. Sig. Process.*, vol.2, no.3, pp.243–255, 1992, https://doi.org/10.1007/bf00276637.

24. G. Komanapalli, R. Pandey, and N. Pandey, "New sinusoidal oscillator configurations using operational transresistance amplifier", *Int. J. Circ. Theor. Appl.*, vol.47, pp.666-685, 2019, https://doi.org/10.1002/cta.2619.



Copyright © 2019 by the Authors. This is an open access article distributed under the Creative Com-

mons Attribution (CC BY) License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Arrived: 08. 11. 2019 Accepted: 07. 01. 2020

https://doi.org/10.33180/InfMIDEM2019.404



ournal of Microelectronics, Electronic Components and Materials Vol. 49, No. 4(2019), 219 – 227

# *The Development of Thermal Coefficients of Photovoltaic Devices*

*Stefan Mitterhofer, Boštjan Glažar, Marko Jankovec, Marko Topič*

*University of Ljubljana, Faculty of Electrical Engineering, Laboratory of Photovoltaics and Optoelectronics, Ljubljana, Slovenia*

**Abstract:** Photovoltaic modules installed in the field exhibit a wide range of operating temperatures, depending on the meteorological and environmental conditions. Their temperature influences their output power and conversion efficiency. Temperature dependence is typically described as a linear function with the temperature coefficients of open-circuit voltage, shortcircuit current and maximal output power. To analyse the development of these parameters over time, the data of modules of several technologies is evaluated, monitored from five to eight years at our outdoor test site. The measurement data is cleaned of outliers and systematic measurement errors and then translated to the irradiance of 1000 W/m2 at standard test conditions. Several discrepancies compared to the datasheet values of the photovoltaic modules are found. These discrepancies are attributed to the parameters' sensitivity to other factors, mostly pronounced in the short-circuit current. One such factor is the spectrum of the incident light. The temperature coefficients are then analysed each month to evaluate their development over time. A seasonality is found, showing a higher temperature sensitivity of the short-circuit current in the winter and a correspondingly lower sensitivity of the output power at maximum power point. However, no systematical change over time due to possible influences of module degradation in the timeframe of up to eight years was observed.

**Keywords:** photovoltaics; thermal coefficients; performance monitoring

# *Dolgoročna odvisnost temperaturnih koeficientov fotonapetostnih modulov*

**Izvleček:** Fotonapetostni moduli v času obratovanja doživijo širok razpon obratovalnih temperatur glede na pogoje okolice, kjer so nameščeni. Temperatura celic , katere vpliv opisujemo s temperaturnimi koeficienti napetosti odprtih sponk, kratkostičnega toka in maksimalne moči, vpliva na njihovo zmogljivost in učinkovitost. V članku smo analizirali dolgoročne poteke temperaturnih koeficientov fotonapetostnih modulov različnih tehnologij, ki jih dolgoročno spremljamo na našem testnem poligonu od pet do osem let. Merilne rezultate smo preračunali na standardno obsevanje 1000 W/m2 in izločili izstopajoče meritve in sistematične merilne napake. Opazili smo razlike v primerjavi z vrednostmi temperaturnih koeficientov, ki jih navajajo proizvajalci. Razlike so najbolj izrazite pri kratkostičnem toku in jih pripisujemo občutljivosti parametrov na druge dejavnike, kot npr. spekter vpadne svetlobe. Temperaturne koeficiente smo analizirali na mesečni ravni, da smo ocenili njihovo časovno spreminjanje. V rezultatih so najbolj razvidni sezonski vplivi, kjer je pozimi temperaturni koeficient kratkostičnega toka višji in posledično maksimalne moči ustrezno nižji. Vendar pa ni bilo opaziti sistematične časovne spremembe zaradi možnih vplivov degradacije fotonapetostnega modula.

**Ključne besede:** fotovoltaika; temperaturni koeficienti; spremljanje učinkovitosti

*\* Corresponding Author's e-mail: stefan.mitterhofer@fe.uni-lj.si* 

## *1 Introduction*

Photovoltaic (PV) devices convert energy emitted from the sun or other light sources directly into electrical energy. Incident photons excite electrons from the valence band into the conduction band of a semiconductor, creating electron-hole pairs that generate voltage and current at the contacts in a well know currentvoltage (IV) characteristic under illumination [1]. A depiction of an IV curve of a PV cell is shown in Figure 1. The most important parameters describing the device's performance are the short-circuit current  $I_{\text{sc}}$ , the open-circuit voltage  $V_{\text{oc}}$  and the fill factor FF.  $I_{\text{sc}}$  is the current through the cell, when it is short circuited and the voltage across the cell is zero.  $V_{\text{oc}}$  is the maximum

possible voltage across the cell at zero current. The power output P of a PV cell is given as the product of voltage and current. The maximum power output  $P_{MPP}$ is achieved at the maximum power point (MPP), with the corresponding voltage  $V_{MPP}$  and current  $I_{MPP}$ . The fill factor FF of a solar cell is the ratio between the  $P_{MPP}$  and the product of  $I_{SC}$  and  $V_{OC}$ . It is the ratio between the areas of yellow and blue rectangles in Figure 1.

$$
FF = \frac{U_{MPP} \cdot I_{MPP}}{V_{OC} \cdot I_{SC}}\tag{1}
$$



**Figure 1: IV curve of a PV cell.** 

#### *1.1 Loss mechanisms in PV devices*

The efficiency of this energy conversion process in an idealized case is limited by several fundamental loss mechanisms. The first approach to calculate this limit for single-junction solar cells was taken by Shockley and Queisser using the detailed balance principle [2]. It is therefore referred to as Shockley-Queisser limit. Similar results were later obtained using a thermodynamic approach [3]. More recent work yielded the same results analyzing the impact of single intrinsic loss processes [4]. These loss mechanisms are:

- Thermalisation: Photons with an energy E above the semiconductor bandgap Eg quickly lose this energy and fall down to the conduction band. This process is usually many orders of magnitude faster than charge carrier extraction [5].
- Below Eq: Photons with  $E < Eq$  are generally not absorbed and do not create an electron-hole pair.
- Emission: The PV device emits photons according to the generalized Planck equation [6], [7].
- Boltzmann or angle mismatch: Entropy is generated and the corresponding energy lost. This process is linked to an angle mismatch between incident and emitted photons.
- Carnot: As an energy converter, the solar cell is limited by the Carnot efficiency. The sun is the hot reservoir, the cell is the cold one.

There are several ways to reduce their impact and achieve efficiencies above the Shockley-Queisser limit, for example multi-junction cells [6], intermediate band cells [8] or concentrating sunlight [9]. However, these losses are unavoidable in single-junction devices.

Additional loss mechanisms in real world devices reduce the efficiency further. Thus, the conversion efficiency η of even the best laboratory single-junction solar cells is a few percent below the Shockley-Queisser limit [10]. These additional loss mechanisms are:

- Non radiative recombination: Shockley-Read-Hall [11], [12], Auger [13] and surface recombination.
- Optical losses: Reflection on material interfaces, transmission of photons with energy  $E > Eq.$
- Parasitic resistances: A series resistance within the cell, in metal contacts and on the interface can cause further losses. A shunt resistance in the device can create alternative current pathways. This is commonly connected to manufacturing defects or cell degradation [14].
- Parasitic absorption: Absorption of photons with  $E > Eq$ , which do not result in the creation of charge carriers.

#### *1.2 Temperature sensitivity of PV cell parameters*

Because all fundamental and several additional loss mechanisms depend on the device temperature T, the output of PV modules and their efficiency is a function of T. The temperature coefficients analysed are usually the coefficient  $\alpha$  of the short-circuit current density J<sub>SC</sub>, the coefficient  $\beta$  of the open-circuit voltage V<sub>oc</sub> and the coefficient γ of the output power  $P_{MPP}$  at the MPP.

The major part of the overall temperature sensitivity of solar cells is caused by the coefficient  $β$  [15]. At VOC, generation equals recombination, and the current density is zero. β is therefore an indication of the temperature dependence of the generation-recombination balance. It is [15–17]:

$$
\beta = \frac{1}{V_{OC}} \frac{dV_{OC}}{dT} = -\frac{1}{V_{OC}} \frac{\frac{E_{g0}}{q} - V_{OC} + \frac{kT}{q} \gamma \frac{f}{\xi} \frac{d\xi}{df}}{T}.
$$
 (2)

Here, n is the diode ideality factor, k the Boltzmann constant and q the elementary charge. The function f describes individual recombination processes dependent on ξ. This parameter is introduced for mathematical convenience and depends on the intrinsic, electron and hole carrier concentrations, as well as Eg and T.

#### The coefficient  $\alpha$  is [15]:

$$
\beta = \frac{1}{J_{sc}} \frac{dJ_{sc}}{dT} = \frac{1}{J_{sc,ideal}} \frac{dJ_{sc,ideal}}{dE_g} \frac{dE_g}{dT} + \frac{1}{f_c} \frac{df_c}{dT}.
$$
 (3)

Here,  $J_{\text{SCI}$ <sub>deal</sub> is an ideal short-circuit current density, only dependent on the incident photon flux and disregarding loss mechanisms.  $f_c$  is a collection factor describing the impact of these loss mechanisms. There are various issues with an accurate determination of α [16–18]. It depends on the incident light intensity and spectrum. However, solar simulators used to determine the electrical characteristics of PV modules are classified by integrating their total irradiance over several broad parts of the spectrum and comparing them individually to the AM1.5 spectrum [19]. Thus, the spectral radiance of such solar simulators can exhibit variations between each other, and compared to the AM1.5 spectrum [20]. Correspondingly, the determined values of α can show large variations [21], [22]. The incident spectrum in the field also changes depending on time and location of the PV installation, leading to similar discrepancies. Another issue are the temperature differences between single cells of a module, which can lead to further inaccuracies of the determination of  $\alpha$  [23], [24].

There is no similar equation for the coefficient  $\gamma$  available in literature. It is dependent on the other temperature coefficients, as well as on the temperature coefficient of the fill factor FF, since:

$$
P_{MPP} = I_{SC} \cdot V_{OC} \cdot FF.
$$
\n(4)

An analysis of the latter is given in [25], resulting in:

$$
\frac{1}{FF} \frac{dFF}{dT} \approx (1 - 1.02 FF_0) \left( \frac{1}{V_{OC}} \frac{dV_{OC}}{dT} - \frac{1}{T} \right)
$$

$$
- \frac{R_s}{V_{OC}} \left( \frac{1}{R_s} \frac{dR_s}{dT} \right)
$$
(5)

Here, FF $_{\rm o}$  is an approximation of the fill factor neglecting series resistance  $\mathsf{R}_{\mathsf{s}}$  and shunt resistance, and assuming a one-diode model with diode ideality factor n, given in equation (6) [26].

$$
FF_0 = \frac{v_{OC} - \ln(v_{OC} + 0.72)}{v_{OC} + 1}
$$
 (6)

$$
v_{OC} = \frac{q}{nkT_C} V_{OC}
$$
 (7)

However, equation (5) only accounts for the series resistance, and neglects the impact of the shunt resistance and the diode ideality factor on the temperature

coefficient. A corresponding evaluation including all these factors is still missing in literature. Thus, it is not surprising that large variations between experimental values and calculations were found [18].

#### *1.3 PV modules in the field*

Many models have been developed to predict the temperature of PV modules in the field, as reviewed in [27]. It is dependent on climatic factors, for example ambient temperature, irradiance and wind speed, but also technological factors, for example type of module and mounting configuration. Depending on these factors, T shows large variations at different times at different locations. Thus, the total yield of PV installations strongly depends on their temperature sensitivity. Module datasheets therefore commonly contain one value for each of the three temperature coefficients  $α$ ,  $β$  and  $γ$ . They are usually obtained with the methods defined in the IEC 60891 standard [28]. This standard requires multiple measurements within a range of at least 30°C at a single irradiance. The results are valid at  $\pm 30\%$  of this irradiance. Datasheets usually give the values under standard test condition (STC) irradiance: the AM1.5 spectrum at 1000 W/m<sup>2</sup>. They rarely include an error range of these values, or values for multiple irradiances. Note that another standard exists containing a more in-depth evaluation at different irradiances and temperatures, the IEC 61853-1 [29].

PV modules are expected to last for 25+ years in the field. Their STC output power degrades usually between 0.5% and 1% annually, varying between PV technologies, climate and installation conditions [30]. This degradation is caused by various degradation modes, affecting the PV modules differently [31]. The thermal coefficients are assumed to be stable and unaffected by these degradation modes. However, a recent analysis of field-aged PV modules has shown the inadequacy of this assumption for γ [32]. Thus, the development of the temperature coefficients over time has to be considered for an accurate lifetime analysis and yield estimation of a PV system. According to the best of our knowledge, no such analysis has been published yet.

In this paper, we propose and carry out such an analysis. Chapter 2 presents the PV test site, containing modules of various technologies installed for several years, and the methods used to evaluate the measured data. Chapter 3 gives the results of the data analysis. Chapter 4 contains a discussion of the results.

#### *2 Methods*

#### *2.1 Measurements*

The data is taken from the PV test site of the Laboratory of Photovoltaics and Optoelectronics (LPVO) [33] as well as the adjacent PV power plant on the roof of the Faculty of Electrical Engineering, University of Ljubljana in Ljubljana, Slovenia, shown in Figure 2. The analysed period is between five and eight years, depending on the time the corresponding module was installed.



**Figure 2:** Monitored PV test site and adjacent PV power plant in Ljubljana, Slovenia.

Modules of various PV technologies, including amorphous silicon (a-Si), Cadmium-Tellurite (CdTe), Copper-Indium-Gallium-Selenide (CIGS), micromorph silicon (µ-Si), poly- and mono-crystalline silicon (c-Si) are monitored. One measurement is taken every 10 minutes and contains T, short-circuit current  $I_{\text{SC}}$ ,  $V_{\text{OC}}$  and  $P_{\text{MPP}}$  of the modules, as well as the irradiance G. The temperature is measured on the backside of every module with an attached Dallas DS18B20 digital temperature sensor. The irradiance in plane of array is measured with a Kippen-Zonen CMP21 pyranometer on the test site. The output of the modules is measured with a module monitoring system [33]. They are stored in a database along with further measurements (IV curve, FF, R<sub>s</sub>, etc.), which are however not used in this analysis. Between the measurements, modules are kept in their respective MPP. Additionally, the spectral irradiance is measured with an EKO MS-711 spectroradiometer.

#### *2.2 Data cleaning*

The data is extracted from the database using Python 3.6.5. It is then translated to STC irradiance of 1000 W/  $\mathsf{m}^2$  to remove the dependency of  $\mathsf{I}_{\mathsf{SC}'}\, \mathsf{V}_{\mathsf{OC}}$  and  $\mathsf{P}_{\mathsf{MPP}}$  on the irradiance. There are various possibilities for such a translation, which are generally empirical formulas approximating the real behavior. Thus, a small error depending on the difference of the irradiance to STC

conditions is expected. Several possible translation formulas are reviewed for the single-diode model in [34]. The formulas used here are:

$$
I_{SC,STC} = \frac{G_{STC}}{G} \cdot I_{SC} \left(G\right) \tag{8}
$$

$$
V_{OC,STC} = V_{OC}(G) - \frac{N_S k T n}{q} \ln\left(\frac{G}{G_{STC}}\right)
$$
 (9)

$$
P_{MPP,STC} = \frac{G_{STC}}{G} \cdot P_{Mpp} (G)
$$
 (10)

Here,  $N_s$  the number of cells in one string. It is obtained from the corresponding datasheets or, if missing, counted on the modules. For simplicity, n is set to 1. The index STC denotes the translated values. An example of this translation is shown in Figure 3 and Figure 4 for the  $I_{\rm sc}$  of a crystalline silicon module.



**Figure 3:** I<sub>sc</sub> before translation to STC irradiance. The red curve shows the trendline of the data.



**Figure 4:** I<sub>sc</sub> after translation to STC irradiance. The black curve shows the trendline of the data, showing it is independent of the irradiance. The red curve, for comparison, shows the trendline of the data before translation (same as in Figure 3).

The data is then filtered. Over the range of several years, the measurement setup proved to be very stable and provide high quality data. However, several statistical and systematic measurement errors occurred. In a first filtering step, the raw data is analysed to identify the systematic errors. Several periods had to be excluded due to faulty measurements.

One example of such an error can be found evaluating I<sub>sc,STC</sub>. The values during July - October 2016 are offset compared to all others, shown in Figure 5. The rea-

son was determined to be problems of the irradiance measurements in that period.



**Figure 5:** An example of a systematic measurement error. A further analysis of the highlighted data showed, that the irradiance measurements during this time were incorrect.

However, some cases prove more difficult to include, for example degradation. It is usually a gradual process. To filter the data in such cases is a trade-off between a shorter analysed period and accordingly fewer measurements on the one hand. This reduces the statistical significance and can lead to an increase of seasonal effects, skewing the data. On the other hand, a longer time frame increases the impact of degradation on the results. An example is shown in Figure 6, where a gradual degradation of the  $V_{\text{OC}}$  in a CIGS module can be observed. It was especially pronounced in the first year the module was installed at the test site.



**Figure 6:** Gradual degradation of V<sub>oc</sub> over time. The arrow shows the trend of the development over time. The red points mark data from the first year after installation (10.2014 - 10.2015), the blue points from the following years.

In a second filtering step, the statistical outliers of the measurements are removed. For this purpose, a linear regression over the data is carried out using the linear regression model of the scipy.stats module in Python. The results are used to remove outliers above three standard deviations in  $V_{OC}$ , and two in  $I_{SC}$  and  $P_{MPP}$ . Generally  $I_{SC}$  and  $P_{MPP}$  show far more outliers, requiring the lower threshold to clean the data. An example is shown in Figure 7.

#### *2.3 Data analysis*

The cleaned data is then analysed more in-depth. In the first analysis, the temperature coefficients of the modules are extracted over the entire timespan using a linear regression on the cleaned data. Measurements



**Figure 7:** Removed statistical outliers (red).

taken with an irradiance between 950 W/m<sup>2</sup> and 1050 W/m<sup>2</sup> are analysed. Additional evaluations using larger irradiance ranges are carried out to evaluate the sensitivity of the results to this filter. Each step, the minimum irradiance considered is reduced by 50 W/m<sup>2</sup> and the maximum irradiance increased by the same amount. In the case of a very strong degradation over a short timespan, for example as shown in Figure 6, the data is not considered in this evaluation.

In the second analysis, the monthly coefficients are extracted using a linear regression on monthly data. This approach enables the evaluation of the development of the thermal coefficients over time by comparing them during the same month over several years. Some sources of inaccuracies of the determination of these coefficients described in the introduction are thus minimized. Furthermore, possible seasonal variations of these coefficients over a year can be analysed.

Three different approaches for the choice of the analysed irradiance range are taken. The corresponding results are compared and the impact of the used irradiance filter analysed. In the first approach, the range is set to 500 W/  $m<sup>2</sup> - 1200 W/m<sup>2</sup>$ . Using these values, in the winter on average ~100 measurements each month are available. Most of them are close to the lower limit in this range. However, many more measurements are available at higher irradiances in the summer. Thus, in the second approach only measurements with an irradiance between 500 W/m<sup>2</sup> and 700 W/m<sup>2</sup> are considered. In the third approach, different filters are chosen during different months of the year. The settings are summarized in Table 1. The chosen irradiance filters ensure, that most measurements are obtained under a clear-sky condition. Approximately 100 measurements or more each month are available using these filters, depending on the weather and filtered out periods due to systematic measurement errors.

#### **Table 1:** Irradiance filters used



Narrower ranges would reduce the amount of measurements, reducing the statistical significance and yield a worse result from the linear regression. Wider ranges would increase the variation of conditions, under which the measurements are taken. This leads to a larger spread of data and correspondingly a larger error margin of the fit. In certain conditions, this could even lead to a systematic error of the determined coefficient. All irradiance filters given here are thus chosen carefully. They are specific to the location of the analysed modules and should not be taken as a general guideline.

Because the spectral irradiance has shown to influence α, it is analysed and compared in different conditions. The first analysis compares the spectra in the winter and summer at a similar total irradiance. The second analysis evaluates differences in the spectrum during the summer with similar total irradiance, but at different times and conditions: clear sky condition in the early morning and in the evening as well as cloudy sky condition at noon.

#### *3 Results*

#### *3.1 Analysis of temperature coefficients*

The results of the analysis over the entire measured timespan are given in Table 2. Several discrepancies between the results and the corresponding datasheet values can be observed. The obtained values of α show on average the largest differences from the datasheet. They are suspected to stem from the high sensitivity of  $I_{\rm cc}$  to various environmental parameters and the corresponding issues of determining α accurately described in the introduction. Next to the differences between the experimental results and the datasheet, this sensitivity causes a large spread of the data. The corresponding fit can show a large error margin, even after the rigorous data filtering. Figure 8 and Figure 9 show examples of a good fit, commonly observed with  $V_{\text{cor}}$  and a bad fit, commonly observed with  $I_{\text{sc}}$  and  $P_{\text{MPP}}$ . Extending the analysed irradiance range increases the temperature coefficients of most modules. A reason for this behavior was not found.



**Figure 8:** Example of a good fit and clean data.



Figure 9: Example of a large data spread and a correspondingly bad fit.

Another interesting result is the positive value of  $\gamma$  for the a-Si module. It is connected to the large increase in I<sub>sc</sub> with increasing temperature, as well as an increase of the fill factor. This effect has been reported before and is connected to lower resistance effects and decreased recombination [15], [35]. Note that this module's datasheet contains a negative value of γ. A possible reason for this discrepancy, next to the issues with an accurate determination of  $I_{\text{ccl}}$  is the thermal history of the modules. In thin film devices, this history has shown to in-

**Table 2:** The temperature coefficients of several modules obtained from the analysis using the 950 W/m<sup>2</sup> – 1050 W/  $m^2$  filter, and from the corresponding datasheets. The error margins given are obtained from the scipy.stats module in Python and, in the case of multiple modules, calculated.


fluence the obtained values of the temperature coefficients. Thus, the corresponding standard specifies that the history concerning the irradiation and the thermal history have to be indicated in a report presenting the measurements of these coefficients in a laboratory setting [28].

### *3.2 Change of the temperature coefficients over time*

Using a single irradiance filter during the entire year leads to a large spread of the data and a correspondingly larger error margin of the coefficients. The spectral irradiance analysis shows, that the chosen irradiance filters contain different incident spectra. While these differences are only in the range of a few percent at various wavelengths, they are suspected to be one of the reasons for the large error margins. Reducing the size of the analysed irradiance range to reduce these inaccuracies would remove almost all measurements in the winter.

Thus, the third approach described in chapter 2.3 using different irradiance filters during the year yields the best results. The resulting thermal coefficients show no systematic changes during the analysed period. An example of a CIGS module is shown in Figure 10. However, they show large seasonal variations. There are various possible reasons for this behavior. Non-linear effects of the irradiance translation become more apparent for larger differences to the STC irradiance. Furthermore, different thermal coefficients at various temperatures are possible. Only a single datasheet of the analysed modules contains a second value for  $\beta$ . It is higher at lower temperatures, which is in line with the results found in this study.



**Figure 10:** Development of the temperature coefficients of a CIGS module over time. The lines show the standard error obtained from the scipy.stats Python module.

## *4 Discussion*

Open-circuit voltage  $V_{\text{oc}}$  is, in general, the strongest varying factor with a change in temperature. This is in line with the literature presented in the introduction. The generation-recombination balance is, of all the processes influencing the power extracted from the module, the most sensitive to a temperature change. Furthermore, the results show a small spread of the data and a reasonably good fit with the data sheet values, showing a comparably low dependency on other climatic parameters.

Short-circuit current generally increases with increasing temperature. This is caused by a higher collection efficiency at higher temperatures [17]. The measured data exhibits a large spread, leading to possible inaccuracies of the fitting process. This is as well in agreement with literature, emphasizing the high sensitivity of the short-circuit current to outside influences besides the temperature. Thus, a narrow irradiance filter is required for an accurate determination of  $\alpha$ , as long as enough measurements are available using this filter.

The same issues cause a large inaccuracy in the determination of γ. The calculated values differ from the datasheet for several modules. These problems to determine α and γ accurately, and their dependence on other environmental parameters, show that a single value obtained in a laboratory according to IEC 60891 cannot be taken for an accurate prediction of module behavior in the field. Similarly, an extrapolation of such behavior from one climate and location to another can prove difficult and is endowed with a certain inaccuracy. The multiple values at different temperatures and irradiances defined in IEC 61853-1 would serve as a much better basis of such a prediction.

The modules at the test site will be monitored further. The methodology presented in this paper can be applied to larger systems and modules installed in different climates for a better statistical analysis of the development of the temperature coefficients, as well as their correlation to other parameters.

# *5 Conclusions*

The temperature coefficients of the analysed photovoltaic modules do not change in the installed climate over several years. They exhibit a seasonal variation, which can be linked to larger errors stemming from the translation of the measurements to STC irradiance, as well as the sensitivity of these parameters to other factors. Examples include the spectrum of the incident

light, the temperature and temperature variations between the cells inside a module. These results emphasize the requirement of a more in-depth evaluation of these parameters for an accurate lifetime and yield analysis of PV modules installed in different locations in the field.

# *6 Acknowledgments*

This project has received funding from the European Union's Horizon 2020 programme in the framework of the project "SolarTrain" under GA No. 721452 and the Slovenian Research Agency under the research programme P2-0197. The authors thank Kristijan Brecl, Matevž Bokalič and Julian Ascencio-Vazquez, University of Liubliana, for useful discussion.

# *7 Conflict of Interest*

The authors declare no conflict of interest.

# *8 References*

- 1. M. A. Green, Solar cells: operating principles, technology, and system applications. Englewood Cliffs, NJ: Prentice-Hall, 1982.
- 2. W. Shockley and H. J. Queisser, 'Detailed Balance Limit of Efficiency of p-n Junction Solar Cells', Journal of Applied Physics, vol. 32, no. 3, pp. 510– 519, Mar. 1961,
	- https://doi.org/10.1063/1.1736034.
- 3. R. T. Ross, 'Some Thermodynamics of Photochemical Systems', The Journal of Chemical Physics, vol. 46, no. 12, pp. 4590–4593, Jun. 1967, https://doi.org/10.1063/1.1840606.
- 4. L. C. Hirst and N. J. Ekins-Daukes, 'Fundamental losses in solar cells', Progress in Photovoltaics: Research and Applications, vol. 19, no. 3, pp. 286– 293, May 2011, https://doi.org/10.1002/pip.1024.
- 5. P. Würfel and U. Würfel, Physics of solar cells: from basic principles to advanced concepts, 2nd, updated and expanded ed ed. Weinheim: Wiley-VCH, 2009.
- 6. C. H. Henry, 'Limiting efficiencies of ideal single and multiple energy gap terrestrial solar cells', Journal of Applied Physics, vol. 51, no. 8, pp. 4494–4500, Aug. 1980, https://doi.org/10.1063/1.328272.
- 7. W. Ruppel and P. Wurfel, 'Upper limit for the conversion of solar energy', IEEE Transactions on

Electron Devices, vol. 27, no. 4, pp. 877–882, Apr. 1980,

https://doi.org/10.1109/T-ED.1980.19950.

- 8. A. Luque and A. Martí, 'Increasing the Efficiency of Ideal Solar Cells by Photon Induced Transitions at Intermediate Levels', Physical Review Letters, vol. 78, no. 26, pp. 5014–5017, Jun. 1997, https://doi.org/10.1103/PhysRevLett.78.5014.
- 9. W. H. Press, 'Theoretical maximum for energy from direct and diffuse sunlight', Nature, vol. 264, no. 5588, p. 734, Dec. 1976, https://doi.org/10.1038/264734a0.
- 10. M. A. Green et al., 'Solar cell efficiency tables (Version 53)', Progress in Photovoltaics: Research and Applications, vol. 27, no. 1, pp. 3–12, Jan. 2019, https://doi.org/10.1002/pip.3102.
- 11. W. Shockley and W. T. Read, 'Statistics of the Recombinations of Holes and Electrons', Physical Review, vol. 87, no. 5, pp. 835–842, Sep. 1952, https://doi.org/10.1103/PhysRev.87.835.
- 12. R. N. Hall, 'Electron-Hole Recombination in Germanium', Physical Review, vol. 87, no. 2, pp. 387– 387, Jul. 1952,

https://doi.org/10.1103/PhysRev.87.387.

- 13. P. V. Auger, 'Sur les rayons β secondaires produits dans un gaz par des rayons X', C.R.A.S., vol. 177, pp. 169–171, 1923.
- 14. V. Naumann et al., 'Explanation of potential-induced degradation of the shunting type by Na decoration of stacking faults in Si solar cells', Solar Energy Materials and Solar Cells, vol. 120, pp. 383–389, Jan. 2014,

https://doi.org/10.1016/j.solmat.2013.06.015.

15. M. A. Green, 'General temperature dependence of solar cell performance and implications for device modelling', Progress in Photovoltaics: Research and Applications, vol. 11, no. 5, pp. 333–340, Aug. 2003,

### https://doi.org/10.1002/pip.496.

- 16. O. Dupré, R. Vaillon, and M. A. Green, Thermal Behavior of Photovoltaic Devices. Cham: Springer International Publishing, 2017.
- 17. O. Dupré, R. Vaillon, and M. A. Green, 'Physics of the temperature coefficients of solar cells', Solar Energy Materials and Solar Cells, vol. 140, pp. 92– 100, Sep. 2015,

https://doi.org/10.1016/j.solmat.2015.03.025.

- 18. O. Dupre, R. Vaillon, and M. A. Green, 'Experimental Assessment of Temperature Coefficient Theories for Silicon Solar Cells', IEEE Journal of Photovoltaics, vol. 6, no. 1, pp. 56–60, Jan. 2016, https://doi.org/10.1109/JPHOTOV.2015.2489864.
- 19. International Electrotechnical Commission, 'IEC 60904-9:2007 - Photovoltaic devices - Part 9: Solar simulator performance requirements', International Standard, 2007.

20. G. Leary, G. Switzer, G. Kuntz, and T. Kaiser, 'Comparison of xenon lamp-based and led-based solar simulators', in 2016 IEEE 43rd Photovoltaic Specialists Conference (PVSC), Portland, OR, USA, 2016, pp. 3062–3067,

https://doi.org/10.1109/PVSC.2016.7750227.

- 21. G. Landis, 'Solar Cell Temperature Coefficients', presented at the SPRAT 13, Washington, DC, 1994, p. 16.
- 22. J. H. Fatehi, C. Kedir, C. Tumengko, J. L. R. Watts, and N. Riedel, 'Results from flash testing at multiple irradiances and temperatures across five photovoltaic testing labs', in 3rd PV Performance Modeling Workshop, Santa Clara, CA, 2014, https://doi.org/10.13140/rg.2.2.21370.24007.
- 23. C. R. Osterwald, M. Campanelli, G. J. Kelly, and R. Williams, 'On the reliability of photovoltaic shortcircuit current temperature coefficient measurements', in 2015 IEEE 42nd Photovoltaic Specialist Conference (PVSC), New Orleans, LA, 2015, pp.  $1-6$

https://doi.org/10.1109/PVSC.2015.7355842.

- 24. A. Pavgi, 'Temperature coefficients and thermal uniformity mapping of PV modules and plants', Master Thesis, Arizona State University, Tempe, AZ, 2016.
- 25. J. Zhao, A. Wang, S. J. Robinson, and M. A. Green, 'Reduced temperature coefficients for recent high-performance silicon solar cells', Progress in Photovoltaics: Research and Applications, vol. 2, no. 3, pp. 221–225, Jul. 1994, https://doi.org/10.1002/pip.4670020305.
- 26. M. A. Green, 'Solar cell fill factors: General graph and empirical expressions', Solid-State Electronics, vol. 24, no. 8, pp. 788–789, Aug. 1981, https://doi.org/10.1016/0038-1101(81)90062-9.
- 27. E. Skoplaki and J. A. Palyvos, 'Operating temperature of photovoltaic modules: A survey of pertinent correlations', Renewable Energy, vol. 34, no. 1, pp. 23–29, Jan. 2009,

https://doi.org/10.1016/j.renene.2008.04.009.

- 28. International Electrotechnical Commission, 'IEC 60891:2009 - Photovoltaic devices - Procedures for temperature and irradiance corrections to measured I-V characteristics', International Standard, 2009.
- 29. International Electrotechnical Commission, 'IEC 61851-1:2011 - Photovoltaic (PV) module performance testing and energy rating - Part1: Irradiance and temperature performance measurements and power rating', International Standard, 2011.
- 30. D. C. Jordan, S. R. Kurtz, K. VanSant, and J. Newmiller, 'Compendium of photovoltaic degradation rates: Photovoltaic degradation rates', Progress in

Photovoltaics: Research and Applications, vol. 24, no. 7, pp. 978–989, Jul. 2016, https://doi.org/10.1002/pip.2744.

31. D. C. Jordan, T. J. Silverman, J. H. Wohlgemuth, S. R. Kurtz, and K. T. VanSant, 'Photovoltaic failure and degradation modes: PV failure and degradation modes', Progress in Photovoltaics: Research and Applications, vol. 25, no. 4, pp. 318–326, Apr. 2017,

### https://doi.org/10.1002/pip.2866.

32. T. Curtis et al., 'Temperature coefficient of power (Pmax) of field aged PV modules: impact on performance ratio and degradation rate determinations', in Reliability of Photovoltaic Cells, Modules, Components, and Systems X, San Diego, United States, 2017, p. 22,

https://doi.org/10.1117/12.2281840.

- 33. J. Kurnik, M. Jankovec, K. Brecl, and M. Topič, 'Development of outdoor photovoltaic module monitoring system', Informacije MIDEM, vol. 38, no. 2, pp. 75–80, 2008.
- 34. H. Ibrahim and N. Anani, 'Variations of PV module parameters with irradiance and temperature', Energy Procedia, vol. 134, pp. 276–285, Oct. 2017, https://doi.org/10.1016/j.egypro.2017.09.617.
- 35. Y. Riesen, M. Stuckelberger, F.-J. Haug, C. Ballif, and N. Wyrsch, 'Temperature dependence of hydrogenated amorphous silicon solar cell performances', Journal of Applied Physics, vol. 119, no. 4, p. 044505, Jan. 2016,

https://doi.org/10.1063/1.4940392.



Copyright © 2019 by the Authors. This is an open access article distributed under the Creative Com-

mons Attribution (CC BY) License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Arrived: 19. 11. 2019 Accepted: 08. 01. 2020

https://doi.org/10.33180/InfMIDEM2019.405



Journal of Microelectronics, Electronic Components and Materials Vol. 49, No. 4(2019), 229 – 239

# *Grammatical Evolution-based Analog Circuit Synthesis*

*Matevž Kunaver* 

# *University of Ljubljana, Faculty of Electrical Engineering, Ljubljana, Slovenia*

**Abstract:** Computer aided circuit design is becoming one of the mainstream methods for helping circuit designers. Multiple new methods have been developed in this field including Evolutionary Electronics. A lot of work has been done in this field but there is still a room for improvement since some of the solutions lack the flexibility (diversity of components, limited topology etc.) in circuit design or lack complex fitness functions that would enable the synthesis of more complex circuits. The research presented in this article aims to improve this by introducing Grammatical Evolution-based approach for circuit synthesis. Grammatical Evolution offers great flexibility since it is rule based – adding a new element is as simple as writing one additional line of initialization code. In addition, the use of a complex multi-criteria function allows us to create circuits that can be as complex as required thus further increasing the flexibility of the approach. To achieve this, we use a combination of Python and SPICE to create a series of netlists, evaluate them in the PyOpus environment, and select the best possible circuit for the task. We demonstrate the efficiency of our approach in three different case studies where we automatically generate oscillators and high/low-pass filters of second and third order.

**Keywords:** Automated synthesis; analog circuits; grammatical evolution; computer-aided design; evolutionary algorithms

# *Sinteza analognih vezij s pomočjo slovnične evolucije*

**Izvleček:** Računalniško podprto načrtovanje vezij postaja eno ključnih orodij načrtovalcev elektronskih vezij. Na tem področju se je v zadnjem času pojavilo mnogo novih pristopov, kot na primer evolucijska elektronika. Kljub temu, da se področje živahno razvija, pa so možne in tudi potrebne še mnoge izboljšave. Zlasti zato, ker marsikatera obstoječa tehnika ne nudi dovolj prilagodljivosti pri sintezi vezij (omejen nabor elementov, omejitve pri topologijah ipd.) ali pa ne nudi možnosti za razvoj bolj kompleksnih vezij. S pristopom, ki uporablja tako imenovano slovnično evolucijo (angl. grammatical evolution), želimo te pomanjkljivosti odpraviti. Slovnična evolucija je izjemno prilagodljiva tehnika, ki deluje po principu pravil (t.j. ukazov, s pomočjo katerih se izgradi posamezen element vezja). Zato dodajanje novega tipa elementa v sistemu, ki uporablja slovnično evolucijo, ni nič bolj zapleteno kot vnos dodatne vrstice v inicializacijsko kodo. Poleg tega smo pri našem pristopu uporabili večkriterijsko funkcijo, ki nam omogoča sintezo poljubno kompleksnega vezja. Celoten sistem smo razvili in preizkusili v programskih okoljih Python in SPICE, s pomočjo katerih smo ustvarili serijo datotek z opisi vezij (angl. netlist), jih ovrednotili v okolju PyOpus ter s pomočjo kriterijske funkcije izbrali vezje, ki je najboljše za zadano nalogo. Uporabnost naše metode smo prikazali na treh primerih, kjer smo avtomatsko sintetizirali oscilatorje ter visoko in nizko prepustna sita drugega in tretjega reda.

**Ključne besede:** Avtomatska sinteza, analogna vezja, slovnična evolucija, računalniško podprto načrtovanje, evolucijski algoritmi

*\* Corresponding Author's e-mail: matevz.kunaver@fe.uni-lj.si*

# *1 Introduction*

Analog circuit design has gradually changed from manual design (in the 1970s) to computer assisted design where a highly skilled engineer uses an assortment of computer tools to create a circuit with the specified characteristics. These tools range from simple schematic design to accurate circuit simulators (Simulation Program with Integrated Circuit Emphasis – SPICE [1]) that can simulate circuit behaviors and thus show if a circuit is actually worth implementing. This of course greatly reduces both the material costs (since one must only implement the final, best circuit) and the time to

production since the simulations require only a fraction of time compared to actually creating and measuring each potential circuit.

These simulations, however, still require a lot of expertise from the user since he/she must still be able to manually specify things like the desired topology and necessary elements [2, 3, 4]. The rise of easily accessible (and more powerful) computers led to the research and development of tools that are capable (to some degree) of creating the desired topology automatically given a list of possible elements (resistors, coils, generators etc.) [5, 6] and circuit characteristics. Such tools are of great interest since they streamline the design process, save a lot of time and also grant the possibility of circuit design to a user who might not be well-versed in physical circuit design.

These tools then led to the development of Electronic Design Automation (EDA) [7] and Evolutionary Electronics [8], which allow automatic circuit synthesis and optimization. The idea behind this approach is to have the engineer simply specify the characteristics in an appropriate format (a cost function) and have the system create the appropriate circuit without any further interaction. This was first made possible when Koza [9] created topologies in 1992 with arbitrary connections using genetic programming (GP). Genetic programming allows a great diversity of developed topologies and offers great flexibility when selecting a cost function and the topology components. The approach quickly spread to other research groups [10, 11, 12], who tackled issues such as bloat (an excessive growth of circuits with surplus elements, such as two or more elements of the same type in series or parallel) and alternative topology representations [12, 13].

One of such topology representations is in the form of formal grammar to be used by Grammatical Evolution (GE), an evolutionary computation technique related to the idea of GP. GE offers great flexibility and simplicity during the initialization of the problem. It was already successfully used for simple circuit generation by Castejon et al. [6] but lacked a more complex fitness function. In this paper, we propose a combination of the GE approach together with a complex fitness function, similar to the one proposed by Rojec et al. [5]. The use of such function allows fine-tuning of several parameters at the same time (slope, gain, cut-off frequency etc.) but was so far limited only to a matrix-based GP approach. We believe that the combination of GE and an appropriate cost function should result in a tool that would allow a simple and efficient generation of complex circuits that meet several different criteria at once. The structure of the paper is as follows. Section 2 covers the methods used for our circuit representation – SPICE syntax, fitness functions, GE production rules and population manipulation techniques. Section 3 presents the results of three different case studies that we performed during the development process, and section 4 summarizes our findings and compares them to three other techniques.

## *2 Materials and methods*

Our goal is to create a system that can synthesize a circuit given two input parameters: (i) a list of acceptable components (resistors, capacitors etc.) and (ii) the desired circuit characteristic interpreted as a GE fitness function.

The final system should not require any in-depth knowledge of circuit design beyond being capable of formulating the desired circuit response and finding the circuit models in the SPICE environment.

We rely on a combination of Python (in which we implemented the GE algorithm), SPICE (for circuit evaluation) and PyOpus [14] as the link between them. During initialization we:

- Specify the components that can be used in the circuit (resistor, capacitor, source…) using the SPICE format.
- Define the fitness (RMSE, multi-criteria, PyOpus measurements etc.).
- Define the GE grammar production rules.
- Set the GE parameters:
	- Number of generations how many groups of programs to evaluate.
	- Population size how many individuals (subcircuits) are produced in each generation.
	- Crossover rules which nodes can be replaced
	- Mutation probability
	- Elite size what percentage of best individuals make it into the next generation.

Once all the parameters are set, we run the experiment by performing the following steps:

- 1. Evolve a sub-circuit for each genome sequence in the current generation.
- 2. Create a netlist for the generated circuit.
- 3. Evaluate the sub-circuit using the selected fitness function.
- 4. Trim the population keep the best 10%.
- 5. Create a new generation of circuits by combining the previous best 10% and generating the rest with mutation, crossover and selection (see 2.2.4).
- 6. Repeat this until we have created and evaluated the desired number of generations.

### *2.1 Circuit representation*

Since we use SPICE to simulate and evaluate our circuits, we use the SPICE notation (netlists) for our circuits to make the procedure as simple as possible. Our circuit consists of a "main" circuit that features the necessary evaluation elements (sources, loads, ground ports etc.) and the "evaluation" subcircuit, which was generated by our algorithm. The "main" circuit is shown in Figure 1.



**Figure 1:** Main circuit for evaluation of the generated subcircuits.

The two circuits are stored in separate netlist files and used as input parameters for our PyOpus simulation.

### *2.2 Grammatical evolution*

Grammatical Evolution is one of the emerging methods from the field of Evolutionary computation [6, 15, 16]. The approach is based on using a grammar that consists of production rules for each possible circuit element and its characteristics. The definition of the grammar structure is one of the most important steps when using the GE approach. Each circuit (i.e. filter, oscillator, amplifier circuit) requires a different grammar structure since it can contain different elements, ports, and so on. The grammar is usually defined using the Backus-Naur form (BNF).

Once we select the grammar, we must also select the evolution hyper-parameters such as the population size, crossover type and mutation probability. These parameters impact the duration of the simulation (a larger population requires more time for evaluation) and the success rate of each run (larger population and more frequent mutations can cover more of the searchspace and possibly find a better solution).

All our case studies featured a population size of 300 individuals per generation, 250 generations per run and a fixed mutation rate of 5%.

#### *2.2.1 The Grammar and production rules*

The grammar used for this article was designed to accommodate future expansions and modifications, i.e. to be as flexible as possible when adding new components. An interesting thing to note is that although we created these rules from scratch we ended up with rules that were quite similar to those used by Castejon et al. [6].

The grammar consists of rules formulated in BNF format. These rules either generate non-terminal nodes (components) or terminal node that represent component characteristics such as component type (resistor, coil and capacitor), values (i.e. resistance, capacity) and ports. The names of elements are generated later (see 2.3 for more details). The common rules used in all our case studies are listed in Table 1.

**Table 1:** Genetic grammar production rules.



The starting symbol (see Table 2) is then used to generate the number of elements present in the subcircuit. This can be done recursively where one generates nodes until a terminal node is reached or one achieves the maximum possible depth. Alternatively, we can use an iterative approach where we set a maximum number of components (as opposed to maximum depth in the recursive approach). Here we deviated from the grammar form used in [6] as we used a different approach to keeping the number of elements within a preset maximum number. Castejon uses a dynamic option where a codon can either add an additional element or not. In our approach, we set the maximum number of elements and let the codons select whether or not an element exists. This was done to limit the circuit bloat.

### *2.2.2 Individuals and chromosomes*

Grammatical Evolution creates individuals using a sequence of chromosomes. Each chromosome sequence contains 300 randomly generated chromosomes (a random integer between 1 and 256). These chromosomes are then interpreted using the GE rules (see 2.2.3 for an example). The same sequence of chromosomes will always generate the same subcircuit as long as the production rules remain the same. This greatly simpli-

fies storage and reproduction of our results since we do not need to store the actual netlists, files or objects but simply need to store simple sequences of 300 integers thus greatly reducing the required storage space.

### *2.2.3 Demo Sequence*

An example of individual generation would therefore proceed as follows. The system first creates a chromosome sequence {229, 52, 125, 40, 60, 99, 100….} and uses a starting sequence as shown below.

### "<part><part><part>"

The algorithm then focuses on replacing the first symbol in the sequence. The symbol "<part>" has three possible values which is why the algorithm then uses modulo operation on the current genome (229%3) which gives the result 1 which is the second of the possible values. The "<part>" symbol is therefore replaced with the "<res>" symbol.

### **"<res><part><part>"**

The resistor has only one possible value so the algorithm proceeds with a modulo 1 operation (52%1) which returns 0 and thus selects the only possible resistor type.

### **"rXX (<gPair>)<num>e<n><part><part>"**

The next symbol in the sequence is then the "<qpair>" which has 20 possible values. Using modulo 20 on the next genome in the sequence (125%20) returns 5 which means that the resistor is set to be connected between the input and output ports of the subcircuit.

### **"rXX (input output)<num>e<n><part><part>"**

The next genome (40%2) sets the "<num>" part to a single digit of "<n>".

### **"rXX (input output)<n>e<n><part><part>"**

Then the next genome selects one of the ten possible values for the digit (60%10) and sets it to "1".

### **"rXX (input output) 1e<n><part><part>"**

Lastly, the value of exponent is set to "0" using the next genome and selecting between the 10 possible values (99%10).

### **"rXX (input output)1e0<part><part>"**

The first element is therefore a resistor with resistivity of one 1 Ohm and connected between the input and output port.

Having set all the parameters of the first "<part>" symbol, the algorithm then returns to the start symbol and uses the next genome (100%3) to select the type of the next element, which would in this case again be a resistor. This continues until all the symbols have been replaced with their parameter values.

### *2.2.4 Population manipulation*

Once all the individuals in the current population are evaluated and sorted, the question of producing the next generation occurs. This is done using several manipulation techniques. Using experience and advice from other experiments [17, 5, 18] we take the best individuals from the current generation and move them into the next until we fill one tenth of it (so in our case of 300 individuals per generation we allow the best 30 individuals to proceed into the next one). Since we already evaluated these individuals, we will not need to do so again.

Next, we check if any of the (non-elite) individuals will mutate (a 5% chance in our case). When mutating, the algorithm select one random node of the individual and remove any nodes connected to it. The chromosome corresponding to this node will then be randomly changed to a new value. Afterwards the GE rules will be used to re-create the mutated individual. A mutation can therefore result in a completely new circuit or a minor change in the circuit such as changing the numeric value of the element or the ports to which it is connected. So for example, if we begin with the following sequence:

### **"rXX (input output)1e10 cXX(1 2) 4e-3"**

Mutation can either change one of the parameters of the elements (for example the capacitivity of the capacitor)

### **"rXX (input output)1e10 cXX(1 2) 10e-6"**

 Or completely replace one the elements with a new one (for example replace the resistor with a new capacitor)

### **"cXX (1 output)2e-9 cXX(1 2) 4e-3"**

Following the mutation sequence, the algorithm will perform a crossover function on all the remaining individuals. This is done by selecting two individuals and randomly selecting a node in the first one. We then check if we can find a node of the same type in the second individual. If we do, we switch them between the two individuals. If not, we leave the individuals as they are. The level of exchange can be set to high level elements only (exchange complete elements with all

attributes) or any level desired (exchange ports, numeric values etc.). The resulting individuals replace the originals. An example of crossover can be shown using the following two sequences which represent two individuals:

### **"cXX (1 output) 2e-9 cXX (1 2) 4e-3" "rXX (input output) 1e10 cXX (1 2) 10e-6"**

Our algorithm would then decide to crossover on the "<cap>" node, meaning that it would try to find a node of this type in each individual. Let's assume that it selects the first capacitor in the first individual and the only capacitor in the second individual. The algorithm then swaps the two and stores them as the new individuals resulting in:

### **"cXX (1 2) 10e-6 cXX (1 2) 4e-3" "rXX (input output) 1e10 cXX (1 output) 2e-9"**

Lastly, after selection we check if the new generation contains enough diversity. Without doing this, we would quickly find that most of the individuals contain the same circuit with only minor differences (for example, a resistor of 10 Ohms instead of 9 Ohms). While this could be useful when optimizing the final solution, it can quickly lead us into a dead-end of the search space (a local minimum of the fitness function). We therefore check the diversity of population every 5 generations and remove any duplicate individuals that we find. All such duplicates are then replaced with a "fresh" randomly generated circuits which will (hopefully) increase the diversity and thus the chance of finding the best possible solution. The frequency of this can be as high or as low as we require but we found that when working with simple circuits, it is beneficial to do this as frequently as possible.

Once all these steps are done, the next generation is complete and ready for evaluation.

#### *2.3 Netlists and PyOPUS*

Once an individual is transformed into a string sequence using the production rules (for example "cXX (1 output) 3e-8 cXX (1 2) 39e-5 cXX (input 2) 73e-2 cXX (3 output) 2e-7 rXX (1 0) 8e3 rXX (2 0) 4e7 rXX (1 3) 07e8 rXX (3 0) 96e2 rXX (input output) 06e9 cXX (2 0) 07e-7") we need to transform this sequence into a suitable subcircuit for the PyOpus simulator. Only then are we able to evaluate it (calculate its cost function). To do this, we utilize a simple string parser that performs two important tasks:

- Create a unique name for each of the circuit elements (change the first cXX to c01, the second to c02 and so on)

Flag any circuit containing illegal nodes as faulty and not appropriate for simulation (for example cXX (2 2) 07e-9 is a capacitor that is connected to a single loop). Faulty circuits can otherwise loop the simulator, resulting in lost processing time.

If the circuit is faulty, the individual is not evaluated and has its cost set to the maximum possible value. Otherwise, the processed string is stored into a temporary file along with a header containing all the required SPICE subcircuit characteristics. This file is then used with the SPICE simulator during the evaluation procedure.

#### *2.4 Fitness functions*

The core of the GE approach is the fitness function used to evaluate individuals. This function can be as simple or as complex as desired but must provide a clear (numerical) fitness value of each individual in the generation. The better the value (usually meaning the lowest possible value) the better the individual and the better the chance for this individual to be the best possible solution for the problem at hand.

In our first case study, we used the same approach as Castejon et al [6] – a curve fitting metric. However, instead of using a custom weighted function, we used the "standard" form of curve comparison – RMSE. This approach has proven to be viable in the past [19] and meets the GE fitness function requirements – i.e. a smaller value is better. An example result when using RMSE as the fitness function is shown in Figure 2.





Once we moved to more complex circuits (filters of second and third order), we quickly discovered that simple fitness functions do not work sufficiently and lead to a low success rate. An example of a third-order filter design using RMSE is shown in Figure 3.

The problem stems from the fact that such a fitness function simply compares the difference between two curves on a point-by-point basis and is unable to in-



**Figure 3:** A failure when creating the third-order filter using an RMSE fitness function.

clude any additional design requirements as for example the desired level of dampening. Even using a weighted version of such a function (as in [6]) does not help.

We therefore designed a different (multi criteria) fitness function as suggested by Rojec et al. [5]. The proposed function allows focus on several characteristics at the same time and also allows assigning different priorities to each of them. Once we switched to the new fitness function, we were able to generate working filters which matched the desired characteristics quite nicely as seen in Figure 4.



**Figure 4:** A successfully generated low-pass filter of the second order.

An additional advantage of the new fitness function was also the fact that we no longer required a comparison curve. When using RMSE we had to manually create a comparison curve, which meant that we had to have a comparison circuit ready. Using this baseline circuit we performed the PyOpus analysis and stored the results for further comparison. This is of course a bit controversial since it means that we had to have at least one example of a working circuit in order to be able to find other possible solutions. This can of course become a problem when dealing with more complex circuits or even when dealing with a user who does not have necessary knowledge.

Using the new fitness function we simply had to specify the desired characteristics (see 3.2.2) and the GE algorithm was able to run. As an additional bonus, we also sped up the evaluation procedure by performing our evaluations during the simulation itself. The speed increase was noticeable since we were now able to produce the final circuit in ten minutes or less (as compared to one hour reported by Rojec et al.).

# *3 Case studies*

#### *3.1 Oscillator circuit*

In the first case study, we wanted to test several GE rule sets and see if they can produce feasible and workable circuits. We focused on replicating the performance of an oscillator circuit. We used RMSE as the fitness function and compared the voltage curve of the original oscillator with the GE generated curve. Figure 5 shows an example result where the solid line represents the original circuit, while the dashed one represents the GE generated circuit.



**Figure 5:** An RMSE generated oscillator voltage response.

In the evolution process, we used three different rule sets. The first set featured pre-set elements (one resistor, capacitor and coil – see rule (i) in Table 2) with the GE algorithm focusing on finding the correct element values. The idea behind this set was proving that our proposed technique actually finds a possible solution even when faced with severe limitations (in the form of a fixed circuit).

The second set allowed the algorithm to create as many components as possible (up to 20). We only limited the number of available ports and combinations by using the <gPair> element from Table 1. The start symbol (shown as (ii) in Table 2) therefore featured  $20$  <part> elements for which the GE algorithm chose whether or not they translated into an actual component.

The last rule set further relaxed the constraints and allowed any number of elements and any number of

ports. The start symbol for this case became recursive as shown in Table 2. This means that, each time the GE algorithm created the next component, there was a 50% chance (since there are two possibilities in the <p> symbol) of creating an additional component and 50% chance of this being the last component in the circuit.

**Table 2:** Oscillator circuit grammar.



Each of the rule sets resulted in a circuit that matched the original curve almost perfectly as seen in Figure 5. A sample circuit generated with the last set of rules is shown in Figure 6.



**Figure 6:** A GE generated oscillator circuit.

The results show that the GE algorithm is up to the task of creating the desired circuit. We can, however, see that there is a potential problem with bloat, since the third set of rules created a large number of components in most cases (an oscillator normally requires only three components). This can be alleviated during post-processing by analyzing the netlist and replacing parallel/serial elements with their equivalents.

### *3.2 Second-order filters*

The first case study showed that we can generate simple circuits using our GE based approach. We then moved to a more complex example – second-order low/high-pass filters. As before, the aim remained the same – to automatically generate a filter with the desired characteristics. At the beginning, we retained the RMSE fitness function (and had to generate a comparison circuit for each example) but we quickly discovered that this fitness function did not seem to have a very high success rate – while we were always able to create a circuit with filter-like characteristics (i.e., with a cut-off frequency and dampening) we were unable to do so in a consistent manner. Upon reflection (see 2.2 for more detail) we switched to a multi-criteria fitness function as suggested by Rojec et al. [5].

At the beginning of the study, we focused on high-pass filters but later also generated low-pass versions to demonstrate the flexibility of our approach.

### *3.2.1 Using RMSE*

We used a pre-set circuit to generate a comparison curve that was used to evaluate the GE generated circuits. We also used experience from the first case study to limited our rules to using up to 12 different circuit components to reduce bloat (limiting the upper number of components was also suggested in [5]). In addition, we removed the coil element from the component list, since a filter circuit usually consists of only resistors and capacitors.

We noticed after several extensive runs that there appeared to be some issues with the success rate of the algorithm. While it did find a possible solution in some of the runs, it quite often either completely failed or produced something that did not resemble a secondorder filter at all, exhibiting quite a high cost function value. After analyzing several of the results, we came to the conclusion that the issue lies in the nature of the RMSE fitness function as discussed in 2.4.

We therefore switched to a more complex fitness function that allowed us to emphasize important aspects of the filter transfer functions and hopefully produce better (and more consistent) results.

### *3.2.2 Multi-criteria fitness function*

We based our approach on the fitness function presented by Rojec et al. [5]. For the second-order filter, we focused on gain, cut-off frequency, ripple, and damping. Gain measures an increase (i.e., amplification) in the voltage level before the dampening begins. Gain of an ideal filter is equal to zero, meaning that the input level is stable before any changes applied by the filter. We calculated gain using this equation:

$$
g = |0dB - gain| \tag{1}
$$

The cut-off frequency indicated the frequency at which the damping begins. We set it to 20 kHz in our case and calculated the difference between this value and the frequency created by our algorithm using the following equation:

$$
f_{\text{off}} = \left| \log_{10} 20 \, kHz - \log_{10} f_{\text{pass}} \right| \tag{2}
$$

Ripple indicates whether or not the input level before the cut-off frequency remains stable (i.e. the whole bandwidth is amplified at the same level). We calculated the ripple level using equation 2.

$$
r = \begin{cases} ripple - 0.5 dB, ripple > 0.5 dB \\ 0, ripple \le 0.5 dB \end{cases}
$$
 (3)

Last but not least the damping indicates whether or not the designed filter actually achieved the desired level of damping after the cut-off frequency (i.e., a drop of 40 dB for a second-order filter). This was verified using the following equation:

$$
d = \begin{cases} 40 \, dB - damping \, , damping < 40 \, dB \\ 0, damping \ge 40 \, dB \end{cases} \tag{4}
$$

A graphical representation of these four criteria is shown in Figure 7.



*Frequency-Fixed Fitness Function*

**Figure 7:** Multi-criteria fitness function components as . . **. . .** . . . . . .<br>shown in [5].

In the end, we combined the four characteristics into a single cost function:

$$
cost = w_1 r + w_2 d + w_3 f_{off} + w_4 g \tag{5}
$$

The four weights ( $W_1$  to  $W_4$ ) allow us to select which of the characteristics is more important to us. For example, if we favor achieving the desired level of damping and don't care so much about hitting the filter frequency precisely, we raise the value of  $W_2$  and decrease the value of  $W_3$ . During our experiments, we emphasized gain and ripple since this produced the best results. We selected the weights experimentally by using values from 1 to 20 and then chose the set that produced the best results in several runs. The four weights were set to 15, 10, 5, and 4.

The new fitness function also simplified our algorithm in two important ways: (i) We do not need a comparison circuit anymore (which means less requirements for prior knowledge) and (ii) The evaluations of the four characteristics could be done automatically during the PyOpus simulation, thus reducing the amount of postprocessing. This resulted in a noticeable speed increase during the test runs.

We soon discovered that the new fitness function finds workable circuits a lot more frequently (practically always) and works a lot more consistently during runs. Thus, we can conclude that it is crucial for a complex circuit to have a complex fitness function in order to be able to generate results consistently.

An example of a generated transfer function (compared to the idealized transfer function) is shown in Figure 8 with the matching generated circuit in Figure 9.



**Figure 8:** A second-order high-pass filter transfer function.



**Figure 9:** A second-order high-pass filter circuit generated by our GE algorithm.

We were also able to generate a low-pass filter with slight modifications to the cost function or, to be more precise, the PyOpus simulation parameters. Namely, we used the PyOpus measurement module to extract the cut-off frequency using the following expression: m.ACbandwidth(abs(v('out')),abs(scale()),filter='hp'

To design a low-pass filter we simply switched the filter parameter to 'lp' and were able to proceed. The resulting transfer function and circuit are shown in Figures 10 and 11.



**Figure 10:** A second-order low-pass filter transfer function.



**Figure 11:** A second-order low-pass filter circuit generated using our method.

Looking at the circuits, we can see that there is a certain level of redundancy (for example two parallel resistors R1 and R2 in Figure 11), but (in our view) still not something to worry about. As already mentioned, much of this can be removed using a post processing algorithm that analyzes and optimizes the final netlist. This is, however, not possible while the algorithm is running, since it would require an extensive reworking of the chromosome structure. Nevertheless, we will consider this as a part of possible future improvements of our algorithm.

### *3.3 Third-order filters*

For the last case study, we decided to increase the circuit complexity by increasing the level of the filter from second to third order. The implementation of such a change was extremely easy, since it only took us to change the target value of the damping factor from 40 dB to 60 dB in equation 3. The rest of the experiment used the same parameters (i.e., the grammar rules remained unchanged, no additional elements were added, and the number of runs and generations remained the same).

We were again able to consistently generate filter circuits with the desired characteristics in most of the runs. An example transfer function that we obtained from one of the evolution runs can be seen in Figure 12. This function belongs to the circuit shown in Figure 13. Interestingly enough, we did not need to add additional components into the algorithm during the initialization phase (meaning that we were able to create a third-order filter using up to 14 components).









We are able to make the same observations about the obtained circuits as we were during the previous case study – there is a certain level of redundancy (and bloat) but, due to an upper limit on the number of components, this remains on a manageable level and can be further reduced during post-processing.

### *4 Comparison with other methods*

Compared to the original genetic programming based approach proposed by Koza [10], our approach offers more flexibility since it is not limited by the types of embryonic circuits introduced in the initialization phase. This means that we do not need to specify any starting topology or/and circuit and can leave the algorithm to find its own solution. This also reduces the amount of prior knowledge required to use our approach.

The approach presented in this paper also builds on findings of Castejon et al. [6] and Rojec et al. [5]. The former research group also used a GE approach and created separate rule sets similar to the ones presented in our work. They did not, however, limit bloat in the circuit (they allowed any number of elements) nor did they tackle more complex circuit examples. The latter is the consequence of them using only a very rudimentary cost function which, as we have demonstrated in this article, severely reduces the algorithm's success rate. In our approach, we used a multi-criteria cost function (similar to the one used in [5]) and were consequently able to produce more complex circuits as well.

An additional improvement made by our approach is a considerable increase in computation speed with which we generate the circuits. While Castejon et al. do not explicitly state the amount of time required for their experiments, we can learn that the approach used by Rojec et al. takes anywhere from one to 12 hours. All of the case studies presented in this work took less than 15 minutes per run to complete, while getting a comparable circuits. We could probably reduce this further by using multiple processors and hyper threading but since the process already took such a small amount of time, we left this for future work.

# *5 Conclusions*

We successfully developed a GE based system for automated topology synthesis that works with a high-level rule set and a complex (or simple) fitness function. We were able to generate several circuits in a small amount of time with appropriate grammar modification. As a consequence we believe that this approach shows merit and can be of benefit to other engineers. It can also be developed further to improve its performance even more.

An additional improvement that we plan to develop is automatic post-processing of the evolved circuits. At the moment, we are only able to make sure that the evolved netlist contains correct component names and does not contain any illegal connections. This could be further improved by automatically detecting and removing any redundancies (e.g., replacing two or more serial or parallel elements of the same type with a single one).

Another option would be a repairing mechanism that would be used before the fitness function evaluation. Such a mechanism could detect faulty circuits, useless circuit branches and other defects even before evaluation and either try to correct them or flag the circuit as faulty and eliminate the individual. This could significantly improve the approach, but will require some time to develop since we would also have to modify the individuals' chromosome sequence to reflect the repairs.

We believe there lies much more potential for the application of the presented GE technique for an efficient evolution of useful and complex circuits than the science has been able to unearth so far. We will therefore aim to further develop the approach by increasing the complexity of the generated circuits, expanding the rule sets to include additional elements (transistors, amplifiers, etc.) and experiment with different options offered by the PyOpus environment (i.e., alternative modes of evaluation of the fitness function, parallelization, and others). Last but not least, we plan to work towards creating an open-source library to be available for other researchers and research groups in the community as a part of the PyOpus package.

# *6 Acknowledgments*

The authors acknowledge the financial support from the Slovenian Research Agency (research core funding No. P2-0246 Algorithms and Optimization Methods in Telecommunications).

# *7 Conflicts of interest*

The authors declare no conflict of interest.

The founding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

### *8 References*

- 1. L. W. Nagel and D. O. Pederson, "SPICE (Simulation Program with Integrated Circuit Emphasis)," 1973.
- 2. J. Olenšek, T. Tuma, J. Puhan and Á. Bűrmen, "A New Asynchronous Parallel Global Optimization Method Based on Simulated Annealing and Differential Evolution," Applied Soft Computing, vol. 11, pp. 1481-1489, 2011.
- 3. Á. Bűrmen, F. Bratkovič, J. Puhan, I. Fajfar and T. Tuma, «Extended global convergence framework for unconstrained optimization,» Acta mathematica Sinica, vol. 20, pp. 433-440, 2004.
- 4. Á. Bűrmen, T. Tuma and I. Fajfar, «A combined simplex-trust-region method for analog circuit

optimization,» Journal of circuits, systems, and computers, vol. 17, pp. 123-140, 2008.

- 5. Ž. Rojec, Á. Bűrmen and I. Fajfar, «Analog circuit topology synthesis by means of evolutionary computation,» Engineering Applications of Artificial Intelligence, vol. 80, pp. 48-65, 2019.
- 6. E. Castejon and F. J. Carmona, "Automatic design of analog electronic circuits using grammatical evolution," Applied Soft Computing, vol. 62, pp. 1003-1018, 2018.
- 7. 8. G. Gielen and R. Rutenbar, Computer-aided design of analog and mixed-signal integrated circuits, New York: John Wiley & Sons , 2002.
- 8. R. Zebulum, M. Pacheco and M. Vellasco, "Comparison of different evolutionary methodologies applied to electronic filter design," in IEEE International Conference on Evolutionary Computation Proceedings, 1998.
- 9. J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, Cambridge, MA: MIT Press, 1992.
- 10. J. R. Koza, I. F. H. Bennett, D. Andre, M. A. Keane and F. Dunlap, "Automated Synthesis of Analog Electrical Circuits by Means of Genetic Programming," Trans. Evol. Comp, vol. 1, pp. 109-128, Jul 1997.
- 11. G. Györök, "Crossbar network for automatic analog circuit synthesis," in 2014 IEEE 12th International Symposium on Applied Machine Intelligence and Informatics (SAMI), 2014.
- 12. Z. Gan, Z. Yang, T. Shang, T. Yu and M. Jiang, "Automated synthesis of passive analog filters using graph representation," Expert Systems with Applications, vol. 37, no. 3, pp. 1887-1898, 2010.
- 13. L. Torres-Papaqui, D. Torres-Munoz and T.-C. E., "Synthesis of VFs and CFs by manipulation of generic cells," Analog Integr. Circuits Signal Process, vol. 46, pp. 99-102, 2006.
- 14. A. Bűrmen, J. Puhan, J. Olenšek, G. Cijan and T. Tuma, "PyOPUS - Simulation, Optimization, and Design," EDA Laboratory, Faculty of Electrical Engineering, University of Ljubljana, 2016.
- 15. M. O'Neill and C. Ryan, "Grammatical evolution," IEEE Transactions on Evolutionary, vol. 5, pp. 349- 358, 2001.
- 16. I. Fajfar, Á. Bűrmen and J. Puhan, "Grammatical evolution as a hyper-heuristic to evolve deterministic real-valued optimization algorithms," Genetic programming and evolvable machines, vol. 19, pp. 473-504, 2018.
- 17. I. Fajfar, J. Puhan and Á. Bűrmen, "Evolving a Nelder–Mead Algorithm for Optimization with Genetic Programming," Evolutionary Computation, vol. 5, no. 3, pp. 351-373, 2017.
- 18. I. Fajfar and T. Tuma, "Creation of numerical constants in robust gene expression programming." Entropy, vol. 20, pp. 1-15, 2018.
- 19. M. Kunaver and T. Požrl, "Diversity in recommender systems - a survey," Knowledge-based systems, vol. 123, pp. 154-162, 2017.



Copyright © 2019 by the Authors. This is an open access article distributed under the Creative Com-

mons Attribution (CC BY) License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Arrived: 22. 11. 2019 Accepted: 08. 01. 2020

https://doi.org/10.33180/InfMIDEM2019.406



Journal of Microelectronics, Electronic Components and Materials Vol. 49, No. 4(2019), 241 – 254

# *Design of an Optimized Twin Mode Reconfigurable Adaptive FIR Filter Architecture for Speech Signal Processing*

*Padmapriya S.1 , Jagadeeswari M.1 , Lakshmi Prabha V.2*

*1 Sri Ramakrishna Engineering College, Department of ECE, Coimbatore, Tamilnadu, India 2 PSG College of Technology, Department of ECE, Coimbatore, Tamilnadu, India*

**Abstract:** Reconfigurability, low complexity and low power are the key requirements of FIR filters employed in multi-standard wireless communication systems. Digital Filters are used to filter the audio data stream and increase the reliability of speech signal. Therefore, it is imperative to design an area optimized and low power based reconfigurable FIR filter architectures.

The reconfigurable architecture designed in this research is capable of achieving lower adaptation-delay and area-delay-power efficient implementation of a Delayed Least Mean Square (DLMS) adaptive filter with reversible logic gates. The Optimized Adaptive Reconfigurable (OAR) FIR filter architectures are proposed. The optimized architectures are implemented across the combinational blocks by reducing the pipeline delays, sampling period, energy consumption and area, to increase the Power-Delay Product (PDP) and Energy Per Sample (EPS).The noisy speech signals are used for verifying the efficiency of the proposed architectures. By implementing the proposed scheme in signal corrupted by various real-time noises at different Signal to Noise Ratios (SNRs), the efficiency of the architecture is verified.

**Keywords:** Adaptive Filter; Least Mean Square Algorithm; Reconfigurable Filtering; Speech Signal Processing

# *Dizajn optimiranega nastavljivega FIR filtra v za procesiranje govornega signala*

**Izvleček:** Nastavljivost, enostavnost in nizka poraba so glavne zahteve FIR filtrov v multistandardnih sistemih brezžične komunikacije. Digitalni filtri se uorabljajo za filtriranje zvokovnega prenosa in izboljšanje zanesljivosti govora v signal. Zato je potrebno načrtovati majhne filtre z nizko porabo energije. Predlagana arhitektura je sposobna dosegati nizko porabo in majhno uporabo prostora za DLMS filter z vrati z reverzno logiko. Optimizirana arhitektura je implementirana v bloke za zmanjševanje zakasnitev, period vzorčenja in porabe energije ter prostora in povečanja PDP in EPS. Za verifikacijo so bili uporabljeni številni vzorci govora z veliko šuma v realnem času.

**Ključne besede:** adaptiven filter; algoritem najmanjših povprečnih kvadratov; procesiranje govornega signala

*\* Corresponding Author's e-mail: padmapriya.s@srec.ac.in*

# *1 Introduction*

FIR digital filters are widely used as a main tool in various signal processing and image processing applications [1]. Most portable electronic devices such as cellular phones, personal digital assistants, and hearing aids require DSP for high performance. The miniaturization of handheld devices with good performance increases the demand for sophisticated DSP algorithm implementations that are area optimized and consume as little power as possible.

An important branch of signal and information processing is Adaptive Signal Processing. The relationship between two signals is modeled in an iterative manner with an adaptive filter in real time. A filter can be realized either as a set of program instructions running on

an arithmetic processing device such as a microprocessor or a DSP chip, or in a semi-custom or custom VLSI integrated circuit. The fundamental operation of an adaptive filter depends on the specific physical realization that it takes. Adaptive signal processing is rapidly developing, and widely emphasized by scholars at home and abroad. It expands the application range of digital signal processing [2].

Least mean square error (LMS) algorithm, proposed by Widrow in 1985 [18], is a key algorithm for adaptive signal processing. It is widely applied because of characteristics, including good stability, computational efficiency and easy implementation.

Ting et al.[3] have proposed a fine-grained pipelined design to limit the critical path to the maximum of one addition time, which supports high sampling frequency. But it involves a lot of area overhead for pipelining and higher power consumption than the systolic architecture for Delayed Least Mean Square Adaptive Digital Filter (DLMS ADF) proposed by Van &Feng [4].

To reduce the number of adaptation delays, Meher&Maheshwari [5] have proposed a 2-bit multiplication cell, and used that with an efficient adder tree for pipelined inner-product computation to minimize the critical path and area without increasing the number of adaptation delays. For achieving lower adaptation-delay and area-delay-power efficient implementation, a novel partial product generator and an optimization of previously reported design have been proposed by Park &Meher [6]. It is found to be more efficient in terms of the Power Delay product (PDP) and Energy Per Sample (EPS).

In the above works, the critical-path analysis and necessary design considerations are not taken into account. The designs of Meher& Park[7, 8] still consume higher area, which could be substantially reduced. A criticalpath analysis of the LMS adaptive filter, conditional signed carry-save accumulation to reduce the sampling period, and area complexity and low power consumption by fast bit clock for carry-save accumulation were presented by Meher& Park [8]. An architecture for the LMS adaptive filter with minimal use of pipeline stages was derived, which results in less area complexity and power consumption without compromising the desired processing throughput. Fixed-point implementation of LMS adaptive filter was proposed by Meher & Park [9], which uses an efficient implementation of multiplications and Computation sharing inner-product makes a novel design of Partial Product Generator (PPG) Block.

The Block Least Mean Square (BLMS) ADF proposed by Clark et al. [10] and Baghel&Shaik[11] is one of the useful derivatives of the LMS ADF for fast and computationally-efficient implementation of ADFs. BLMS ADF accepts a block of input for computing a block of output and updates the weights using a block of errors in every training cycle.

Baghel&Shaik [12] had suggested a Distributed-Arithmetic (DA)-based structure for FPGA implementation of BLMS ADFs. A low complexity design using a single Multiply-ACcumulate (MAC) cell for the computation of filter output and weight-increment term supporting a low sampling rate has been proposed by Jayashri et al.[13] for BLMS ADFs.

The throughput of the DA-LMS ADFs could be slow for real-time applications due to the bit-serial nature of DA computation. The scheme offers the sharing of Look Up Table (LUT) for the computation of both filter output and weight-increment term, but this scheme cannot be applied to derive a DA-based structure for BLMS ADFs, because separate Inner-Product Computation (IPC) is performed for calculation of filter output and weightincrement term of BLMS ADF. In LMS ADF, IPC is performed to calculate the filter output. Mohanty&Meher [14, 15] derived a DA formulation of BLMS algorithm where both convolution and correlation are performed using a common LUT for the computation of filter outputs and weight increment terms. LUT words and adders are significantly saved, which constitute the major hardware components in DA-based computing structures. A parallel architecture for the implementation of DA-based BLMSADF is derived.

Mohanty et al [16] proposed an architecture, scalable for higher filter lengths and block sizes based on DA. The maximal sharing of parallel LUT Update operation and LUT contents was also proposed. However, the structure complexity increases with filter length and block size.

Tasleem Khan & Shaik [17] proposed pipelined DA optimal-complexity structures based least-mean-square (LMS) adaptive filter. Offset-Binary-Coding (OBC) combinations of input samples was implemented to reduce the complexity of proposed structures on hardware. A novel low-complexity implementations for the offset term, weight update block and shift-accumulate unit are also proposed.

# *2 Adaptive filtering and principle*

### *2.1 Adaptive filter*

### *2.1.1 Adaptive filtering principle*

The FIR Filter weights are updated by the Widrow-Hoff Least Mean Square (LMS) algorithm proposed by Widrow&Stearns[18] due to its low complexity, stability

and satisfactory convergence performance analyzed by Haykin&Widrow [19]. The output signal is compared to a second signal  $d_{n'}$  called the desired response signal, by subtracting the two samples at time n. The weights of the LMS adaptive filter during *nth* iteration are updated according to the following equations.

$$
W_{n+1} = W_n + \mu e_n X_n \tag{1}
$$

where  $\mu$  is the step size,

$$
e_n = d_n - y_n \tag{2}
$$

$$
\mathcal{Y}_n = W_n^T X_n \tag{3}
$$

With the input vector  $X_{_n}$  and weight vector  $W_{_n}$  at the  $n^{th}$ iteration given by

$$
X_n = [x_n, x_{n-1}, \dots, x_{n-N+1}]^T
$$
  

$$
W_n = [w_n(0), w_n(1), \dots, w_n(N-1)]^T
$$

Where  $d_{_n}$  is the desired response,  $\bm{{\mathsf{y}}}_{_n}$  is the filter output and *en* denotes the error signal. The error signal is fed into a procedure which alters or adapts the parameters of the filter from time *n* to time (*n*+1) in a well-defined manner. As the time index n is incremented, it is hoped that the output of the adaptive filter becomes a better and better match to the desired response signal through the adaptation process, such that the magnitude of *en* decreases over time. In the adaptive filtering task, adaptation refers to the method by which the parameters of the system are changed from time index n to time index (*n*+1). The number and types of parameters within this system depend on the computational structure chosen for the system.The error  $e_n$  becomes available after *m* cycles, where *m* is called the adaptation-delay for pipelined designs with *m* pipeline stages. For an  $N^{th}$  -order FIR filter, the generation of each output sample *yn* takes *N*+1 Multiply-Accumulate (MAC) operations.

*2.1.2 Adaptive filtering with minimal delay and pipelining* In the direct form LMS adaptive filter the critical-path for computing the inner-product (in order to obtain the filter output) is long. Pipelined implementation is required to reduce the critical-path when the desired sample period is exceeded. Due to the recursive structure of the conventional LMS algorithm, pipelining is not supported. Meyer &Agrawal [20] ,Long et al. [21] proposed Delayed LMS (DLMS) algorithm, which supports pipelined implementation. In DLMS, the correction terms for updating the filter weights of the current iteration are calculated from the error corresponding to the past iteration. Many architectures were proposed by Ramanathan&Visvanathan [22], Van &Feng [4],Yi et al. [23] and Ting et al. [3] to reduce the adaptation delay but the area overhead and high-power consumptions were the trade-offs in all the designs. To overcome the drawbacks of the above methods, a 2-bit multiplication cell was proposed by Meher&Maheshwari [5] and implemented in an adder-tree for pipelined inner-product computation.

#### *2.1.3 Drawbacks of the pipelined adaptive filters*

The algorithms based on pipelined LMS have high computational complexity. For higher-order filters, the adaptation delay increases. There exists a trade-off between the adaptation delay, area, and power consumption. However, it occupies an area without increasing the number of adaptation-delays. Hence, Area Optimized and Low Power Adaptive Reconfigurable FIR filter architecture has been proposed in this research. In the proposed adaptive architecture, thresholding is performed to support the reconfiguration in the existing adaptive designs. The proposed adaptive reconfigurable architectures are simple to design and suitable for filtering speech signals.

The Proposed Optimized Adaptive Reconfiguration (OAR) FIR Filter consumes less power, the area overhead is reduced, and the adaptation delay is decreased. Thresholding is done with the help of the adaptive filter coefficient values.

# *3 Delayed least mean square (DLMS) algorithm*

Pipeline implementation is not favored when the sampling rate is high, due to the delay in availability of the feedback error for updating the weights according to the LMS algorithm. Cohen et al.[24] and Parhi [25] have proposed the DLMS algorithm for pipeline implementation. In DLMS, the error of the past iteration is used for calculating the correct terms for updating the filter weights of the current iteration.

#### *3.1 Implementation of direct-form DLMS algorithm:*

The error computation path is implemented in m pipelined stages, the latency of error computation is m cycles. The error computed by the structure at the *nth* cycle is  $e_{nm}$ . The error is used with the input samples delayed by *m* cycles to generate the weight-increment term. The weight-update equation of the DLMS algorithm is given by

$$
W_{n+1} = W_n + \mu e_{n-m} X_{n-m}
$$
 (4)

Where 
$$
e_{n-m} = d_{n-m} - y_{n-m}
$$
 and  $y_n = W_n^T X_n$ 

A generalized block diagram of direct form DLMS adaptive filter is shown in Fig. 1. It consists of an errorcomputation block as shown in Fig. 2 and a weightupdate block as shown in Fig. 3. The number of delays m shown in Figure 1 corresponds to the pipeline delays introduced due to pipelining of the error computation block.



**Figure 1:** Generalized block diagram of Direct-form DLMS Adaptive Filter



**Figure 2:** Error Computation block of Direct-form DLMS Adaptive Filter



**Figure 3:** Weight-update block of Direct-form DLMS Adaptive Filter

#### *3.2 Structure of error computation block*

The structure of Error-Computation Unit of *N* -tap DLMS adaptive filter is shown in Fig.4. It consists of *N* 2-bit Partial Product Generators (PPG) corresponding to *N* multipliers, a cluster of *L/2* binary addertrees followed by a single shift add tree.

#### *3.3 Structure of partial product generator*

The structure of each PPG is shown in Fig. 5. It involves L/2 2-to-3 decoders and the same number of AND-OR



**Figure 4:** Structure of Error-Computation Block

Cells(AOC). Input is taken as even word-length. Each of the 2-to-3 decoders takes a 2-bit digit  $(u_1, u_0)$  as input and produces three outputs  $b_0 = u_0 u_1$ ,  $b_1 = u_0 u_1$ , and  $b_2 = u_0 u_1$ , such that  $b_0 = 1$  for  $(u_1 u_0) = 1$ ,  $b_1 = 1$ for  $(u_1 u_0) = 2$ , and  $b_2 = 1$  for  $(u_1 u_0) = 3$ . The decoder outputs  $b_{\scriptscriptstyle 0}$ ,  $b_{\scriptscriptstyle 1}$  and  $b_{\scriptscriptstyle 2}$  along with *w*, 2*w* and 3*w* are fed to an AOC, where *w*, 2*w* and 3*w* are in 2's complement representation and sign-extended to have (*w*+2) bits each. While computing the partial product corresponding to the Most-Significant Digit (MSD), i.e., (*u*<sub>L-1</sub>*u*<sub>L-2</sub>) of the input sample, the AOC-(*L*/2-1) is fed with *w*, -2*w* and -*w* as input while considering the sign of the input.



**Figure 5:** Structure of Partial Product Generator (PPG)

#### *3.4 Structure of AND-OR cells*

The structure and function of AOC is depicted in Fig.6. Each AOC consists of 3 AND cells and 2 OR cells. The structure and function of AND cells and OR cells are depicted by Fig. 6(b) and 6(c) respectively. Each AND cell takes *n*-bit input D and a single bit input *b*, and consists of *n* AND gates. It distributes all *n*-bits of input D to its *n* AND gates as one of the inputs. The other inputs of all the *n* AND gates are fed with the single-bit input *b*. Each OR cell similarly takes a pair of *n*-bit input words,



r<sub>.</sub>=b<sub>.</sub>+d<sub>.</sub> for i=0,1,..., n-1

**Figure 6:** (a) Structure of AND/OR cell, (b) AND gate and (c) OR gate

and has *n* OR gates. A pair of bits in the same bit-position in B and D is fed to the same OR gate.

The output of an AOC is *w*, 2*w* and 3*w* corresponding to the decimal values 1, 2 and 3 of the 2-bit input  $(u_1u_0)$ respectively. The decoder along with the AOC performs a multiplication of input operand *w* with two-bit digit ( $u_1 u_0$ ), such that the PPG of Fig. 5 performs *L*/2 parallel multiplications of input word *w* with a 2-bit digit to produce *L*/2 partial products of the product word *wu*.

### *3.5 Proposed latched carry select adder (LCSA)*

Carry Select Adder is used in the proposed reconfigurable FIR filter since it is the fast adder. D-latch with enable signal is introduced in the design. Latches are used to store one bit information. As long as the enable signal is asserted, the outputs are affected by the inputs.

The architecture of carry select adder with D-latch consists of five groups of different bit size Ripple Carry Adders (RCA) and D-latch. The Carry Select Adder (CSA)

proposed by Ramkumar &Kittur [26] has two RCAs , one for carry input  $C_{\text{in}}=1$  and another RCA for carry input  $C<sub>in</sub>=0$ . In the proposed LCSA, instead of using two separate adders in the regular CSA, the proposed method uses only one RCA block which results in the reduction of area and power consumption. Another RCA block is replaced with a D-latch. Each of the two additions is performed in one clock cycle. When clock goes high, addition for carry input C<sub>in</sub>=1 is performed. Otherwise the carry input is assumed as zero and sum is stored in adder itself.

The latch is used to store the sum and carry for  $C_{in}=1$ . As shown in Fig.7.Carry out from least significant bit adder is used as a control signal for multiplexer to select the final output carry and sum of the n-bit adder.The Fig.7 shows the n-bit adder in which the LSB adder is a 2-bits wide ripple carry adder.



**Figure 7:** Proposed n-bit Latched Carry Select Adder



**Figure 8:** Internal structure of group 2

The upper half of the adder i.e, most significant part is (*n*-2) -bits wide which works according to the clock. Whenever clock goes high addition for carry input one is performed. When clock goes low then carry input is assumed as zero and sum is stored in adder itself.

From the Fig. 7, latch is used to store the sum and carry for C<sub>in</sub>=1. Carry out from the previous stage i.e, least significant bit adder is used as control signal for multiplexer to select final output carry and sum of the n-bit adder. If the actual carry input is one, then computed sum and carry latch is accessed and for carry input zero MSB adder is accessed.  $C_{out}$  is the output carry.



**Figure 9:** The adder structure of the filtering unit for  $N = 8$  and  $L = 16$ 

Fig.8 shows the internal structure of group 2 of the proposed n-bit LCSA. The group 2 performed the two bit additions which are  $a_{_2}$  with  $b_{_2}$  and  $a_{_3}$  with  $b_{_3}$ . This is done by two Full Adders (FA) named *FA*  $_2$  and *FA*  $_3$  respectively. The third input to the full adder  $\mathit{FA}_2$  is the clock instead of the carry and the third input to the full adder *FA*<sub>3</sub> is the carry output from *FA*<sub>2</sub>. The group 2 structure has three D-Latches in which two are used to store the Sum<sub>2</sub> and Sum<sub>3</sub> from FA<sub>2</sub> and FA<sub>3</sub> respectively and the last one is used to store carry. Multiplexer is used for

selecting the actual sum and carry from the previous stage. The 6:3 multiplexer is the combination of 2:1 multiplexers. When the clock is low,  $a_2$  and  $b_2$  are added with carry input equal to zero. Because of low clock, the D-Latch is not enabled. When the clock is high, the addition is performed with carry input equal to one. All the D-Latches are enabled and store the sum and carry for carry input equal to one. According to the value of *C*in whether it is 0 or 1, the multiplexer selects the actual sum and carry.



**Figure 10:** Structure of weight-update block

#### *3.5.1 Structure of adder-tree*

The shift-add operation is used for obtaining the desired inner product but the adder size increases as the word length increases. *N*-1 additions are required to add *N* product values. To avoid such increase in wordsize of the adders, all the *N* partial products of the same place value are added from all the *N* Partial Product Generator (PPG)s by one adder-tree.

All the *L*/2 partial products generated by each of the *N* PPGs are thus added by *L*/2 binary Carry Select adder trees. The output of the *L*/2 adder-trees are then added by a shift-add-tree according to their place values. Each of the binary Carry Select adder-trees requires log<sub>2</sub>N stages of adders to add *N* partial products, and the shiftadd-tree requires log<sub>2</sub>L-1 stages of adders to add L/2 outputs of *L*/2 binary Carry Select adder-trees. The addition scheme for the error-computation block for 8-tap filter and input word-size *L*=16 is shown in Fig. 9. For *N*=8 and *L*=16, the adder-network requires eight binary carry select adder-trees with two stages and a three-stage shift-add tree. Pipeline latches (represented by dashed line) are introduced to reduce the critical-path to one addition time. Pipelining is performed by a feed-forward cut-set retiming of error-computation block .

### *3.6 Structure of weight-update block*

The proposed structure for weight-update block is shown in Fig. 10. It performs *N* multiply accumulate operations of the form  $(\mu \times e) \times x_i + w_i$  to update *N* filter weights. The step-size  $'\mu'$  is taken as a negative power of two to realize the multiplication with recently available error only by a shift operation. Each of the MAC units therefore performs the multiplication of shifted value of error with the delayed input samples  $x_i$  followed by the additions with the corresponding old weight values ' $w_i$ '. All the *N* multiplications for the MAC operations are performed by *N* PPGs followed by *N* shift-adder trees. Each of the PPGs generates *L*/2 partial products corresponding to product of recent shifted error value  $\mu \times e$  with  $L/2$  2-bit digits of the input word  $x_i$ , where the subexpression  $3\mu \times e$  is shared within the multiplier. Since the scaled error  $(\mu \times e)$  is multiplied with all the *N* delayed input values in the weight-update block, this subexpression can be shared across all the multipliers as well. This leads to a substantial reduction of adder complexity. The final outputs of MAC units constitute the desired updated weights to be used as inputs to the error-computation block as well as the weight-update block for the next iteration.

#### *3.7 Filter coefficient monitoring*

In the proposed adaptive reconfigurable FIR filter, the filter order is changed depending on the amplitude of the input signal as well as the filter coefficients. In Linear symmetric FIR filter, the middle filter coefficient has the largest value. The threshold of the filter coefficient and the input is fixed by considering the average sum of the first half of the filter coefficients. The filter coefficients will vary depending on the characteristics of the FIR filter. The filter input and the filter coefficient are denoted as  $X_n$  and  $C_k$  respectively. Same threshold is used for the input and filter coefficient. The threshold value is denoted as *Th*.

### *3.8 Decision block*

The amplitude of the input samples as well as the filter coefficients are monitored by using the Decision Block, also referred to as the Amplitude Detector (AD) as shown in Fig.11.



**Figure 11:** Amplitude Detector (AD)

The proposed optimized Adaptive Reconfigurable FIR Filter works in two modes. When the input signal X as well as the filter coefficient  $\mathsf{C}_\mathsf{k}$  are both smaller than the threshold (average of the filter coeffcients) then multiplierless implementation of the adaptive FIR Filter is carried out. In the other case, if both X and  $C_k$  are greater than the threshold value(Th) the area optimized implementation is used. In runtime the modes are varied dynamically. The speech signal with noise inclusion are considered for analysis. This signal value (x) will be var-



**Figure 12:** Decision block diagram of Proposed Optimized Adaptive Reconfigurable (OAR) FIR Filter

ying. The modes are switched depending on the input signal variation. The threshold value can be changed depending on the designer's considerations. AD can be implemented using a comparator. The decision diagram of the Proposed Optimized Adaptive Reconfigurable FIR filter is shown in Fig.12.

The filter order can be changed depending on the amplitude of both filter coefficients and the inputs. The product of the data inputs of the filter, with the coefficients, has large variations in amplitude. When the data sample multiplied with the coefficient is small, the FIR filter with multiplierless implementation is used. When the samples as well as the coefficients are large, the proposed Latched Carry Select Adder (LCSA) based Optimized Adaptive FIR filter is used.

#### *3.9 Multiplier control signal decision(MCSD) window*

When the number of filter taps increases the switching problem occurs because the multipliers must be turned on/off. Multiplier Control Signal Decision (MCSD) window is used to solve the switching problem. The filter coefficients under this window alone are monitored and the decision is made.

The amplitude of the input samples as well as the filter coefficients are monitored by using AD. The proposed architecture of Reconfigurable FIR filter consists of two modes of operation. When the input sample value  $X_n$ and the filter coefficient  $\mathsf{C}_\mathsf{k}$  are less than  $\mathsf{Th}$ , then multiplierless FIR filter is implemented. Otherwise, it performs area optimized FIR filtering operation.

### *4 Experimental evaluation*

#### *4.1 Performance validation metrics*

#### **Data arrival time (DAT) in ns**

Data arrival time is the time required for the data to propagate from a source sequential circuit, through combinational logic and routing and arrive at the destination sequential circuit (which must happen before the next clock edge occurs).

#### **Minimum sample period (MSP) in ns**

The minimum sampling rate is the minimum period at which a signal can be sampled without introducing errors, which is twice the highest frequency present in the signal.

#### **Area delay product (ADP) (sq.mm × ns)**

Area Delay Product is defined as the product of area occupied by the design and Minimum Sampling Period.

$$
ADP = Area \times MSP \tag{5}
$$

#### **Energy per sample (EPS) (nW× ns)**

Energy Per Sample is defined as the product of Total power consumed by the design and Minimum Sampling Period.

$$
EPS = Total power \times MSP \tag{6}
$$

#### **Maximum sampling frequency (MSF) (MHz)**

Maximum Sampling Frequency is defined as the reciprocal of the Minimum Sampling Period percentage.

$$
MSF = \frac{1}{MSP} \times 100
$$
 (7)

#### **Maximum usable frequency (MUF) (MHz)**

Maximum usable frequency (MUF) is the highest frequency that can be used between two ends of an architecture.

For discussions, as a metric of power savings, Power Consumption Ratio,  $P_r$  is used.  $P_r$  is the ratio of the proposed reconfigurable filter power consumption to the existing filter power consumption.

$$
P_r = \left(\frac{P_{\text{Proposed}}}{P_{\text{existing}}}\right)
$$
 (8)

The Power Saving Ratio (PSR) is defined as  $(1 - P)$ .

A significant factor that has a major effect on the proposed filter performance and power consumption is *Th*. Increasing the value of *Th* result in greater power savings. On the other hand, if *Th* is too small, power savings become trivial, but area optimization is carried out.

The Area Saving Ratio (ASR) is defined as  $(1 - A_i)$ .  $A_i$  is the area utilization ratio. A<sub>r</sub> is defined as the ratio of the proposed reconfigurable filter area and the existing filter area.

$$
A_r = \left(\frac{A_{\text{Proposed}}}{A_{\text{existing}}}\right)
$$
 (9)

Mean Square Error (MSE) Garcia [27] of the filter output is used as a metric of filter performance degradation.

$$
M = \frac{1}{n} \sum_{i=1}^{n} (y - y_1)^2
$$
 (10)

Where *n* is the number of samples, *y* is the expected output and  $y_1$  is the proposed reconfigurable filter output.

The comparison is made between the existing adaptive filter (named as expected output *y*) and the proposed adaptive reconfigurable filter output (named as *y*1).

The degradation in filter performance is analysed by Signal Power to MSE Ratio of the filter output (SMR). SMR realized by Proakis [28] is defined as the ratio of the desired signal to the distorted error signal power, measured in dB.

### *4.2 Evaluation datasets description*

The following database signals are used for analysing the performance of the Proposed Adaptive Reconfigurable FIR filter.

**NOIZEUS -** A Noisy Speech Corpus: The noisy database contains 30 IEEE sentences, produced by three male and three female speakers corrupted by eight different real-world noises at different SNRs. The noise signals were added to the speech signals at SNRs of 0dB, 5dB, 10dB and 15dB. (Speech Processing Lab [29]).

**Speech Enhancement and Assessment Resource (SpEAR) database:** The SpEAR database contains carefully selected samples of noise corrupted speech with clean speech references. Lombard speech samples, in which live speech signal recorded in a noisy environment and monaural recordings, in which two recorded speech signals are acoustically combined and re-recorded. (SpEAR Noisy Speech Database Beta Release v1.0 [30]).

**TIMIT:** TIMIT is an acronym composed by TI (Texas Instruments) and MIT.TIMIT provides useful speech for both the acoustic and phonetic aspects. The database contains 6300 utterances produced by 630 speakers which includes both male and female speakers of the main US regional variety in .wav format (Becchetti& Ricotta [31]).

**Table 1:** Comparison of various speech database in terms of MSE measure for Bohman filter characteristics.



**ITU test database:** International Telecommunication Union, ITU-T Recommendation P56. The dataset includes 16 recorded sentences in each of 20 languages and sentences recorded in the laboratories of some ITU members. (International Telecommunication Union [32).



**Figure 13:** MATLAB simulation results (a) original noisy input signal from spear database (Scholars\_f16r1\_16. wav noise addition from f16 flight), (b) DA based Adaptive Filter method (Park&Meher [8]), (c ) Proposed OAR FIR Filter result.

### *5 Result analysis and discussion*

The performance of the Proposed Optimized Adaptive Reconfigurable (OAR) FIR filter has been analyzed and discussed in the following sections.

The MSE was analyzed for various filter types (8, 16 and 32 taps) namely, equiripple characteristics, least square characteristics, Hamming window characteristics, and Bohman window characteristics. Bohman characteristics results in smaller MSE when compared to other filter characteristics.

### *5.1 MSE analysis*

Table 1 shows the Comparison of various speech databases in terms of MSE measure for Bohman filter characteristics. Bohman filter characteristics gives better performance. This is the reason for choosing the Bohman characteristics for comparison with various speech database. It is inferred from Table 1 that among the four databases considered, ITU database results in a smaller MSE value. The Mean Square Error (MSE) of the filter output is used as a metric of filter performance degradation. As the number of taps increases, the MSE value also increases.

Among the four speech database considered, NOIZEUS database contains noise at different SNRs. The performance measure of the architecture is analysed and tabulated by considering the NOIZEUS database.

MSE1 is the MSE value of the DA based adaptive filter design. MSE2 is the MSE value of the New DA formulation of Block LMS design. MSE3 is the MSE value of the Area-Delay-Power efficient LMS Adaptive filter. MSE4 is the MSE value of the Proposed design of the OAR-FIR filter. Table 2 shows the MSE performance measure for various filter taps with respect to different filter characteristics. MSE values are tabulated for the NOIZEUS database input "sp12\_airport\_sn15.wav", considering the window monitoring with window size  $m = 4$ . It is inferred from the table that the Bohman window filter characteristics results in smaller MSE when compared to other filter characteristics. As the number of taps increases, the MSE value also increases. Table 3 shows the Power Analysis of the 32 Taps Proposed OAR-FIR filter using the Bohman filter characteristics for various speech databases.Table 5 shows the Power Saving Ratio of the Optimized Adaptive Reconfigurable FIR filter design. Fig. 13 shows the MATLAB Simulation of existing as well as proposed Filter techniques.

#### *5.2 Power Analysis*

PSR<sub>OAWC1</sub>, PSR<sub>OAWC2</sub> and PSR<sub>OAWC3</sub> are the Power Saving Ratios of the Optimized Adaptive Reconfigurable FIR Filters designs. Equations (11-16) give the expressions for Power Saving Ratios and Power consumption ratios.

$$
PSR_{OAI} = (1 - P_{roAI})
$$
\n<sup>(11)</sup>

**Table 2:** MSE measure considering the Window based Coefficient Monitoring of the Proposed Area Optimized Adaptive Reconfigurable FIR filter with NOIZEUS speech as input with window size m = 4 for various pass band, stop band and critical frequencies



\*EQ – equiripple characteristics, LS – least square characteristics, Ham –Hamming window characteristics, Boh – Bohman window characteristics



**Table 3:** Power Analysis of the 32 Taps Proposed OAR-FIR considering the Bohman filter characteristics for different speech database.

LP : Leakage Power in nano watts; DP : Dynamic Power in nano watts; TP : Total Power in nano watts

where  $P_{\text{rOA1}}$  is the Power consumption ratio for OAR design between the Proposed Optimized Adaptive Reconfigurable FIR filter and DA based Adaptive filter design.

$$
P_{roAI} = \frac{P_{proposedOAR-FIR}}{P_{DAbasedAdaptiveFilter}} \tag{12}
$$

$$
PSR_{OA2} = (1 - P_{rOA2})
$$
\n(13)

where  $P_{\text{max}}$  is the Power consumption ratio for OAR design between the Proposed Optimized Adaptive Reconfigurable FIR filter and DA based formulation of Block LMS.

$$
P_{rOA2} = \frac{P_{\text{Pr oposedOAR} - FIR}}{P_{DAbasedformulation of BLockLMS}}
$$
(14)

$$
PSR_{OA3} = (1 - P_{rOA3})
$$
\n(15)

where  $P_{r<sub>OA3</sub>}$  is the Power consumption ratio for OAR design between the Proposed Optimized Adaptive Reconfigurable FIR filter and Optimal pipelined DA based Adaptive filter.

$$
P_{rOA3} = \frac{P_{\text{Pr} oposedOAR-FIR}}{P_{\text{Optimalppeline}dDAbasedAdaptiveFilter}}
$$
(16)

When comparing the various filter designs, the Power Saving Ratio is high for 32 Taps filter with NOIZEUS speech signal as input (which implies that, the multiplierless implementation is done while switching the modes).  $PSR_{\text{O41}}$  is equal to 19.3941% as shown in Table 3.

#### *5.3 Area analysis*

The Area Saving Ratios of the Optimized Adaptive Reconfigurable FIR Filter are ASR<sub>OA1</sub>, ASR<sub>OA2</sub>, and ASR<sub>OA3</sub>. They are calculated using Equations  $(17) - (22)$ .

$$
ASR_{OA1} = (1 - A_{rOA1})
$$
\n(17)

where A<sub>rOA1</sub> is the Area utilization ratio for OAR design between the Proposed Optimized Adaptive Reconfigurable FIR filter and DA based Adaptive FIR filter.

$$
A_{roA1} = \frac{A_{\text{Pr oposedOAR} - FIR}}{A_{DAbasedAdptiveFIRFilter}}
$$
(18)

$$
ASR_{OA2} = (1 - A_{rOA2})
$$
\n(19)

where  $A_{r0A2}$  is the Area utilization ratio for OAR design between the Proposed Optimized Adaptive Reconfigurable FIR filter and DA based formulation of Block LMS Adaptive FIR filter.

$$
A_{rOA2} = \frac{A_{\text{Pr oposedOAR-FIR}}}{A_{DA basedFromulation of \text{BLockLMSAdaptive FIRFilter}}
$$
 (20)

$$
ASR_{OA3} = (1 - A_{rOA3})
$$
 (21)

where  $A_{\text{max}}$  is the Area utilization ratio for OAR design between the Proposed Optimized Adaptive Reconfigurable FIR filter and Optimal DA Based Adaptive FIR filter.



### **Table 4:** Power Saving Ratio (PSR) of the Optimized Adaptive Reconfigurable (OAR) FIR Filter

**Table 5:** Comparison of area for OAR-FIR filter design



**C1** –DA based Adaptive FIR filtermethod; **C2** – DA based formulation of Block LMS Adaptive FIR filter;

**C3** –Optimal DA based Adaptive FIR filter; **P1** - Proposed OAR-FIR

**Table 6:** SMR analysis for OAR designs



$$
A_{roA3} = \frac{A_{\text{Pr}oposedOAR-FIR}}{A_{\text{OptimalDAbasedAdaptiveFIRFilter}}}
$$
(22)

Table 4 shows the area comparison of OAR-FIR Filter design.

### *5.4 Signal power to mean square error ratio analysis*

Table 6 presents the SMR results of proposed OAR-FIR designs. The SMR is analyzed for 8 taps, 16 taps and 32 taps FIR filter. For most of the cases the SMR value is larger than 35dB, meaning that the MSE is almost ne-

Table 7: Performance comparison of adaptive filter characteristics based on synthesis using TSMC 180-nm library



glectable. For the speech applications, if SMR is comparable or larger than the signal to quantization error power ratio or the Signal to Noise Ratio (SNR) of a given system, which is usually less than 30 dB (Proakis[27]). DAT: Data Arrival Time in ns; MSP: Minimum Sample Period in ns; ADP: Area Delay Product expressed in (sq. mm × ns); EPS: Energy Per Sample expressed in (nW× ns); MSF: Maximum Sampling Frequency expressed in (MHz) and MUF: Maximum Usable Frequency expressed in (MHz)

From the Table 7 it is inferred that the Proposed OAR-FIR design has the lowest ADP and EPS of about 0.14582 sq.mm  $\times$  ns and 2287.4614 nW  $\times$  ns respectively, when compared to the other proposed designs and existing methods. From the analysis, it is found that the proposed Adaptive reconfigurable FIR filter is more efficient in terms of power and area when compared to the conventional reconfigurable designs.

## *6 Conclusion*

An OAR-FIR Filter design has been proposed. The proposed architecture allows an efficient trade-off between the filter performance and computation energy. In the proposed reconfigurable filter, the input data and the filter coefficients are monitored. Maximum Power Saving Ratio and Area Saving Ratio of about 19.3941% and 41.9413% respectively is achieved for OAR-FIR filter with 32 taps (NOIZEUS input and ITU input respectively). The MCSD window size is fixed (here  $m = 4$ ) and those filter coefficients lying within the window alone are monitored, the multiplier turning off is reduced, which leads to a smaller Power and area saving ratio. The proposed architecture design gives better performance compared to existing designs.

In the future, the proposed approach can be applied to other areas of signal processing, where a trade-off between power savings and filter performance degradation is considered.

## *7 References*

- 1. Proakis, J.G., Manolakis, D.K.: Digital Signal Processing: Principles, Algorithms, and Applications. Macmillan Publishing Company (2007).
- 2. Ranran Liu., Hong xiang Xu., Enxing Zheng., Yi feng Jiang.:Adaptive filtering for intelligent sensing speech based on multi-rate LMS algorithm, Cluster Computing, Springer, pp.1-11(2017). https://doi.org/10.1007/s10586-017-0871-y
- 3. Ting, L.K., Woods, R., Cowan, C.F.N.: Virtex FPGA implementation of a pipelined adaptive LMS predictor for electronic support measures receivers, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 13, no. 1, pp. 86-99 (2005). https://doi.org/10.1109/TVLSI.2004.840403
- 4. Van, L.D., Feng, W.S.: An efficient systolic architecture for the DLMS adaptive filter and its applications, IEEE Transactions on Circuits and Systems II, Analog and Digital Signal Processing, vol. 48, no. 4, pp. 359-366(2001). https://doi.org/10.1109/82.933794
- 5. Meher, P.K., Maheshwari, M.: A high-speed FIR adaptive filter architecture using a modified delayed LMS algorithm, in Proceedings of IEEE International Symposium on Circuits and Systems, pp. 121-124(2011).

https://doi.org/10.1109/ISCAS.2011.5937516 6. Park, S.Y., Meher, P.K.: Low-Power, High-Throughput, and Low-Area Adaptive FIR Filter Based on Distributed Arithmetic, IEEE Transactions on Circuits and Systems-II: Express Briefs, vol.60, no.6, pp. 346-350 (2013).

https://doi.org/10.1109/TCSII.2013.2251968

- 7. Meher, P.K., Park, S.Y.: Low adaptation-delay LMS adaptive filter part-II: An optimized architecture, in Proceedings of 54th IEEE International Midwest Symposium Circuits and Systems, pp. 1-4 (2011). https://doi.org/10.1109/MWSCAS.2011.6026643
- 8. Meher, P.K., Park, S.Y.: Critical-Path Analysis and Low-Complexity Implementation of the LMS Adaptive Algorithm, IEEE Transactions on Circuits and Systems I: Regular Papers, vol.63, no.3, pp. 778-788 (2013).

https://doi.org/10.1109/TCSI.2013.2284173

9. Meher, P.K., Park, S.Y.: Area-Delay-Power Efficient Fixed-Point LMS Adaptive Filter With Low Adaptation-Delay, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 22, No. 2, pp.362- 371 (2014).

https://doi.org/10.1109/TVLSI.2013.2239321

- 10. Clark, G.A., Mitra, S.K.,Parker, S.:Block implementation of adaptive digital filters, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 29, no.3, pp. 744-752 (1981). https://doi.org/10.1109/TASSP.1981.1163603
- 11. Baghel, S., Shaik, R.: FPGA implementation of Fast Block LMS adaptive filter using Distributed Arithmetic for high-throughput, in Proceedings of International Conference on Communications and Signal Processing, pp. 443-447 (2011). https://doi.org/10.1109/iccsp.2011.5739356
- 12. Baghel, S., Shaik, R.:Low power and less complex implementation of fast block LMS adaptive filter using distributed arithmetic, in Proceedings of

IEEE Students Technology Symposium, pp. 214- 219 (2011).

https://doi.org/10.1109/TECHSYM.2011.5783848

13. Jayashri, R., Chitra, H., Kusuma, S., Pavitra, A.V.,Chandrakanth, V.:Memory based architecture to implement simplified block LMS algorithm on FPGA, in Proceedings of International Conference on Communications and Signal Processing , pp. 179-183 (2011).

https://doi.org/10.1109/ICCSP.2011.5739296

14. Mohanty, B.K., Meher.P.K.: A High-Performance Energy-Efficient Architecture for FIR Adaptive Filter Based on New Distributed Arithmetic Formulation of Block LMS Algorithm, IEEE Transactions on Signal Processing, Vol. 61, No. 4, pp. 921-932 (2013).

https://doi.org/10.1109/TSP.2012.2226453

- 15. Mohanty, B.K., Meher, P.K.: A High-Performance FIR Filter Architecture for Fixed and Reconfigurable Applications, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no.2, pp. 444-452 (2015).
- https://doi.org/10.1109/TVLSI.2015.2412556 16. Mohanty, B.K., Meher, P.K.,Patel, S.K.: LUT Opti-
- mization for Distributed Arithmetic-Based Block Least Mean Square Adaptive Filter, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol.24 , Issue: 5, pp. 1926-1935 (2016). https://doi.org/10.1109/TVLSI.2015.2472964
- 17. Khan, M.T.,Shaik,R.A.:Optimal Complexity Architectures for Pipelined Distributed Arithmetic-Based LMS Adaptive Filter, IEEE Transactions on Circuits and Systems I: Regular Papers, Volume: 66 , Issue: 2, pp. 630-642 (2019). https://doi.org/10.1109/TCSI.2018.2867291
- 18. Widrow, B., Stearns, S.D.:Adaptive Signal Processing. Englewood Cliffs, NJ, USA: Prentice-Hall (1985).
- 19. Haykin, S, Widrow, B.:Least-Mean-Square Adaptive Filters, Hoboken, NJ: Wiley-Interscience (2003).
- 20. Meyer, M.D., Agrawal, D.P.: A modular pipelined implementation of a delayed LMS transversal adaptive filter, in Proc. IEEE International Symposium on Circuits and Systems, vol.3, pp.1943-1946 (1990).
- 21. Long, G.H., Ling, F., Proakis, J.G.:The LMS algorithm with delayed coefficient adaptation', IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, no.9, pp. 1397-1405(1989). https://doi.org/10.1109/29.31293
- 22. Ramanathan, S., Visvanathan, V.: A systolic architecture for LMS adaptive filtering with minimal adaptation delay, in Proc. International Confer-

ence on Very Large Scale Integration (VLSI) Design, pp. 286-289(1996).

https://doi.org/10.1109/ICVD.1996.489612

23 Yi, Y., Woods, R., Ting, L.K., Cowan, C.F.N.: High Speed FPGA-Based Implementations of Delayed-LMS Filters', Journal of Very Large Scale Integration (VLSI) Signal Processing, Springer, vol. 39, no. 1, pp. 113-131(2005).

https://doi.org/10.1023/B:VLSI.0000047275.54691.be

24. Cohen, R.H., Herzberg, H., Beery, Y.: Delayed adaptive LMS filtering: current results', in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, NM, vol.3, pp. 1273-1276(1990).

https://doi.org/10.1109/ICASSP.1990.115604

- 25. Parhi, K.K.: VLSI Digital Signal Processing Systems – Design and Implementation', Wiley(1999).
- 26. Ram Kumar., Kittur, H.M.: Low-Power and Area-Efficient Carry Select Adder, IEEE Transactions on Very Large Scale Integration (VLSI) systems, vol.20, no.2, pp.371-375(2011). https://doi.org/10.1109/TVLSI.2010.2101621
- 27 Garcia, A.L.:Probability, Statics and Random Processes for Electrical Engineering', Upper Saddle River, NJ: Pearson Education(2009).
- 28. Proakis, J.G.:Digital Communications, 3rd ed. New York: McGraw- Hill(1995).
- 29. Speech Processing Lab, UT Dallas. Available from: <http://ecs.utdallas.edu/loizou/speech/ noizeus/>. (2008).
- 30. SpEAR Noisy Speech Database Beta Release v1.0. CSLU, Oregon Graduate Institute of Science and Technology. Available from: <http://www.cslu. ogi.edu/nsel/data/SpEAR\_database.html>(2009).
- 31. Becchetti, C., Ricotta, L.P.: Speech Recognition Theory and C++ Implementation', By John Wiley &Sons (1999).
- 32. International Telecommunication Union. Available from: <http://www.itu.int/net/itu-t/sigdb/ menu.aspx>.(2015).

G) (cc

Copyright © 2019 by the Authors. This is an open access article distributed under the Creative Com-

mons Attribution (CC BY) License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Arrived: 21. 08. 2019 Accepted: 13. 01. 2020 https://doi.org/10.33180/InfMIDEM2019.407



Journal of Microelectronics, Electronic Components and Materials Vol. 49, No. 4(2019), 255 – 260

# *Blood Sugar Level Monitoring*

*Rok Ražman, Aleksander Sešek, Jurij Tasič, Janez Trontelj*

*University of Ljubljana, Faculty of Electrical Engineering, Ljubljana, Slovenia*

**Abstract:** Blood glycemic level, also known as blood sugar level or blood glucose level, especially that reaching high values (hyper glycaemia) and persisting in time, is strongly linked to the development of type 2 diabetes and consequently serious medical conditions such as neuropathy, cardiovascular diseases, sensitivity to infections etc. Nowadays the only effective and reliable way of monitoring blood sugar level is to directly analyze the blood (capillary or venous), interstitial or other body fluids. The former method is the most used. Its main disadvantage is puncturing of patient skin (finger pricking for example) which frequently causes pain and the risk of viruses and bacteria entering the body. The development of an effective and accurate noninvasive method for blood glucose monitoring has been recognized as a crucial goal for future studies of blood sugar and implementations of such methods into wearable devices. In this paper, we propose monitoring of blood glucose level employing skin impedance measurement. A measurement system featuring an Application-Specific Integrated Circuit (ASIC) is presented and analyzed. The fabricated ASIC in 350 nm CMOS technology with dimensions 1223  $\mu$ m x 1388  $\mu$ m, typically consumes 450  $\mu$ A at 3.3 V supply voltage and operates in frequency region from 5 kHz to 16 MHz. The system exhibits a good linear response for loads up to a few kΩ, making it suitable for skin impedance measurements.

**Keywords:** blood sugar; bioimpedance; noninvasive method; ASIC

# *Merjenje ravni krvnega sladkorja*

**Izvleček:** Dlje časa trajajoča povišana raven krvnega sladkorja oziroma raven glukoze v krvi je tesno povezana s pojavom diabetesa ali sladkorne bolezni tipa 2 in posledično resnih zdravstvenih težav kot so nevropatija, srčno-žilne bolezni, nagnjenost k okužbam itd. Dandanes edini učinkovit in zanesljiv način merjenja krvnega sladkorja je neposredna analiza krvi (kapilarne ali venozne), intersticijske in drugih telesnih tekočin. Najpogostejša je prva omenjena metoda. Njena glavna slabost je prebadanje pacientove kože (npr. blazinic na prstih), ki pogosto povzroča bolečino in predstavlja tveganje, saj lahko virusi in bakterije prodrejo v telo. Razvoj učinkovite in natančne neinvazivne metode za merjenje ravni krvnega sladkorja je bil prepoznan kot ključni korak za bodoče študije krvnega sladkorja in implementacije takih metod v prenosne naprave. V tem prispevku je predstavljeno merjenje ravni krvnega sladkorja na osnovi merjenja impedance kože. Merilni sistem z integriram vezjem (ASIC) je predstavljen in analiziran. Izdelano integrirano vezje v 350 nm CMOS tehnologiji z dimenzijami 1223 µm x 1388 µm tipično troši 450 µA pri napajalni napetosti 3,3 V in deluje v frekvenčnem razponu 5 kHz do 16 MHz. Sistem izkazuje dober linearni odziv za bremena do nekaj kΩ, kar je primerno za meritve impedance kože.

**Ključne besede:** krvni sladkor; bioimpedance; neinvazivna metoda; ASIC

*\* Corresponding Author's e-mail: rok.razman@fe.uni-lj.si*

# *1 Introduction*

Insight into the human body, its internal structures, processes and functions is a relatively young invention. It could be, without exaggeration, labeled a revolution in terms of medicine. During the whole history of humankind, the actual processes in the human body were a mystery and often explained by supernatural forces and other theories (for example the four humors: blood, yellow bile, black bile and phlegm).

The only way to study the internal composition of the human body was dissection of cadavers (which for a long time was banned due to religious conventions). The above-mentioned method was not applicable to live patients and therefore their diagnosis was often not possible.

The development of science, engineering, physics, medicine etc. allowed the creation of so-called noninvasive methods for the study of the human body. One of those revolutionary methods was the discovery of X-ray imaging by Roentgen in 1895. In the 20th century, new inventions followed, like the CT (Computed Tomography scan) and Magnetic Resonance Imaging (MRI). These methods are focused on the internal structure of the body (bones, organs etc.).

The same principle was applied also to body functions and mechanisms, some noteworthy methods are pulse oximetry [1], electroencephalography (EEG) [2], electrocardiography (ECG) [3], pulse transmit time [4] and others.

Biompedance analysis of body tissues, especially skin, has been proposed as a noninvasive tool for healing process monitoring in tissue transplants [5] blood glucose [7], [8], skin cancer identification [9], body composition determination [10] etc.

Diabetes mellitus is a chronic metabolic diseases characterized by raised blood sugar or glycaemia. Undiagnosed and untreated diabetes has severe consequences for health. It causes various complications, for example vision loss (retinopathy), nerve damage, increased risk of cardiovascular, peripheral vascular and cerebrovascular disease, foot ulcers, limb amputation, kidney disease (nephropathy) [11] etc.

The estimated number of people affected by diabetes rose from 108 million in 1980 to 422 million in 2014 [12]. One of two adults with diabetes is undiagnosed [13].

Diabetes and the conditions directly caused by it is a burden to people affected, their families, national health systems and national economies [12]. The estimated cost of diabetes treatment in the United States of America in 2017 was \$327 billion, of which \$237 billion in direct cost and \$90 billion in reduced productivity. Every fourth of health care dollar in the USA is spent for diabetes [14].

Early diagnosis or detection of diabetes and prompt treatment of the disease are, beside diabetes prevention, two crucial ways to improve the health and assure longevity to the patients [12].

Continuous blood glucose monitoring is necessary for a better managing of the disease. Nowadays portable glucose meters are employed by patients to monitor their blood glucose level at home. Skin must be punctured with a lancet to obtain a droplet of capillary blood, which is applied to a disposable strip inserted into the glucose meter (Figure 1). Although this method is precise, it is invasive and consequently it causes pain, discomfort, it poses the risk of contamination with blood borne pathogens (viruses, bacteria). Many patients tend to skip individual blood glucose measurements or totally refrain from SMBG (Self Measurement of Blood Glucose) [15] rendering the treatment of diabetes less effective.



Figure 1: a) skin puncturing/pricking with a lancet, b) obtaining of a droplet of capillary blood and c) blood glucose determination with a portable glucose meter.

This paper proposes a noninvasive blood glucose level monitoring (measurement) system based on an ASIC. Chapter 2 presents the architecture of the proposed system (beside the ASIC it features also a custom PCB, a microcontroller and a Graphical User Interface (GUI)). Chapter 3 presents some information about the system response and Chapter 4 provides the main conclusions and results of presented work.

# *2 System design*

The proposed noninvasive system is centered around a fully integrated analog front-end (AFE) which is based on synchronous bioimpedance measuring. The circuitry injects an AC current into the subject under test (SUT), in this case human skin, and the voltage potential across the SUT is measured, pre-amplified, multiplied with inphase and quadrature signal and finally filtered to obtain the real and imaginary part of impedance.

### *2.1 ASIC*

The AFE was implemented in an ASIC. Figure 2 presents a simplified schematic of the ASIC. Voltage excitation through buffers buf and external resistors Rexc form a current source (pads EXCP and EXCN). The AC current is injected into the skin through a pair of electrodes and the resulting voltage potential across the electrodes is sampled (IM\_P and IM\_N). DIA\_BP is a differential band-pass instrumentation amplifier, which amplifies the voltage drop on the SUT, in this case skin.

Excitation voltage is either generated internally by digital ring oscillator (DRO) or supplied by an external signal generator. The core of the DRO (Figure 4) is a ring of inverters and capacitors. A reference current source is generated by Iref and is mirrored by Imir to the inverters. Imir is composed by 4 current mirrors with binary weighted dimensions. The inverter ring output signal osc\_out is fed to a series of 7 dividers. Each divider reduces the osc\_out frequency (fosc\_out) by a factor of two. Thus, the last divider yields a frequency equal to fosc\_out/27. The output frequency osc\_clk is chosen with a 8:1 multiplexer (MUX 8:1).

The oscillator generates square wave pulses with frequencies in the range fmin= 20 kHz to fmax=14.24 MHz. It is controlled by an 8-bit register (4 bits for Imir and 3 bits for MUX 8:1, 1 bit is free) which in turn is controlled externally via SPI (in the schematic marked as SPI\_cfg). The clk\_sel block is used to switch between the clock sources.

Figure 3a) represents the ASIC layout designed in Cadence Virtuoso in 350 nm CMOS technology. The chip size is 1223 µm x 1388 µm. The core of the chip is composed by the AFE containing an Serial Peripheral Interface (SPI) block, a differential instrumentation amplifier, an I/Q block (in-phase/quadrature clock generator), two multipliers, two low-pass filters, two unity gain amplifiers, a bias and references block, a multi-frequency generator block etc. The core is surrounded by the bonding pads (from top left corner clockwise: vss, EN\_TIM, vdda, CLK\_OUT, EXCN, EXCP, OUQ, SSn, SCLK, VBAL, EXT\_SEL, MOSI, MISO, OUI, SIGN, SIGP, EXT\_CLK).



**Figure 2:** ASIC schematic.



**Figure 3:** a) ASIC layout and b) fabricated ASIC.

Figure 3b) represents the fabricated ASIC placed in a 20-pin ceramic dual in-line package (DIP) and bonded with gold wires. Furthermore the chip could be placed in a smaller surface mount package, for example SOIC- 18 to significantly reduce the ASIC's footprint on a Printed Circuit Board (PCB).

### *2.2 PCB*

A PCB has been designed in Altium Designer and fabricated with a milling machine (Figure 5a). The PCB has the all necessary connectors, decoupling capacitors, external resistors, etc. required by the ASIC.

### *2.3 Microcontroller*

The SAMD21 Xplained Pro Evaluation Kit containing SAMD21J118A microcontroller by Atmel (now Microchip) serves as a bridge between the ASIC's SPI and a personal computer (Figure 5b). It also samples OUI and OUQ signals and provides voltage supply (3.3 V) to the PCB and ASIC.



**Figure 4:** DRO schematic.

#### *2.4 GUI*

The system includes the ASIC, the PCB, and the microcontroller, which was firstly controlled by a Matlab script. Subsequently to facilitate the user experience a GUI has been developed using Visual Studio 2015 in C# language (Figure 6). The total size of the setup file setup.exe is 774 KB.



**Figure 5:** a) PCB and b) microcontroller.

### *2.5 Skin interface*

The employed electrodes are Skintact T-60 by Leonhard Lang Gmbh, which are Ag/AgCl wet electrodes. Figure 7a) represents the placement of the electrodes (the distance between the centers of the electrodes is 60 mm), figure 7b) is an in-vivo placement of the electrodes on the left arm of a subject.



**Figure 6:** Graphical User Interface.



**Figure 7:** Electrodes placement.

### *2.6 System linearity*

The system exhibits good linearity for small loads (Figure 8), while for large loads it shows a significant nonlinearity (Figure 9). Due to this error, the calibration of the system prior to measurements is required (Figure 10). The calibration is achieved simply by substituting

the target SUT (human skin) with a series of resistors with small tolerance and measuring the system response signals OUI and OUQ at the desired excitation frequencies. Theoretically, OUQ should be equal to zero, while OUI should be linearly proportional to the resistor resistance value.



**Figure 8:** Linearity for loads 0 - 3.3 kΩ.



**Figure 9:** Linearity for loads 0 - 22kΩ.



**Figure 10:** System calibration with resistors 0 - 10 kΩ.

# *3 Blood glucose vs impedance*

A series of measurements has been performed to quantify the effect of increased blood glucose level on skin impedance. Measurements (simultaneously with a glucose meter and the proposed noninvasive system) have been done before ingesting a dose of 75 g of glucose (equal to fasting blood glucose) and after that, which is known as Oral Glucose Tolerance Test (OGT). The selected time slots of measurement were time -30 minutes presents measurement when electrodes were placed, 0 minutes when glucose was ingested orally and then measurements were performed 15 minutes, 30 minutes, 60 minutes, 90 minutes and 120 minutes after glucose ingestion.

Figure 11 represents the OGT of two subjects measured with our system and the corresponding blood glucose levels (BGL) in the legend in the upper right corner. Figures 11.a) and 11.b) were recorded for a female subject (in-phase and quadrature components respectively). Figures 11.c) and 11.d) correspond to a male subject. The measurements, which belong to the highest BGL, are marked with five-point stars.

The female subject exhibits an increased OUI around 100 kHz, which can be correlated to high BGL. The male subjects does not exhibit any peaks, but shows an increased OUI for high BGL (except for 30 min). The OUQ in both cases steadily decreases with time and it is not influenced by BGL.

## *4 Conclusions*

In the paper, a noninvasive blood glucose measurement system has been presented. Preliminary analysis of the results (Figure 11) shows that variations of skin impedance are observable before and after glucose ingestion. The interindividual differences of skin impedance during the glucose tests are large and a simple equation or algorithm for blood glucose estimation cannot be formulated without taking in account a large number of factors which affect skin impedance, including electrode-skin impedance, electrode type, stratum corneum thickness, sweat glands density, hair follicles, variations of impedance, water content of skin [16] etc.

Resorting to a multisensory system would significantly benefit to the accuracy and selectivity of the proposed system. In addition, a larger group of subjects must be included in future studies and a multivariate analysis must be performed to obtain higher accuracy.

# *5 Acknowledgments*

The authors would like to express their gratitude to the staff of the Laboratory of Microelectronics FE, University of Ljubljana for their help.

# *6 Conflict of interest*

The authors declare no conflict of interest.

### *7 References*

- 1. H. Lee, H. Ko, and J. Lee, 'Reflectance pulse oximetry: Practical issues and limitations', ICT Express, vol. 2, no. 4, pp. 195–198, Dec. 2016. https://doi.org/10.1016/j.icte.2016.10.004
- 2. M. Sawan et al., 'Wireless Recording Systems: From Noninvasive EEG-NIRS to Invasive EEG Devices', IEEE Trans. Biomed. Circuits Syst., vol. 7, no. 2, pp. 186–195, Apr. 2013. https://doi.org/10.1109/TBCAS.2013.2255595



Figure 11: Skin impedance (I and Q component respectively) of the a-b) female subject, c-d) male subject.

- 3. J. Liu and Y. Zhou, 'Design of a Novel Portable ECG Monitor for Heart Health', in 2013 Sixth International Symposium on Computational Intelligence and Design, 2013, vol. 2, pp. 257–260. https://doi.org/10.1109/ISCID.2013.178
- 4. B. Ibrahim, A. Akbari, and R. Jafari, 'A novel method for pulse transit time estimation using wrist bio-impedance sensing based on a regression model', in 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS), 2017, pp. 1–4. https://doi.org/10.1109/BIOCAS.2017.8325054
- 5. M. Min, P. Annus, R. Land, T. Paavle, E. Haldre, and R. Ruus, 'Bioimpedance Monitoring of Tissue Transplants', in 2007 IEEE Instrumentation Measurement Technology Conference IMTC 2007, 2007, pp. 1–4.

https://doi.org/10.1109/IMTC.2007.379341

6. I. Bodén, D. Nilsson, P. Naredi, and B. Lindholm-Sethson, 'Characterization of healthy skin using near infrared spectroscopy and skin impedance', Med. Biol. Eng. Comput., vol. 46, no. 10, p. 985, Oct. 2008.

https://doi.org/10.1007/s11517-008-0343-x

- 7. C. J. Cordero, L. C. L. Landicho, J. C. D. Cruz, and R. G. Garcia, 'Quantifying blood glucose level using S11 parameters', in TENCON 2017 - 2017 IEEE Region 10 Conference, 2017, pp. 1481–1486. https://doi.org/10.1109/TENCON.2017.8228091
- 8. C. E. Ferrante do Amaral and B. Wolf, 'Current development in non-invasive glucose monitoring', Med. Eng. Phys., vol. 30, no. 5, pp. 541–549, Jun. 2008.

https://doi.org/10.1016/j.medengphy.2007.06.003

- 9. P. Aberg, I. Nicander, J. Hansson, P. Geladi, U. Holmgren, and S. Ollmar, 'Skin cancer identification using multifrequency electrical impedance-a potential screening tool', IEEE Trans. Biomed. Eng., vol. 51, no. 12, pp. 2097–2102, Dec. 2004. https://doi.org/10.1109/TBME.2004.836523
- 10. M. A. Riyadi, A. N. Muthouwali, and T. Prakoso, 'Design of automatic switching bio-impedance analysis (BIA) for body fat measurement', in 2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), 2017, pp. 1–5.

https://doi.org/10.1109/EECSI.2017.8239105

- 11. K. G. Alberti and P. Z. Zimmet, 'Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus provisional report of a WHO consultation', Diabet. Med. J. Br. Diabet. Assoc., vol. 15, no. 7, pp. 539–553, Jul. 1998. https://doi.org/10.1002/(SICI)1096-9136(199807)15:7<539::AID-DIA668>3.0.CO;2-S
- 12. G. Roglic and World Health Organization, Eds., Global report on diabetes. Geneva, Switzerland: World Health Organization, 2016.
- 13. 'IDF Diabetes Atlas'. [Online]. Available: https:// www.idf.org/e-library/epidemiology-research/diabetes-atlas/134-idf-diabetes-atlas-8th-edition. html. [Accessed: 16-Aug-2019].
- 14. A. D. Association, 'Economic Costs of Diabetes in the U.S. in 2017', Diabetes Care, p. dci180007, Mar. 2018. https://doi.org/10.2337/dci18-0007
- 15. L. Heinemann, 'Finger Pricking and Pain: A Never Ending Story', J. Diabetes Sci. Technol. Online, vol. 2, no. 5, pp. 919–921, Sep. 2008. https://doi.org/10.1177/193229680800200526
- 16. G. Li, S. Wang, and Y. Y. Duan, 'Towards gel-free electrodes: A systematic study of electrode-skin impedance', Sens. Actuators B Chem., vol. 241, pp. 1244–1255, Mar. 2017.

https://doi.org/10.1016/j.snb.2016.10.005



Copyright © 2019 by the Authors. This is an open access article distributed under the Creative Com-

mons Attribution (CC BY) License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Arrived: 31. 07. 2019 Accepted: 13. 01. 2020


Journal of Microelectronics, Electronic Components and Materials Vol. 49, No. 4(2019), 261 – 261

# *50-letnica Laboratorija za mikroelektroniko, Univerze v Ljubljani, Fakultete za elektrotehniko v Ljubljani*

*50th anniversary of Laboratory of Microelectronics, University of Ljubljana, Faculty of Electrical Engineering, Ljubljana*



Na Fakulteti za elektrotehniko Univerze v Ljubljani je 20. 12. 2019 potekala slavnostna akademija ob 50-letnici Laboratorija za mikroelektroniko (LMFE). Dogodek so obeležili ugledni slavnostni govorniki.

Rektor Univerze v Ljubljani, prof. dr. Igor Papič je v svoji poslanici uvrstil ta dogodek v sklop praznovanj ob 100- letnici naše univerze in poudaril pomen prenosa tehnologij, inovacij in znanja med Univerzo v Ljubljani in industrijo in se zahvalil prof. Janezu Trontlju in članom Laboratorija za mikroelektroniko za uspešno delo. Prav tako se je vodstvu in članom laboratorija zahvalil za dolgoletno uspešno delovanje dekan Fakultete za elektrotehniko, prof. dr. Gregor Dolinar in jim zaželel tako uspešno delo kot doslej tudi v prihodnosti.

Bivši dekan, prof. dr. Janez Nastran je prijazno in toplo nagovoril predstojnika laboratorija in se spomnil dejanj, ki so v času, ko je bil dekan, spletle neizbrisno in močno povezavo med fakulteto in laboratorijem. Čas njegovega vodenja fakultete sodi v čas najbolj prijetnega vzdušja v LMFE.

Govornik Janez Novak, direktor RLS Merilna tehnika, je pomen sodelovanja z LMFE osvetlil s spominom na nekatere izjemne projekte, s katerimi je podjetje RLS postalo vodilno v svetu.

Gospod Robert Žerjal, bivši direktor razvoja v Iskri Avtoelektriki pa je spomnil, da je LMFE za slovensko industrijo izjemen in edinstven in je ogromno pomenil za njen razvoj. Ob tem pa se člani laboratorija skoraj nikoli niso pojavljali v javnosti, ampak so neutrudno sledili zastavljenim ciljem in svojemu poslanstvu.

Predstojnik laboratorija, prof. dr. Janez Trontelj je kratko predstavil prehojeno pot LMFE od začetka z nekaterimi uspešnimi primeri in s slikami zanimivih izdelkov.

Med najpomembnejšimi dosežki je izpostavil:

- Razvoj hibridne tehnologije (vezja za Iskro Elektrooptiko)
- Razvoj monolitne tehnologije (prvi čip 1976)
- Razvoj 3µm CMOS tehnologije skupaj z International Microelectronic Products
- Prispevek LMFE za nastanek Iskre Mikroelektronike
- Razvoj tehnologije AMR v LMFE

Med svetovnimi prvenci je predstavil:

- Prvo vezje za telefonski aparat (licenca ZDA in Japonska)
- Prvi 8-bitni mikroračunalnik (EMZ1001 / S2000)
- Prva knjiga o načrtovanju mešanih vezij ASIC (založba McGraw Hill)
- Prva video kaseta o načrtovanju integriranih vezij z mešanimi signali (založba IEEE)
- Prvo integrirano vezje za pametno kartico

Izjemni dosežki za domačo industrijo

- Vezja za obrambno industrijo
- Izum in izdelava prvega vezja za magnetni dajalnik kota in pozicije (kopiran v večmilijonskih serijah po svetu)
- Vezje za kontrolo prtljage na letališčih
- Vezje za varnost potnikov ob avtomobilskem trku
- Vezje za precizni senzor dotika (Renishaw)
- THz senzor, ki deluje pri sobni temperaturi
- Merilnik ekstremno nizkih koncentracij molekul (nevarne) snovi v zraku

Za prijetno presenečenje pa je na koncu poskrbel prof. dr. Marko Topič, ki je izročil plaketo Društva MIDEM s priznanjem Laboratoriju za mikroelektroniko za izjemen prispevek na znanstvenoraziskovalnem, razvojnem in inovacijskem področju ter kontinuirano podporo strokovnim in društvenim dejavnostim.

## *Boards of* MIDEM *Society | Organi društva* MIDEM

## MIDEM *Executive Board | Izvršilni odbor* MIDEM

**President of the MIDEM Society | Predsednik društva MIDEM** 

Prof. Dr. Marko Topič, University of Ljubljana, Faculty of Electrical Engineering, Slovenia

#### **Vice-presidents | Podpredsednika**

Prof. Dr. Barbara Malič, Jožef Stefan Institute, Ljubljana, Slovenia Dr. Iztok Šorli, MIKROIKS, d. o. o., Ljubljana, Slovenija

#### **Secretary | Tajnik**

Olga Zakrajšek, UL, Faculty of Electrical Engineering, Ljubljana, Slovenija

#### **MIDEM Executive Board Members | Člani izvršilnega odbora MIDEM**

Darko Belavič, HIPOT-RR d.o.o., Otočec, Slovenia Dr. Slavko Bernik, Jožef Stefan Institute, Ljubljana, Slovenia Dr. Miha Čekada, Jožef Stefan Institute, Ljubljana, Slovenia Prof. DDr. Denis Đonlagič, UM, Faculty of Electrical Engineering and Computer Science, Maribor, Slovenia Prof. Dr. Leszek J. Golonka, Technical University Wroclaw, Poland Dr. Vera Gradišnik, Tehnički fakultet Sveučilišta u Rijeci, Rijeka, Croatia Leopold Knez, Iskra TELA d.d., Ljubljana, Slovenia mag. Mitja Koprivšek, ETI Elektroelementi, Izlake, Slovenia Prof. Dr. Miran Mozetič, Jožef Stefan Institute, Ljubljana, Slovenia Prof. Dr. Janez Trontelj, UL, Faculty of Electrical Engineering, Ljubljana, Slovenia Dr. Danilo Vrtačnik, UL, Faculty of Electrical Engineering, Slovenia

### *Supervisory Board | Nadzorni odbor*

Prof. Dr. Franc Smole, UL, Faculty of Electrical Engineering, Ljubljana, Slovenia Prof. Dr. Drago Strle, UL, Faculty of Electrical Engineering, Ljubljana, Slovenia Igor Pompe, Ljubljana, Slovenia

*Court of honour | Častno razsodišče* 

Darko Belavič, Slovenia Dr. Marko Hrovat, Slovenia Dr. Miloš Komac, Slovenia

**Informacije MIDEM** *Journal of Microelectronics, Electronic Components and Materials* ISSN 0352-9045

*Publisher* / Založnik: *MIDEM Society* / Društvo MIDEM *Society for Microelectronics, Electronic Components and Materials, Ljubljana, Slovenia* Strokovno društvo za mikroelektroniko, elektronske sestavne dele in materiale, Ljubljana, Slovenija

**www.midem-drustvo.si**