UDK 533.5:681.2.08 Original scientific article/Izvirni znanstveni članek ISSN 1580-2949 MTAEC9, 43(2)85(2009) MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS MODELIRANJE KARAKTERISTIKE INVERTNEGA MAGNETRONA Z NEVRONSKIMI SISTEMI Igor Belic Institute of Metals and Technology, Lepi pot 11, 1000 Ljubljana, Slovenia igor.belic@imt.si Prejem rokopisa — received: 2009-02-10; sprejem za objavo - accepted for publication: 2009-03-10 The inverted magnetron or cold cathode gauge (CCG) is a device used as a vacuum gauge. It is a very robust device, with mostly very positive properties. The problem with its use lies in its nonlinear, temporary, variable characteristic and the fact that the theory of its operation is not thoroughly understood. Neural networks are, therefore, an ideal solution for building a nonlinear characteristics model, based on a set of measured points. Such a model is valid for some certain period of time. When the characteristic of the CCG is altered significantly (due to aging and contamination), the process of recalibration needs to be done, where again neural networks provide a very easy-to-use and robust tool. In the article the simulation of the CCG characteristics is presented. It is meant to provide sufficiently large sets of data to enable a study of the modelling properties of the used neural networks. The CCG characteristic was split into several segments, each of which was modelled by a separate neural network. The results of the study are presented. The study ended in a practically usable methodology for employing neural networks to calibrate (or recalibrate) the CCGs. Keywords: inverted magnetron, CCG, modelling, approximation, neural networks, calibration Invertni magnetron ali merilnik s hladno katodo (CCG) je naprava, ki se uporablja kot grobi merilnik tlaka v vakuumskih sistemih. To so robustne naprave s celo vrsto dobrih lastnosti. Problem praktične uporabe je, da je karakteristika CCG zelo nelinearna, časovno spremenljiva in da teorija delovanja ni povsem znana. Zato so nevronski sistemi idealno orodje za gradnjo nelinearnega modela, ki je zgrajen na množici izmerjenih točk. Tak model je uporaben v nekem časovnem obdobju. Ko se karakteristika CCG preveč spremeni (zaradi staranja in kontaminacije naprave), je treba narediti rekalibracijo. Tudi pri rekalibraciji so nevronski sistemi uporabljeni kot orodje, ki je robustno in enostavno za uporabo. V prispevku je opisana simulacija karakteristike CCG. Namenjena je generiranju zadostnega števila točk, ki so omogočile študijo lastnosti modeliranja z nevronskimi sistemi. Celotna karakteristika CCG je bila razdeljena na nekaj segmentov, pri čemer je bil vsak segment posebej modeliran s svojim nevronskim sistemom. Predstavljeni so rezultati študije. Rezultat študije je praktično uporabna metodologija modeliranja karakteristike CCG z nevronskimi sistemi, ki jih uporabimo za kalibracijo (rekalibracijo) merilnika. Ključne besede: invertni magnetron, CCG, modeliranje, aproksimacija, nevronski sistemi, kalibracija 1 INTRODUCTION The inverted magnetron or cold cathode gauge (CCG) is normally used as a coarse pressure gauge in the range from 11012 to 110-2 mbar. During our work the range from 110-9 to 110-5 mbar was used. (In the field of vacuum phisics the mbar is commonly used. The SI unit is Pa. 1 bar = 105 N/m2 = 105 Pa; 1 mbar = 1 hPa) On the principles of CCG operation, our research group has already published several articles 1,2,3,4,5 In the scope of this article only a very brief overview of the CCG's operating principles is given. In the inverted magnetron the electrons are trapped in perpendicular magnetic and electric fields 5. The electrons are moving on cycloid trajectories around the anode, which is placed inside the discharge cell. The kinetic energy of electrons is high enough to ionize the atoms and molecules of the vacuum chamber's atmosphere inside the magnetron cell. After the collision of the electron with an atom/molecule, the kinetic energy of the electron decreases, therefore it is drawn into a cycloid trajectory closer to the anode. After a series of collisions, the electron reaches the anode and therefore contributes to the anode current. Due to the higher mass/ charge ratio, the ions take wider cycloid trajectories than electrons and they hit the cathode. By doing so, new electrons emerge from the cathode surface and they add to the electron cloud within the magnetron cell67. Some ions are trapped on the cathode surface and therefore they no longer contribute to the chamber's atmosphere. This causes an unwanted pumping effect of the CCG gauge. The gauge itself lowers the pressure inside the vacuum chamber. Inverted magnetrons are very robust devices. They use very little power for their operation, they have a very high sensitivity, they operate without a hot cathode and they are relatively cheap 8. Usually, they act as relative pressure gauges used for large vacuum systems, such as accelerators, as well as in vacuum systems where the additional RF pollution caused by the gauge (for example, the hot filament cathode) cannot be tolerated 9,10,11,12. The cold cathode gauges compared to the hot cathode gauges also show a very low level of thermic outgassing, they do not emit the unwanted x-rays, nor do they cause Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 85 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS electron stimulated desorption. The electron cloud is provided solely by the self-sustaining mechanism of the vacuum chamber atmosphere's atoms/molecules ionisation. Although there are always enough electrons in the rotating field of the CCG the anode current rises with the pressure. The property that makes the use of the inverted magnetron problematic is its non-linear characteristic between the registered ion current and the actual pressure in the vacuum chamber. At very low pressures the device does not start easily and it can take some time to form the spatial charge in the CCG cell. Devices without the starter that provides the initial electron cloud might not start in UHV (ultra-high-vacuum) conditions. In addition to the many advantages of CCGs, these instruments are restricted in their use by a decreasing measuring accuracy over the operating time 13,14 as the internal electrodes become contaminated. Thus, to guarantee a consistently high measuring accuracy, this gauge type needs to be calibrated regularly after a fixed operating time period. The calibration process can be improved by the use of neural network modelling. The process of modelling the characteristics of the inverted magnetron (CCG - cold cathode gauge) using the neural networks is presented. The characteristics were obtained on a calibration ultra-high-vacuum system which consists of the test chamber, the extractor gauge, the spinning rotor gauge, and the gas manifold with the precision valve. The magnetron ion current was measured simultaneously with the high-voltage measurements between the cathode and the anode, all at different pressures, that vary form 10-9 do 10-5 mbar. The working voltage (cathode-anode) was varied in the range from 1.2kV to 9kV. For all measurements, the magnetic field density remained at 1.3T. A very positive attribute of the CCG is its extremely low thermal outgassing values, and it can be used for measurements of low-pressure values suitable for ultra-high-vacuum systems. 0123458799 U1 kV Figure 1: The nonlinear characteristic I/U of the magnetron (Measurements were conducted by dr. Alenka Vesel and dr. Miran Mozetič, both from the Josef Stefan Institute, Ljubljana, Slovenia.) Slika 1: Nelinearna karakteristika I/U magnetrona (Meritve sta izvedla dr. Alenka Vesel in dr. Miran Mozetič, oba IJS, Ljubljana, Slovenija) An unwanted property of the inverted magnetron is its highly nonlinear dependence between the ion current and the pressure in the vacuum system (Figure 1). In some areas the CCG characteristic can also show discontinuities. The mechanisms of operation of such a complicated device as an inverted magnetron are not understood in detail. Consequently, classical mathematical modelling is not appropriate to cover the analytical needs for devices that serve as measurement equipment. For the inverted magnetron in the role of a vacuum-system pressure gauge the dependence between the ion current, the operating voltage and the pressure in the vacuum system must be known. In addition, in the process of magnetron calibration, its characteristics must be measured. Usually, the number of measured points is, from the practical point of view, limited. The role of the neural network is to model the characteristic in the whole usable space between the measured points. The complete set of measured points is used as the training set for the multi-layer neural network with the classical error-backpropagation training scheme. The formed model must be able to reconstruct the input-output relationship, where the input consists of the ion current and the working voltage, while the vacuum system pressure represents the output value. The built model makes it possible to use the inverted magnetron as a pressure gauge. In the CCG's lifetime, its characteristic changes, and therefore it needs to be recalibrated several times. The use of a neural network to model the CCG's characteristic is proposed. Nonlinear CCG characteristic (I/p) is normally approximated-modelled piecewise using Equation (1). I = kpn (1) with the sensitivity I/p = k p(n-1) (2) where I represents the ion current, n and k are the constants that are different for the observed part of the CCG characteristic. In the literature 15 the values of n are listed from 1.05 to 2. The constant k depends on the magnetic flux density, the geometry of the discharge chamber and the gases present in the chamber. The constant n depends primarily on the magnetic flux density, the operating voltage and, again, on the geometry of the device 7. The theory of magnetron operation is not known in such a detail as to enable the theoretical mathematical model to cover the device's operation for measurement purposes 16. The relationship between the ion current and the pressure above the so-called "magnetron knee" is usually obtained in the logarithm tables of measured values 1718. The tables are formed in a time-consuming calibration process. Furthermore, the use of such tables makes operating the magnetron clumsy. The values between those covered in the table are usually calculated 86 Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS by linear interpolation, which introduces additional errors into the measurements. The introduction of neural networks reduces the number of measured points needed for the calibration process. Since the neural network builds the nonlinear CCG characteristic, the linear interpolation is no longer needed and, consequently, any error produced by the linear interpolation is avoided. The important properties of the inverted magnetron can be summarized in several points: • The principle of the magnetron's operation is not known in such a detail as to enable a concise mathematical model; • The CCG characteristic is nonlinear and in some places even discontinued; • Usually, a very coarse piecewise mathematical model is applied; • The operation of the CCG as a pressure gauge is stable and repeatable, although due to contamination and aging process it needs recalibrations. • It is usually used as a coarse relative pressure gauge. 1.1 A testing ground for the neural network modelling In the process of modelling it is of vital importance to have a reasonably large amount of data to first build the model and second to validate its operation. It is rare to have a situation where there is a large amount of data readily to hand. Therefore, it is very good practice to form some kind of generator that is able to provide the amount of data needed to asses all the necessary aspects of the formed model. The simulated data is intended only to enable a thorough analysis of the modelling process alone, before it can be used on "live" data (Figure 2). By no means is the simulation intended to clarify the physical phenomena that take place in the CCG. The simulation of the inverted magnetron characteristic uses the basic Equation (1), which combines the pressure in the vacuum chamber and the ion current of the inverted magnetron. Equation (1) also includes two parameters that depend on the magnetic flux density, the Figure 2: The measured characteristic of the inverted magnetron (log values for p and I) (Measurements were obtained by dr. Bojan Erjavec, IMT, Ljubljana, Slovenia.). Please note that the operating range of the device spans several decades, which complicates the modelling process. Slika 2: Izmerjena karakteristika invertnega magnetrona (logaritemske vrednosti) (Meritve je izvedel dr. Bojan Erjavec, IMT). Zaradi merilnega podro~ja, ki obsega podro~je ve~ dekad, je logaritmiranje nujno, sicer grafi~en prikaz ne bi bil smiseln. operating voltage, the geometry, and the materials used to fabricate the device. Figure 2 depicts the measured characteristic (U-p-I) measured at four different pressure values, with different operating voltages between 2.5 kV and 7.5 kV. Therefore, for each operating voltage we have four different ion-current values for different chamber pressures. At first we have to assess the values for parameters k and n. From four different characteristic points, the least-squares method was used to calculate k and n at all voltages. The upper part of Table 1 contains the assessed values that use the measured values of the CCG. These values represent the initial assessment of where the simulated values should be. In the simulated characteristic, a slightly narrower range was used (Table 1 - lg Figure 3: Simulation of the ideal characteristic of the inverted magnetron. Different curves are due to the different operating voltage U, with the appropriate parameters k and n. The right-hand figure is the log 10 of the figure on the left-hand side. Slika 3: Simulirana idealna karakteristika invertnega magnetrona. Parameter pri razli~nih krivuljah je delovna napetost U s pripadajo~ima k in n. Desna slika je desetiški logaritem leve slike. Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS lower part). The values used to create the simulation are printed in the same table. Tabela 1: Measured and simulated values for the parameters k and n Tabela 1: Izra~unane in simulirane vrednosti za parametra k in n MEASURED VALUES Operating voltage U/kV k n 2.5 O.9130 1.0026 3 1.3866 0.9973 3.5 1.8302 1.0371 4 2.2576 1.0657 4.5 2.7023 1.1448 5 3.0021 1.1127 5.5 3.2945 1.0753 6 3.5140 1.0412 6.5 3.9903 1.0319 7 4.3364 1.0218 7.5 4.9055 1.0398 SIMULATED VALUES Operating voltage U/kV k n 1 1.0000 1.0000 1.3 1.0500 1.0200 1.6 1.1000 1.0400 1.9 1.1500 1.0600 2.2 1.2000 1.0800 2.5 1.2500 1.1000 2.8 1.3000 1.1200 3.1 1.3500 1.1400 3.4 1.4000 1.1600 3.7 1.4500 1.1800 4 1.5000 1.2000 For the selected values of k and n (lower part of Table 1), the ideal characteristic U-p-I is generated. This follows the logarithm (base 10) of the pressure p and the ion current I, while keeping the operating voltage constant (Figure 3). It is not our goal to simulate the ideal characteristic, in fact we need the characteristic that includes the departures from such idealizations. The ideal characteristic is therefore modified in a few steps. All the modifications are made on data in log space. The first modification changes the value of the ion current versus pressure (Figure 4). The modification follows Equation (3). lg (I2) = lg (A) + sin (?r(lg(p) + 7)/4) (3) Here, I1 represents the ion current prior to the modification, while the current I2 represents the value after it. The second modification bends the CCG characteristic with the regard to the operating voltage - Equation (4). lg (I3) = lg (I2) + sin ^(U - 1)/3) (4) The current I2 is the value prior to, and I3 is after, the second modification (Figure 5). The third modification introduces random fluctuations to the so-far modified characteristic. The modification follows equation (5). lg (I4) = lg (I3) + m Rand() (5) As in previous modifications, the current I3 holds the value prior to, and the I4 after, the modification. The generator of random numbers is denoted by Rand(). It generates pseudo random numbers with values from -1 to +1, while the parameter m sets the magnitude of the influence of the randomization process. The result of the third modification is shown in Figure 6. The three modifications form the simulated CCG characteristic, which is presented in 3D in Figure 7. The data of the simulated CCG characteristic is gathered in Table 2. The same data can also be presented in the parameterized graph shown in Figure 8. The similarity between the characteristic from Figure 1 and Figure 8 is obvious. The similarity between the actual CCG characteristic and its simulated counterpart is close enough to enable a study of the modelling properties of the neural network. Figure 4: The first modification of the ideal CCG characteristic - it bends the characteristic with regard to the pressure. Different curves have a different operating voltage U, and the parameters k and n. The right-hand figure is the log 10 of the figure on the left-hand side. Slika 4: Prva korekcija idealne karakteristike - ukrivljenost glede na tlak. Parameter pri razli~nih krivuljah je delovna napetost U s pripadajo~ima k in n. Desna slika je desetiški logaritem leve slike. 86 Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS Figure 5 : The second modification - the characteristic is bent with regard to the operating voltage. Different curves have a different operating voltage U, and the parameters k and n. The right-hand figure is the log 10 of the figure on the left-hand side. Slika 5: Druga korekcija karakteristike - ukrivljenost glede na napetost. Parameter pri različnih krivuljah je delovna napetost U s pripadajočima k in n. Desna slika je desetiški logaritem leve slike. Figure 6 : The third modification - the randomization process. Different curves have a different operating voltage U, and the parameters k and n. The right-hand figure is the log 10 of the figure on the left-hand side. Slika 6: Tretja korekcija karakteristike - naključna sprememba. Parameter pri različnih krivuljah je delovna napetost U s pripadajočima k in n. Desna slika je desetiški logaritem leve slike. 1.2 The neural-network modelling of the CCG characteristic The calibration process for the inverted magnetron is a time-consuming task. The neural-network modelling of the characteristic must provide a reduction of the required number of calibration points and it should model the characteristic in the whole usable space. The central problem of modelling the CCG is the fact that its operation spans a large range, which is true for the current (10-11 A to 10-4 A) as well as for the pressure (10-9 to 10-6mbar). Figure 7: An example of the simulated characteristic of the magnetron Slika 7: Primer simulirane karakteristike magnetrona Figure 8: The parameterized view of the simulated characteristic of the CCG. The pressure p is the parameter for the presented curves. The higher curve is obtained at higher pressure. Slika 8: Primer parametriziranega prikaza simulirane karakteristike invertnega magnetrona. Parameter je tlak v vakuumski komori - višja krivulja je dobljena pri višjem tlaku. Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS Table 2: The simulated CCG characteristic. The ion current of the inverted magnetron I/A in relation to the pressure p/mbar, and the operating voltage U/kV. The CCG characteristic is divided into 10 segments for further processing. The central part of the table represents the ion current I/A. The segments overlapping areas are shaded in gray. Tabela 2: Simulirana karakteristika; katodni tok invertnega magnetrona I/A v odvisnosti od tlaka p/mbar in delovne napetosti U/kV. Celotna karakteristika je zaradi potreb v nadaljevanju razdeljena na 10 segmentov. Vse vrednosti v osrednjem delu tabele so katodni tok I/A. Sivo obarvana polja vsebujejo podatke, kjer se segmenti glede na vrednost tlaka p prekrivajo. [Z/kV 4 3.7 3.4 3.1 2.8 2.5 2.2 1.9 1.6 1.3 1 p/mbar I/A 1.00E-09 1.00E-10 7.37E-11 5.07E-11 3.05E-11 1.53E-11 6.19E-12 2.00E-12 5.20E-13 1.13E-13 2.14E-14 3.79E-15 2.00E-09 2.13E-10 1.63E-10 1.16E-10 7.22E-11 3.75E-11 1.57E-11 5.24E-12 1.41E-12 3.17E-13 6.24E-14 1.14E-14 3.00E-09 3.52E-10 2.74E-10 1.99E-10 1.27E-10 6.71E-11 2.87E-11 9.77E-12 2.69E-12 6.16E-13 1.24E-13 2.31E-14 4.00E-09 5.15E-10 4.07E-10 3.00E-10 1.93E-10 1.04E-10 4.51E-11 1.56E-11 4.35E-12 1.01E-12 2.06E-13 3.91E-14 5.00E-09 7.01E-10 5.60E-10 4.17E-10 2.72E-10 1.48E-10 6.49E-11 2.27E-11 6.40E-12 1.51E-12 3.10E-13 5.95E-14 6.00E-09 9.10E-10 7.34E-10 5.52E-10 3.63E-10 1.99E-10 8.81E-11 3.11E-11 8.86E-12 2.10E-12 4.37E-13 8.46E-14 7.00E-09 1.14E-09 9.27E-10 7.03E-10 4.66E-10 2.58E-10 1.15E-10 4.09E-11 1.17E-11 2.80E-12 5.87E-13 1.15E-13 8.00E-09 1.39E-09 1.14E-09 8.70E-10 5.81E-10 3.23E-10 1.45E-10 5.19E-11 1.50E-11 3.61E-12 7.61E-13 1.50E-13 9.00E-09 1.67E-09 1.37E-09 1.05E-09 7.07E-10 3.96E-10 1.79E-10 6.44E-11 1.87E-11 4.53E-12 9.60E-13 1.90E-13 1.00E-08 1.96E-09 1.62E-09 1.25E-09 8.46E-10 4.76E-10 2.16E-10 7.82E-11 2.28E-11 5.56E-12 1.19E-12 2.36E-13 2.00E-08 6.01E-09 5.15E-09 4.11E-09 2.88E-09 1.67E-09 7.87E-10 2.95E-10 8.92E-11 2.25E-11 4.96E-12 1.02E-12 3.00E-08 1.20E-08 1.05E-08 8.52E-09 6.08E-09 3.61E-09 1.73E-09 6.63E-10 2.05E-10 5.26E-11 1.18E-11 2.49E-12 4.00E-08 1.97E-08 1.75E-08 1.44E-08 1.05E-08 6.30E-09 3.07E-09 1.19E-09 3.73E-10 9.72E-11 2.22E-11 4.73E-12 5.00E-08 2.92E-08 2.61E-08 2.19E-08 1.60E-08 9.75E-09 4.80E-09 1.88E-09 5.96E-10 1.57E-10 3.63E-11 7.82E-12 6.00E-08 4.03E-08 3.64E-08 3.07E-08 2.27E-08 1.40E-08 6.93E-09 2.74E-09 8.77E-10 2.34E-10 5.44E-11 1.18E-11 7.00E-08 5.29E-08 4.83E-08 4.10E-08 3.05E-08 1.89E-08 9.47E-09 3.78E-09 1.22E-09 3.27E-10 7.67E-11 1.68E-11 8.00E-08 6.72E-08 6.16E-08 5.27E-08 3.95E-08 2.47E-08 1.24E-08 4.99E-09 1.62E-09 4.37E-10 1.03E-10 2.28E-11 9.00E-08 8.29E-08 7.65E-08 6.58E-08 4.96E-08 3.12E-08 1.58E-08 6.38E-09 2.08E-09 5.65E-10 1.34E-10 2.98E-11 1.00E-07 1.00E-07 9.28E-08 8.03E-08 6.09E-08 3.84E-08 1.96E-08 7.95E-09 2.61E-09 7.12E-10 1.70E-10 3.79E-11 2.00E-07 3.43E-07 3.30E-07 2.95E-07 2.32E-07 1.51E-07 7.98E-08 3.36E-08 1.14E-08 3.22E-09 7.97E-10 1.84E-10 3.00E-07 6.97E-07 6.83E-07 6.25E-07 5.00E-07 3.33E-07 1.79E-07 7.70E-08 2.67E-08 7.70E-09 1.94E-09 4.58E-10 4.00E-07 1.14E-06 1.14E-06 1.05E-06 8.55E-07 5.79E-07 3.16E-07 1.38E-07 4.83E-08 1.41E-08 3.63E-09 8.66E-10 5.00E-07 1.66E-06 1.67E-06 1.57E-06 1.29E-06 8.81E-07 4.86E-07 2.14E-07 7.61E-08 2.25E-08 5.84E-09 1.41E-09 6.00E-07 2.25E-06 2.28E-06 2.16E-06 1.79E-06 1.24E-06 6.88E-07 3.06E-07 1.10E-07 3.28E-08 8.57E-09 2.09E-09 7.00E-07 2.89E-06 2.96E-06 2.82E-06 2.36E-06 1.64E-06 9.20E-07 4.12E-07 1.49E-07 4.48E-08 1.18E-08 2.90E-09 8.00E-07 3.58E-06 3.69E-06 3.54E-06 2.98E-06 2.09E-06 1.18E-06 5.32E-07 1.93E-07 5.86E-08 1.55E-08 3.85E-09 9.00E-07 4.32E-06 4.48E-06 4.32E-06 3.66E-06 2.58E-06 1.46E-06 6.64E-07 2.43E-07 7.40E-08 1.98E-08 4.92E-09 1.00E-06 5.09E-06 5.31E-06 5.15E-06 4.38E-06 3.10E-06 1.77E-06 8.08E-07 2.97E-07 9.11E-08 2.44E-08 6.11E-09 2.00E-06 1.43E-05 1.54E-05 1.55E-05 1.36E-05 9.97E-06 5.90E-06 2.78E-06 1.06E-06 3.36E-07 9.34E-08 2.42E-08 3.00E-06 2.48E-05 2.73E-05 2.80E-05 2.51E-05 1.88E-05 1.13E-05 5.46E-06 2.12E-06 6.87E-07 1.95E-07 5.15E-08 4.00E-06 3.58E-05 3.99E-05 4.16E-05 3.79E-05 2.87E-05 1.76E-05 8.60E-06 3.39E-06 1.11E-06 3.20E-07 8.59E-08 5.00E-06 4.69E-05 5.29E-05 5.57E-05 5.13E-05 3.94E-05 2.44E-05 1.21E-05 4.81E-06 1.60E-06 4.64E-07 1.26E-07 6.00E-06 5.79E-05 6.60E-05 7.01E-05 6.52E-05 5.05E-05 3.15E-05 1.57E-05 6.33E-06 2.12E-06 6.22E-07 1.70E-07 7.00E-06 6.88E-05 7.90E-05 8.45E-05 7.92E-05 6.18E-05 3.89E-05 1.96E-05 7.93E-06 2.68E-06 7.92E-07 2.18E-07 8.00E-06 7.95E-05 9.18E-05 9.89E-05 9.33E-05 7.33E-05 4.65E-05 2.35E-05 9.60E-06 3.26E-06 9.72E-07 2.70E-07 9.00E-06 8.99E-05 1.04E-04 1.13E-04 1.07E-04 8.49E-05 5.41E-05 2.76E-05 1.13E-05 3.87E-06 1.16E-06 3.24E-07 1.00E-05 1.00E-04 1.17E-04 1.27E-04 1.21E-04 9.65E-05 6.19E-05 3.16E-05 1.31E-05 4.49E-06 1.35E-06 3.79E-07 2.00E-05 1.88E-04 2.27E-04 2.56E-04 2.53E-04 2.08E-04 1.38E-04 7.31E-05 3.12E-05 1.11E-05 3.46E-06 1.01E-06 3.00E-05 2.56E-04 3.16E-04 3.63E-04 3.66E-04 3.07E-04 2.08E-04 1.12E-04 4.91E-05 1.78E-05 5.67E-06 1.68E-06 4.00E-05 3.11E-04 3.89E-04 4.54E-04 4.64E-04 3.96E-04 2.72E-04 1.49E-04 6.59E-05 2.43E-05 7.84E-06 2.36E-06 5.00E-05 3.56E-04 4.51E-04 5.33E-04 5.51E-04 4.75E-04 3.30E-04 1.83E-04 8.18E-05 3.05E-05 9.94E-06 3.02E-06 6.00E-05 3.95E-04 5.05E-04 6.02E-04 6.28E-04 5.46E-04 3.83E-04 2.14E-04 9.67E-05 3.64E-05 1.20E-05 3.68E-06 7.00E-05 4.29E-04 5.53E-04 6.64E-04 6.98E-04 6.11E-04 4.32E-04 2.44E-04 1.11E-04 4.20E-05 1.39E-05 4.31E-06 8.00E-05 4.59E-04 5.95E-04 7.20E-04 7.62E-04 6.71E-04 4.78E-04 2.71E-04 1.24E-04 4.74E-05 1.58E-05 4.93E-06 9.00E-05 4.86E-04 6.33E-04 7.70E-04 8.20E-04 7.27E-04 5.20E-04 2.97E-04 1.37E-04 5.25E-05 1.76E-05 5.53E-06 1.00E-04 5.09E-04 6.68E-04 8.16E-04 8.74E-04 7.79E-04 5.60E-04 3.22E-04 1.49E-04 5.75E-05 1.94E-05 6.11E-06 The neural-network approximation requires that both the input and the output values are mapped in the range from 0 to 1 (or in some versions from -1 to +1). The main problem of mapping is that the small values are modelled with a very low precision. The problem is addressed in detail in 19. Basically, we have two strategies to deal with the problem, one is to transform the data in the log space, and the other is to split the characteristic into the appropriate number of segments 5,20. The solution to the problem of modelling the large data range with the neural networks can not be found in the literature. In such cases it is the usual approach to use the log transformation of the whole data space and then execute the modelling in log space. Nowadays, computers are very fast, they provide very large memory capacities, and so there is no difficulty in addressing the problem from another perspective. Instead of performing the log transformation, the data space can be segmented 86 Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS to be accomplished. The modelling error is again valid for each segment separately. The actual output value of the model y a is calculated from the output value y of the neural network using the equation Ja = ky (6) where ki denotes the multiplication constant of the i-th segment, ya is the scaled value of the value y produced by the neural-network model. The error produced by the model of the i-th segment can be calculated as it stands in Equation (7). V dF . Ay a •Aya = kiAy ; ki << 1 (7) Figure 9: The segmentation strategy - the complete data space is segmented into several sub-spaces. Each segment is then modelled separately Slika 9: Strategija segmentacije - delitve podro~ja na ve~ podpodro-~ij. Vsak segment je modeliran posebej into the convenient sub-spaces and the modelling process should be executed for each segment separately (Figure 9). Thus, separate models are created for each separate segment. It is of vital importance that the data is segmented in such a way that the segments are not too wide, and that we have enough data for each separate segment to do the modelling. The segmentation theory shows the following important details: • Both the input and output spaces are divided into several sub-spaces called segments. Each segment has its own multiplication constant to map the area close to the 0, 1 interval. • The neural-network training tolerance is valid for each segment only. • When all the models are formed, the process of merging them again into the single characteristic has where the function F represents the modelled function and the meaning of the other symbols is the same as in Equation (6). We have found that reasonably good modelling results can be acheived if at least five data points are available for each segment (it is true for our CCG example). The division of the modelled space depends on various parameters. The most important parameters are the shape of the modelled function and the admissible relative error that the model should fulfil. Table 2 shows the segmentation of the CCG characteristic into 10 segments. The segments should overlap in order to allow the merging of the segments when the separate segments are modelled. The overlapping region is shown with the gray background (Table 2). Neural networks (due to the pre-set training criteria) perform well at higher values, so when two segments are to be merged, one segment has locally high values, while the other is to be joined with the locally low values. During the merging process it is more likely that the data from the segment that is to be merged with the locally high values SEGMENT 6 Neurol network t SEGMENT 1 Neurol network 1 SEGMENT 7 Neurol network 7 SEGMENT Ä Neurol network 3 SEGMENTS Neurol network « SEGMENTS Neurol network 3 SEGMENT« Neurol network » SEGMENT J Neurol network 4 SEGMENT 10 Neurol network 10 SEGMENTS Neurol network 5 Figure 10: The CCG characteristic has been split into 10 segments. Each segment is modelled on its own - a separate neural network. Slika 10: Celotna karakteristika invertnega magnetrona je razdeljena na 10 segmentov, vsak segment modelira svoj nevronski sistem. Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS are more accurately modelled, and some kind of weighting (linear, nonlinear, etc.) should be used. 2 EXPERIMENTAL Table 2 holds the data of the CCG characteristic, which has been divided into 10 segments. Figure 10 graphically represents the segmenting process. The segments were formed in such a way as to ensure that for each segment the ion current I covers as little area as possible. The main idea is that each segment should cover such data space to ensure that the model will produce results with acceptable errors. Table 3: The data space covered by the separate segments Tabela 3: Segmenti karakteristike invertnega magnetrona in podro~ja, ki jih posamezni segmenti obsegajo Segment p!mbar //A 1 1.00E-09 to 6.00E-09 3.79E-15 to 8.86E-12 2 6.00E-09 to 5.00E-08 8.46E-14 to 5.96E-10 3 5.00E-08 to 4.00E-07 7.82E-12 to 4.83E-08 4 4.00E-07 to 4.00E-06 8.66E-10 to 3.39E-06 5 4.00E-06 to 1.00E-04 8.59E-08 to 1.49E-04 6 1.00E-09 to 6.00E-09 5.20E-13 to 9.10E-10 7 6.00E-09 to 5.00E-08 8.86E-12 to 2.92E-08 8 5.00E-08 to 4.00E-07 5.96E-10 to 1.14E-06 9 4.00E-07 to 4.00E-06 4.83E-08 to 4.16E-05 10 4.00E-06 to 1.00E-04 3.39E-06 to 8.74E-04 Table 3 shows the chosen segments and the area coverage for the pressure p as well as for the CCG ion current I. A brief inspection of the segments reveals that some segments still cover an area that spans well over two decades. It is a necessary trade off since the introduction of even more segments would require more data points, which does not represent the problem in the simulated environment, but for the real CCG calibration it can pose a problem. For testing purposes a 5% training tolerance was selected. 2.1 The testing environment In the experimental work we studied the modelling capabilities of neural networks, while at the same time we were seeking the neural-network architecture that would show the best modelling properties for the given problem. The approximation theory for use with the neural networks was corrected and published 20. In the same publication, the concept of the neural-network training stability was introduced. The training stability deals with the variability of various possible neural-network models and sets the boundary where all possible models (obtained with different configurations) give their results. The testing of various neural-network architectures was organised in an orderly fashion (Table 4), where the set of numbers represents the number of artificial neural cells in the appropriate layer. For clarification please refer to 20. For example, the notation 2 10 20 1 means that the input layer consists of 2 neurons, the first hidden layer of 10 neurons, the second hidden layer of 20 neurons, and finally the output layer contains 1 neuron. Since the experiment took quite some time to complete, it was necessary to develop a system that controls the experiments and in the case of power failure resumes with work where it has been interrupted. The log file was Table 4: The organisation of the different neural-network configurations included in the experiment Tabela 4: Seznam preizkušenih konfiguracij nevronskih sistemov CONFIGURATION CONFIGURATION CONFIGURATION CONFIGURATION 1 2 5 5 1 21 2 5 10 5 1 41 2 10 15 5 1 61 2 15 20 5 1 2 2 5 10 1 22 2 5 10 10 1 42 2 10 15 10 1 62 2 15 20 10 1 3 2 5 15 1 23 2 5 10 15 1 43 2 10 15 15 1 63 2 15 20 15 1 4 2 5 20 1 24 2 5 10 20 1 44 2 10 15 20 1 64 2 15 20 20 1 5 2 10 5 1 25 2 5 15 5 1 45 2 10 20 5 1 65 2 20 5 5 1 6 2 10 10 1 26 2 5 15 10 1 46 2 10 20 10 1 66 2 20 5 10 1 7 2 10 15 1 27 2 5 15 15 1 47 2 10 20 15 1 67 2 20 5 15 1 8 2 10 20 1 28 2 5 15 20 1 48 2 10 20 20 1 68 2 20 5 20 1 9 2 15 5 1 29 2 5 20 5 1 49 2 15 5 5 1 69 2 20 10 5 1 10 2 15 10 1 30 2 5 20 10 1 50 2 15 5 10 1 70 2 20 10 10 1 11 2 15 15 1 31 2 5 20 15 1 51 2 15 5 15 1 71 2 20 10 15 1 12 2 15 20 1 32 2 5 20 20 1 52 2 15 5 20 1 72 2 20 10 20 1 13 2 20 5 1 33 2 10 5 5 1 53 2 15 10 5 1 73 2 20 15 5 1 14 2 20 10 1 34 2 10 5 10 1 54 2 15 10 10 1 74 2 20 15 10 1 15 2 20 15 1 35 2 10 5 15 1 55 2 15 10 15 1 75 2 20 15 15 1 16 2 20 20 1 36 2 10 5 20 1 56 2 15 10 20 1 76 2 20 15 20 1 17 2 5 5 5 1 37 2 10 10 5 1 57 2 15 15 5 1 77 2 20 20 5 1 18 2 5 5 10 1 38 2 10 10 10 1 58 2 15 15 10 1 78 2 20 20 10 1 19 2 5 5 15 1 39 2 10 10 15 1 59 2 15 15 15 1 79 2 20 20 15 1 20 2 5 5 20 1 40 2 10 10 20 1 60 2 15 15 20 1 80 2 20 20 20 1 86 Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS li 1 . 1 . . .11 '.......»..I......!.. i...... J* B » M J "( H 1 h Figure 11: The formation of the training stability belt - the configuration 2 10 10 10 1, segment 3 Slika 11: Formiranje pasu stabilnosti učenja - primer konfiguracije 2 10 10 10 1, segment 3 Figure 13: The formation dynamics of the neural network training stability belt - configuration 2 10 10 10 1, segment 3. From the graph we can conclude that after the 90th repetition of the experiment, the training stability belt does not change significantly. Slika 13: Dinamika spreminjanja pasu stabilnosti učenja -konfiguracija 2 10 10 10 1, segment 3 Iz grafa lahko ocenimo, da se po 90. ponovitvi učenja pas stabilnosti učenja ne spreminja več bistveno. Figure 12: The formation of the training stability belt - the configuration 2 10 20 15 1, segment 6 Slika 12: Formiranje pasu stabilnosti učenja - primer konfiguracije 2 10 20 15 1, segment 6 created where all the events relating to the experiment were stored. The directory/file structure was organised to store the experimental data. Each directory holds the data on one neural-network configuration (80 directories). In each directory there are 10 files: one for each segment. For each segment and for each neural-network configuration the different neural cell connection weights were randomly generated and the network was trained. To obtain the training stability belt, 100 different weight sets were probed, meaning that 100 randomly different (in the sense of connection weights) neural networks were generated and trained. The gathered data from 100 separate experiments is stored in a file. The complete experiment is therefore saved in 800 files. All the generated neural networks were trained with equal parameters that control the behaviour of the neural-network training process. These parameters are as follows: the learning rate (0.7), the momentum (0.5), and the training tolerance (0.1). The training process is stopped when the training tolerance is reached for all the training points. Another limitation was active due to the possibility that the training process does not reach the preset training tolerance. In such cases the training is Figure 14: The formation dynamics of the neural network training stability belt - configuration 2 10 20 15 1, segment 6. Again, we can conclude that after the 90th repetition of the experiment, the training stability belt does not change significantly. Slika 14: Dinamika spreminjanja pasu stabilnosti učenja - konfiguracija 2 10 20 15 1, segment 6. Tudi iz tega grafa lahko ocenimo, da se po 90. ponovitvi učenja pas stabilnosti učenja ne spreminja več bistveno. stopped, another set of weights is generated and the training is repeated. 2.2 The formation of training stability belt Before the experiment commences, it is necessary to assess the number of necessary repetitions of the training processes that will give the information on the width of the training stability belt. Two configurations were used for the assessment: 2 10 10 10 1 - 3rd segment (Figure 11) and 2 10 20 15 1 - 6th segment (Figure 12). The dynamics of the training stability belt was assessed, and it was found that for the configuration 2 10 10 10 1 (Figure 13) after the 90th repetition the training stability belt remains stable. For the configuration 2 10 20 15 1 (Figure 14) the case is almost the same. Therefore, the number of training repetitions needed to form the training stability belt was set to 100. Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS Figure 15: Segment 1. pmax/pmin = 6.00; Imax/Imin = 2337.73; p - 1.00E-09 to 6.00E-09 mbar; I - 3.79E-15 to 8.86E-12 A Slika 15: Segment 1. pmax/pmin = 6.00; Imax/Imin = 2337.73; p - 1.00E-09 do 6.00E-09 mbar; I - 3.79E-15 do 8.86E-12 A SEGMENT 6 2 20 10 5 1 SEGMENT 1 2 15 1051 SEGMENT 7 2 20 20 1 SEGMENT 2 2 20 20 20 1 SEGMENT & 3 15 20 20 1 SEGMENT 3 2 20 20 1 SEGMENT 9 2 20 20 5 1 SEGMENT 4 2 10 5 101 SEGMENT 10 2 510 20 1 SEGMENT 5 2 20 20 20 1 Table 5: The data on modelling segment 1 Tabela 5: Podatki modeliranju 1. segmenta karakteristike The analysis of models Number of epochs Training stability belt width (-10-9) Average 325130 0.55578 Minimum 184900 0.45526 Configuration 2 15 5 5 1 2 20 15 10 1 Maximum 753820 0.67061 Configuration 2 5 20 1 2 20 5 1 Standard deviation 120651.7 0.05773 3 RESULTS AND DISCUSSION - THE COMPARISON OF THE NEURAL-NETWORK CHARACTERISTICS We are searching for the configuration of the neural network that would give the best results in terms of how fast it is capable of learning the function and, on the other hand, that it is capable of forming the model that produces the narrowest training stability belt (the modelled data is as close as possible to the original). The result of the analysis of the performance of different neural networks modelling one of the ten segments is shown in Figure 15. Each point on the graph represents one configuration. On the x axis the average number of epochs needed to train the network is presented, while on the y axis there is the average width of the training stability belt expressed in mbar. For the segment 1, the ratio between the highest and the lowest value of pressure p max/pmin is 6.0; the current ratio /max//min = 2337.73. On average, the neural networks needed 325130 epochs to satisfy the training tolerance. The trained models give, on average, a training stability belt width of 5.510-10 mbar. The minimum value for the number of epochs is 184900, and it is reached for the configuration 2 15 5 5 1. The lowest value for the training stability belt width is 4.510-10 mbar, which is obtained for the configuration 2 20 15 10 1. On the other hand, the worst results were obtained with the configurations 2 5 20 1 (753820 epochs) and 2 20 5 1 (training stability belt width 6.710-10 mbar). The Figure 16: The best configurations to model the segmented characteristic of the CCG Slika 16: Najugodnejše konfiguracije nevronskih sistemov za posamezne segmente calculated standard deviation for the number of required epochs and for the segment 1 is 120651.7, while for the training stability belt width it is 5.7 10-11 mbar. The neural-network configurations that show the best results for both the number of required epochs and the training stability belt width for segment 1 are: 2 15 10 5 1; 2 20 10 20 1; 2 20 15 10 1 etc. The red points in Figure 15 represent the configurations that show the best modelling capabilities, as well as those with the poorest results. Such assessments were made for all 10 segments. The analysis gives the neural-network configurations that model each separate CCG characteristic segment as well as possible (Figure 16). From the analysis we can conclude that there is no obvious rule that would point to the concrete architecture of the neural network with the narrowest training stability belt and, at the same time, with the fastest learning. The most favourable configuration always depends on the nature of the modelled dependence. Table 6 summarizes the most important data on the modelling properties of the neural networks for the separate CCG characteristic segments. The ratio between the highest and the lowest number of required epochs for all the experiments regardless of the configuration was 15.6, and the ratio between the highest and the lowest value for the training stability belt width was 7.1. For the study of the segmentation strategy during the CCG characteristic modelling, 80 000 models were formed and 9 720 484 000 epochs were used. 4 CONCLUSION In the presented study neural networks were used as the modelling tool for the nonlinear CCG characteristic. For building up the CCG model a reasonable amount of measured data must be available, which is to be used as 86 Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 I. BELIČ: MODELLING THE CHARACTERISTICS OF AN INVERTED MAGNETRON USING NEURAL NETWORKS Table 6: The modelling of the CCG characteristic segments - the data analysis Tabela 6 Modeliranje segmentov karakteristike invertnega magnetrona - analiza poskusov Segment no. No. of epochs Training stability belt width No. of epochs NN configuration No. of epochs NN configuration Training stability belt width NN configuration Average Average min. max. min. 1 325130 0.55578 184900 2 15 5 5 1 753820 2 5 20 1 0.45526 2 20 15 10 1 2 64178 2.84227 46430 2 5 15 1 146450 2 5 5 1 2.05883 2 20 20 20 1 3 87066 1.71938 56560 2 10 5 1 146420 2 5 5 1 1.27799 2 15 10 5 1 4 352009 1.99775 324390 2 5 5 20 1 482020 2 5 20 1 1.33548 2 15 10 1 5 27128 3.38247 8770 2 10 20 5 1 137350 2 5 20 1 1.54748 2 20 20 20 1 6 286188 0.25606 175580 2 10 5 20 1 726440 2 15 5 1 0.18384 2 20 10 5 1 7 19852 0.88654 14350 2 20 15 1 25330 2 20 5 10 1 0.59725 2 15 15 10 1 8 10571 0.50022 8230 2 10 20 15 1 21660 2 5 5 1 0.33963 2 15 20 20 1 9 36663 0.62073 25930 2 10 20 5 1 71420 2 10 20 1 0.41891 2 20 20 5 1 10 6276 1.31892 4270 2 10 20 15 1 13840 2 5 20 1 0.30653 2 5 15 1 Segment no. Training stability belt width NN configuration No. of epochs Training stability belt width max. value of signal Training stability belt width Training stability belt width Favourable Configuration max. St. dev St. dev % max. value % min. value. 1 0.67061 2 20 5 1 120652 0.05773 6 11.18 7.59 2 15 10 5 1 2 3.69692 2 10 15 1 11231 0.42213 50 7.39 4.12 2 20 20 20 1 3 2.33175 2 5 10 1 16105 0.22013 40 5.83 3.19 2 20 20 1 4 2.88486 2 5 15 5 1 39022 0.32619 40 7.21 3.34 2 10 5 10 1 5 5.61412 2 5 5 1 26164 0.86263 100 5.61 1.55 2 20 20 20 1 6 0.40764 2 20 5 1 172304 0.04225 6 6.79 3.06 2 20 10 5 1 7 1.21092 2 5 20 5 1 2449 0.15081 50 2.42 1.19 2 20 20 1 8 0.67534 2 20 10 1 3097 0.08581 40 1.69 0.85 2 15 20 20 1 9 0.78866 2 20 10 20 1 11317 0.06955 40 1.97 1.05 2 20 20 5 1 10 2.17188 2 10 20 5 1 2074 0.54849 100 2.17 0.31 2 5 10 20 1 the training set for the neural network. The CCG characteristic is split into several segments, where each of them is modelled by its own neural network. The created model is then used as the interface between the measured ion current, the operating voltage, and the actual pressure readout of the CCG. However, due to the contamination and aging of the CCG it needs to be recalibrated. The process is the same as it is in the case of the first calibration. The presented methodology is now fully developed and ready for use in practical applications. Since neural networks run on computers it is the matter of convenience whether it is realized on a separate computer or a special microcomputer system is developed (for example PIC 32 or similar) and this then becomes an integral part of the CCG device. 5 REFERENCES 1 B. Erjavec, J. Setina, L. Irmančnik-Belič, Mater. tehnol., 35 (2001) 3/4,143-150 (in Slovene) 2B. Erjavec, J. Setina, L. Irmančnik-Belič, Mater. tehnol., 35 (2001) 5, 1-257 (in Slovene) 3L. Irmančnik-Belič, I. Belič, B. Erjavec, J. Setina, Mater. tehnol., 35 (2001) 6,15-420 (in Slovene) 4L. Irmančnik-Belič, I. Belič, B. Erjavec, J. Setina, Mater. tehnol., 36 6 (2002), 401-405 (in Slovene) 5L. Irmančnik-Belič, I. Belič, B. Erjavec, J. Setina, Vacuum, 71 (2003), 505-515 6 P. J. Bryant, W. W. Longley, C. M. Gosselin, J. Vac. Sci. Technol, 3 (1965) 2, 62 7 R. N. Peacock, N. T. Peacock, D. S. Haucshulz, J. Vac. Sci Tehnol. 3 (1991) 1977 8 L. Cusco, Guide to the Measurement of Pressure and Vacuum. The Institute of Measurement and Control, London (1998) 9 A. Vesel, M. Mozetič, Vacuum, 67 (2002) 3-4, 629-633 10 A. Vesel, M. Mozetič, A. Zalar, Vacuum, 71 (2003) 1-2, 225-228 11 A. Vesel, M. Mozetič, Vacuum, 73 (2004) 2, 281-284 12 A. Vesel, M. Mozetič, M. Žumer, V. Nemanič, B. Zajc, Vacuum, 78 (2005) 1, 13-17 13 S. Wilfert, N. Schindler, Applied Physics A: Materials Science & Processing, 78 (2004) 5, 663-666 14 S. Wilfert, C. Edelmann, Vacuum 82 (2008), 412-419 15 N. T. Peacock, R. N. Peacock, J. Vac. Sci. Tehnol. 8 (1990), 2806 16 N. T. Peacock, R. N. Peacock, J. Vac. Sci. Tehnol. 3 (1988), 1141 17 B. R. F. Kendall, E. Drubetsky, J Vac Sci Technol, 3 (1997), 740 18 P. A. Redhead,. Vacuum, 38 (1988) 8-10, 901 191. Belič, L. Irmančnik-Belič, B. Erjavec, Strojarstvo 48 (2006) 1/2, 5-12 201. Belič, Vacuum, 80 (2006) 10, 1107-1122 Materiali in tehnologije / Materials and technology 43 (2009) 2, 85-95 95