Informatica 19 (1995) 257-264 257 
Comparing Inference Control Mechanisms for Statistical Databases with Emphasis on Randomizing 
Ernst L. Leiss 
Uniyersity of Houston, Department of Computer Science, Houston, Texas, 77004, U.S.A. 

coscelOcs.uh.edu 
AND Jurij Jakliè University of Ljubljana, Faculty of Economics, Kardeljeva pl. 17, 61000 Ljubljana, Slovenia jurij.jaklic@uni-lj.si 
Keywords: statistical databases, security, comparison, randomizing 
Edited by: Jerzy R. Nawrocki Received: July 27, 1994 Revised: February 24, 1995
Statistical databases are primarily collected for statisticalally contain information about persons and organizationsand only aggregate statistics on this confidential attribute
 Accepted: April 12, 1995 
analysis  purposes.  They  usu­ 
 which is considered  confidential  
 are  permitted.  


However, deduction of confidential data (inference) is frequently possible. In order to preverit this possibility several security-control mechanisms have been developed, among them the randomizing method. 
We compare randomizing with other methods using several evaluation criteria. Evalua­tion shows that randomizing has several advantages in comparison with other methods, such as high leve! of security, robustness, low cost. On the other hand, the problem of 
bias for small query sets can be considerable



 Introduction 

Databases often contain confidential information. Various security-control mechanisms deal with di­fferent kinds of security problems, such as encryp­tion, identiflcation of users, and access authoriza­tion. Here we will study inference, i.e., the de­duction of confidential information from legal (i.e. permitted) statistical summary queries. 
A statistical database (SDB) is a database where certain users are authorized to issue only aggregate (statistical summary) queries, such as sum, maximum, minimum, count, average, median, variance, standard deviation, and k-moment. These users (researchers) cannot retri­eve information about an individual entry. 
Typical examples are Census Bureau databases, salaries of individual persons in a company, and diagnoses of patients in a hospital. 
We must enable these users to retrieve aggre­gate statistics, but prevent them from retrieving 
 for some applications. 
values of confidential attributes of individual re­cords. In this way we protect the individuaPs right to privacy and on the other hand we can process information needed by the society [22]. 
Problems arise when certain users (snoopers) try to derive or infer confidential information from legal aggregate queries. If they are successful, we say that the SDB is compromised. There is more than one way how to compromise a SDB, for example the linear system attack [10] or using a tracker [8]. 
Several methods have been'developed in order to protect SDB's against compromises. Most of them are described in [2], where they are also classified and evaluated. Randomizing security-control mechanisms such as the one described in 
[17] are just mentioned in [2]. Our goal is to eva­luate this method and compare it with others. 

2 Overview of th e Method s 
Two models offer frarneworks for dealing with the problem of the SDB security [2]: one is called the conceptual model [5] and the other the lattice model [9]. They support the research in this neld but do not offer methods for security control. 
There are two approaches to the problem of in­ference for statistical databases: 
— query restriction approach and 
- perturbation approach. 

Methods belonging to the same group have si­milar characteristics and are therefore easier to compare. In this section a brief overview of the two approaches is given. 
2.1 Query Restriction Approach 
The idea is that if we do not permit every aggre­gate query to be executed, we can achieve better security of the database. The proposed methods differ in the way in which it is decided whether the query is permitted. 
The first method (query set size control) is to limit the query set size [13]. It has been noticed [10], that if the issued queries have many com­mon entities (records) in their query sets, there is a higher possibility of compromising the data­base. So, the method query set overlap control proposes to restrict the number of records that any pair of issued queries can have in common. One of the methods which can provide a very high level of security is auditing [27]. Auditing keeps track of issued queries and checks for each new query whether the database can be compromised. The partitioning method ([4],[25]) and the celi su­ppression method [6] are other techniques. 

2.2 Perturbation Approach 
The perturbation approach includes two subgro­ups of methods. One subgroup replaces original data in the SDB with other data and ušes these perturbed data to compute statistics (data pertur­bation) while the other subgroup computes stati­stics from the original data, but noise is added in one or another way during the computation of the statistics (output perturbation). 
The probability distribution methods [19] re­place the original SDB by another sample with the same (assumed) probability distribution. A variant of this method (the analytical method) approximates the data distribution by orthogo­nal polynomials (see [15]). The other proposed approach [23] is to replace the true values of a gi­ven attribute with the perturbed values once and forever (fixed data perturbation). There are two different methods, one for numerical and one for categorical attributes. 
An example of the output perturbation me­thods is the random sample queries method [7]; a statistic of a randomly selected subset of the original query set is given as a response to the issued query. 
In the randomizing method [17] a response to the given query is computed from a superset of the query set. Records are randomly selected from the database and added to the query set. 
Let us assume that a user is interested in a cer­tain statistic of a given query set of size k. Let i\,..., ik be the indices of the records in the query set, and let DK be the name of the confidential attribute in whose statistic the user is interested. Instead of computing the true statistic, another v > 0 entities of the database are selected rando­mly and added to the query set. Then the stati­stic of this superset of the original query set with k + v records is computed and returned as the re­sult. Thus for a query of type average we have the perturbed response 
A _ E?=i DKj, + ZU P*«,-_ k + v _k-a + Ej = i DK5j k + v 

where s j is the index of the j-th. randomly selected record and a is the true response C/= i DKi-)lk. 
v should be (much) smaller than k, otherwise the precision of the result can be very bad, altho­ugh the security of this method increases with hi­gher values for v. That yields a security-accuracy trade-off. Here we select v to be equal to 1, justi­fied by the observation (see [17]) that even with small introduced noise, the relative error of the in­ferred values for DK will be considerable for large values of k. 
COMPARING INFERENCE CONTROL... 




 Comparison 
We use the evaluation criteria proposed in [2] to compare the different methods; they cover the im­portant objectives of a good security control me­chanism. Some of the criteria exclude each other, and thus an effort must be directed towards ba­lancing them. 

3.1 Security 
We consider different kinds of disclosures: 
-	
 Exact disclosure is possible when a user can obtain the exact value of the confidential at­tribute. 

-	
 Partial disclosure is possible when a user can obtain an estimate DK[ of the value of the confidential attribute D K for the i-th record, such that Var(DKl) < c\ ( i.e. variance of the estimate DK[ is less than cf), where the parameter C\ is set by the DBA (Data Base Administrator). For the èase of categori­cal attributes a partial disclosure is possible when a user can infere that the confidential attribute has not a certain value. 


-	Statistical disclosure occurs when the same query is issued several times in order to ob­tain a small variance of the estimate of the true response -filtering. 
Let us look first at the exact compromisability of an SDB under different protection methods. There is the possibility of exact disclosure for some methods belonging to the query restriction approach group, namely the query set size control and the query set overlap control. 
Exact compromise cannot be done for a SDB protected by the auditing because of the nature of the auditing. This is also true for celi suppression and partitioning (see [6] and [26]). 
For methods belonging to the perturbation group, including randomizing, there is no possibi­lity for the exact disclosure, except for some rare cases for the analytièal method (see [2]) and for rounding [1]. Conditions under which exact dis­closure is possible for these two methods are very severe. 
There are more possibilities for partial disclo­sure. Regardless of which method we use, it is Informatica 19 (1995) 257-264 259 
possible to achieve partial disclosure. As for the most of the perturbation approach methods, it is possible to balance the security against the preci­sion also for randomizing. The parameters which influence the security (and precision) and can be set by the DBA are: v, the number of records to be added to the query set and j , the parameter which is used in the method to solve the problem of accuracy (see Section 3.2). 
The problem of statistical disclosure has to be considered only in the cases where answers to the same query issued several times differ. For these (perturbation) methods one can achieve a better estimate of the real answer to the query using the method of filtering. The idea is that the user repe­ats the same query several times. Let An denote the perturbed response to the ra-th repetition of the query with the true response a. In general A{ ^ Aj for i ^ j . Then the user can repeat the query m times and compute the average 
, Aj+A2 + ... + Am
a =	 . 
m 
The result of this expression will converge to a certain value a* with increasing m. If the pertur­bed response (after one repetition) is A then we have 
Var(a ) = —-, 
m if the answers to repeated queries are indepen­dent. Thus, we see that the more times one re­peats the query, the better an estimate a' of the true response a can be achieved. 
Let us see how many queries (mr) we have to issue if we want to get statistical disclosure of the SDB protected by the basic randomizing method, if the criterion for the statistical disclosure is 
Var{a!) < c\, 
where c\ is the parameter set by the DBA. Let us consider a fixed query of type average with the true value a. The perturbed value of that query is 
a • k + DK., 
A~ k+1 ' where k is the size of the original query set and DKS is the value of the confidential attribute D K of the randomly selected record. If the random number generator we use is uniform, then we can expect that on the average the value for DKS is 

equal to the average of the values over the data­base (DK*). Thus the expected value of A is 
a-k + BK* E(A) 

k + 1 
Now we can compute the variance of A: 

a-k + DKS a-k + DK* Var(A) = E 

k+1 k+1 E(T>KS -DK*)2 Far(DK ) (k + 1)2 (k + iy 

Because the responses to the queries are indepen­dent of each other, we have 
Var(A) Var(DK)Var(a') = m-(k + l)2 
m 

As we expected, the variance of the estimate de­pends on the variance of the confidential attribute and it is smaller for larger query set size. Thus we have: 
> Var(BK)mrc\-(k + iy 

Even if the mimber of queries needed to com­promise a SDB can be quite large, it is possible to do it. The reason is that with the increasing m the average introduced noise of the query always converges to the same value DK*. In order to avoid this one can use the following method. 
For each issued query two calls to the random number generator are made, and therefore two in­dices for the additional record are proposed. The selection of the single record to be added to the query set depends on the values of the confidential attribute of the records which are already in the query set. Let us say that the two proposed indi­ces are x\ and X2- Then we choose max{a;i,a;2} if the Value of the boolean expression 
E = [(DKtl < DKi2) © (DKi2 < DKh) © • • • ... ® (L>#,-,_, <DKik)} 
is true, and min{a:i,a:2} otherwise. Here © deno­tes the X0 R operator. Thus the average introdu­ced noise may differ for two different queries and therefore the database cannot be compromised. 
3.2 Accurac y of Response s 
The problem of the accuracy of responses occurs when we use perturbation methods. Using query E.L. Leiss et al. 
restriction control methods, the responses to que­ries are always equal to the true responses. We consider two criteria, namely bias and pre­cision, i.e. variance of the estimator. 
As stated in [2] the main disadvantage of the randomizing method in comparison with other ou­tput perturbation methods is the problem of bias. A bias occurs when 
E(a\A = w) ^ iv 

where again a is the true response to a fixed query and A is the perturbed value for that query. In our èase of the randomizing method and for queries of type average, we have 
A-(k + 1)- DK, 
a= k • 

^From this we can compute 
TP/-i A \ (k + l)-w E(DKs\A = w) k k 
Since, the selection of the random record does not depend on the query (for the basic method), we can obtain the fmal result 
(k + l)-w DK* E(a\A = w) k k 
-DK* 



= w + w 
k 

where DK* is again the average over ali database. It follows that in the limit, k —»• oo, the bias will be zero. 
The variance of the perturbed value A for ran­domizing is (X.f.i)2 • So, for large values of k (query set size) this method gives us quite good results, which means precise and \vithout bias. The only parameter which influences the precision is the query set size; the variance of the perturbed value is proportional to the variance over ali the database. 
The problem of accuracy is that the maximal error introduced by the randomizing can be arbi­trarily bad [17]. The average error is not so bad, but the maximal error can be very unpleasant, specially for smaller k. 
This problem can be solved, if we do not permit that the additional value is very different from the values of the records from the query set. Thus, for a query of type average we stipulate that the chosen record satisfy the condition 
mx + mn ^ _ . . _ mx + mn avg — : — < DKS < avg + 
2-i 2-j 
COMPARING INFERENCE CONTROL... 
where s is the index of the selected record, avg, mx and ran are the average, minimal and maxi­mal values of the confidential attribute in the spe­cific query set, and j is a parameter set by the DBA. If the first selected record does not satisfy the condition, then we select another record, and so on. Of course we must set a limit on the num­ber of repetitions. Here the selection of j is es­sential. If j is too small then we do not restrict randomizing; on the other hand if j is too large, possibly no record will satisfy the condition. 
It is difncult to say which of the perturbation methods is more precise, because the precision de­pends to a great deal on the selection of parame­ters of a given method. 
Ali data perturbation methods, except the fixed data perturbation method for categorical attribu­tes, suffer from the problem of bias. On the other hand, among the output perturbation approaches only randomizing has this problem. This problem might be considerable for small databases with high variance of the confidential attribute. 


3.3 Consistency 
A security control method is consistent if there are no contradictions or paradoxical results, e.g., if we get different responses to the repetition of the same query, or when the response on the average query differs from the quotient of the sum and count queries over the same query set. 
Ali query restriction methods are consistent. The only thing we have to take èare about is the possibility that the same query is once restric­ted and once not (e.g. query set overlap control). Also data perturbation methods do not give con­tradictory results, but we can obtain some para­doxical results, such as negative salaries. 
On the other hand, ali probability distribu­tion methods, random sample queries and varying output perturbation methods are inconsistent. Since the randomizing method belongs to the ran­dom sample queries methods it is inconsistent too. But we can overcome this problem if we use quasi-randomizing [18] instead of the basic method de­seribed in [17]. 
In order to select a random record to be added to the query set we use a random generator. Each random generator is a function of a parameter seed, and for the same seed the same sequence of random numbers is generated. So, when we want Informatica 19 (1995) 257-264 261 
to generate a random index of a record, we can use as a seed a function of the query set; thus for the same query set the randomly selected index will be always the same. The requirement for the function which maps a query set into a seed is that it does not change for any permutation of the query set. A simple solutioh is the sum of the values of the confidential attribute. If 
QS = {DKil,...,DKik}, 
then 
seed(QS) = DKh + ••• + DKik 
or 
s = g^RcindiDK^ + ••• + DKik)) 
where s is the randomly selected index and g is some function which maps random numbers into the set of indices of the records in the database. Another advantage of this method is that stati­stical disclosure is not possible, because responses to the same query issued several times are always the same. The problem of paradoxical values for randomizing is not as severe as for the fixed data perturbation method. 


3.4 Robustness 
We say that a security control method is robust if supplementary knowledge does not help a user who wants to compromise the SDB. Supplemen­tary knowledge is considered to be ali the informa­tion about the database which a user knows from a source other than the system [2]. In general the robustness of the query restriction approach me­thods is very low, since the responses to queries for these methods are always correct. So with su­pplementary knowledge about the database one can easily compute other values. Robustness can be controlled for some methods such as partitio­ning [26]. Very severe is the problem of robustness for auditing because queries with very small query set sizes may be permitted. 
The perturbation methods are more robust, cle­arly because perturbed answers are returned to a user. Their robustness varies from moderate (in most cases) to high for the data swapping method and can be usually controlled by the parameters of a particular method. 
The robustness of the randomizing method is high. In fact, if the number of elements known 

by the user is small in comparison to the query set size, the SDB is stili secure. As stated in [16, pp. 15] also for the number of known elements approximately equal or larger than k (query set size), the number of repetitions of queries one has to issue in order to compromise the database is quite large. This is even more pronounced if quasi-randomizing is used. 
3.5 Cos t 
We consider the cost of the implementation of the security control mechanism, the processing over­head per query and the education of the user ne­cessary to understand the responses. 
For randomizing we can say that the initial implementation effort is very low for the basic method and a bit more complicated for the me­chanisms which provide higher security (see Sec­tion 3.1), higher accuracy (see Section 3.2) and consistency (see Section 3.3). 
Some methods have very low processing over­head, for example query set size control, celi su­ppression, ali data perturbation methods and ran­dom sample queries. On the other hand there are methods such as query set overlap control and au­diting, which are tirne and space consuming. Even more disturbing than the average complexity of comparisons is the fact that the processing over­head is much larger for the queries issued later than for the queries issued at the beginning. 
The randomizing method has low processing overhead, the tirne overhead required per query is constant and there is no need for additional space. From this point of view ali variants of ran­domizing are basically low in cost. 
The randomizing method follows the majority of the methods regarding the cost of the education of the user, which means that it is low in cost ­simple to understand. 


3.6 Generality 
Some of the methods described in [2] are not de­veloped for ali types of aggregate statistic queries. For example, the varving output perturbation is developed only for sum, count and percentile sta­tistics. 
The randomizing method can be used for al­most ali types of queries, with the exception of count. But there are obviously some differences E.L. Leiss et al. 
between the usage of randomizing for statistics such as sum or average and usage of the same me­thod for selector functions as median, min, max. 
While the precision for queries of type average is good, i.e. Var(A) = ( a fc'i1y,r, we cannot say the same for the selector functions. In the worst èase for the max and min queries the difference between the true response and the randomized response can be as large as the difference between the maximal and minimal value of the confiden­
tial attribute in the database, max;=i jv|DK;| ­min;=iv..)jv|DK,|. On the other hand, a user can get also the true response. If we choose an ar­bitrary query set of size k, then the probability that the answer given to the user will be exact, is equal to 
N (m-l\ , 


V ^-^ — ~ — 
^ (") ' N ~ N' 
m=h \k> 

In order to solve this problem one can use the re­stricted randomizing method. However, we know that the value of the perturbed response to a query of type max is greater or equal to the true response. Thus, the problem of bias is here more severe. 
The situation is the same for queries of type min, but not for queries of type median: here the answer will change often, but the error is usually small and depends on the variance of the database and the selection of the query set. 
Since there are few perturbation methods which can be used for ali types of queries, vve can con­sider randomizing as a general method, even tho­ugh it is not equally good for ali types of queries. 
3.7 Suitability 
It is desirable that a method be applicable for both numeric and categorical attributes. AH query restriction approaches are able to deal with both. Two fixed data perturbation methods have been developed, one for numeric attributes [28] and the other one for categorical attributes [2]. The probability distribution and the random sam­ple queries can be used for both. The randomizing method is not suitable for categorical, but only for numeric attributes. 
In addition, some methods are suitable for more than one attribute and others for only one at­tribute. Again ali the query restriction methods are basically suitable for more than one attribute. 
COMPARING INFERENCE CONTROL... 
The only problem is for auditing, where the pro­cessing overhead can become very high and this method may have no practical value. From the perturbation methods the following are suitable for more than one attribute: data swapping, ana­lytical method, random sample queries, if the at­tributes are independent, also the varying output perturbation, as well as randomizing. 
The last criteria is suitability to dynamic SDB. For some purposes a SDB can be static, for exam­ple a census database. For other purposes it is very important that the database be dynamic and on-line. For such a database the method used must provide security also in cases of changes in the database. Moreover, it must not require too much processing overhead per change in the da­tabase. Randomizing is completely suitable for on-line dynamic databases and does not require any additional effort for implementation nor any processing overhead per change in the database. There are some methods which are not suitable to on-line dynamic SDB at ali: celi suppression, data swapping, and probability distribution. Note that ali the output perturbation methods are suitable for on-line dynamic SDB. 

3.8 Information Lost 
Information lost is defined as "amount of non-confidential information that is unnecessarily eli­minated, as well as, in the èase of perturbation methods, the statistical quality of the informa­tion provided to the users" [2]. This means that in the èase of perturbation methods the term in­formation lost corresponds to precision. In other words: the higher the precision is the lower the information lost is. This applies also to the ran­domizing method. 
The situation is different for the query restric­tion methods. Information lost can be very high for the query set size restriction and for the query overlap restriction approaches. 
The auditing method restricts by its nature only queries that can lead to compromise. Fol­lowing the defmition of the information lost (see above) we can say that this method causes no in­formation loss. However, the prièe that must be paid for this is very high (see Section 3.5). 
It is more difficult to give an estimation of infor­mation lost for the partitioning and celi suppres­sion methods because it depends on data in the Informatica 19 (1995) 257-264 263 

database. Some empirical results (see [26]) show that information loss can be very severe for par­titioning. In order to reduce it, dummy records have been proposed. 
3.9 Conclusion s 

In summary we can say that the randomizing me­thod has more than one advantage in comparison with other methods. It assures high securitv, ro­bustness, and precision if the improvements de­scribed in this section are used. Another very im­portant advantage is that the method is among the lowest in cost. 
The disadvantages are: its bias (however it tends to be small for large sizes of query sets); it is not suitable for categorical data; it is not suitable for some types of queries. 
References 
[1] Achugbue J.O.	 and F.Y. Chin. "The effec­tiveness of output modification by rounding for protection of statistical databases." IN­FOR Journal, Vol. 17, No. 3, August 1979, pp. 209-218. 
[2] Adam N.R. and J.C. Wortmann.	 "Securitv-Control Methods for Statistical Databases: A Comparative Study." ACM Computing Surveys, Vol. 21, No. 4, December 1989, pp. 515-556. 
[3] Beck L.L.	 "A security mechanism for sta­tistical databases." ACM Trans. Database Syst., Vol. 5, No. 3, September 1980, pp. 316-338. 
[4] Chin F.Y.	 and G. Ozsoyoglu. "Security in partitioned dynamic statistical databases." 
In	 Proceedings of the IEEE COMPSAC, 
1979, pp. 594-601. 
[5] Chin	 F.Y. and G. Ozsoyoglu. "Statistical database design." ACM, Trans. Database Syst., Vol. 6, No. 1, March 1981, pp. 113­
139. 
[6] Cox	 L.H. "Suppression methodology and statistical disclosure control." Journal of American Statistical Association, Vol. 75, No. 370, June 1980, pp. 377-385. 

[7] Denning D.E.	 "Secure statistical databases with random sample queries." ACM Trans. Database Syst., Vol. 5, No. 3, September 1980, pp. 291-315. 
[8] Denning D.E., P.J. Denning and M.D. Sch­wartz. "The tracker: A threat to statistical database security." ACM Trans. Database Syst., Vol. 4, No. 1, March 1979, pp. 76-96. 
[9] Denning D.E.	 and J. Schlorer. "Inference control for statistical databases." Compu­ter, Vol. 7, No. 16, July 1983, pp. 69-82. 
[10] Dobkin	 D., A.K. Jones and R.J. Lipton. "Secure databases: Protection against user influence." ACM Trans. Database Syst., Vol. 4, No. 1, March 1979, pp. 97-106. 
[11] Fellegi I.P.	 "On the question of statistical confidentiality." Journal of American Stati­stical Association, Vol. 67, No. 337, March 1972, pp. 7-18. 
[12] Friedman A.D. and L. J. HofFman, "Towards a fail-safe approach to secure databases." 
In Proceedings of the IEEE Symposium on Security and Privacy, 1980. 

[13] HofFman	 L.J. and W.F. Miller. "Getting a personal dossier from a statistical data bank." Datamation, Vol. 16, No. 5, May 1970, pp. 74-75. 
[14] Jakliè	 J. "Protecting statistical databa­ses by randomizing and other methods: Comparison and simulation." Master the­sis, Dept. of Computer Science, University of Houston, December 1992. 
[15] Lefons D., A. Silvestri and F. Tangorra. "An analytic approach to statistical databases." 
In Proceedings of 8th International Confe­rence on Very Large Databases, 1983, pp. 260-273. 

[16] Leiss	 E.L. "Protecting statistical databa­ses through randomizing." Technical Report #UH-CS-81-07, University of Houston, De­cember 1981. 
[17] Leiss	 E.L. "Randomizing, a practical me­thod for protecting statistical databases against compromise." In Proceedings of 8th E.L. Leiss et al. 
International Conference on Very Large Databases, 1982, pp. 189-196. 

[18] Leiss E.L. "Principles of data securitv." Ple­num Press, 1982. 
[19] Liew C.K., W.J.	 Choi, and C.J. Liew. "A data distortion by probability distribution." ACM Trans. Database Syst., Vol. 10, No. 3, pp. 395-411. 
[20] Matloff N.E. "Another look at the use of no­ise addition for database security." In Pro­ceedings of the IEEE Symposium on Secu­rity and Privacy, 1986. 
[21] Morris J.L. "Computational methods in ele­mentary numerical analysis." John Wiley & Sons, 1983. 
[22] Palley M.A.	 and J.S. Simonoff. "The use of regression methodology for compromise of confidential infbrmation in statistical da­tabases." ACM Trans. Database Syst., Vol. 12, No. 4, December 1987, pp. 593-608. 
[23] Reiss	 S.P. "Practical data-swapping: The first steps." ACM Trans. Database Syst., Vol. 9, No. 1, March 1984, pp. 20-37. 
[24] Reiss	 S.P. "Practical data-swapping: The first steps." In Proceedings of the IEEE Symposium on Security and Privacy, 1980. 
[25] Schlorer	 J. "Information loss in partitio­ned statistical databases." Comput. Jour­nal, Vol. 26, No. 3, 1983, pp. 218-223. 
[26] Schlorer J.	 "Disclosure from statistical da­tabases: Quantitative aspects of trackers." ACM Trans. Database Syst. Vol. 5, No. 4, December 1980, pp. 467-492. 
[27] Schlorer J.	 "Confidentality of statistical re­cords: A treat monitoring scheme of on-line dialogue." Methods Inform. Med., Vol. 1, No. 15, 1976, pp. 36-42. 
[28] Traub J.F.,	 Y. Yemini and H. Wozniako­wski. "The statistical security of a statisti­cal database." ACM Trans. Database Syst., Vol. 9, No. 4, December 1984, pp. 672-679.