https://doi.org/10.31449/inf.v43i2.2133 Informatica 43 (2019) 281–289 281 
A Comparative Study of Automatic Programming Techniques 
 
Sibel Arslan and Celal Öztürk 
Erciyes University, Engineering Faculty, Computer Engineering Department, Kayseri, Turkey 
E-mail: sibel.arslan2@icisleri.gov.tr, celal@erciyes.edu.tr 
 
Keywords: automatic programming, genetic programming, artificial bee colony programming, symbolic regression, 
prediction, feature selection 
 
Received: January 3, 2018 
Automatic programming, an evolutionary computing technique, forms the programs automatically and is 
based on higher level features that can be easily specified than normal programming languages. Genetic 
Programming (GP) is the first and best-known automatic programming technique that is applied to solve 
many practical problems.  Artificial Bee Colony Programming (ABCP) is one of the latest proposed 
automatic programming method that combines evolutionary approach with swarm intelligence. GP is an 
extension version of Genetic Algorithm (GA) and ABCP is based on Artificial Bee Colony (ABC) 
algorithm. The main differences of these automatic programing techniques and their conventional 
algorithms (GA and ABC) are modeling solution. In ABC same as GA, the solutions are represented fixed 
code blocks. In GP and ABCP, the positions of food sources are expressed in tree structure that is 
composed of different combinations of terminals and functions that are specifically defined as problems. 
This paper presents a review on GP and ABCP and they are worked in symbolic regression, prediction 
and feature selection problems which are widely tackled by researchers. The results of the ABCP 
compared with results of GP show that this algorithm is a powerful optimization technique for structural 
design. 
Povzetek: Predstavljena je primerjalna analiza tehnik avtomatskega programiranja. 
 
1 Introduction 
Computer programming is the process of obtaining a 
program that can be executed machine to use the necessary 
information to perform a task. Automatic programming is 
a computer programming technique which automatically 
generates the program code [1]. It provides practical 
solutions for many machine learning methods such as 
Artificial Neural Network (ANN), Decision Trees (DT), 
Support Vector Machines (SVM), Genetic Programming 
(GP), Artificial Bee Colony Programming (ABCP). GP, 
most well-known automatic programming method, was 
developed by Koza [2]. GP is an extension of Genetic 
Algorithm (GA) and the basic steps for the GP algorithm 
are similar to the steps of GA. ABCP is recently proposed 
automatic programming technique which is based on the 
Artificial Bee Colony Algorithm (ABC) [3]. The goal of 
this paper is to evaluate the success of the models obtained 
from these automatic programming methods in symbolic 
regression, prediction and feature selection problems and 
review papers related to the problems. 
Symbolic regression is a type of regression problem 
aimed at finding the most appropriate mathematical model 
in terms of accuracy and complexity of data. There are 
works investigating the problem of symbolic regression 
with automatic programing techniques, mostly with GP 
[3-10]. ABCP was proposed for the first time as a new 
method for the symbolic regression problem and 
compared with GP [3]. Faris proposed solving the 
symbolic regression problem using GP model was 
compared to least square estimation, GA and particle 
swarm intelligence models based on estimating the 
parameters of the nonlinear regressive curve of the cutting 
tool [4]. According to the benchmarks in the paper, GP 
was found to be superior performance. In [5], two versions 
of GP (standard GP and multi-population GP) were 
compared with ANN on pharmaceutical formulation 
symbolic regression problem. Compared to successful 
ANN models, GP models provided a significant 
advantage, parametric equations that can be interpreted 
and analyzed more easily. Gene Expression Programming 
(GEP) [6], Immune Programming (IP) [7], Ant Colony 
Programming (ACP) [8-10] are other automated 
programming techniques that used in the problem of 
symbolic regression. GEP, proposed on both GA and GP, 
is flexible at genetic operations due to its linear code 
blocks and its parse trees [6]. IP is based on Artificial 
Immune System (AIS) and is a domain-independent 
approach [7] where antigens can be represented in the tree 
structure express solutions similar to GP. ACP [8, 9] and 
Dynamic Ant Colony Programming (DAP) which 
dynamic version of ACP [10] are the main ACP examples 
that are inspired on ant colony algorithm.  
Automatic programming techniques have been solved 
prediction problems where most of applications are based 
on evolutionary optimization techniques [11-19]. Seidy 
proposed a new stock market prediction model using the 
Particle Swarm Optimization with Center of Mass 
Technique (PSOCoM) which was more successful results 
than the particle swarm optimization based models 
according to the prediction accuracy [11]. Manjusha et al. 
used Naive Bayes and J48 algorithms to diagnose 
potentially fatal dermatological diseases with similar 
282 Informatica 43 (2019) 281–289 S. Arslan et al.  
symptoms [12]. They developed the interface which the 
probability of recurrence of each disease was predicted. In 
[13] Box-Jenkins (BJ) and ANN were used together to 
model monthly water consumption in Kuwait. The input 
layer variables in the artificial neural network were 
obtained with BJ, the average error was considered, and 
the more successful results were obtained than the 
traditional methods. Ant Colony Optimization (ACO) 
technique was used to generate qualified bankruptcy 
prediction rules [15]. The Association Rule Miner (ARM) 
technique was used to group rules and eliminate irrelevant 
rules in the paper. Etemadi et al. compared Multiple 
Discriminant Analysis (MDA) and the GP in bankruptcy 
prediction modeling [18]. GP model was found to produce 
more accurate results than the MDA model, which is 
produced as an accurate bankruptcy forecasting model 
considering both the quality of the sample companies and 
the estimated duration. Searson et al. used multigene GP 
demonstrating it with an application in which a predictive 
symbolic Quantitative Structure Activity Relationship 
(QSAR) model of T. pyriformis aqueous toxicity was as 
successful as the QSAR models on the same data [19]. 
In the recent researches, the increase in the number of 
features in the data sets necessitated the use of feature 
selection methods. These methods are used to eliminate 
noisy and unnecessary features in collected data so that the 
data set can be expressed more reliably and at the same 
time classification achieve high success rates. Various 
optimization methods have been applied to solve feature 
selection problems [20-33]. Lujan et al. proposed an 
automatic programming technique called quadratic 
programming based on quadratic function optimization 
for multiclass problems [22]. The work was found more 
efficient than the Maximal Relevance (MaxRel) and 
Minimal-Redundancy-Maximum-Relevance (mRMR) on 
large data sets. In [23], statistical and entropy-based 
feature ranking methods were compared with different 
data sets. It was shown that the accuracy of the classifier 
was influenced by the choice of ranking indices. Brown et 
al. investigated the apparent statistical assumptions of 
feature selection criteria based on mutual knowledge [24]. 
They derived the objective function using the conditional 
probabilities of the training labels. When the results were 
evaluated, the Joint Mutual Information (JMI) criterion 
provided the best balance of accuracy, stability and 
flexibility criteria for small data sets. In [27], two-stage 
automatic programming technique called differential 
development based Named Entity Recognition (NER) was 
proposed. In the first stage, Conditional Random Field 
(CRF) and SVM classifiers were used in feature selection 
problems. In the second stage, classifiers according to F 
scale score were selected and combined using differential 
development based classifier collection technique. The 
technique was more successful than the other traditional 
methods. Yu et al. showed that the GP can be used as a 
feature selector and cancer classifier [30]. Selecting the 
discriminative genes of GP, expressing the relations 
between the genes as mathematical equations were proof 
that GP can be used in this field. In addition, training sets 
and GP classifiers obtained from the validation set in the 
work were tested GP successfully classified tumor classes 
and were more successful than various classification 
methods. The (k-Nearest Neighbor (k-NN) and GP based 
decision trees are applied to feature selection and were 
compared in terms of classification performance in [31]. 
Arqub et al. proposed an algorithm based on GA for the 
solutions of nonlinear systems of second-order boundary 
value problems [32]. The results show that the algorithm 
is very effective and convenient in linear and nonlinear 
cases with less computational generation and less time. 
Continuous Genetic Algorithm (CGA) is introduced 
solving systems of second-order boundary value problems 
[33]. The influence of different parameters, including the 
initialization method, the selection method, the rank-based 
ratio, the evolution of nodal values, the population size, 
the crossover and mutation probabilities, the step size, and 
the maximum nodal residual is studied in the paper. The 
algorithm had better performance than some modern 
methods. 
GP and ABCP is a successful automatic programming 
techniques which were based on GA and ABC. In this 
paper, we have compared GP and ABCP on the main 
applications of automatic programming which are 
symbolic regression, prediction and feature selection 
problems. The organization of the paper is as follows: GP 
is described in Section 2, ABCP is introduced in Section 
3, and Experimental Design is presented Section 4 and 
Results are discussed in Section 5. The paper is concluded 
in Section 6 with remarking the future work. 
2 Genetic programming 
GP, most well-known automatic programming method, 
expresses solutions as tree structures. The trees are 
randomly generated according to tree depth which is 
previously determined. The production of tree nodes is 
provided by terminals (constants or variables such as x, y, 
5) and functions (arithmetic operators such as +, - /, max). 
The representation of the tree is shown in Fig. (1) [34]. 
The root node connects to the branches each of them 
consists of more than one component. In all cases the 
model of solution is found by analyzing the entire tree. 
 
Figure 1: Representation of tree. 
A flow chart of GP is given in Fig (2). Initial 
population is produced and the fitness of the solutions 
according to the determined fitness function is assessed. 
The production of the individuals in the initial population 
is based on full method, grow method, or ramped half-and-
half method [35]. In the full method, nodes are selected 
from the function set until they reach the maximum tree 
A Comparative Study of Automatic Programming Techniques Informatica 43 (2019) 281–289 283 
 
depth. In the grow method, nodes are randomly selected 
from a set of all terminals and functions. In the ramped 
half-and-half method, 50% full and 50% grow method are 
used to produce trees in various widths and depths [2]. GP 
aims to increase the number of individuals with high 
quality survival and decrease the number of low quality 
individuals. Individuals with high quality are more likely 
to pass on to the next generation. Almost all selection 
operators of GA can be used (mostly tournament selection 
methods) in GP [36]. Individuals the optimization of 
problems is developed them with exchange operators such 
as reproduction, crossover and mutation. The crossover 
operator allows the hybrid of two selected individuals to 
produce a new individual. The subtrees taken from the two 
randomly selected crossover points of the parent trees are 
crossed to obtain new trees. The mutation operator 
provides unprecedented and unexplored individuals. A 
randomly generated node or tree is usually exchanged in 
the mutation process instead of the node selected from the 
tree. The best individuals of the previous generation are 
transferred to the current generation with elitism. The 
stopping criterion is checked that the individuals reach a 
certain value or the predetermined number of generations 
and then, the program is terminated when the stopping 
criterion is satisfied.  
 
Figure 2: The flow chart of Genetic Programming. 
3 Artificial bee colony programming 
ABC is a swarm intelligence optimization algorithm that 
simulates the behavior of honeybees and provides a 
solution for multi variable problems [37]. ABCP, based on 
ABC algorithm, was introduced first time as a new method 
for symbolic regression [3]. In ABC, the positions of the 
food sources are represented with fixed size arrays. In the 
ABCP method, the positions of food sources are expressed 
in tree structure that is composed of different 
combinations of terminals and functions that are 
specifically defined for problems [41]. The mathematical 
relationship of the solution model in ABCP can be 
represented the individuals in Fig. (3) is described Eq. (1). 
In these notations, x is used to represent the independent, 
and f (x) is dependent. 
 
Figure 3: Representation of an example solution in ABCP 
with tree structure. 
𝑓 (𝑥 )=[(𝑥 2
+2𝑥 )]+[𝑐𝑜𝑠 3𝑥 ]   (1) 
There are three different types of bees in ABCP, each 
of which is responsible for a food source, employed bee, 
onlooker bee and scout bee. The position of a food source 
express a solution (a single parse tree).  The number of 
employed bees is equal to the number of onlooker bees. 
Quality of the food source in terms of nectar is expressed 
through the fitness function of the solution. The employed 
bees search new food sources and shares information 
about food sources with the onlooker bees. They tend to 
be more inclined toward quality food sources in line with 
the information they receive from their employed bees. If 
a source is abandoned, the employed bee becomes a scout 
bee that starts to look for a new source randomly. The 
exhausted of food resources controls a parameter called 
"limit". For each source, the number of improving trials is 
kept, and in each cycle, it is checked to see whether the 
number of trials exceeds the "limit" parameter. If the food 
source is exhausted, the source is abandoned. Employed 
bees of an abandoned source turn into a scout bee and look 
randomly for a new source. Algorithmic steps for the 
ABCP are given in detail in Fig. (4). 
ABCP starts with the production of food source in the 
initial phase. Similar to producing GP’s solutions, food 
sources by full method, grow method or ramped half and 
half method are produced [2]. The main difference 
between ABCP and ABC is the neighborhood mechanism 
in generating candidate solutions [3].  When a candidate 
solution (v i) is generated based on the node x i which 
represents the current solution in the tree and a neighbor 
node solution x k which is randomly taken from the tree 
considering predetermined probability p ip. The node 
selected from the neighbor solution x k determines what 
information will be shared with the current solution and 
how much it will be shared. This sharing mechanism is 
shown in Fig. (5). Figure 5a and 5b are: node x i 
representing the current solution and neighbor node x k, 
respectively; neighboring information and the generated 
candidate solution are given in Figure 5c and Figure 5d, 
respectively. If the quality of the candidate solution v i is 
better than of the current solution x i, v i is selected on the 
other case x i is going on. 
Employed bees share the information they have 
gained after completing the research process in the sources 
they are related to with onlooker bees. They select source 
according to the probability values calculated by   Eq. (2) 
284 Informatica 43 (2019) 281–289 S. Arslan et al.  
depending on the nectar quantities of sources within the 
information they receive from employed bees’ 
information.  
1 . 0
* 9 . 0
+ =
best
i
i
fit
fit
p   (2) 
Where fit i quality of the solution i,  fit best quality of the 
best solution current solutions. If a source is more 
qualified, the probability of selecting the source increases. 
After selecting the sources to be searched, the onlooker 
bees find new sources like employed bees. The amount of 
nectar of the new found source is checked. If the new 
source has a higher amount of nectar, the new source is 
taken to memory and the old source is deleted from 
memory. Therefore, the onlooker bees show a greedy 
selection like the employed bees. 
In ABCP, the penalty point of the relevant sources is 
increased by one when no better sources can be found for 
each employed bee and onlooker bee. When a better 
source of any source is discovered, that source's penalty 
point is reset. Once all the employed bees and onlooker 
bees have completed the search operations in each cycle, 
the penalty points of sources are checked [42].  
4 Experimental design 
This section, three different experiments with benchmark 
data sets have been studied and performance values of GP 
and ABCP have been compared using similar parameter 
values and the results have been discussed.  
4.1 Experiments 
In the first experiment, the performance of models evolved 
by GP and ABCP are evaluated in the symbolic regression. 
Training data set for the 4-input non-linear Cherkassy 
function expressed in Eq. (3) [38]. The objective of the GP 
and ABCP are to evolve a symbolic function of x 1, x 2, x 3, 
and x4 that closely approximates y. 
𝑦 =exp(2x
1
sin(πx
4
))+sin(x
2
x
3
) (3) 
In the second experiment, the output values in the pH 
data set [39] are predicted.   The data in this experiment is 
taken a simulation of a pH neutralization process with one 
output (pH), which is a non-linear function of the four 
inputs. 
Concrete pressure compressive strength data [40] is 
taken to study feature selection performance of GP and 
ABCP in the last dataset. Concrete pressure compressive 
strength data is highly nonlinear of input values. The 
outputs being modelled are produced by automatic 
programming methods of the concrete data and the 
independent variables are: cement (x 1), blast furnace slag 
(x 2),   fly ash (x 3),   water (x 4), superplasticizer (x 5) ,Coarse 
aggregate (x 6), fine aggregate (x 7),  age (x 8). The noise, 
which was added to concrete dataset, consist of 50 input 
variables (x 9, x 10, …, x 58) with random values in range 
[-500,500]. The number of inputs and instances of the 
problems are shown in Table 1. 
4.2 Fitness function - parameters  
The performance of models are evaluated by the Root 
Mean Square Error (RMSE) on both the training set and 
the test set. The fitness function is shown Eq. (4). 
 
Figure 4: The flow chart of Artificial Bee Colony 
Programming. 
 
Figure 5: Example of information sharing mechanism. 
in ABCP. 
A Comparative Study of Automatic Programming Techniques Informatica 43 (2019) 281–289 285 
 
𝑓𝑖𝑡𝑛𝑒𝑠𝑠 =
√
∑ (𝑦 𝑝𝑟𝑒𝑑 −𝑦 𝑎𝑐𝑡𝑢𝑎𝑙 )
2
𝑛 𝑡 =1
𝑛  (4) 
Where n define the data size, y actual is y values from 
data set, y pred is the predicted y value by obtained solution.  
The complexity of the solution is calculated as in Eq. 
(5) in proportion to the depth of the tree and the number 
of nodes. 
𝐶 =∑ 𝑛 ∗𝑘 𝑑 𝑘 =1
   (5) 
Where C is tree complexity, d is the depth of the 
solution, and n is the number of nodes at depth. The 
parameters for GP and ABCP are summarized in Table 2. 
The add3 function is the sum of three variables (𝑥 1
+ 
𝑥 2
+ 𝑥 3
) and the mult3 function is the multiplication of 
three variables (𝑥 1
* 𝑥 2
* 𝑥 3
). If the divisor value is equal 
to zero, the result is 1, otherwise the normal division is 
performed in the rdivide function. The ifbte and the iflte 
indicates the condition of the nodes. Eq. (6) and Eq. (7) 
describe how the functions operate condition expressions. 
𝑋 =𝑖𝑓𝑏𝑡𝑒 (𝐴 ,𝐵 ,𝐶 ,𝐷 ) 
𝑖𝑓 (𝐴 ≥𝐵 )𝑡 ℎ𝑒𝑛 𝑋 =𝐶 𝑒𝑙𝑠𝑒 𝑋 =𝐷 (6) 
𝑋 =𝑖𝑓𝑙𝑡𝑒 (𝐴 ,𝐵 ,𝐶 ,𝐷 ) 
𝑖𝑓 (𝐴 <𝐵 )𝑡 ℎ𝑒𝑛 𝑋 =𝐶 𝑒𝑙𝑠𝑒 𝑋 =𝐷 (7) 
The population size was taken high value according to the 
curse of the data sets in the literature. When experiments 
are evaluated, it is observed that the optimum results were 
obtained where population/ colony size was set to 100 for 
Cherkassy, 200 for pH and 300 for Concrete. Therefore, 
other parameters like generation number and maximum 
tree depth values was set to make fair test with same 
parameters like in the literature. In this work, the stop 
criterion is decided by using the maximum generation 
number for both GP and ABCP. 
5 Results & discussions 
This section demonstrate symbolic regression, prediction 
and feature selection abilities of ABCP and GP, set of 
experiments conducted. 
5.1 Simulation Results 
The experiments are run 30 times independently for 
ABCP and GP and the obtained results are demonstrated 
in Table 3 for the problems. The R
2
 values of the best 
cases of GP and ABCP are also presented in the table for 
training and test sets.  
Metric Criteria 
Problems 
Cherkassy pH Concrete 
GP ABCP GP ABCP GP ABCP 
Mean RMSE 0.07 0.03 0.92 0.77 11.64 10.50 
Std RMSE 0.03 0.02 0.14 0.07 2.38 1.51 
Best RMSE 0.02 0.01 0.60 0.59 8.83 8.26 
Worst RMSE 0.11 0.09 12.60 0.88 16.77 14.57 
Best 𝑅 𝑡𝑟𝑎𝑖𝑛 2
 0.97 1.00 0.90 0.96 0.72 0.76 
Best 𝑅 𝑡𝑒𝑠𝑡 2
 0.98 1.00 0.91 0.96 0.70 0.73 
Table 3. Obtained results of GP and ABCP. 
While obtaining the results of GP on symbolic 
regression problem, the GPTIPS (an open source symbolic 
regression solution toolbox) [39] is modified and used in 
this work. It indicates ABCP has much better training 
Name 
#Inputs 
#Total Instances 
#Training Instances 
#Test Instances 
Noise 
Cherkassy 4 500 400 100 - 
pH 4 990 700 299 - 
Concrete 8 1030 773 257 
50 input 
variables 
[-500,500] 
Table 1: Benchmark Problems. 
Parameters 
Problems 
Cherkassy pH Concrete 
GP ABCP GP ABCP GP ABCP 
Population / 
Colony size 
100 100 200 200 300 300 
Iteration size 100 100 200 200 500 500 
Maximum tree 
depth 
12 12 12 12 12 12 
Tournament 
size 
6 - 25 - 15 - 
Mutation ratio 0.14 - 0.14 - 0.14 - 
Crossover ratio 0.84 - 0.84 - 0.84 - 
Direct 
reproduction 
ratio 
0.02 - 0.02 - 0.02 - 
Functions 
+,-, *, tan, sin, 
cos, square, 
max, min, exp, 
ifbte, iflte 
+, -, *, tanh, 
add3, mult3 
+,-
,*,rdivide,sin, 
cos,exp,rlog, 
add3,mult3 
Constans [-10,10] [-10,10] [-10,10] 
Table 2: Parameters for GP and ABCP. 
286 Informatica 43 (2019) 281–289 S. Arslan et al.  
performance than GP on these datasets. The best mean 
fitness value of all runs was found in the 0.0323 Cherkassy 
symbolic regression problem in ABCP. The standard 
deviation of ABCP is in 10% range where it is in 20% 
range at GP for pH dataset. However all criteria are worse 
than the other datasets due to the noise in the concrete 
dataset, ABCP has comparable performance than GP. 
The curve fitting of the y actual and the y pred values for 
the training and test data set at this best fitness value is 
expressed in Figure 6 for Cherkassy, Figure 7 for pH, 
Figure 8 for Concrete dataset in ABCP. 
 
Figure 6: Predicted and actual data points on Cherkassy in 
best ABCP. 
 
Figure 7: Predicted and actual data points on pH in best 
ABCP. 
Figures 6-7 and 8 show the evolution plots for all 
datasets. The y actual and the y pred values are close to each 
other for training and test data as seen from the curves. 
5.2 Analysis of evolved models 
The evolved models of best solutions in ABCP of all 
runs are shown in Table 4. Mathematical models have 
been obtained using all inputs in the Cherkassy and pH 
datasets. The evolved model for Concrete has Blast 
Furnace Slag, Age, Cement, Plastic input parameters of 8 
input parameters are selected. It should be noticed that 
only the x 17 input is taken from the 50 added noise 
parameters. The presence of only one of the noise 
parameters in the equation is an indication that the ABCP 
is successful in feature selection. 
 
Figure 8: Predicted and actual data points on Concrete in 
best ABCP. 
The total number of nodes tree has, tree depth, tree 
complexity of the best solution is given Table 5 for each 
dataset. As seen in Table 5, the noise parameters added to 
the inputs in the problems increase the difficulty of the 
problems. Increasing the difficulty enlarges the solution 
trees and increases their complexity. Since the Cherkassy 
function is easier than other problems, the complexity of 
the tree is the least problem. 
Problem 
ABCP 
Total 
number 
of nodes 
Depth of 
the best 
solution 
tree 
Best solution 
tree 
complexity 
Cherkassy 25 7 118 
pH 72 11 467 
Concrete 80 12 657 
Table 5: Best solution tree information for each data set. 
6 Conclusion 
In this paper, the automatic programming methods have 
been examined on symbolic regression, prediction and 
feature selection. The results of the symbolic regression 
on Cherkassy function, prediction on pH and feature 
selection concrete datasets are used to compare GP and 
ABCP methods. In all three experiments, ABCP 
demonstrate higher performance than GP in terms of 
finding more accurate mathematical modeling in symbolic 
A Comparative Study of Automatic Programming Techniques Informatica 43 (2019) 281–289 287 
 
regression, better fitting in prediction ability and effective 
in finding important features along with the presence of 
redundant features. 
In the future, we are intending to investigate several 
interesting researches. Simulation works will be done to 
model the fundamental classification problems (cancer, 
diabetes, heart, gene diseases etc.) with GP and ABCP to 
get performance results. In addition, we are planning to 
work the Multi-Gen Genetic Programming and Multi-
Hive ABCP and compare the results with standard GP and 
ABCP to enhance the symbolic regression, prediction and 
future selection abilities of solutions. 
Acknowledgement  
This project was supported by Scientific Research Project 
Foundation of Erciyes University (Project ID: FBA-12-
4029). 
References 
[1] Alan.W. Biermann. Automatic Programming: A 
Tutorial on Formal Methodologies, J. Symbolic 
Computation, pp. 119-142, 1985. 
https://doi.org/10.1016/s0747-7171(85)80010-9 
[2] John. R. Koza. Genetic Programming: On the 
Programming of Computers by Means of Natural 
Selection, MIT Press, Cambridge, MA, USA, 1992. 
https://doi.org/10.1007/BF00175355. 
[3] Dervis Karaboga, Celal Ozturk, Nurhan Karaboga 
and Beyza Gorkemli. Artificial bee colony 
programming for symbolic regression, Information 
Sciences, 209, pp. 1–15, 2012. 
https://doi.org/10.1016/j.ins.2012.05.002 
[4] Hossam Faris. A Symbolic Regression Approach for 
Modeling the Temperature of Metal Cutting Tool, 
International Journal of Control and Automation. 
6(4), 2013. 
[5] Panagiotis Barmpalexis, Kyriakos Kachrimanis, 
Athanasios Tsakonas, E. Georgarakis. Symbolic 
regression via genetic programming in the 
optimization of a controlled release pharmaceutical 
formulation, Chemometrics and Intelligent 
Laboratory Systems, 107, pp. 75–82, 2011. 
https://doi.org/10.1016/j.chemolab.2011.01.012 
[6] Xin Li. Self-Emergence of Structures in Gene 
Expression Programming, Ph.D. Thesis, University 
of Illinois at Chicago, 2006. 
[7] Petr Musilek, Adriel Lau, Marek Reformat and 
Loren Wyard-Scott, Immune programming, 
Information Sciences, 176, pp. 972–1002, 2006. 
Dataset Model of Best of Run Individual 
Number 
of Inputs 
Cherkassy 𝑦 =((exp(3𝑥 4
𝑥 1
)∗(exp(𝑥 2
𝑥 3
)+𝑥 4
𝑥 1
))+𝑥 4
𝑥 1
)∗exp(𝑥 4
𝑥 1
)) 4 
pH 
𝑦 =𝐴 +𝐵 +𝐶 
 
𝐴 =(tanh (tanh(𝑥 4
+(𝑥 3
∗tanh(𝑥 2
)))+2𝑥 3
)∗tanh(tanh(𝑥 2
))−𝑥 2
) 
𝐵 =𝑥 2
2
∗(tanh((𝑥 4
−(𝑥 1
−8.935+(𝑥 1
+𝑥 3
+𝑥 4
)∗tanh(tanh(𝑥 2
))))∗tanh(𝑥 2
)))
+𝑥 2
 
𝐶 =(𝑥 4
+(𝑥 2
+tanh(tanh(tanh(𝑥 2
)))+tanh(tanh(2𝑥 2
+𝑥 4
)))+((𝑥 2
2
∗𝑥 3
)+𝑥 2
))
+tanh(𝑥 2
)+tanh(tanh(𝑥 2
)−(𝑥 1
−𝑥 3
))−𝑥 4
 
4 
Concrete 
𝑦 =|log(
𝐴 𝐵 )+(𝐶 )| 
𝐴 =|(tanh(tanh(𝑠𝑞𝑟𝑡 (𝑠𝑞𝑟𝑡 (𝐴𝑔𝑒 ))))+(
𝑠𝑞𝑟𝑡 (log(𝐶𝑒𝑚𝑒𝑛𝑡 ))
(−2.997)
+𝑆𝑙𝑎𝑔 )∗𝐴𝑔𝑒 ∗
tanh(tanh(log(
𝑃𝑙𝑎𝑠𝑡𝑖𝑐 −3.877
)))
tanh(tanh(𝑥 17
))
(−3.877)
)| 
𝐵 =(𝑠𝑞𝑟𝑡 (𝐶𝑒𝑚𝑒𝑛𝑡 ))
2
+|(|𝐶𝑒𝑚𝑒𝑛𝑡 |∗(−3.877))
2
| 
𝐶 =(−3.877)+
|
|
(
 
 
𝑠𝑞𝑟𝑡 (log(𝐴𝑔𝑒 )
2
)∗𝑠𝑞𝑟𝑡 (𝐶𝑒𝑚𝑒𝑛𝑡 )−(
tanh(tanh(𝑃𝑙𝑎𝑠𝑡𝑖𝑐 ))
tanh(
tanh(𝐶𝑒𝑚𝑒𝑛𝑡 )
(−3.877)
)
)
)
 
 
|
|
 
−tanh(tanh(log(
𝑃𝑙𝑎𝑠𝑡𝑖𝑐 (−3.877)
)))/tanh (tanh(tanh(
tanh(𝐶𝑒𝑚𝑒𝑛𝑡 )
(−3.877)
))) 
5 
Table 4: Models of Best Run ABCP and GP. 
288 Informatica 43 (2019) 281–289 S. Arslan et al.  
https://doi.org/10.1016/j.ins.2005.03.009. 
[8] Mariusz Boryczka, Ant colony programming: 
application of ant colony system to function 
approximation, Intelligent Systems for Automated 
Learning and Adaptation: Emerging Trends and 
Applications, pp. 248–272, 2010. 
https://doi.org/10.4018/978-1-60566-798-0.ch011. 
[9] Olivier Roux, Cyril Fonlupt . Ant programming or, 
how to use ants for automatic programming, 
Proceedings of ANTS’2000, pp. 121– 129, 2000. 
[10] Shinichi Shirakawa, Shintaro Ogino,  Tomoharu 
Nagao. T. Nagao, Dynamic ant programming for 
automatic construction of programs, IEEE 
Transactions on Electrical and Electronic 
Engineering, pp. 540–548, 2008. 
https://doi.org/10.1002/tee.20311. 
[11] Essam.El. Seidy. A New Particle Swarm 
Optimization Based Stock Market Prediction 
Technique, International Journal of Advanced 
Computer Science and Applications (IJACSA), Vol. 
7, No. 4, 2016. 
https://doi.org/10.14569/ijacsa.2016.070442. 
[12] K.K. Manjusha, K. Sankaranayanan, P. Seena. Data 
Mining in Dermatological Diagnosis: A Method for 
Severity Prediction, International Journal of 
Computer Applications, (0975 – 8887) Vol. 117 – 
No.11, 2015. 
[13] Sana BuHamra, Nejib Smaoui, Mahmoud Gabr. The 
Box–Jenkins analysis and neural networks: 
prediction and time series modelling, Applied 
Mathematical Modelling, Vol: 27,pp. 805–815, 
2003. 
https://doi.org/10.1016/s0307-904x(03)00079-9. 
[14] Gianluca Bontempi. Machine Learning Strategies 
for Time Series Prediction, Machine Learning 
Group, Computer Science Department Boulevard de 
Triomphe - CP 212, Hammamet, 2013. 
Retrieved from: http://www.ulb.ac.be/di. 
[15] A. Martin, V. Aswathy, V.Prasanna Venkatesan. 
Framing Qualitative Bankruptcy Prediction Rules 
Using Ant Colony Algorithm, International Journal 
of Computer Applications, 0975 – 8887 ,Volume 
41– No.21, 2012. 
https://doi.org/10.5120/5827-8143. 
[16] Shuzhan Wan, Shengwu. Xiong and Yi Liu, 
Prediction based multi-strategy differential 
evolution algorithm for dynamic environments, 
Evolutionary Computation (CEC), 2012 IEEE 
Congress, pp. 10-15, 2012. 
https://doi.org/10.1109/cec.2012.6256628. 
[17] Dandan .Li, Wanxin. Xue, Yilei Pei . A high-
precision prediction model using Ant Colony 
Algorithm and neural network, International 
Conference on Logistics, Informatics and Service 
Sciences (LISS), 2015. 
https://doi.org/10.1109/liss.2015.7369696. 
[18] Hossein Etemadi, Ali Asghar Anvary Rostamy, 
Hossan Farajzadeh Dehkordi . A genetic 
programming model for bankruptcy prediction: 
Empirical evidence from Iran, Expert Systems with 
Applications, Vol: 36, pp. 3199–3207, 2009. 
https://doi.org/10.1016/j.eswa.2008.01.012. 
[19] Pamela Dominic, David Edward Leahy, Mark J. 
Willis. Predicting the toxicity of chemical 
compounds using GPTIPS: a free open source 
genetic programming toolbox for MATLAB, 
Intelligent Control and Computer Engineering, 
Lecture Notes in Electrical Engineering, Vol. 70, 
Springer, pp. 83-93, 2011. 
https://doi.org/10.1007/978-94-007-0286-8_8. 
[20] Yudong Zhang, Shuihua Wanga, Preetha Phillips, 
Genlin Ji. Binary PSO with mutation operator for 
feature selection using decision tree applied to spam 
detection, Knowledge-Based Systems, Vol: 64, pp. 
22–31,2014. 
https://doi.org/10.1016/j.knosys.2014.03.015. 
[21] Mark A. Hall , Correlation-based Feature Selection 
for Machine Learning, PhD Thesis, The University 
of Waikato, 1999. 
[22] Irene Rodriguez Lujan, Ramon Huerta, Charles 
Elkan, Carlos Santa Cruz. Quadratic Programming 
Feature Selection, Journal of Machine Learning 
Research, 11, pp. 1491-1516, 2010.  
[23] Jasmina Novakovıć, Perica Strbac, Dusan Bulatovıć. 
Toward Optimal Feature Selection Using Ranking 
Methods And Classification Algorithms, Yugoslav 
Journal of Operations Research, Vol: 21, Number 1, 
pp. 119-135, 2011. 
https://doi.org/10.2298/yjor1101119n. 
[24] Gavin Brown, Adam Pocock, Ming-Jie Zhao, Mikel 
Luj´an. Conditional Likelihood Maximisation: A 
Unifying Framework for Information Theoretic 
Feature Selection, Journal of Machine Learning 
Research, Vol: 13, pp. 27-66, 2012.  
[25] Riyaz Sikora, Selwyn Piramuthu . Framework For 
Efficient Feature Selection In Genetic Algorithm 
Based Data Mining, European Journal of 
Operational Research, Vol:180, pp. 723–737, 2007. 
https://doi.org/10.1016/j.ejor.2006.02.040. 
[26] Shital C. Shah, Andrew Kusiak. Data mining and 
genetic algorithm based gene/SNP selection, 
Artificial Intelligence in Medicine, Vol: 31, pp. 
183—196, 2004. 
https://doi.org/10.1016/j.artmed.2004.04.002. 
[27] Utpal Kumar Sikdar, Asif Ekbal, Sriparna Saha. 
Differential Evolution based Feature Selection and 
Classifier Ensemble for Named Entity Recognition, 
Proceedings of COLING 2012: Technical Papers. 
COLING 2012, Mumbai, December, pp. 2475–
2490, 2012. 
https://doi.org/10.1007/s10032-011-0155-7. 
[28] Yuanning Liu, Gang Wang, Huiling Chen, Hao 
Dong, X.iaodong Zhu, Sujing Wang . An Improved 
Particle Swarm Optimization for Feature Selection, 
Journal of Bionic Engineering, Vol: 8, 2011. 
https://doi.org/10.1016/s1672-6529(11)60020-6. 
[29] Bing Xue, Mengijie Zhang, Will N. Browne. 
Particle Swarm Optimization for Feature Selection 
in Classification: A Multi-Objective Approach, IEEE 
Transactıons on Cybernetıcs, Vol. 43, No. 6, 2013. 
https://doi.org/10.1109/tsmcb.2012.2227469. 
A Comparative Study of Automatic Programming Techniques Informatica 43 (2019) 281–289 289 
 
[30] Jianjun Yu, Jindan Yu, Arpit A. Almal, Saravana 
M.Dhanasekaran ,Debashis Ghosh, William 
P.Worzel, Arul M.Chinnaiyan. Feature Selection 
and Molecular Classification of Cancer Using 
Genetic Programming, Neoplasia, Vol. 9, No:4, pp. 
292 – 303, 2007. 
https://doi.org/10.1593/neo.07121. 
[31] Jacques-Andre Landry, Luis Da Costa and Thomas 
Bernier. Discriminant Feature Selection by Genetic 
Programming: Towards a domain independent 
multi-class object detection system, Systemics 
Cybernetics and Informatics, Vol: 3(1), pp. 76-81, 
2006. 
[32] Omer Abu-Arqub, Zaer  Abo-Hammour, Shaher 
Mohammad Momani. Application of Continuous 
Genetic Algorithm for Nonlinear System of Second-
Order Boundary Value Problems, Applied 
Mathematics& Information Sciences, 8, No.1, pp. 
235-248, 2014. 
https://doi.org/10.12785/amis/080129. 
[33] Omer Abu-Arqub, Zaer Abo-Hammour. Numerical 
solution of systems of second-order boundary value 
problems using continuous genetic algorithm, 
Information Sciences, Vol: 279, pp. 396-415, 2014. 
https://doi.org/10.1016/j.ins.2014.03.128. 
[34] Riccardo Poli, William B. Langdon, Nicholas F. 
McPhee, John R. Koza, A Field Guide to Genetic 
Programming, 2016.  
http://cswww.essex.ac.uk/staff/rpoli/gp-field-guide/. 
[35] Zhaohui Gan, Tommy W.SChow, W.N.Chau. Clone 
selection programming and its application to 
symbolic regression, Expert Systems with 
Applications, Vol: 36, 2009, pp. 3996–4005, 2009. 
https://doi.org/10.1016/j.eswa.2008.02.030. 
[36] Hajira Jabeen, Abdul Rauf Baig. Review of 
Classification Using Genetic Programming, 
International Journal of Engineering Science and 
Technology, Vol: 2, pp. 94-103, 2010. 
[37] Beyza Gorkemli. Study of Artificial Bee Colony 
Programming (ABCP) to Symbolic Regression 
Problems, PhD Thesis, Erciyes University, 
Engineering Faculty, Computer Engineering 
Department, 2015. 
[38] Vladimir Cherkassky, Don Gehring, Filip Mulier. 
Comparison of adaptive methods for function 
estimation from samples, IEEE Transactions on 
Neural Networks, Vol: 7 (4), 1996, pp. 969- 984, 
1996. 
https://doi.org/10.1109/72.508939 
[39] Dominic P.  Searson, GPTIPS 2: an open-source 
software platform for symbolic data mining. Chapter 
22 in Handbook of Genetic Programming 
Applications, A.H. Gandomi et al., (Eds.), Springer, 
New York, NY, 2015., https:// 
sites.google.com/site/gptips4matlab/file-cabinet. 
https://doi.org/10.1007/978-3-319-20883-1_22 
[40] UCI, Machine Learning Repository, Concrete 
Compressive Strength Data Set. 
https://archive.ics.uci.edu/ml/datasets/Concrete+Co
mpressive+Strength, Access Date: 15.10.2016.  
[41] Sibel Arslan, Celal Ozturk, Multi Hive Artificial Bee 
Colony Programming for high dimensional symbolic 
regression with feature selection, Applied Soft 
Computing Journal 78, 515–527, 2019. 
https://doi.org/10.1016/j.asoc.2019.03.014. 
[42] Sibel Arslan, Celal Ozturk, Artificial Bee Colony 
Programming Descriptor for Multi-Class Texture 
Classification, Applied Sciences, 9(9), 2019. 
https://doi.org/10.3390/app9091930. 
  
290 Informatica 43 (2019) 281–289 S. Arslan et al.