Informatica 41 (2017) 349–362 349 An Output Instruction Based PLC Source Code Transformation Approach For Program Logic Simplification Arup Ghosh and Shiming Qin Unified Digital Manufacturing Laboratory, Department of Industrial Engineering, Ajou University Suwon 443-749, Republic of Korea E-mail: arupghosh22.3.89@gmail.com, taihejiang11@ajou.ac.kr Jooyeoun Lee and Gi-Nam Wang Department of Industrial Engineering, Ajou University, Suwon 443-749, Republic of Korea E-mail: jooyeoun325@ajou.ac.kr, gnwang@ajou.ac.kr Keywords: programmable logic controller, reengineering, ladder logic diagram, instruction list, XML Received: August 26, 2016 Due to the growing size and complexity of the PLC (Programmable Logic Controller) programs used for controlling the industrial processes, there is an increasing need for an approach that can help the users to understand the control logics of the PLC programs easily, and can assist them to analyze the programming errors effectively. In this paper, we propose an approach that takes the source code file of PLC program as the input; and transforms it into a hierarchical-structured XML (extensible markup language) file. The XML file format is based on the PLC output instructions and their corresponding conditions. It helps the users to identify the actual cause of a programming error quickly. In addition, a novel technique is applied that decomposes the PLC program into several smaller and modular sub-logic blocks. This makes the control logic simpler and easier to follow. An additional software application has also been developed for state-based graphical visualization of the XML file. Povzetek: Prispevek opisuje metodo za poenostavitev PLC programov za industrijske procese. 1 Introduction The PLCs are a special type of computers that are used for automation of the industrial processes. A PLC controls an industrial process according to the control program embedded in its controller. In each execution of the PLC program, it takes the sensor signals as the inputs and produces a set of output control signals to the actuators. So, the program outputs and their corresponding conditions (which must be satisfied in order to receive that particular output) are the basis of a PLC program. The PLCs can be programmed by using several programming languages under the international standard IEC 61131-3, such as Ladder Logic Diagram (LLD), Function Block Diagram (FBD), Structured Text (ST), Instruction List (IL) etc. [1]. Among these languages, the LLD is the most popular PLC programming language in industries; and the IL is the most commonly used PLC programming language in Europe [1]–[3]. On various occasions, the programmers use a combination of these languages to write a PLC program. With the growing size and complexity of the PLC programs, it becomes more difficult to understand the program logics because of the low-level PLC programming languages. Moreover, if an error is detected in any PLC output, then the programmers have to analyze the complete program manually to find out the conditions that can cause such an error. It is very complicated and time-consuming job for a programmer to determine all the conditions that can affect a particular output. The situation becomes more critical when many programmers work together to develop the project. Moreover, if the routines of the PLC program are written in different languages, then understanding the control logics and/or determining the conditions associated with a program output become even harder. Our main aim is to transform the PLC program source code into a programming language and vendor independent XML file format that can help the users to understand the program logic easily; and can assist the programmers to analyze the programming errors quickly. In this paper, we present a PLC program source code reengineering approach, called Program Output based Source-code Transformation (POST) approach that takes the source code file of the PLC program saved in the IL language as the input, and produces a hierarchical- structured XML file as the output. In that XML file, the program logic is interpreted in terms of the program output instructions and their corresponding conditions, thus the programmers can analyze the programming errors easily. In addition, POST applies a novel technique that subdivides the program logic blocks into several smaller and modular sub-logic blocks in order to make the program logic more simpler, clearer and well-organized. POST is applicable to all the programming languages and the PLC software vendors where the program source code can be saved in the IL language. For example, in case of Siemens PLC software [4], the programs written in the LLD, IL and FBD languages can automatically be saved 350 Informatica 41 (2017) 349–362 A. Ghosh et al. in the IL language format. We have implemented and tested POST for Siemens and Allen-Bradley PLC software [5]. It can easily be extended for other types of PLC software as well. 2 Problem Description In Figure 1, an example rung of a LLD program (written using Siemens Simatic Step 7 software [4]) is given. Each rung of a LLD program characterizes a specific rule or a set of rules. As can be seen in Figure 1, the rung has two output instructions and those outputs are dependent on three input instructions (or conditions) i.e., two Normally Open (NO) contacts and one Normally Closed (NC) contact. The NO and NC contacts actually represent the AND and AND-NOT boolean logic operations, respectively. These conditions are evaluated at the time of program execution in order to determine the data values of the output addresses. In practice, a LLD program can have thousands of such rungs partitioned into several program blocks, such as the Organization Blocks (OBs), Functions (FCs), Function Blocks (FBs) etc. It is very time- consuming and laborious task for the programmers to identify the real cause of a programming error. This is because, if an error is found in any PLC output signal, then the programmers have to examine the complete program (i.e., each rung of every program blocks) manually to find out the exact conditions that can affect the value of the corresponding output address. In that condition candidate set, if an erroneous data value is found in the address field of an input instruction which is not a direct sensor input, then the programmers have to search again for the conditions that can affect the value of that address. This process continues until the root causes of the error (in other words, the faulty sensor inputs and/or the flaws in the program logic) are identified. In order to overcome this kind of difficulties, an attempt is given to transform the source code file of the PLC program into a well-organized and well-structured XML file, thus all the conditions attached to an output address can be determined automatically. This can help the users to fix the programming errors very quickly. The PLC programs are often written in a combination of different languages. The input-output instructions of a particular programming language also vary depending on the PLC software vendors. So, it is necessary to transform the PLC code into a vendor and language independent format, thus the users can understand the program instructions quickly and easily. An automated approach for program logic simplification is another important requirement for industries. Often PLC programs are written in a very low-level, non-graphical language such as the IL language. An IL language code equivalent to the LLD rung of Figure 1 is given in Figure 2. As can be seen, it is very hard to understand the rung logic from this kind of non-graphical PLC programs. Even for the programs written in graphical languages such as the LLD, FBD etc., it becomes difficult to understand the program logic with the growing size and complexity of the rung diagram (especially if the rung has several outputs, parallel branches and sub-branches). An example of such complex LLD rung is presented in Figure 3. It is easy to perceive, identifying the conditions or understanding the program logic behind a particular output is very difficult from this kind of ladder rungs. Therefore, an automated, systematic approach is required that can simplify the program logic of this kind of complex ladder rungs in an efficient way, thus the users can understand the program logic behind a Figure 1: A rung of a LLD program. Figure 2: The IL language representation of the LLD rung of Figure 1. An Output Instruction Based PLC Source... Informatica 41 (2017) 349–362 351 particular output easily. In this work, we have successfully addressed those needs. The rest of this paper is organized as follows: an overview about the existing works in this field is presented in Section 3. In Section 4, we have discussed POST approach in details. Section 5 contains our conclusive remarks of the work followed by a list of relevant references. 3 Background Study Several approaches have been proposed in literatures for reengineering the PLC programs. They can broadly be classified into the following three categories:  Approaches focused on source-to-source translation: this type of approaches transform the PLC programs written in a particular programming language into another programming language. For example, the approaches proposed in [6]–[9] transform the LLD program into the IL language code. The main objective of these research works is to convert the PLC program into the IL language code thus it can be executed directly by the PLCs. In [10], a different type of approach was proposed that converts the LLD program into the ST language code. The main aim of this research work is to promote a particular type of technology and hardware.  Approaches focused on vendor interoperability: these research works propose an approach that can accomplish transferring the program source code among different vendors of PLC programming tools. In these works, the interoperability between the PLC programming tools is achieved by means of a middleware. In most of the cases, the XML technologies have been used for developing the interoperability middleware. Examples of such works include: [11]–[15].  Approaches focused on alternative visualization: this type of approaches transforms the PLC program source code file into another file format for more efficient graphical visualization. For example, in earlier works [3] and [16], an approach was proposed that transforms the PLC program into a vendor and platform independent XML file format. In [17] and [18], an approach was proposed that transforms the PLC programs into the Finite State Machines (FSMs). In another article [19], the UML (Unified Modelling Language) state diagrams are used in place of FSMs for more efficient graphical visualization of the rung logic. Figure 3: An example complex ladder rung. 352 Informatica 41 (2017) 349–362 A. Ghosh et al. Unfortunately, all the mentioned approaches are focused on an efficient graphical representation of the PLC program and/or the vendor and platform interoperability. None of these approaches fulfils all the requirements stated in Section 1 and Section 2, and hence, a completely different type of approach is needed. Our proposed approach POST can solve all those needs effectively. 4 POST approach This section is divided into five subsections. In Subsection 4.1, the overall structure of the output XML file (produced by POST approach) is given and in Subsection 4.2, the program logic simplification procedure of POST is discussed. The program error analysis procedure is presented in Subsection 4.3. In Subsection 4.4, the implementation details of POST approach is discussed and in Subsection 4.5, the output XML file format for a special instruction i.e., the block call instruction is given. In this paper, we discuss POST approach using the ladder rung diagrams of Siemens PLC programs (just for exemplification purpose). 4.1 The Overall XML file structure POST takes the source code file of a PLC program saved in the IL language as the input and produces a well- structured and well-organized XML file. It gives an efficient tree-based representation of the program logic to the users. The XML file structure outputted by POST is based on the output instructions and their corresponding conditions of each rung of the PLC program. The overall structure of the output XML file is given in Figure 4 (some XML nodes are not expanded in order to maintain the clarity of the image). As we can see, under the root node i.e. Program node, the Routine nodes are defined. A routine actually refers to a block of the PLC program. The Type attribute of the Routine nodes specifies the type of that routine i.e., OB or FB or FC etc. In a LLD program, the ladder rungs are always declared inside a routine and hence, under the Routine node, the Rung nodes are defined. The Number attribute represents the corresponding rung number in the routine. As can be seen in Figure 4, under the Rung node, the Output nodes are characterized. Each Output node basically represents a separate output of the corresponding rung. The Type attribute of the Output nodes refers to the type of that output instruction such as the Output Coil, Convert BCD to Integer (CBI), Move, Set or Reset Coil instruction etc. (see for instance: [20] and [21]). The Move and the CBI type instructions have the following two additional attributes: i) Source_Address or Source_Value attribute: represents the address or the value specified at the IN input; and ii) Target_Address attribute: represents the address specified at the OUT output (see Figure 3 and Figure 4). Similarly, the Output Coil type instructions have one additional attribute i.e., the Address attribute which characterizes the output address of the corresponding instruction (the same is also true for the Set and Reset Coil instructions). The additional attributes associated with an instruction (or an Output node) actually represent the addresses and the data values associated with that instruction. As can be seen in Figure 4, POST determines the number of additional attributes and their names (or formats) based on that particular type of instruction (also see [20] and [21]). In our original PLC program, the ladder rung of Figure 3 is actually the second rung of the function block FB 421. In Figure 4, we can find the Output nodes corresponding to the ladder rung of Figure 3 under the rung number 2 node. As can be seen in Figure 3, the rung consists of ten output instructions and hence, ten Output nodes are created under the rung number 2 node in the XML file of Figure 4. Actually, in the output XML file, a rung diagram is characterized on the basis of its output instructions and hence, under each Output node, we can find its corresponding conditions. For example, as can be Figure 4: The output XML file format. An Output Instruction Based PLC Source... Informatica 41 (2017) 349–362 353 seen in Figure 3, the first Move type output instruction (see the first branch) has only one corresponding condition (the condition type AND with address argument M1.5). As can be seen from Figure 4, this condition is correctly placed right below the corresponding Output node in the XML file. It is easy to perceive from the format of the output XML file, it follows exactly the same logic structure as in the original PLC program. However, if any rung has more than one output instruction, then the rung logic is split based on the corresponding output instructions. This is the first logic simplification measure taken by POST (this also simplifies the programming error analysis task – we will discuss on it later). In addition, as can be seen in Figure 4, the instructions and their corresponding properties are described by using simple descriptive language. This makes the program logic easy understandable, and programming language and platform independent. Please note that POST can also produce the output XML file based on the symbolic names (see Figure 1). 4.2 Program Logic Simplification by utilizing the Local Memory Definition The program output based source code transformation method simplifies the rung structure to a great extent. However, further logic simplification measures are needed to be taken particularly for the rungs with a large number of parallel and high depth sub-branches. The parallel branches and sub-branches of a LLD rung represent the OR boolean logic operations. This means that the Result of Logic Operation (RLO) of the parallel branches (respectively, sub-branches) is true, if RLO of any of those branch (respectively, sub-branch) is true [20, 21]. POST simplifies the logic of a rung diagram with parallel branches and sub-branches by using a bottom-up hierarchical decomposition procedure. More specifically, it characterizes a certain portion of the complete rung diagram (a sub-logic block) by utilizing the Local Memory Definition (LMD) [a local memory can be thought of as a virtual memory location where the RLO of its corresponding sub-logic block is stored]. The LMDs are then used successively (reused in a modular fashion) to define the complete rung logic. In Figure 5, the relevant part of the rung diagram of Figure 3 that depicts only the conditions associated with the first Output Coil instruction (address Q215.0) is given. As can be seen from Figure 5, even after the above mentioned logic simplification step (as stated in Subsection 4.1), the rung diagram has several parallel branches and sub-branches. For this reason, the program logic behind the output is difficult to follow and hence, is needed to be simplified further. Figure 5: The rung diagram (or rung logic) associated with the first Output Coil instruction (address Q215.0). 354 Informatica 41 (2017) 349–362 A. Ghosh et al. From our practical experience, we have seen that the conventional procedure to understand this kind of complicated rung structure (as in Figure 5) is to analyze the rung diagram starting from its highest depth sub- branches, and then consecutively proceeding towards the main branch and its parallel branches (in a bottom-up fashion). The LMD based logic simplification procedure of POST exactly follows this natural bottom-up modular decomposition approach. As we can see from Figure 5, the (relatively straightforward) rung logic corresponding to the highest depth sub-branches (branches inside the red colour box) will be defined by using the local memory LM0. Similarly, the rung logic corresponding to the next highest depth sub-branches (branches inside the green colour box) will be characterized by using the local memory LM1. It is easy to perceive, the definition of local memory LM0 can successively be utilized in the definition of local memory LM1. As can be seen from Figure 5, this LMD formulation procedure will be repeated in a bottom- up fashion until all the parallel branches and sub-branches are characterized by using the LMDs. In Figure 5, the boxes and its associated local memory names represent how the rung structure can further be simplified by using the LMDs [The LMDs are restricted to maximum three parallel branches. As an example, see the branches inside the blue colour box. We will discuss more on it later.]. For simplicity, we can suppose that the RLO of the parallel sub-branches of a branch (respectively, the parallel branches of the main branch) is stored in a virtual memory location of the type local memory, and is used successively to evaluate the RLO of that branch (respectively, the main branch) by applying an AND boolean logic operation. The output XML file shown in Figure 6 depicts the condition set (or the rung logic) corresponding to the first Output Coil instruction of Figure 3 (also see the simplified rung diagram of Figure 5). As can be seen in Figure 6 (a), the rung logic or the rung diagram associated with the first Output Coil instruction is characterized based on the definition of local memory LM4. The definition of local memory LM0, LM1, LM2, LM3 and LM4 are shown Figure 6: XML file format for the rung diagrams with parallel branches and sub-branches. (a) The condition set corresponding to the first Output Coil instruction. (b) Local memory LM0 definition. (c) Local memory LM1 definition. (d) Local memory LM2 definition. (e) Local memory LM3 definition. (f) Local memory LM4 definition. An Output Instruction Based PLC Source... Informatica 41 (2017) 349–362 355 separately in Figure 6 (b), Figure 6 (c), Figure 6 (d), Figure 6 (e) and Figure 6 (f), respectively. As can be seen in those figures, under the Local-Memory node, the definition (or the rung logic) of the corresponding local memory is given. The Address attribute of the Local-Memory node represents the virtual address (or name) of the local memory. It is easy to see from Figure 6, the definitions of the local memories characterize the rung logic of exactly the same branches (or the sub-logic blocks) as depicted in Figure 5. For example, the definition of local memory LM0 (presented in Figure 6 (b)) covers the rung logic of the highest depth parallel sub-branches of the rung diagram of Figure 5 (see the red colour box). As can be seen in Figure 6 (b), the condition set of each parallel branches are presented under a separate Option node. The Option nodes basically represent the OR boolean logic operations. So, if the condition set under any Option nodes associated with a particular local memory is true, then the RLO value stored in that local memory is also true. In the same way, the local memory LM1 is defined (shown in Figure 6 (c)). Please note that the definition of local memory LM1 is characterized by using the definition of Figure 7: The state-based graphical representation of the rung logic corresponding to the first Output Coil instruction. 356 Informatica 41 (2017) 349–362 A. Ghosh et al. local memory LM0 (in a modular fashion). The RLO value saved in the local memory LM1 can easily be determined by performing an AND boolean logic operation between the RLO value saved in the local memory LM0 and the RLO value of the other conditions (for simplicity, we can assume that the Local-Memory type attribute represents the AND boolean logic operation). It is easy to realize, this bottom-up hierarchical logic decomposition procedure provides an easy, systematic, step-by-step interpretation of the rung logic to the users. As can be seen in Figure 6 (a), the overall rung logic corresponding to the first Output Coil instruction is characterized by using only a very few conditions (the same is also true for the LMDs – see Figure 5 and Figure 6). This indeed makes the program logic behind an output easier to follow. The rung diagram connected with a particular output can have many same-depth parallel branches and sub- branches. It is easy to perceive, if a large number of parallel branches or sub-branches are characterized by using a single local memory, then it can generate a very complex local memory definition. In order to avoid such issues, POST allows the users to explicitly bind the complexity of the LMDs through restricting (or setting) the number of parallel branches that the definition can cover. For example, as can be seen from Figure 5 and Figure 6, the LMDs are always restricted to maximum three parallel branches. We have also developed a software interface module (with the help of Graphviz Software [22]) that can provide an efficient graphical representation of the rung logic. As an example, in Figure 7, the graphical representation of the rung logic associated with the first Output Coil instruction is shown. It is actually the state-based graphical representation of the XML file presented in Figure 6. A state or a node in the graph of Figure 7 actually represents a particular condition (in other words, indicates a program instruction and the associated addresses or data values). Please note that the RLO values corresponding to the state nodes of a particular path are needed to be ANDed in order to get the resultant RLO value of that path; and the RLO values of the different paths are needed to be ORed in order to determine the resultant RLO value of the corresponding sub-logic block (see Figure 5 and Figure 7). 4.3 Program Error Analysis Procedure The program output instructions and their corresponding conditions based output XML file format not only provides an efficient program logic interpretation, but also makes it possible for a software module to accumulate all the conditions corresponding to an output automatically. If an incorrect data value is found in any output address, then the user has to pass that address (or any other output address) as the query input value to the condition search engine of POST. The condition search engine of POST analyzes the output address attribute values of all the Output nodes of the above stated XML file (as shown in Figure 4 and Figure 6), and generates a query output XML file that contains all the conditions (i.e., the program instructions and the associated addresses or data values) that can directly affect the value stored at that particular input address. Recall that the output address attribute of the Output nodes refers to the attribute that denotes the output address of the corresponding program instruction. For example, the Target_Address attribute is the output address attribute of the Move type instructions (see Figure 3 and Figure 4). For the convenience of the readers, an example query output XML file is presented in Figure 8. The condition search engine of POST produced that XML file for the query input address MD3650 (see Figure 3 and Figure 4). As we can see, under the root node i.e. Query-Output node, the Routine and the Rung nodes are defined. It helps the users to identify the routine and the rung in which the output instruction is declared. The Output nodes and their corresponding Local-Memory and Condition nodes help the users to explicitly determine the output instructions and the conditions that can affect the value stored in that output address. As can be seen in Figure 8, two Move type output instructions (and their corresponding condition sets) can directly alter the value stored in the query input address MD3650. The first Move instruction is located in the second rung of the function block FB421 (shown in Figure 3 – see the second branch of the rung diagram), and the second Move instruction is located in the tenth rung of the function block FB423 (not illustrated for the space Figure 8: The format of the query output XML file (retrieved as a result of the query input: address MD3650). An Output Instruction Based PLC Source... Informatica 41 (2017) 349–362 357 reasons). The users then have to thoroughly inspect these condition sets to find out: i) if any sensor has failed or is transmitting an inaccurate reading; and ii) if there is any logical or conceptual flaw in the rung diagram (i.e., the output instructions and their corresponding conditions). If not then the users have to search again for the conditions corresponding to the output address (or addresses) from where an erroneous value is obtained as the input (please note that this address must have to the output address of an instruction that belongs to the above stated condition sets). This process continues until the actual cause of the error is detected (as mentioned in Section 2). It is easy to realize, POST makes this entire condition search process automatic and oversight easy; and hence, the program error analysis process becomes simple and fast. Please note that this becomes possible only because of the program output instructions and their corresponding conditions based source code transformation approach of POST. 4.4 Implementation Details of POST Approach We have implemented POST approach in C++ language and tested it on the program source codes of Siemens and Allen-Bradley PLC software. POST takes the PLC program source code file saved in the IL language format as the input, and converts it into the above stated XML file format (as discussed in Subsection 4.1 and Subsection 4.2). Whenever the starting tag of a routine (respectively, rung) is encountered in the program source code file, POST enters the name and type (respectively, number) information of that routine (respectively, rung) into the output XML file, following the same format as shown in Figure 4. Similarly, when the ending tag is detected, it Figure 9: A simple example ladder rung diagram. Figure 10: Step 1 – Identifying the program output instructions and their corresponding condition sets. 358 Informatica 41 (2017) 349–362 A. Ghosh et al. closes the corresponding XML file node. The rung logic defined inside the starting and ending tag of a rung is copied into the computer memory for further processing. The rung logic (or IL code) to XML file conversion is a three-phase procedure. We discuss this three-phase procedure with the help of a simple ladder rung given in Figure 9. The first phase of the above stated three-phase source code transformation process is illustrated in Figure 10. As we can see, a string array, named Instruction Set array is used to store the source code of the ladder rung presented in Figure 9. In the first phase, as can be seen in Figure 10, the output instructions and their corresponding condition sets are determined. In order to accomplish this, at first, the program instructions stored in the Instruction Set array are converted into the specific instruction name format of POST (as discussed in Subsection 4.1 and Subsection 4.2). As can be seen in Figure 10, the IL language instructions are converted into the descriptive language instructions and are stored in a string array, called the Modified Figure 11: Step 2 – Formulating the local memory definitions. Figure 12: Step 3 – Transforming the results into the output XML file format. An Output Instruction Based PLC Source... Informatica 41 (2017) 349–362 359 Instruction Set array. In case of the source code of Siemens PLC software, the complex (or compound) instructions are decomposed into the core or basic instructions. For example, a Move type instruction is decomposed into the L, T, (optional) JNB instructions etc. [20, 21] (also see Figure 2). For this reason, POST has to inspect whether a set of core instructions in the Instruction Set array is actually equivalent to any such complex program instruction or not. If so, then POST replaces that instruction set with its corresponding descriptive language instruction and stores it in the Modified Instruction Set array (see Figure 10). The same measure is also taken for the instructions that have multiple inputs and outputs (such as the block call instructions – we will discuss more on it later). This sub-phase is skipped for the source code of Allen-Bradley PLC software. This is because, in that case, the complex instructions are not broken down into the core or basic instructions. In the next sub-phase of the first phase, the outputs and their corresponding condition sets are formulated from the Modified Instruction Set array. As we can see in Figure 10, POST first identifies all the output instructions in the Modified Instruction Set array, and then copies them into another array, named the Output Array. The condition set corresponding to each output is stored in a separate array (by using a pointer), named the Condition Array. As can be seen in Figure 10, all the instructions (except the output instructions) prior to an output instruction form the condition set corresponding to that output instruction, and are stored in the corresponding Condition Array. However, this axiom is not correct for the output instructions that are declared in a parallel branch or a sub- branch. For example, as can be seen in Figure 9, the Move output instruction is declared in a parallel branch of the main branch. For this reason, the conditions declared in its previous branches that appear prior to it in the Modified Instruction Set array, cannot be considered as its corresponding conditions. As can be seen from Figure 10, in this case, we get an inequality in the number of open and close parentheses (three open and one close parentheses). So, all the conditions up to the second open parenthesis are needed to be eliminated in order to get the equality in the number of open and close parentheses (in other words, in order to obtain the actual condition set). As can be seen in Figure 10, after performing this elimination operation, the condition set corresponding to an output is stored in its corresponding Condition Array for further processing. In the second phase, POST further simplifies the program logic stored in each Condition Array by formulating the LMDs following the same procedure as discussed in Subsection 4.2. This phase in details is illustrated in Figure 11 (for the interest of space, the LMD formulation procedure is shown only for the Output Coil instruction). If the Condition Array does not hold any OR instruction, then this phase is skipped. As can be seen in Figure 11, POST first identifies the condition set associated with the highest depth parallel branches present in the Condition Array (the branch depth can easily be calculated from the number of open and close parentheses); and then replaces it with a new local memory definition. Recall from Subsection 4.2 that the RLO value stored in the local memory address associated with a LMD is needed to be ANDed with the RLO value of the other conditions present in the Condition Array in order to get the resultant RLO value of the Condition Array. It is easy to see, the exact same principle is followed in Figure 11. As can be seen in Figure 11, the local memory addresses are saved in a separate array, called the Local Memory Array and in the second dimension of that array, a pointer to another array, called the LMD Array is stored. In the LMD Array, the definition (or the condition set) of its corresponding local memory is stored. This process is repeated until all the parallel branches of the main branch are defined by using the local memories (in other words, until all the OR instructions are eliminated from the Condition Array – see Subsection 4.2). Recall that in POST, the LMDs can be restricted to a limited number of parallel branches. If we set that number to two (implies that at most one OR instruction can be present in a particular LMD Array), then only the conditions inside the red colour arrows (see Figure 11) are characterized by using the local memory LM0 (and so on). In the third phase of the source code transformation process, the outputs and their corresponding condition sets are mapped into the output XML file. This phase is depicted in Figure 12. As can be seen, the condition sets associated with the local memories and the outputs are respectively written into the XML file following the same format as discussed throughout Subsection 4.1 and Subsection 4.2. In the output XML file, the OR instructions in the LMD Arrays are represented by using the Option nodes (as stated earlier). However, as can be seen in Figure 12, an OR (respectively, OR-NOT) instruction with an address argument is characterized by using the Condition node that has the Type attribute value AND (respectively, AND-NOT) and is defined under a separate Option node (this type of mapping is shown by using the red colour arrows). This is because, in the program source code of Siemens PLC software, an OR (respectively, OR-NOT) instruction with an address argument actually represents a parallel branch where only one AND (respectively, AND-NOT) instruction is declared (see Figure 9 and Figure 12). In the output XML file, POST always keeps the exact same logic structure as in the original PLC program (as stated earlier). Also note that in the output XML file, a Local-Memory type condition basically indicates an AND boolean logic operation and hence, the corresponding RLO value is needed to be ANDed with the RLO value of the other conditions in order to determine the resultant RLO value of the corresponding sub-logic block. In the above, we have discussed the implementation details of POST based on Siemens and Allen-Bradley PLC software. However, the proposed three-phase procedure is logically applicable to the source code of other PLC software as well (a little modification may be needed based on that particular software). 360 Informatica 41 (2017) 349–362 A. Ghosh et al. 4.5 Dealing with the Block Call Instructions In this subsection, we discuss the output XML file format for a special instruction i.e., the block call instruction. The block call instructions are used to call (or invoke) the program blocks, such as the FCs, FBs, System FCs (SFCs), System FBs (SFBs) etc. [20, 21, 23]. Unlike other instructions, the block call instructions have two types of parameters, namely the formal and the actual parameter. In addition, a program block can have multiple inputs and outputs. An example of such program block call is presented in Figure 13 (a) [the SFC20 block is used to copy the contents of a memory area given at input SRCBLK to another memory area given at output DSTBLK. If an error occurs, the returned error code is stored at the address given at output RET_VAL.]. As can be seen in Figure 13 (a), the SFC20 block has only one input and two outputs. However, some program blocks can have dozens of inputs and outputs. If POST characterizes all the inputs and outputs associated with a program block inside the Output node, it may generate a very complex XML node structure. In addition, POST has to define both the formal and actual parameters inside the Output node. In order to overcome this type of issues, under the Output node, an additional XML file node, called the Parameters node is created, inside which all the information related to the parameters of a program block is put together. In Figure 13 (b), the output XML file Figure 13: The output XML file format for the block call instructions. (a) The calling or invocation of the System Function SFC20. (b) The output XML file format for SFC20 block. Figure 14: An example of a block call instruction with an input parameter that is defined based on a set of conditions. (a) The calling or invocation of the System Function Block SFB4. (b) The output XML file format for SFB4 block. An Output Instruction Based PLC Source... Informatica 41 (2017) 349–362 361 format for SFC20 block is shown. As we can see, only the name and type information of the block call instruction is specified inside the Output node. The corresponding condition set is defined under the Output node following the same way as done before. As can be seen in Figure 13 (b), under Parameters node, all the parameters of SFC20 block are characterized. The Block_Input and the Block_Output nodes are created to define the inputs and the outputs of the program block. The Formal and the Actual attributes represent the formal and the actual parameters, respectively (the Actual attribute of the Block_Output node is the output address attribute of the block call instruction – also see Subsection 4.3). Please note that inside the Block_Input and the Block_Output nodes, an additional attribute i.e., the Comment attribute is incorporated in order to define the objectives of the parameters. However, the Comment attribute is an optional attribute and is generated only for the system library blocks (since the objectives of the parameters are known in advance). Another example of a program block call (a SFB4 block call) and its corresponding XML file format are shown in Figure 14 (a) and Figure 14 (b), respectively. As can be seen in Figure 14 (b), the output XML file follows exactly the same structure as discussed above. In the Output node, an additional attribute i.e., the Instance_Data_Block_Address attribute is included thus the address of the corresponding instance data block can be incorporated (for more information, see [21] and [23]). However, this change is instruction specific (recall that POST determines the format of the XML node according to the functional specification of the corresponding instruction). A distinctive feature of this SFB4 block is that it has an input i.e., the input IN which is defined on the basis of a set of conditions (see Figure 14 (a)). Actually, at the time of program execution, the RLO value of the condition set is passed as the input IN value to the SFB4 block. If we define all these conditions inside the corresponding Block_Input node, then it generates a very complex XML node structure. In order to avoid this sort of problems, POST utilizes the concept of the local memory definitions. The local memory definitions are used to characterize the inputs that are defined on the basis of multiple conditions (following the same way as discussed in Subsection 4.2). As can be seen from Figure 14 (a) and Figure 14 (b), the complete condition set corresponding to the actual parameter of input IN is defined by using the local memory LM1. Please note that the parallel branches of the main branch are characterized by using the local memory LM0 following exactly the same procedure as described in Subsection 4.2. It is easy to perceive from the above discussions:  the output XML file format is designed very carefully in such a way that the condition search engine of POST can accumulate all the conditions associated with a program output automatically and in a straightforward way  the rung logic associated with a program output is simplified further whenever it gets complicated (in other words, whenever the number of parallel branches and sub-branches exceeds a certain limit)  each type of node in the output XML file is designed keeping in mind the objective and the functional specification of the corresponding instruction These features of the output XML file indeed make the programming error analysis task simple, fast and oversight easy (because, there is no need to inspect each rung of every program blocks manually). In addition, the above discussed XML file format provides an easy, systematic and step-by-step interpretation of the program logic to the users which makes the error analysis task even more simpler. 5 Conclusion This work is motivated by the need of an approach that can help the users to understand the PLC programs easily, and can assist them to analyze the programming errors in an efficient manner. In this paper, we have proposed a new approach, called POST that can satisfy all the mentioned needs effectively. POST takes the PLC program source code file as the input, and converts it into a program output instruction and its corresponding conditions based well- structured XML file. In the XML file, the rung logic corresponding to an output is further simplified by using a novel local memory based technique, and is presented in a programming language and platform independent format. The proposed XML file format provides a systematic and step-by-step interpretation (in a bottom-up fashion) of the program logic to the users. In addition, the XML file format is designed in such a way that the condition search engine of POST can accumulate all the conditions that can affect the value stored at a given output address automatically. These features of POST indeed help the users to identify the actual cause of a programming error quickly and reliably. A software interface module has also been developed in order to provide an efficient state-based graphical representation of the rung logic to the users. Acknowledgement The authors wish to thank UDMTEK Co., Ltd. for supplying all the required software, tools and PLC programs used in this research work. This work was supported in part by the Ministry of Trade, Industry and Energy (MOTIE), Republic of Korea, under Grant 10051146 and 10065737; in the part by the Small and Medium Business Administration, Republic of Korea, under Grant S2408982; and in part by the MOTIE and the Korea Institute for Advancement of Technology, Republic of Korea, under Grant N0001083. References [1] Liu, J. & Darabi. H. (2002). Ladder Logic Implementation of Ramadge-Wonham Supervisory Controller. Proceedings of the 6th International Workshop on Discrete Event Systems (WODES’02), Zaragoza, Spain, pp. 383–389. 362 Informatica 41 (2017) 349–362 A. Ghosh et al. [2] Du, D., Liu, Y., Guo, X., Yamazaki, K., & Fujishima, M. (2009). Study on LD-VHDL conversion for FPGA-based PLC implementation. The International Journal of Advanced Manufacturing Technology, vol. 40, no. 11-12, pp. 1181–1190. [3] Bani Younis, M. & Frey, G. (2004). Visualization of PLC Programs Using XML. Proceedings of the American Control Conference (ACC’04), Boston, USA, pp. 3082–3087. [4] Siemens Simatic Step S7 Software. Website: http://w3.siemens.com/mcms/simatic-controller- software/en/step7/pages/default.aspx, last retrieved on 6th July, 2017. [5] Allen-Bradley RSLogix Software. Website: http://www.rockwellautomation.com/rockwellsoftw are/products/rslogix.page, last retrieved on 6th July, 2017. [6] Fen, G. & Ning, W. (2006). A Transformation Algorithm of Ladder Diagram into Instruction List Based on AOV Digraph and Binary Tree. Proceedings of the IEEE Region 10 International Conference (TENCON’06), Hong Kong, China, pp. 1–4. [7] Hu, F., Fu, L., Liu, L., & Zhang, G. (2008). An Algorithm about Transforming PLC Ladder Diagram to Instruction List Based on Series-Parallel Merging Method. Proceedings of the Pacific-Asia Workshop on Computational Intelligence and Industrial Application (PACIIA’08), Wuhan, China, pp. 812–816. [8] Tan, A. & Ju, C. (2011). The Application of Maze algorithm in Translating Ladder Diagram into Instruction Lists of Programmable Logical Controller. Procedia Engineering, vol. 15, no. 1, pp. 264–268. [9] Yan, Y. & Zhang, H. (2010). Compiling ladder diagram into instruction list to comply with IEC 61131-3. Computers in Industry, vol. 61, no. 5, pp. 448–462. [10] Huang, L., Liu, W., & Liu, Z. (2009). Algorithm of transformation from PLC ladder diagram to structured text. Proceedings of the 9th International Conference on Electronic Measurement & Instruments (ICEMI’09), Beijing, China, pp. 4-778– 4-782. [11] Estevez, E., Marcos, M., Iriondo, N., & Orive, D. (2007). Graphical Modelling of PLC-based Industrial Control Applications. Proceedings of the 26th American Control Conference (ACC’07), New York City, USA, pp. 220–225. [12] Estevez, E., Marcos, M., Orive, D., Irisarri, E., & Lopez, F. (2007). XML based Visualization of the IEC 61131-3 Graphical Languages. Proceedings of the 5th IEEE International Conference on Industrial Informatics (INDIN’07), Vienna, Austria, pp. 279– 284. [13] Estevez, E., Marcos, M., Irisarri, E., Lopez, F., Sarachaga, I., & Burgos, A. (2008). A novel Approach to attain the true reusability of the code between different PLC programming Tools. Proceedings of the 7th IEEE International Workshop on Factory Communication Systems (WFCS’08), Dresden, Germany, pp. 315–322. [14] Estevez, E., Marcos, M., Orive, D., Lopez, F., Irisarri, E., & Perez, F. (2008). Middleware based on XML technologies for achieving true interoperability between PLC programming tools. Proceedings of the 17th World Congress of the International Federation of Automatic Control (IFAC’08), Seoul, Republic of Korea, pp. 8461– 8466. [15] Marcos, M., Estevez, E., Perez, F., & Der Wal, E. (2009). XML exchange of control programs. IEEE Industrial Electronics Magazine, vol. 3, no. 4, pp. 32–35. [16] Lopez, F., Irisarri, E., Estevez, E., & Marcos, M. (2008). Graphical representation of factory automation Markup Languages. Proceedings of the 13th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA’08), Hamburg, Germany, pp. 29–32. [17] Frey, G. & Bani Younis, M. (2004). A Re- Engineering Approach for PLC Programs using Finite Automata and UML. Proceedings of the IEEE International Conference on Information Reuse and Integration (IRI’04), Las Vegas, USA, pp. 24–29. [18] Bani Younis, M. & Frey, G. (2005). Formalization and Visualization of Non-binary PLC Programs. Proceedings of the 44th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC’05), Seville, Spain, pp. 8367–8372. [19] Bani Younis, M. & Frey, G. (2006). UML-based approach for the re-engineering of PLC programs. Proceedings of the 32nd Annual Conference on IEEE Industrial Electronics (IECON’06), Paris, France, pp. 3691–3696. [20] Siemens Simatic Ladder Logic (LAD) for S7-300 and S7-400 Programming (Reference Manual). URL: https://cache.industry.siemens.com/dl/files/822/455 23822/att_82001/v1/s7kop__b.pdf, last retrieved on 6th July, 2017. [21] Siemens Simatic Statement List (STL) for S7-300 and S7-400 Programming (Reference Manual). URL: https://cache.industry.siemens.com/dl/files/446/455 23446/att_79269/v1/s7awl__b.pdf, last retrieved on 6th July, 2017. [22] Graphviz Software. Website: http://www.graphviz.org, last retrieved on 6th July, 2017. [23] Siemens Simatic System Software for S7-300/400 System and Standard Functions (Volume 1/2, Reference Manual). URL: https://cache.industry.siemens.com/dl/files/574/121 4574/att_44504/v1/SFC_e.pdf, last retrieved on 6th July, 2017.