Metodološki zvezki, Vol. 4, No. 1, 2007, 71-82 Use of Trellis Graphics in the Analysis of Results from Field Experiments in Agriculture Katarina Čobanovič, Emilija Nikolič-Dorič, and Beba Mutavdžič1 Abstract Trellis graphics (Becker, Cleveland, and Shyu, 1996) is a very effective method for visualizing multidimensional data sets. The basic idea behind trellis graphics is to display any of a large variety of 1-D, 2-D or 3-D statistical plot types in trellis layout of panels, where each panel displays a subset of the data for different values of one or more additional discrete or continuous conditioning variables. The data that we use for the illustration of different applications of trellis graphics are the results of a field experiment conducted at the Institute for Field and Vegetable Crops in Novi Sad in the period 1994-1998 (Čobanović et al., 2001) with three fertilizers (nitrogen, phosphorus and potassium) in three repetitions with nine variants of wheat. In the experiment, four quantities of each fertilizer were applied (0, 50, 100, 150 kg/ha) at plots of the same size in 20 from 64 possible combinations, whereby the yield of wheat (t/ha) was the measured outcome. 1 Introduction Modern computer technology has changed the way statistical analysis and summary is done. In particular, graphical methods of analysis and summary now play more important role, and deserve increased emphasis in scientific reports (Cleveland,1993). Statistical graphics generally have two major functions: the analysis and the presentation which is traditionally their primary function. With the development of computer technology and with the intensive use of computer software, the analysis function of statistical graphics has assumed increasing importance (Schmid,1983). In recent times, the statistical graphics are a very attractive way of visual communication and analysis. 1 Faculty of Agriculture, University of Novi Sad, 21000 Novi Sad, Trg Dositeja Obradovića 8, Serbia; katcob@polj.ns.ac.yu 72 Katarina Čobanovič, Emilija Nikolič-Dorič, and Beba Mutavdži Graphical methods have a central role in the Exploratory Data Analysis (EDA), the approach of data analysis introduced by John Tukey (1977). As the human brain is very powerful in processing visual information, sometimes a single glance at a graph is enough to identify a very complex structure of data and relations between the variables. Cleveland and his associates at Bell Laboratories did a lot of research in the theory of graphical perception. They focused on the psychophysical aspects of human graphical processing. The appropriately drawn graphs may help in understanding the data and in presenting the results to others. On the other hand, the graphs not properly drawn, can lead to the wrong conclusions (Tufte, 2001) Statistical graphics are used both in teaching statistics as well as in the research (Nikolić-?orić et al., 2006). The data analysis based on graphics is the first step in a great many statistical investigations. The trellis graphics were introduced by Becker, Cleveland, and Shyu, 1996. The basic idea behind the trellis graphics is to display any of a large variety of 1-D, 2-D or 3-D statistical plot types in trellis layout of panels, where each panel displays a subset of the data for different values of one or more additional discrete or continuous conditioning variables. The name trellis comes from the arrangement of the plots which looks like a trellis (grid, lattice). The term “trellis” comes from gardening where it is an open structure used as a support for vines. One important application of a trellis display is uncovering the structure of multivariate data and relations of the variables in the multivariable data sets (Becker, Cleveland, and Shyu, 1996). The trellis display enables making important discoveries not found in the initial analysis. By comparing each conditioned panel on the same scale it is possible not just to explore if the relationship between the variables exists, but if it holds for all the levels of the conditioning variable as well. Although the trellis was developed initially in the context of large data sets, it is also useful for modelling the data from the designed experiments, even small experiments, and it is a very powerful tool for revealing the structure of interactions in the studies of how a response depends on explanatory variables (Cleveland and Fuentes, 1997). One of the first applications of the trellis diagram was to present the yields data of ten varieties of barley in an experiment arranged in randomized blocks, carried out in the State of Minnesota in the years 1930 and 1931, at six locations (Fisher, 1966). The trellis display led to the conclusion that the data are in error. Use of Trellis Graphics in the Analysis of… 73 2 Software for Trellis A trellis display was implemented in S/S-Plus system (Backer and Cleveland, 1996). The special package “lattice” for producing trellis graphs was developed in R-language. Also the trellis was implemented in the statistical packages GENSTAT 8, VSN, International, 2005 and STATISTICA, Statsoft Inc, where it was named a categorized graph, the term first used in 1990. S-Plus Clinical Pack developed by Insightful Corporation, enables the easy integration of powerful S-Plus graphics within SAS environment. 2.1 Software for Trellis A trellis display consists of panels laid out into a three-way rectangular array of columns, rows and pages. For the small data sets one page is usually enough. In the case of larger data sets multi-page layouts are necessary for presenting the whole data set. The panels of a trellis display, by default are ordered left-to-right and bottom-to-top but may be changed by the user in some other way.The table ordering, for example, is left-to-right and top-to-bottom. The strip labels written at the tops of panels indicate the conditioning (slicing) variable and ranges, values or levels of it, depending on whether it is a numerical (continuous, discrete) or categorical variable. A trellis display may be created by a single command line. It is based on repeating the same graphical specification for each element in a Cartesian product of levels of one or more factors. The programs in S-Plus and R trellising library have the structure: Y ~ X/a*b where Y is a continuous variable, X is a continuous variable or factor and a, b levels of factors, variables or functions of the fitted model. A scatter plot also consists of panels that are defined by Cartesian product of variables. Each of the panels is a plot of different set of variables but it is based on the entire set of observations and not on a subset as in the case of trellis display. In S-Plus and R it is possible to present a scatter-plot matrix conditioned on values on a relevant variable, i.e. in trellis form. The very flexible object-oriented S-Plus and R languages make possible the control of display in order to present maximum information of data. It is also possible to define the aspect ratio, multi-panel layout, plotting symbols, lines, colours, character sizes. 74 Č Katarina Čobanovič, Emilija Nikolič-Dorič, and Beba Mutavdži ć-?orić, 3 Experimental data The data that we use for the illustration of different applications of trellis graphics are the results of a field experiment conducted at the Institute for Field and Vegetable Crops in Novi Sad in the period 1994-1998 ( Čobanović et al., 2001) with three fertilizers (nitrogen, phosphorus and potassium) in three repetitions with nine variants of wheat. In the experiment, four quantities of each fertilizer were applied (0, 50, 100, 150 kg/ha) at the plots of the same size in 20 from 64 possible combinations (Table 1), whereby the yield of wheat (t/ha) was the measured outcome. Table 1: Combinations of fertilizer applied in the experiment. Variants N 0 100 0 0 100 100 0 50 50 50 100 100 100 100 100 150 150 150 150 150 P 0 0 100 0 100 0 100 50 100 100 50 100 100 150 150 50 100 100 150 150 K 0 0 0 100 0 100 100 50 50 100 50 50 100 100 150 50 50 100 100 150 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 4 Discussion In the analysis of the results from the field experiment in agriculture, several univariate, bivariate and high dimensional trellis displays were applied. Also for the same type of trellis various partitioning of data were made in order to focus on the different features of data. In order to explore the characteristics of data distribution one-dimensional trellis were applied: trellis dot-plot (Figure 1), box-plots (Figure 2), histograms with probability density plot (Figure 3), and quantile plots (Figure 4). With these Use of Trellis Graphics in the Analysis of… 75 diagrams the subpopulation structure, the distribution shape and the presence of outliers may be quickly revealed. Figure (1) illustrates that the highest wheat yield was in 1994/95 and the lowest in 1996/7, regardless of the variety and combination of fertilizers. Trellis notched box-plots (Figure 2) shows that Pobeda had the highest median wheat yield while the yield of variety Lasta had the highest variability. Also on the basis of 95% median confidence interval that consists of notches that are drawn about the median and are extended to+ 1,58- IQR/—, where IQR is interquartile / Vn range, it is possible to make multiple comparisons of the median yields. The notches that do not overlap represent a significant difference between the medians. POBEDA NOVRANA LASTA ITALIJA EVROPA BALKAN ZVEZDA RANANIS PROTEINK POBEDA NOVRANA LASTA ITALIJA EVROPA BALKAN ZVEZDA RANANIS PROTEINK POBEDA NOVRANA LASTA TALIJA EVROPA BALKAN ZVEZDA RANANIS PROTEINK POBEDA NOVRANA LASTA ITALIJA EVROPA BALKAN ZVEZDA RANANIS PROTEINK POBEDA NOVRANA LASTA ITALIJA EVROPA BALKAN 2 17 ODD OOO BO 13 46810 246810 18 19 20 ¦ »a o tfv•'im mn n OC -3D OOO OOOD OODO (BD OOOdDOOSSD CSDDO OBD OOOOGD o oonoo O0C<0» 14 o o 15 OCX <¦ ODOOO 9 16 en CKDOC' GOD O OOQOO O oa>M> "D C' O C«2C> OOO od < > •:¦ OC 3D GOOGOD OCO CO 10 O OB& r«:«XBD OOO 11 5 coco (BD": :»D 'inrT"L as o ooooo 12 O 0 OD OOOOODOO 6 DOO OGED OQD oxo o CK* D 1994/5 1995/6 1996/7 1997/8 o < aooQO CtttD GOD OGQDOO OOO (DDD OOO 7 «¦:•> a»:oo a- 1 8 ODO ODO O OK* QOD OtflD •I«I OOO 0DQ> O O um oaroo (¦BD a>:< oatDO ¦D O O o ¦DD 2 OC->