10 Lusa L et al.: Automated Preparation of the Book of Abstracts using R and LaTeX Research Paper ■ Automated Preparation of the Book of Abstracts for Scientific Conferences using R and LaTeX Lara Lusa, Andrej Blejec Abstract. The organization of a scientific conference can be a very demanding and time-consuming duty. Two challenging tasks are the preparation of the detailed program and of the book of abstracts. To make these tasks easier to handle, we developed the generbook package, which includes some functions written in R language and a LaTeX template for the book of abstracts. This paper describes the package and how to use it; it also shows how it was used for the organization of an international statistical conference. Avtomatska priprava knjige povzetkov za znanstvene konference z uporabo R in LaTeX Izvleček. Organizacija znanstvene konference je zahteven podvig. Dve posebej zahtevni nalogi sta priprava programa konference in knjige povzetkov. Da bi olajšali njuno izvedbo, smo izdelali paket generbook v okolju R, ki vključuje več funkcij v jeziku R in predlogo za knjigo povzetkov za stavni sistem LaTeX. Prispevek podaja podrobne napotke za uporabo paketa in vključuje primer njegove uspešne uporabe pri organizaciji mednarodne statistične konference. ■ Infor Med Slov: 2009; 14(1-2): 10-18 Authors' institutions: Institute for Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana, Slovenia (LL), National Institute of Biology, Ljubljana, Slovenia (AB). Contact person: Lara Lusa, Institute for Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana, Vrazov trg 2, Ljubljana, Slovenia. email: lara.lusa@mf.uni-lj.si. Informatica Medica Slovenica 2009; 14(1-2) 11 Introduction The organization of a scientific conference can be a very demanding and time-consuming task. The organizing committee of a small academic conference is typically a small group of people from the scientific staff of the organizing institution, with limited administrative support and funding availability. Therefore, the organizing committee has to handle many different aspects: advertising the conference, preparing the abstract submission form and the set-up of a method to efficiently manage the submissions, managing communication with the conference participants, selecting the contributions that will be presented at the conference, preparing the documents requested by the participants, preparing the detailed scientific program and the book of abstracts, and managing last-minute cancellations and special requests from the participants. The preparation of the detailed program of the conference and of the book of abstracts can be perceived as tasks that are performed only once during the conference preparation. In practice, the program, and consequently the book of abstracts, changes many times, as participants communicate last-minute cancellations or submit additional contributions, ask to change the schedule of their talks, or to make corrections to the submitted abstracts. These last minute modifications are particularly tedious if the program and the book of abstracts are handled and prepared with a word processor that requires the modifications to be made manually. This would therefore be the case when using Microsoft Office Word, at least without complex automation using macros. Each small modification to the program or the book of abstracts would require some manual editing, which is not only time consuming but can also easily introduce errors into the final document. The book of abstracts usually includes a program overview of the conference, a detailed program, all the abstracts accepted for presentation, and an alphabetical index of the authors. Hereafter, we will refer to all these parts simply as "the book of abstracts". There are three key features needed to automatically produce the book of abstracts: 1. A database that contains the information about each submitted abstract (the names of the authors, affiliation of each author, the title of the contribution, the abstract text etc. -contributions database); 2. A file that indicates which contributions will be included in the final program and their schedule (program file); 3. A computer program that will automatically produce the final book of abstracts by reading the program from the program file, and the detailed information about the contributions from the contributions database. If these three features are available, it is easy to produce the book of abstracts, and it is straightforward to update it if needed. This paper focuses on the automated preparation of the book of abstracts. We describe some functions written for this purpose in R language1 and a LaTeX2 template for the book of abstracts. These R functions and the LaTeX template were used for the preparation of the annual Applied Statistics International Conference held in September 2009 in Ribno, Slovenia (AS2009 Conference). Hereafter, we will refer to the R functions and the LaTeX template as the generbook package. The rest of the paper is organized as follows: in the Methods section, we describe the abstract submission management, the R functions for the generation of the book of abstracts and the LaTeX template for the book of abstracts; in Results and Discussion, we show how the developed functions and the template were used for the AS2009 Conference and discuss the issues in the application of such methods for conference management; in the Conclusion, we discuss the advantages of using our approach or similar automated approaches for preparing a book of abstracts. 12 Lusa L et al.: Automated Preparation of the Book of Abstracts using R and LaTeX Methods Here, we describe how to prepare the files needed to automatically produce the book of abstracts with generbook: the set of functions that we wrote in R language for this purpose and the LaTeX template for the book of abstracts. The files are freely available at http://sites.google.com/site/lara3107/Home/softwar e/generbook. To use the package generbook, the users must download and install R and have on their computer a program to compile the LaTeX documents; such programs are available on most Linux distributions, while Windows and Mac OS X users can download freely available programs for this purpose — MikTeX for Windows (http://www.miktex.org/) or MacTeX for Mac OS X (http://www.tug.org/mactex/). Contributions database The most straightforward way to obtain the contributions database is by handling the submission of the contributions using a form posted on the World Wide Web. Many websites offer freely available tools for developing and posting web forms; some examples are the forms that can be created with Google Docs (http://docs.google.com, using the forms available in the Docs spreadsheet), LimeSurvey (http://www.limesurvey.org) or 1ka (http://www.1ka.com). Of course, it is possible to create the database also when more traditional methods are used for submissions, such as submissions through electronic or regular mail. This approach is not very practical if the contributions database is needed, since the information about each submission has to be retrieved and then added manually to the database. The package generbook can be used together with a contributions database in which each contribution represents a record in the database. The records contain information about the contribution (title and abstract) and about the authors (their names, academic affiliations, contact information, and indication of which author is going to present the paper at the conference). In order to maintain high flexibility in the layout of the abstracts as they appear in the book of abstracts, some information must be split in separate fields. For example, the name of each author is split into three different fields: first, middle and last name; similarly, the affiliation of each author is split into institution, city and country. Table 1 reports the fields that must appear in the contributions database for using generbook. Two fields can be added to the database after the submission is complete. 1. A unique identification number (AbstractID) assigned to each contribution and used in the program file to uniquely identify them. 2. A variable indicating if the contribution was accepted for presentation at the conference (AbstractOK, with value TRUE if the contribution was accepted, FALSE otherwise) In order to use the generbook package, the database has to be in text format with the fields delimited by tabulation (tab-delimited file). Most databases can be easily exported into such format. The user can choose any name for the fields in the original database, as they do not need to have pre-specified names. Informatica Medica Slovenica 2009; 14(1-2) 4 Table 1 Fields included in the contributions database. Field Example Contribution Title The Voyages of the Starship USS Enterprise Abstract In this paper we present the voyages of the starship USS Enterprise and its five-year mission: to explore strange new worlds; to seek out new life and new civilizations; to boldly go where no man has gone before. Topic 1 Exploration of the Outer Space_ Topic 2 Starship s Author 1 First Name James Middle Name T. Last Name Kirk Institution Starfleet City San Francisco Country U.S.A. - Earth e-mail address JTKirk@starfleet.org Presenting (Yes/No) Yes Author 2 First Name Spock Middle Name Last Name Institution Vulcan Academy of Science City Country Vulcan e-mail address spock@vulcan.org AbstractID 101 AbstractOK TRUE Note: If more than one author is selected as presenting author, the first one selected is used as presenting. After the submission, a unique identification number is assigned to each contribution (AbstractID= 101 in this case). If all the authors have the same affiliation, it will be reported only once in the final abstract. The program file To use generbook to generate the detailed program of the conference and the book of abstracts, the program of the conference has to be specified in a tab-delimited file where each record (row) of the file refers to a session. The main pieces of information required are: the name of the session, when and where it will be held, who is going to chair it, and which abstracts are scheduled in the session. Some fields can be left empty if they do not apply. For example, if a record refers to a break, the fields for the chair of the session and for the abstracts will remain empty. Table 2 reports the variables that the user can specify for each of the sessions. Table 2 Variables included in the program file. Variable Example Name Exploration of the Outer Space Day 1 DayLong SUNDAY, September 20, 2009 DayShort Sunday, September 20 DayTable Sunday Room Hall 1 TimeBegin 10.30 TimeEnd 12.30 Abstract1 101 Abstract2 23 Abstract3 21 Abstract4 81 Abstract5 Note: The variables refer to a single session of the conference. The day of the session is reported in four different ways: as a number (Day, indicating the order of the conference days - 1 for the first day, etc.), with a long (DayLong), short (DayShort), or very short (DayTable) denomination. The reason is that in different parts of the book of abstracts we need different level of details. For example, when preparing the table with the outline of the program we use just the day of the week, while for the detailed program the longer denomination is used. Additional variables can be added by the user of generbook to the program file or to the contributions database. Functions written in R We wrote some functions in R to automate the preparation of some parts of the book of abstracts. These functions are useful because they perform some tasks that would be time-consuming if performed manually. Hence, they allow the user to automatically obtain a new book of abstracts each 14 Lusa L et al.: Automated Preparation of the Book of Abstracts using R and LaTeX time the program is modified (i.e., changes are made to the program file) or any correction is made to the contributions database. R function: generate.abstracts() This R function retrieves pieces of information from the contributions database and generates a separate text file (abstract file) for each contribution. The abstract file contains the information that will be included in the book of abstracts regarding the contribution: the title, the names of the authors, their affiliations and e-mail addresses, and the text of the abstract. To use the generate.abstracts() function, the user needs to specify: where the contributions database is located, in which directory to save the abstract files, and which are the columns of the contributions database that contain the information relevant for the generation of the abstract files (i.e., which columns contain the identification number of the abstract, the title of the contribution, the text of the abstract, the names of the authors, their affiliations and e-mails, etc.). These pieces of information are the arguments of the generate.abstracts() function. As an example, we show the content of the abstract file that would be generated for the contribution described in Table 1 (file 101.tex), and in Figure 1 (upper panel) the resulting abstract as it would appear in the book of abstracts. \A {The Voyages of the Starship USS Enterprise} {\Presenting {James T. Kirk}$^1$\index{Kirk, JT} and Spock$^2$\index{Spock}} {\Affiliation{$^1$Starfleet, San Francisco, Earth}; \Email {JTKirk@starfleet.org } \Affiliation{$ ^2$Vulcan Academy of Science, Vulcan};\Email{spock@vulcan.org}} {Topic1: Exploration of the Outer Space, Topic2: Starships. Abstract ID: 101} {In this paper we present the voyages of the starship USS Enterprise and its five-year mission: to explore strange new worlds; to seek out new life and new civilizations; to boldly go where no man has gone before.} It can be noted that in this abstract file, we defined some new LaTeX commands (in bold). The purpose was to maintain a highly flexible style of the abstracts. The new commands are defined in the preamble of the LaTeX template for the book of abstracts (see below the description of Book.tex) and can be easily modified by the users of generbook. The main new command is \A, which specifies the formatting style of the abstracts, taking as arguments: the title (#1), the names of the authors (#2), the affiliation and e-mails of the authors (#3), the keywods (#4) and the text of the abstract (#5). \A is defined as follows: \newcommand{\A}[5]{ \begin{minipage}{\textwidth} \Title{#1} \Author{#2} \AffiliationAndEmail{#3} \Keyword{#4} \Abstract{#5} \end{minipage} } The formatting of each of the arguments is further specified by five newly defined LaTeX commands (\Title, \Author, \AffiliationAndEmail, \Keyword, and \Abstract). The definitions of these commands can be found in the LaTeX template for the book of abstracts included in the generbook package (Book.tex). A simple example of the flexibility of this approach can be seen comparing the upper and lower panels of Figure 1. To obtain the abstract reported in the lower panel, we modified the \Title command, using italic fonts instead of bold fonts; the \Author command, removing the centering and the italic fonts; and the \Presenting command, indicating the presenting author with an asterisk instead of underlying their name. Informatica Medica Slovenica 2009; 14(1-2) The Voyages of the Starship USS Enterprise 'Starlkct, S:m Francisco.Earth: JTKirk@starfleet .org _Vrulcati Academy of Science, Vulcan; spocicQvulcan - org In this paper we present ihe voyages of the staiship USS Enterprise ami its five-year mission; tc explore strange new worlds; lo seek oul new lite and new civilj/aljofls; in holdlv go where no mais 'Star fleet. San Francisco. Earth; JTKirk3 starf leet. org 2Vulcan Academy or Science. Vulcan: spock@vulcan. org In this paper wc pre soul I be voyages of the starship USS Eiilerpriw ai«J its five-year mission: to explore strange new worlds; to seek out new life and new civilizations; to boldly go where no nun 15 I'n'j'r.in; Overview Hull 1 Hull 2 SIIIILIHV IG.30- 11.mi gegtstnOoi^ 11,00- 11.10 Opening of the Conference [1 10- 12.00 Invited Lecture t:.oo- i:.:o [In. A 12.20- 13 40 Exploration iifLlH." Bl.ti? Assimilation : >.!.■: S |'.i.Si 13.40- 15,00 UilKiti 15.00- 16.20 I RamulanDiplomacy NcwCtguMPlanets 16.20- 16 41) Bmak 16.40 - 13.00 PositroriK 11. 1: K' 111 i: Lvdgmga Development 14.00 II l. ' l I.-, , , p'l. ii Mixicluy 1-10- [1.00 lllY 11'.'li 1 .villi. 1 (1.00- 10.20 Break 111.20- 11.40 Nixi B^'liii'iiid Stici.s V n,. 'ii .1..I Tclnpullis Unilcr Wtitp 11.40- 12.00 Bioifc 12,20- 13.40 HKin". ,'i iik- United ■ i Ii ill,, liiii. Iritf ration of Planets on Sliinships 13.00- 14.30 Lunctt 14. JO hAmr.tu-; to M ii' 19.00 ! ,.|:iV ll COftfeffltHS Figure 1 The contribution described in Table 1 as it would appear in the book of abstracts, using the default settings of generbook (upper panel) and redefining some of the LaTeX commands (lower panel, obtained by modifying the \Author, \Title and \Presenting commands in the LaTeX template file). These modifications required the change of few lines of code in the preamble of the LaTeX template of the book of abstracts (Book.tex). If we had not used this approach, a user interested in changing the style of the abstracts should have changed the code of the R function, which would have been more complicated. Even more work would have been required to introduce these changes in a manually edited book of abstracts. R function: generate.programOverview() This R function uses the program file and produces the LaTeX input file (programOverview.tex) that contains the program overview of the conference. The program overview is a table that reports the names and times of the sessions, but not the list of the abstracts that are presented in each session. An example of a program overview table produced with generbook is shown in Figure 2. Figure 2 Example of a program overview table generated with generbook. R function: generate.program() This R function uses the contributions database and the program file, and produces two files that are necessary to generate the book of abstracts: • program.tex — the file that contains the LaTeX input file with the detailed program of the conference, i.e., the detailed program lists for each session with the titles of the contributions and the names of the authors; • abstracts.tex - the file that contains the LaTeX input file for the part of the book of abstracts that incorporates the abstracts; in our current implementation of generbook, the abstracts appear in same order as in the detailed program and each session is separated from the others; the names of the sessions and the dates appear in the header of the document (see the header in Figure 1 for an example); this is obtained by redefining the header commands in the document for each new session. 16 Lusa L et al.: Automated Preparation of the Book of Abstracts using R and LaTeX LaTeX template for the book of abstracts The LaTeX template file for generating the book of abstracts in called Book.tex and it can be downloaded from the web site of the generbook project. The template file can be used as the basis for any book of abstracts and it requires few manual modifications to adapt its style to the user's preferences. As announced in the description of the R funcions, we defined some new LaTeX commands in the preamble of the Book.tex file. These commands define the style that is used for the abstracts (see the description of \A in the "Functions written in R" section), and the page style and layout of the book of abstracts. After the preamble, the Book.tex document is very simple as it uses the LaTeX \input command to read and process files that were either already generated by the generbook package (programOverview.tex, program.tex and abstracts.tex) or previously prepared by the user. The user needs to prepare the LaTeX documents containing the front matter of the book of abstracts (cover.tex file), i.e., the title page, the edition notice, the page with the names of the members of the scientific and organizing committee, etc. An example of the cover.tex file is included in the generbook package and can be adapted. We prepared some additional LaTeX files that can be useful for completing the book of abstracts: the notes.tex file that specifies a page of the book of abstacts devoted to notes, the empty.tex file containing an empty page, and the sponsor.tex file that contains the information about the sponsors of the conference. A simplified version of the Book.tex document contains the following commands (comments to the code are preceded by the % symbol): % includes the front matter of the book \inputjcover.tex} % includes the program overview \inputjprogramOverview.tex} % includes the detailed program \inputjprogram.tex} % includes all the abstracts to be presented % at the conference \inputjabstracts.tex} % makes the index of authors \printindex % includes the pages for taking notes \inputjnotes.tex} % includes the information about the sponsoring % of the conference \inputjsponsors.tex} Figure 3 summarizes the use of the generbook package. It shows the files that the user needs for using generbook (the files represented in solid boxes), the files that can be prepared using the R functions of generbook (in dashed boxes), and which files are inputted into other LaTeX files (indicated with dashed arrows). The final document in PDF form is obtained by compiling the LaTeX source file Book.tex (e.g., using the pdflatex program, which is included in the Windows distribution of MikTeX). Results and Discussion We used the generbook package to generate the book of abstracts of the AS2009 Conference. The book of abstracts was printed and published on-line.3 We handled the submission of the contributions using the forms available in Google Documents. The preparation and publication of the form is straightforward, and it does not require any programming knowledge. In our form, we considered the possibility of having at most six authors. The submission form that was used can be seen at http://spreadsheets.google.com/viewform?hl=en&f ormkey=cmhvV0FPYk5CNk9oNGlwbzF6TlIzdlE 6MA (note that submissions are closed). Informatica Medica Slovenica 2009; 14(1-2) 17 Figure 3 Use of generbook package. The files represented in boxes with solid lines are the those that must be prepared by the user, while those in boxes with dashed lines are produced by using the generbook package (as indicated by solid arrows, listing the function or program used to generate the file). Dashed arrows represent the connections between a LaTeX file and the input files that are included in it. The spreadsheet with the contributions was exported into a tab-delimited text file and a unique identification number was assigned to each contributed abstract (AbstractID); this number was communicated to the authors and used for all further communications with them. The abstract files were generated using the generate.abstracts() function. Few manual corrections were needed. The most common problem was the use of symbols that were misinterpreted when included in a LaTeX document (for example, the symbol "%" marks the beginning of a comment in LaTeX, so it had to be substituted with "\%"). We prepared a temporary version of the book of abstracts that included all the contributions, and we used this book for selecting the contributions for the conference. This temporary book was very helpful in the selection process, as it provided a clear and organized display of the abstracts. The decision about the contributions was included in the contributions database by defining a new variable (AbstractOK). The program was specified by preparing a program file as described above, and the abstracts that were selected for each session were identified by their AbstractID number. We used the R functions of generbook to create the LaTeX documents containing the program overview, the final detailed program and the complete abstracts, in the same order as they appeared in the program. The style of the book was defined in the preamble of the LaTeX template and it was easy to modify. Most importantly, last minute changes to the program were handled easily: the only required manual changes were those on the program file. Everything else was produced automatically: the new LaTeX input files were generated using the R functions, and the LaTeX file of the book of abstracts was recompiled. In this way, the book of abstracts was updated in all its parts. The final book of abstracts can be viewed on-line.3 Manually updating any changes of the program in the book of abstracts can be very cumbersome and it can easily introduce some errors into the final document. For example, a simple change like switching two sessions would require the modification of the following parts of the book of abstracts if handled manually: the content of the program overview table, the detailed program, the order in which the complete abstracts appear in the book, and the page references in the index of authors. Conclusions In this paper, we presented the generbook package, a freely available set of R functions and of LaTeX templates that can be used to generate the book of abstract of scientific conferences. The package also provides some simple tools for managing the conference program. In our experience, generbook proved to be a valuable tool in the organization of the AS2009 Conference, reducing the tediousness of manually updating the files and the probability of making 18 Lusa L et al.: Automated Preparation of the Book of Abstracts using R and LaTeX errors. The system has proven to be sufficiently robust yet versatile. The package is relatively easy to use for anyone with basic knowledge of R and LaTeX. Although some commercial alternatives to our system exist, the presented package has the advantage of being freely available (open source), and features flexibility rarely found in other systems. Our package can be seen in the framework of reproducible computing, as it provides a reproducible solution for the preparation of the book of abstracts. References 1. R Development Core Team: R: A Language and Environment for Statistical Computing. Vienna, Austria 2009: R Foundation for Statistical Computing. http://www.R-project.org 2. Lamport L: LaTeX: A document preparation system: User's guide and reference (2nd ed.). Reading 1994: Addison-Wesley. 3. Stare J, Lusa L (eds.): International Conference Applied Statistics: Program and abstracts (electronic ed.). Ljubljana 2009: Statistical Society of Slovenia. http://conferences.nib.si/AS2009/AS2009-Abstracts.pdf.