UNIVERSITY OF LJUBLJANA BIOTECHNICAL FACULTY Leon DEUTSCH BIOINFORMATICS INTEGRATION OF MICROBIOME AND METABOLOMICS DATA IN THE TRANSLATIONAL CONTEXT DOCTORAL DISSERTATION Ljubljana, 2022 UNIVERSITY OF LJUBLJANA BIOTECHNICAL FACULTY Leon DEUTSCH BIOINFORMATICS INTEGRATION OF MICROBIOME AND METABOLOMICS DATA IN THE TRANSLATIONAL CONTEXT DOCTORAL DISSERTATION Ljubljana, 2022 “It is better to fail aiming high than to succeed aiming low.” – Bill Nicholson Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Based on the Statute of the University of Ljubljana and the decision of the Biotechnical Faculty senate, as well as the decision of the Commission for Doctoral Studies of the University of Ljubljana adopted on 15.02.2021 it has been confirmed that the candidate meets the requirements for pursuing a PhD in the interdisciplinary doctoral programme in Biosciences, Scientific Field Bioinformatics. Prof. Dr. Blaž Stres is appointed as supervisor. Na podlagi Statuta Univerze v Ljubljani ter po sklepu Senata Biotehniške fakultete in sklepu Komisije za doktorski študij Univerze v Ljubljani z dne 15.2.2021 je bilo potrjeno, da kandidat izpolnjuje pogoje za opravljanje doktorata znanosti na Interdisciplinarnem doktorskem študijskem programu Bioznanosti znanstveno področje bioinformatika. Za mentorja je bil imenovan prof. dr. Blaž Stres. Supervisor (mentor): Prof. Blaž STRES, PhD, University of Ljubljana, Biotechnical Faculty, Faculty of Civil and Geodetic Engineering, Jožef Stefan Institute, Slovenia Committee for the evaluation and the defense (Komisija za oceno in zagovor): Chair (predsednik): Prof. Andrej, BLEJEC, PhD University of Ljubljana, Biotechnical Faculty, Department of Biology, Slovenia Member (član): Prof. Gregor ANDERLUH, PhD National Institute of Chemistry, Department for Molecular Biology and Nanobiotechnology, Slovenia Member (član): Assoc. Prof. David GOMEZ CABRERO, PhD King Abdullah University of Science and Technology, Thuwal, Saudi Arabia Translational Bioinformatics Unit at Navarra biomed, University of Valencia, Valencia, Spain Date of the defense (datum zagovora): 21.11.2022 Leon DEUTSCH II Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 KEY WORDS DOCUMENTATION ND Dd DC UDC 579:004(043.3) CX bioinformatics, metabolomics, metagenomcis, physical inactivity, data integration, systems biology, microbiome, urine, nuclear magnetic spectrometry AU DEUTSCH, Leon AA STRES, Blaž (supervisor) PP SI-1000 Ljubljana, Jamnikarjeva 101 PB University of Ljubljana, Biotechnical Faculty, Interdisciplinary Doctoral Programme in Biosciences, Scientific field Bioinformatics PY 2022 TY BIOINFORMATICS INTEGRATION OF MICROBIOME AND METABOLOMICS DATA IN THE TRANSLATIONAL CONTEXT DT Doctoral Dissertation NO XI, 205 p., 2 tab., 18 fig., 323 ref. LA EN Al en/sl AB Human metabolism was studied in three different projects focusing on different levels of inactivity. Urine, liquor and serum metabolomics were used to assess the impact of nusinersen treatment in patients with spinal muscular atrophy. Urine samples were contrasted with samples from matching healthy cohort. In the PreTerm project, metabolomics (fecal and urine) and fecal microbial metagenomics were used to assess the differences between preterm and full-term born adults. In the X-Adapt project, urinary metabolomics was used to evaluate the 10-day training regime and the differences between trained and untrained individuals. In all projects, classification models based on different data sets were developed as a proof of principle and to foster their use in future studies or possibly in medical diagnostics. In addition, two workflows (GUMPP and MAGO tool) and a method to study physicochemical parameters (minimum pressure of piercing strength) were developed to study the microbiome and its environment, respectively. The final work resulted in the creation of the first Slovenian urine 1H-NMR database, which consists of 1200 urine metabolomes from different projects (PlanHab, Spinal Muscular Atrophy, X-Adapt, PreTerm, Healthy males and females) measured by 1H-NMR, outlining the baseline for future extensions. The entire database can be used to build machine-learning models for classification between different diseases or levels of physical activity at a national level. III Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 KLJUČNA DOKUMENTACIJSKA INFORMACIJA ŠD Dd DK UDK 579:004(043.4) KG bioinformatika, metabolomika, metagenomika, fizikalna inaktivnost, integracija podatkov, sistemska biologija, mikrobiom, urin, jedrska magnetna resonanca AV DEUTSCH, Leon SA STRES, Blaž (mentor) KZ SI-1000 Ljubljana, Jamnikarjeva 101 ZA Univerza v Ljubljani, Biotehniška fakulteta, Interdisciplinarni doktorski študij Bioznanost, Znanstveno področje Bioinformatika LI 2022 IN BIOINFORMACIJSKA INTEGRACIJA MIKROBIOMSKIH IN METABOLOMSKIH PODATKOV V TRANSLACIJSKEM KONTEKSTU TD Doktorska disertacija OP XI, 205 str., 2 pregl., 18 sl., 323 vir. IJ sl JI sl/en AI V okviru večih projektov smo preučevali metabolome preiskovancev z različnimi stopnjami neaktivnosti. Za oceno učinka zdravljenja z zdravilom nusinersen pri bolnikih s spinalno mišično atrofijo smo uporabili metabolomiko urina, likvorja in seruma. Dodatne vzorce urina smo primerjali s tistimi iz zdrave kontrolne skupine. V projektu PreTerm smo s kombinacijo metabolomike fecesa in urina ter z metagenomiko fekalnega mikrobioma raziskovali razlike med predčasno in pravočasno rojenimi odraslimi. V projektu X-Adapt smo z metabolomiko urina ocenili učinke 10-dnevnega režima treninga in razlik med treniranimi in netreniranimi posamezniki. V vseh projektih smo na zbranih podatkih razvili modele za razvrščanje skupin z namenom razvoja analitskih poti ter prikaza možnostjo uporabe teh modelov v prihodnjih študijah ali morda v medicinski diagnostiki. Poleg tega sta bila razvita dva cevovoda (orodje GUMPP in MAGO) ter metoda za preučevanje fizikalno-kemijskih parametrov (minimalna prebodna sila) za preučevanje mikrobioma in njegovega okolja. Končni rezultat analize je bila izdelava prve slovenske metabolomske baze podatkov 1H-NMR urina, ki jo sestavlja 1200 metabolomov urina iz različnih projektov (PlanHab, Spinalna mišična atrofija, X-Adapt, PreTerm, Zdravi moški in ženske), merjenih z 1H-NMR, ki predstavljajo osnovo za prihodnje razširitve iz novih projektov. Celotno bazo podatkov je mogoče uporabiti za gradnjo modelov strojnega učenja za razvrščanje med različnimi boleznimi ali stopnjami telesne aktivnosti na nacionalni ravni. IV Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 TABLE OF CONTENTS KEY WORDS DOCUMENTATION ............................................................................................ III KLJUČNA DOKUMENTACIJSKA INFORMACIJA ............................................................... IV TABLE OF CONTENTS ................................................................................................................. V TABLE OF CONTENTS OF SCIENTIFIC WORKS ............................................................... VII LIST OF FIGURES ..................................................................................................................... VIII LIST OF TABLES .......................................................................................................................... IX ABBREVIATIONS AND SYMBOLS ............................................................................................. X 1 INTRODUCTION ........................................................................................................................... 1 1.1 HUMAN SYSTEMS BIOLOGY AND HEALTH ..................................................................... 2 1.1.1 Metabolomics .................................................................................................................... 4 1.1.2 Microbial metagenomics .................................................................................................. 6 1.1.3 Analysis of data ................................................................................................................. 9 1.2 INACTIVITY......................................................................................................................... 15 1.3 PURPOSE OF THE RESEARCH .......................................................................................... 16 1.4 HYPOTHESES ...................................................................................................................... 17 1.4.1 PreTerm related .............................................................................................................. 17 1.4.2 SMA related .................................................................................................................... 18 1.4.3 Hypotheses of merged dataset ....................................................................................... 18 2 SCIENTIFIC WORKS ............................................................................................................ 19 2.1 PUBLISHED SCIENTIFIC WORKS .................................................................................... 19 2.1.1 Computational framework for high-quality production and large-scale evolutionary analysis of metagenome assembled genomes ............................................................... 19 2.1.2 General unified microbiome profiling pipeline (GUMPP) for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data to predicted microbial metagenomes, enzymatic reactions and metabolic pathways ................... 26 2.1.3 Spinal muscular atrophy after nusinersen therapy: improved physiology in pediatric patients with no significant change in urine, serum, and liquor 1H-NMR metabolomes in comparison to an age-matched, healthy cohort ............................... 41 2.1.4 The importance of objective stool classification in fecal 1H-NMR metabolomics: exponential increase in stool crosslinking is mirrored in systemic inflammation and associated to fecal acetate and methionine ................................................................... 58 2.1.5 Systems view of deconditioning during spaceflight simulation in the PlanHab project: the departure of urine 1H-NMR metabolomes from healthy state in young males subjected to bedrest inactivity and hypoxia ...................................................... 75 2.1.6 Exercise and interorgan communication: short-term exercise training blunts differences in consecutive daily urine 1H-NMR metabolomic signatures between physically active and inactive individuals .................................................................... 91 2.1.7 Urine and fecal 1H-NMR metabolomes differ significantly between pre-term and full-term born physically fit healthy adult males ...................................................... 110 2.2 ADDITIONAL SCIENTIFIC WORK ................................................................................. 133 2.2.1 Metagenomes assembled genomes from the PreTerm project ................................. 133 V Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.2.1.1 Introduction .............................................................................................................. 133 2.2.1.2 Materials and methods ............................................................................................. 134 2.2.1.3 Results ...................................................................................................................... 134 2.2.1.4 Discussion ................................................................................................................ 136 2.2.2 Data integration ............................................................................................................ 137 2.2.2.1 Introduction .............................................................................................................. 137 2.2.2.2 Materials and methods ............................................................................................. 137 2.2.2.3 Results ...................................................................................................................... 138 2.2.2.4 Discussion ................................................................................................................ 139 3 DISCUSSION AND CONCLUSIONS ...................................................................................... 142 3.1 DISCUSSION ........................................................................................................................ 142 3.1.1 Developed tools for data integration ........................................................................... 145 3.1.2 Physicochemical characteristics of microbial world in the gut ................................ 147 3.1.3 Metabolomics in the PlanHab study ........................................................................... 149 3.1.4 Spinal muscular atrophy .............................................................................................. 150 3.1.5 X-Adapt project – the influence of short-term training on inactive individuals ..... 153 3.1.6 Metabolomes and microbial metagenomes can distinguish preterm and full-term born adults .................................................................................................................... 156 3.1.7 Data integration ............................................................................................................ 160 3.1.8 What about the future? ................................................................................................ 162 3.9 CONCLUSIONS .................................................................................................................... 164 4 SUMMARY (POVZETEK) ....................................................................................................... 165 4.1 SUMMARY ........................................................................................................................... 165 4.2 POVZETEK ........................................................................................................................... 168 5 REFERENCES ............................................................................................................................ 180 ACKNOWLEDGEMENTS VI Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 TABLE OF CONTENTS OF SCIENTIFIC WORKS Pg. Murovec B., Deutsch L. , Stres B. 2019. Computational framework for high-quality production and large-scale evolutionary analysis of metagenome assembled genomes. Molecular Biology and Evolution, 37, 2: 593-598...…………………………………………………………………………19 Murovec B., Deutsch L. , Stres B. 2021. General unified microbiome profiling pipeline (GUMPP) for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data to predicted microbial metagenomes, enzymatic reactions and metabolic pathways. Metabolites, 11, 6: 336, doi: https://doi.org/10.3390/metabo11060336, 14 p. ……………………………………………………26 Deutsch L. , Osredkar D., Plavec J., Stres B. 2021. Spinal muscular atrophy after nusinersen therapy: improved physiology in pediatric patients with no significant change in urine, serum, and liquor 1H-NMR metabolomes in comparison to an age-matched, healthy cohort. Metabolites, 11, 4: 206, doi. https://doi.org/10.3390/metabo11040206, 15 p. ...………………………………………41 Deutsch L. , Stres B. 2021. The importance of objective stool classification in fecal 1H-NMR metabolomics: exponential increase in stool crosslinking is mirrored in systemic inflammation and associated to fecal acetate and methionine. Metabolites, 11, 3: 172, doi. https://doi.org/10.3390/metabo11030172, 16 p. …………………………………………………..58 Šket R., Deutsch L. , Prevoršek Z., Mekjavić I.B., Plavec J., Rittweger, J., Debevec T., Eiken O., Stres B. 2020. Deutsch L., Stres B. 2021. Systems view of deconditioning during spaceflight simulation in the PlanHab Project: The departure of urine 1H-NMR metabolomes from healthy state in young males subjected to bedrest inactivity and hypoxia. Frontiers in Physiology, 11: 532271, doi. https://doi.org/10.3389/fphys.2020.532271, 15 p. ……………………………………………75 Deutsch L. , Sotiridis A., Murovec B., Plavec J., Mekjavić I., Debevec T., Stres B. 2022. Exercise and interorgan communication: short-term exercise training blunts differences in consecutive daily urine 1H-NMR metabolomic signatures between physically active and inactive individuals. Metabolites, 12,6: 473, doi. https://doi.org/10.3390/metabo12060473, 18 p. ……………………91 Deutsch L. , Debevec T., Millet G.P.., Osredkar D., Opara S., Šket R., Murovec B., Mramor M., Plavec J. Stres B. 2022. Urine and fecal 1H-NMR metabolomes differ significantly between preterm and full-term born physically fit healthy adult males. Metabolites, 12: 6, doi. https://doi.org/10.3390/metabo12060536, 23 p. ..……………………………………………….110 For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). VII Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 LIST OF FIGURES Figure 1: Interactions between biological systems (Kronegger and Stres, 2019, Hasin et al., 2017)…..………………………………………………………………………...………..3 Figure 2: Representative spectra obtained in the X-Adapt study. ...................................................... 5 Figure 3: Differences between “microbiota” and “microbiome” terms (Berg et al., 2020). ............. 8 Figure 4: Increase importance of microbiome research (Wilkinson et al., 2021). ............................. 9 Figure 5: Representation of data analysis in multi-omics research (Tebani et al., 2016). ............... 12 Figure 6: Data analysis methods use in current work. ..................................................................... 14 Figure 7: Lifestyle of modern humans (Dunstan et al., 2021). ........................................................ 16 Figure 8. Staircase approach for increased activity level (Dunstan et al., 2021). ............................ 16 Figure 9: Graphical presentation of collected samples and ‘omics layers. ...................................... 17 Figure 10: Relationship between completeness and contamination of MAGs in control and preterm group. ............................................................................................................................ 135 Figure 11: Number of MAGs per both groups and their quality. .................................................. 136 Figure 12: PSLDA of all metabolomes stratified by activity......................................................... 138 Figure 13: The success of classification with PLSDA. .................................................................. 139 Figure 14: Representation of studies involved in this work........................................................... 150 Figure 15: Model representing results of inactivity. ...................................................................... 153 Figure 16: Change between trained (T) and untrained (UT) participants of X-Adapt study. ........ 156 Figure 17: A summary of observed changes in PreTerm study. .................................................... 160 Figure 18: The continuation of the projects, described in this work.............................................. 163 VIII Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 LIST OF TABLES Table 1: Representation of differences between nuclear magnetic resonance (NMR) and mass spectrometry (MS) (Wishart, 2019). .................................................................................... 5 Table 2: My contributions to published and unpublished work and postulated hypothesis in the frame of this PhD. ............................................................................................................ 142 IX Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 ABBREVIATIONS AND SYMBOLS 16S rRNA Subunit of ribosomal ribonucleic acid 1H-NMR Proton nuclear magnetic resonance ANI Average nucleotide identity ASV Amplicon sequence variants AUC The area under the curve AutoML Automatic machine learning BMI Body mass index BSS Bristol stool Scale CHPOP INTEND The Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders DSS Sodium trimethylsilylpropanesulfonate EMA The European Medicines Agency FDA The Food and Drug Administration FiO2 Inspired O2 fraction GenomesDB The reference genome database GUMPP General Unified Microbiome Profiling Pipeline HAmb Hypoxic ambulation HBR Hypoxic bedrest HCA Hierarchical cluster analysis HMDB Human metabolome database HMFS Hammersmith Functional Motor Scale HMFSE Expanded Hammersmith Functional Motor Scale HPC high performance computing Humann The HMP Unified Metabolic Analysis Network ID Identity JADBio Just add bio data JSpeciesWS JSpecies Web Server LPS lipopolysaccharide MAG Metagenome assembled genome MAGO Metagenome-Assembled Genomes Orchestra MetaPhlAn Metagenomic Phylogenetic Analysis MFM Motor Function Measurement MIMAG Minimum information about a metagenome-assembled genome ML Machine learning MP Minimal pressure MS Mass spectrometry NBR Normoxic bedrest NPMANOVA Non-parametric multivariate analysis of variance OTU Opetrational taxonomic units PCA Principal component analysis PCR Polymerase chain raction X Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Picrust2 Phylogenetic Investigation of Communities by Reconstruction of Unobserved States PiO2 Partial pressure of inspired O2 PlanHab The Planetary Habitat simulation project PLSDA Partial least squares regression discriminant analysis PPAR-α Peroxisome proliferator-activated receptor alpha ppm Parts per million PreTerm The physiological responses at adulthood as a result of preterm delivery QC Quality check ROC Receiver operating characteristic curve ROS Reactive oxygen subtances SMA Spinal Muscle Atrophy SMN The survival motor neuron TCA The tricarboxylic acid TMAO Trimethylamine N-oxide TSP Trimethylsilylpropanoic acid VIP The Variable importance in projection VO2max Maximal oxygen output WHO World heatlh organization Wpeak Maximal pedaling power output X-Adapt Cross-adaptation between heat and hypoxia - novel strategy for performance and work-ability enhancement in various environments XI Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 1 INTRODUCTION This study built upon our past work in the field of metagenomics and metabolomics and significantly extended our analytical approaches developed for the Planetary Habitat simulation project (PlanHab) (Debevec et al., 2014; Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018), which was used to study the short-term and reversible effects of human host physical inactivity. The effects of short-term inactivity resulted in maladjustments in physiology, intestinal microbiota, and metabolomic profiles giving rise to increased inflammation, depression, and insulin resistance, resembling metabolic syndrome and type 2 diabetes symptoms. In contrast, the effects of long-term physical inactivity, the lack of oxygenation (e.g., cardiovascular fitness) and signals from large body muscles (e.g., lower limbs) are not well understood despite their direct and widespread biomedical relevance for people delivered preterm and/or genetic disorders, such as Spinal Muscle Atrophy (SMA), obesity, cardiovascular deconditioning, chronic obstructive pulmonary disease, and many other noncommunicable diseases. To extend our understanding in the field of human physiology in relation to human gut microbiome, a diverse range of samples was collected within the following three major projects: i) the physiological responses at adulthood as a result of preterm delivery (PreTerm project; ARRS J3-7536; EU project https://recap-preterm.eu/); ii) the Spinal Muscular Atrophy (project at the University Clinical Centre Ljubljana) as an extreme case of physical inactivity, and iii) cross-adaptation between heat and hypoxia: a novel strategy for performance and work-ability enhancement in various environments (X-Adapt; research project ARRS J5-9350). The SMA and PreTerm projects dealt with the lifelong exposure to systemic effects of reduced physical activity that can be summarized as following: i) intermittent episodes of systemic hypoxia at rest/sleep (PreTerm), and ii) continuous systemic hypoxia due to reduced physical activity of the host and the alleviation of hypoxia after therapy. The X-Adapt project dealt with the influence of a standard 10-day training regime on the physiology of healthy trained and untrained individuals. The biochemical characterization of bodily fluids collected within the three projects was used to explore the biochemical makeup (metabolites) and their interactions (metabolic pathways) next to the differences between studied groups. The PreTerm and X-Adapt projects contained healthy baseline data collection for the SMA project to determine the different metabolic pathways between the healthy and affected groups, as well as before and after SMA genetic treatment. Additionally, samples from healthy individuals and their children (father and sons, mothers and daughters) were collected to match those of SMA group and in addition to provide a baseline healthy cohort for metabolomic database. As a result, a national Slovenian urine nuclear magnetic resonance (NMR) database was established with the intent to enable distinction between various “diseased” groups of participant form “healthy” group of participants based on urine metabolites in the future. The extended inclusion of novel samples is planned. 1 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 In addition, little is known about the existence of differences in the human-gut microbiome relationship due to the lifelong exposure to hypoxic episodes in the preterm (compared to full-term born adolescents (The PreTerm project)) that could affect the functionalities and metabolism of microbiomes within such hosts and be linked to the various physiological differences observed globally between the two groups in previous research (Martin et al., 2018). In short, a high number of wet-lab measurements was conducted on a large number of parameters, utilizing makeup of three major projects in the field of biomedical science, utilizing metagenomics, metabolomics, bioinformatics, and data integration approaches. The data generated within each ‘omics technology was analysed and finally integrated to gain better understanding of humans as (holobiont) systems. 1.1 HUMAN SYSTEMS BIOLOGY AND HEALTH Systems medicine or systems biology is a relatively new term (even as of 2022) that combines the application of systems biology concepts, methods, and analytical tools to scientific research and medical practice. The main goal of systems medicine is to integrate data from different levels of research into biomedical models that can predict the behaviour of a system, enhance our understanding of it, and ultimately be used in the prevention, cure, or treatment of disease. These approaches are utilized to study the daunting complexity of chronic (noncommunicable) and acute diseases, be they in humans, animals, or plants. Noncommunicable diseases were shown to develop slowly over prolonged periods of time (years to decades; Alzheimer’s disease, type 2 diabetes, metabolic syndrome, insulin resistance, psychological disorders, etc.). Multiple factors were shown to contribute to the development of particular disease types, making them even more complex to study and understand. These factors range from host gene variants, epigenetic regulation of expression, to the microbiome and its metabolic activities, all in response to detrimental environmental factors (e.g., sedentary lifestyle, diet, stress, hydration, circadian rhythm, etc.) (Craig, 2008; Bousquet et al., 2011; Mizeranschi et al., 2016)). The term “system” has thus far been used depending on the scale of the study domain to describe behaviour at selected chemical compounds at the molecular level, extending to its reaction (an enzyme bound to a ligand) or a microbe or complex microbiome at the level of a single human gut or the population globally. Consequently, a system can be observed at different time and size scales (a few milliseconds and a few micrometres compared to the entire human body and 70 years) (Noble, 2002; Hunter and Nielsen, 2005). At the same time, the surrounding short- and long-term environment with its physical and chemical parameters exerts significant multivariate effects on the entire system of observation (diet, level of activity, use of medications, society, etc.). From this point of view, the microbiome is only one subsystem of the many present in human body, which interacts in many directions over various ‘omic layers, thus generating a complex network of interfering signals acting differently over time and space (Figure 1, (Stres and Kronegger, 2019)). The first step in the systems medicine approach is to identify the key structuring variables important for systems functioning out of all that are measured in relation to the nature of the disease. In the 2 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 design of ‘omics studies, each layer of ‘omics data provides a list of differences associated with the disease state relative to previous time point or healthy state. Analysing a single type of ‘omics data in the absence of other datasets has the potential to generate oversimplified conclusions; therefore, researchers should integrate various types of ‘omics data from large cohorts (Hasin et al., 2017). Figure 1: Interactions between biological systems (Kronegger and Stres, 2019, Hasin et al., 2017). Bidirectional interaction between different biological systems (e.g. Microbiome and human), between ‘omics layers ((meta)genomics, transcriptomics, proteomics, metabolomics) and within individual ‘omic layers. Slika 1: Interakcija med biološkimi sistemi (Kronegger in Stres, 2019; Hasin in sod., 2017). Obojestranska interakcija med različnimi sistemi (npr. Mikrobiomom in človekom), med ‘omskimi nivoji ((meta)genomiko, transkriptomiko, proteomiko, metabolomiko) in znotraj posameznih ‘omskih nivojev. The rise of ‘omics technologies has enabled researchers to measure thousands of data points; this ability is now at the heart of systems biology and medicine. These high throughput technologies (genomics, transcriptomics, metabolomics, proteomics) enabled the discovery of complex sets of biomarkers describing healthy and disease states that can be objectively measured and evaluated. These variables can be used as indicators of a biological process (healthy versus diseased, active versus inactive, pre-treatment versus post-treatment) in the data-driven top-down research coupled to multivariate statistics and machine learning/artificial intelligence. Using high-throughput methods for analysis that capture the properties of systemic homeostasis and dysregulation, we can examine a large number of ‘omic markers (also called “biochemical entities”) simultaneously (Biomarkers working group, 2001; Holmes et al., 2008b; Fanos, 2016; Tebani et al., 2016; Apweiler et al., 2018; Gallo Cantafio et al., 2018). In this work, metabolomics and microbial metagenomics were used as the ‘omics methods of choice. 3 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 1.1.1 Metabolomics Metabolites are small molecules (< 1 kDa) in body fluids such as urine and serum. All metabolites detected in samples are part of the metabolome, which a quantitative description of low molecular weight molecules in a biological sample above the detection threshold of analytical approach. The metabolome is controlled partially by the host genome (primary metabolome), but it also depends on the microbiome metabolic activity (co-metabolome) (Holmes et al., 2008a; Vignoli et al., 2019) in response to changing local and outer environments. Using different spectroscopy methods (nuclear magnetic resonance, mass spectrometry (MS)), metabolic profiles can be analysed at a precise time point. This provides a top-down view of the biochemical processes that occur due to physiological status or environmental exposure (Barr, 2018). The genome is (as currently accepted) unchanged throughout the life span of an individual compared to the high responsiveness and fluidity of metabolome. The latter is heavily influenced by environmental factors such as gender, age, diet, physical activity, health status, and microbiome, to name the most relevant. Metabolome-wide association studies are improving the understanding of the relationship between metabolic profiles and disease risk factors in the general population (Holmes et al., 2008b; Elliott et al., 2015; Vignoli et al., 2019). Metabolomics complements functional metagenomics by mapping the complex metabolic interactions between the host and microbiota via metabolic profiles, compound identity and quantity, characterization of unknown small molecules produced by microbes, and defining the biochemical pathways of metabolites and biochemical reactions (Peisl et al., 2018). In the previous two decades, NMR has become one of the most important methods for measuring metabolites in different samples (liquid or solid) (Emwas et al., 2019; Wishart, 2019). It is based on the quantum mechanical property (spin) of each nucleus in the molecule. When a nucleus is excited in a magnetic field, a frequency domain spectrum with a peak corresponding to the frequency of the nucleus can be scanned. The frequency, or chemical shift, is reported in parts per million (ppm), and the amplitude of the peak corresponds to the number of nuclei present in the sample (Figure 2). Both can be used to determine the concentration of a molecule in the sample (Maguire, 2014; Keun and Athersuch, 2022). 1H-NMR spectroscopy is used in the majority of NMR based studies. Protons (1H) are present in every metabolite and exhibit the greatest NMR signal sensitivity (Emwas et al., 2019). NMR spectra are usually recorded in water and therefore require solvent suppression (Zheng and Price, 2010; Giraudeau et al., 2015). Compared to MS, the NMR method is robust and reproducible, requires minimal sample preparation, sample measurement is rapid and robust, hence highly replicable; at the same time, it is non-destructive, no chemical derivatization is required, and all types of metabolites can be measured simultaneously and automatically (Table 1). However, the analytical sensitivity is low (10 to 100 times lower than MS), the spectra are complex and computationally intensive to deconvolute, and the NMR spectrometer requires a significant amount of physical space compared to MS. NMR detects molecules at concentrations greater than 1 µM, while MS can detect molecules at concentrations greater than 10 nM (Emwas et al., 2019; Wishart, 2019). 4 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 2: Representative spectra obtained in the X-Adapt study. Slika 2: Spektri pridobljeni v okviru projekta X-Adapt. Table 1: Representation of differences between nuclear magnetic resonance (NMR) and mass spectrometry (MS) (Wishart, 2019). Preglednica 1: Primerjava razlik med jedrsko magnetno resonanco (NMR) in masno spektrometrijo (MS) (Wishart, 2019). 5 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Thus far, 1H-NMR has been used to investigate the modulation of metabolites on cellular stress (Lindon et al., 2003), breast cancer markers (Bro et al., 2015), acute pancreatitis (Dumas et al., 2014), the influence of the metabolome on health and disease (Marin et al., 2015), biomarkers for Crohn’s disease and ulcerative colitis (Bjerrum et al., 2015), obesity (Zhang et al., 2015), and coronary heart disease and stroke (Holmes et al., 2008a; Murovec et al., 2018; Sket et al., 2018; Vignoli et al., 2019). However, it is very difficult to distinguish between microbial and human metabolites. The metabolism of all parts of the holobiont (human cells and microbial cells) is highly dynamic and variable. For this reason, some authors have used the term “dark matter” of metabolomics, which (in short) means that some metabolites have already been described, but orders of magnitude higher numbers of other metabolites remain unknown. With each study, the data increase which will aid in illuminating the metabolomic dark matter. Modern statistical approaches and data integration combined with ongoing ‘omics research and methods development will accelerate the reduction of dark matter and reveal new insights, which will lead to the use of metabolomics methods in general diagnostics (da Silva et al., 2015; Peisl et al., 2018). 1.1.2 Microbial metagenomics Classical microbiological approaches such as cultivation methods are notoriously incomplete, tedious, irreproducible, and ineffective as only 1% of microorganisms (out of 1500 species) can be easily cultured. For the study of the entire gut microbiome system (microbiome, host, and environment), top-down approaches have taken the leap forward with the introduction of ‘omics technologies (Stres and Kronegger, 2019; Lin et al., 2021). For decades, amplicon sequencing was the most commonly used method in microbiome research. The most commonly used gold standard for amplicon sequencing was the sequencing of the gene for 16S rRNA. Variable regions of 16S rRNA are used to determine taxonomic profiles of the microbiota. The major weakness of this method is that we can only determine which taxa are present in the sample, and this can be effectively accomplished only down to the genus level, while species or strain resolution cannot be achieved. In addition, the functional potential of such a community remains obscured as only predictions of functional genes, metabolic pathways involved in the community of interest can be accomplished utilizing different tools such as Picrust2 tool (Langille et al., 2013b; D’Amore et al., 2016; Sinha et al., 2017; Fricker et al., 2019; Douglas et al., 2020). Recently, pipelines for automated analysis of amplicon sequences have been developed for more standardized and efficient analysis on high-performance computing clusters (HPC) (Murovec et al., 2020). The amplicon-sequencing approach is fast, simple, and requires low-cost sample preparation and analysis. However, it is not possible to distinguish living, dead or active microbes. The amplification method can lead to biases (selection of primers for PCR reaction), requires negative control, and functional information is limited (Knight et al., 2018). 6 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Metagenomics, in contrast, uses whole genome shotgun sequencing to fragment and sequence the entire DNA pool of the microbiome in the sample, rather than just one gene (e.g., gene for 16S rRNA) as in amplicon sequencing. The data, quality control, and the information derived from this method are orders of magnitude more comprehensive and enable the recovery of information about phages, viruses, bacteria, archaea, fungi, protozoa, and human DNA. With this approach there is no need for gene prediction based on 16S rRNA as functional genes are determined by comparison to complex gene-family databases with concomitant contamination recognition and removal. With the development of novel quality control tools (KneadData), microbiome taxonomy can be deciphered at the species level with MetaPhlAn3 next to the functional genes recovered from the sample (HUMAnN3) (Brown et al., 2013; Nayfach and Pollard, 2016; Garud et al., 2019; Beghini et al., 2021). Due to the tens of thousands up to millions of variables obtained, the analysis of such datamatrices becomes computationally intensive and requires the utilization of HPC clusters. The metagenomics approach also allows us to use the latest method in microbial genomics: de novo metagenome assembly. Metagenomics can reveal microbial taxonomic and phylogenetic identity, require no PCR amplification, and enable identification of previously known and new species (MAGs) next to new gene families. However, the metagenomics wet-lab and HPC operations are currently still very costly (Knight et al., 2018). Microbiome analyses are currently focusing on the use of metagenomics due to its wealth of data and reproducible analyses. The microbiome taxonomic description includes the representatives of the community (microbiota - bacteria, archaea, protists, fungi), while also providing the information on so-called “theatre of activity” (Figure 3). Therefore, the information provided includes not only taxonomic descriptions but also molecules produced by these taxonomic units (Whipps et al., 1988; Berg et al., 2020). This approach is becoming increasingly important as the estimates of the number of unique microbial genes per single unique human gene are becoming inherently higher over time, ranging from 50 (Qin et al., 2010) to more than 500 (as of 2022). For this reason, a holistic approach to the study of this system is required. Taking into account the considerable complexity, it becomes increasingly more evident that the disruption of the human microbiome and its activities is significantly associated to the development of various diseases, which in turn depend on the environment and lifestyle of the host (e.g., human). Various environmental factors can affect the gut microbiota: diet, medications, cultural habits, physical activity, transit time, gender, local environment, etc., to variable extent over time and space. (Schmidt et al., 2018; Deutsch and Stres, 2021). 7 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 3: Differences between “microbiota” and “microbiome” terms (Berg et al., 2020). Slika 3: Razlike med izrazi “mikrobiota” in “mikrobiom” (Berg in sod., 2020). Metagenomics has become a powerful tool for understanding host-microbiome relationships and enables linking biomarkers (genera, species, functional genes) to noncommunicable diseases such as inflammatory bowel disease (Frank et al., 2007), liver cirrhosis (Qin et al., 2014), diabetes (Giongo et al., 2011; Qin et al., 2012), cardiovascular (Wang et al., 2011b) and Parkinson’s disease (Scheperjans et al., 2015), colorectal cancer (Kostic et al., 2012), rheumatoid arthritis (Scher et al., 2013), obesity, metabolic syndrome and others. Therefore, metagenomics has become the currently most important approach to study the genetic potential of microbial populations in the intestinal tract. Figure 4 shows the importance of microbiome influence on our future health span (Wilkinson et al., 2021). 8 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 4: Increase importance of microbiome research (Wilkinson et al., 2021). Research of microbiome is becoming more and more important from the angle of improved wellness, timely interventions and search for biomarkers. Slika 4: Naraščajoča pomembnost raziskovanja microbiome (Wilkinson et al., 2021). Raziskovanje mikrobioma postaja vedno bolj pomembno iz perspektive izboljšanja naših življenj, medicinskih intervencij in iskanja biomarkerjev. 1.1.3 Analysis of data Both methods (metagenomics and metabolomics) require various steps of quality control and data processing from the sequences or spectra obtained to the final conclusions. After quality checking the sequences with programmes such as FastQC, fastp (Chen et al., 2018) or KneadData, the next step is to obtain actionable sequence data. We can determine which taxa are present in the sample (MetaPhlAn), identify strains (StrainPhlAn), and determine which functional genes are present in the sample (HUMAaN3) or even predict metabolites (MelonnPan) (Segata et al., 2012; Beghini et al., 2021; Mallick et al., 2019). To facilitate the use of such pipelines, workflows were developed by various groups, such as bioBakery or MetaBakery (in preparation by our group). These workflows simplify the use of programs for nature scientists, as they do not need to be installed separately and are already prepared as pipelines that work on HPC cluster as a Singularity images (Kurtzer et al., 2017) or Docker containers. All of these tools generate various matrices for visualisations, statistics, modelling, and machine-learning approaches (Costea et al., 2017; Quince et al., 2017; Knight et al., 2018; Moreno-Indias et al., 2021). These pipelines result in matrices of variables describing the samples. As a second option, de novo metagenome assembly represents a second option for metagenomics data analysis and results in the assembly of novel draft genomes that may represent new and not yet described species (Yang et al., 2021). Slightly different steps are required. After quality control, the read sequences have to be assembled. There are different assemblers, including metaSPAdes (Nurk et al., 2017), megahit (Li et al., 2015) or IDBA-UD (Peng et al., 2012). Assembled sequences are binned in the next steps using binning tools, such as BinSanity (Graham et al., 2017), CONCOCT (Alneberg et al., 2014), MetaBat, MaxBin, and DAStool (Wu et al., 2016; Sieber et al., 2018). 9 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Assembled metagenomes can be scored for quality (% completeness and % contamination) using CheckM (Parks et al., 2015) according to the MIMAG standard (> 90% complete and < 5% contamination) (Bowers et al., 2017). All MAGs obtained can be used for annotation with Prokka (Seemann, 2014) in GeneBank format or analysed with Roary (Page et al., 2015) as pan and core genomes. ezTree (Wu, 2018) can be used to extract protein-coding single-copy orthologous marker genes with functional annotation and to build maximum likelihood trees from amino acid sequences. High-throughput analysis of average nucleotide identity (ANI) of MAGs is used in FastANI (Jain et al., 2018). All the above programs are available as a single pipeline for MAGs development in MAGO and prepared for HPC computing as a Singularity image or Docker container (Murovec et al., 2020). The JSpeciesWS taxonomic threshold web service measures the probability of whether genomes belong to the same species or not based on their complete or tentative nucleotide sequence (Richter et al., 2016). For more in-depth analyses, the recently developed Genome Taxonomy Database can be utilized. Metabolome profiling is usually performed using either targeted or untargeted methods. Targeted metabolomics studies (metabolic profiling) focus on the accurate identification and quantification of a defined group of metabolites in biological samples. Untargeted studies (metabolic fingerprinting) focus on measuring and comparing as many signals as possible in a sample set, followed by the assignment of these signals to metabolites IDs using metabolomics databases. NMR measurements generate spectra that must be processed (Bingol, 2018; Klein, 2021). The untargeted approach does not require prior knowledge of the metabolites in the sample, so its analysis can be more complex and difficult (Klein, 2021). NMR spectra can be referenced with an internal chemical shift standard, such as DSS or TSP, which are the most commonly used standards in the NMR community (Emwas, 2015; Dona et al., 2016; Emwas et al., 2016; Emwas et al., 2018). In the pre-processing step, spectra must be phased and baseline corrected. With phasing, the absorptive character and symmetry of all NMR peaks are maximized (Wishart, 2008). Baseline correction is a processing step that removes all artefacts caused by electronic distortion or incomplete digital sampling, ultimately resulting in a completely flat part of the spectra in signal-free regions (Emwas et al., 2018). Several elements of the spectra need to be removed as they represent artefacts originating from protons in water (4.5-4.9 ppm) and urea (5.5-6.1 ppm). Software for targeted approaches, such as Chenomx NMR Suite, Amix, and AssureNMR, match the obtained spectra with reference spectra (in Human Metabolome Database (HMDB) (Wishart et al., 2007a; Wishart et al., 2013; Wishart et al., 2018; Wishart et al., 2022)) to calculate the concentrations of identified metabolites in the sample (Klein, 2021). Untargeted approaches can be divided into two groups of spectra processing. The peak-picking approach requires clearly visible peaks and generates a feature list for the spectral positions of the successfully detected peaks. This approach is not able to identify low intensity signals or signals with distorted line shapes. AlpsNMR (Madrid-Gambin et al., 2020), rDolphin (Cañueto et al., 2018), or speaq 2 (Beirnaert et al., 2018) are tools that use the peak-picking approach. The other approach is spectral binning, which can be used to identify signals that are missed by the peak-picking approach. The data from binning contain a large number of features from spectral regions. However, they also contain signals from spectral noise, which can reduce the statistical power of the data analysis in the next step (Klein, 10 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2021). Data from untargeted approaches often contain negative values. These negative values must be replaced by an affine transformation of the negative values, which is implemented in the R package mcrbin (Klein, 2021). The resulting bins or matrices of concentrations and metabolites are processed for statistical analysis (Ebbels et al., 2013; Barnes et al., 2016). Metabolomics and metagenomics generate different data matrices with a large number of variables (taxa, functional genes, enzymatic reactions, metabolic pathways, metabolites) that are variably associated with different additional datamatrices describing the environmental factors, such as diet or patient metadata (health status, body mass index (BMI), age, etc.). These data matrices require more modern statistical approaches that use multivariate statistics (Figure 5). The high dimensionality of ‘omics data can range from 300+ metabolites in NMR metabolomics to several thousand and millions of variables from microbiomes (taxa, functional genes, enzymatic reactions, metabolic pathways, predicted metabolites) and require data reduction methods (Argmann et al., 2016; Barnes et al., 2016) and nonparametric statistical methods (NPMANOVA) (Legendre and Legendre, 2012; Anderson and Walsh, 2013). Normalization must be used to remove variation between samples and make them comparable to each other (Emwas et al., 2018). To find the best normalization approach, the web tool NOREVA was developed to compare 20 different normalization methods (Yang et al., 2020). Scaling and transformation should also be applied to reduce the stronger influence of analysing features that are present in larger quantities compared to others, which means that this approach helps to distribute the data more normally (Ebbels et al., 2013; Emwas et al., 2018). There are two different approaches to data analysis. First, the unsupervised methods that do not require prior knowledge, such as principal component analysis (PCA) or hierarchical cluster analysis (HCA), utilize loadings plot created within PCA analysis to see which feature discriminates target groups of interest (Barnes et al., 2016). Second, the supervised methods assume that a known structure of patterns exists and use rules to predict new data. Supervised methods include partial least squares regression discriminant analysis (PLSDA) (Wold et al., 2001; Trygg and Wold, 2002), regression, and classification. The Variable Importance in Projection (VIP) score can be used to see which feature contributed the most to discrimination (Barnes et al., 2016). Supervised methods are very powerful and require validation methods to confirm the true relationship between different groups (Ebbels et al., 2011). Unsupervised methods may miss an interesting correlation, while supervised methods are more likely to produce false positives (Maguire, 2014). Web servers were developed to facilitate the use of these methods, such as MicrobiomeAnalyst (Dhariwal et al., 2017; Chong et al., 2020), MetaboAnalyst (Chong et al., 2018; Chong et al., 2019; Pang et al., 2021) or OmicsAnalyst (Zhou et al., 2021). 11 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 5: Representation of data analysis in multi-omics research (Tebani et al., 2016). Slika 5: Prikaz analize podatkov pridobljenih z multi-omskimi metodami (Tebani in sod., 2016). Multivariate statistical methods provide results whose feature can successfully distinguish between different groups. The next step in modern data science is the machine learning (ML) approach, which creates models that can be used in the future to diagnose, treat, and predict the health status of individuals. ML methods rely on algorithms that describe the relationship between variables (Sidey-Gibbons and Sidey-Gibbons, 2019). ML models, such as Support Vector Machines, K Nearest Neighbours, Naïve Bayes, Random Forest, and others, can be used for this purpose (Cristianini and Shawe-Taylor, 2000; Shen et al., 2003; Susnow and Dixon, 2003; Bender et al., 2007; Deo, 2015; Ekins et al., 2019). There is no clear boundary to distinguish statistical from ML methods. In short, the main goal of the statistical approach is to draw conclusions and inferences about populations based on measured data. The primary goal of ML methods, in contrast, is to make predictions. The main steps for ML are (i) importing and preparing the data set, (ii) training the ML model, (iii) testing the ML model (validating the model), (iv) evaluating the sensitivity, specificity, and accuracy of the model, (v) plotting the area under the curve and the receiver operating characteristics curve, and (vi) applying new data to the trained model (Sidey-Gibbons and Sidey-Gibbons, 2019). Regularization techniques must be used to ensure the correctness of the model. The regularization or penalty 12 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 parameter controls the complexity of the model (controls the number of features included in the prediction). Each model must be subjected to cross-validation, which means that the data must be split into a training set (for training the model) and a validation set (for validating the model). Model validation compares the predictive performance of the selected model if its performance from training is similar (Teschendorff, 2019). In recent years, AutoML platforms have been created for building ML models without human intervention JADBIO (Tsamardinos et al., 2020; Tsamardinos et al., 2022), AutoWEKA (Thornton et al., 2013; Kotthoff et al., 2017), AutoSklearn (Feurer et al., 2021), GoogleAutoML, RapidMiner) (Mustafa and Rahimi Azghadi, 2021). AutoML has already been applied in various fields of human healthcare, such as diabetes diagnosis, Alzheimer’s disease, electronic medical record analysis, and medical imaging (Borkowski et al., 2019; Karaglani et al., 2020; Tsamardinos et al., 2020; Waring et al., 2020; Mustafa and Rahimi Azghadi, 2021). AutoML automates the main processes of ML, from data preparation to feature extraction and selection, algorithm selection, hyperparameter optimization and evaluation (Feurer et al., 2015; Kotthoff et al., 2017; Hutter et al., 2019; Mustafa and Rahimi Azghadi, 2021). However, experienced data scientists are still required to professionally evaluate the results obtained with AutoML (Mustafa and Rahimi Azghadi, 2021). For proper interpretation of the obtained results, the right data integration process should be used. Identifying a combination of distinguishing characteristics satisfies biological assumptions that cannot be satisfied by univariate methods. Therefore, the combination of different statistical methods (univariate, multivariate, machine learning) provides the key to answer complex biological questions. The mixOmics-R package (Rohart et al., 2017; Singh et al., 2019) is dedicated to the multivariate analysis of biological datasets with a particular focus on data exploration, dimensionality reduction and visualisation, thus providing a systems biology approach, a wide range of methods that statistically integrate multiple datasets simultaneously to explore relationships between heterogeneous ‘omics datasets to identify molecular signatures. mixOmics supports the inclusion of different types of biological data and their analysis beyond the scope of ‘omics, as long as they are expressed as continuous values. The other important issue should also be discussed. Batch effects are an important part of the natural sciences. Different processing, different samples can lead to spurious findings and obscure the true signals due to differences in experiments and methods. Biological studies depend on many different factors. This can lead to confounding factors that are unavoidable and come from biological, technical, and computational sources (Ma et al., 2019; Wang and Lê Cao, 2020). Batch effects are an obstacle to comparing the results of different studies. Traditional meta-analysis techniques for combining p-values from independent studies, such as Fisher’s method, are effective but statistically conservative. If batch effects can be corrected, statistical tests can be performed on data pooled across studies, increasing the sensitivity for detecting differences between treatment groups. Removing or accounting for batch effects requires computational and analytical multivariate methods (Wang et al., 2019), such as ConQuR (Ling et al., 2021) or ComBat (Gibbons et al., 2018). Most of the above-mentioned methods were use in different projects of this work (Figure 6). 13 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 6: Data analysis methods use in current work. Slika 6: Metode analize podatkov zajete v tem delu. 14 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 1.2 INACTIVITY Physical inactivity associated with the modern sedentary lifestyle (Figure 7) is becoming a global problem and is ranked as the fourth largest behavioural risk for mortality worldwide (Kohl et al., 2012; Kelly et al., 2020a). Every adult should engage in at least 75 minutes of vigorous physical activity per week or 150 minutes of moderate physical activity per week (Sallis et al., 2016), engage in muscle training twice per week, and try to spend as little time as possible in a sedentary position (Kelly et al., 2020a). Regular physical activity reduces the risk of obesity, some cancers, diabetes, coronary heart disease, stroke, dementia, etc. (Booth et al., 2012). Several observational, short-term and long-term intervention studies have used different metabolomics methods to monitor changes in physiological levels due to inactivity as they change following different exercise regimes (Kelly et al., 2020a). Varying levels of physical activity are associated with quantifiable changes in the metabolic profile of individuals. Possible changes may be observed in metabolism of fatty acid, cholesterol and carnitine, lipolysis, the tricarboxylic acid (TCA) cycle, glycolysis, and insulin sensitivity (Kelly et al., 2020a). Metabolic syndrome has a number of risk factors associated with the development of type 2 diabetes mellitus and atherosclerotic cardiovascular disease. Biogenic amines, such as trimethylamine N-oxide (TMAO), choline and L-carnitine (all found in red meat), and branched-chain amino acids may increase the likelihood of metabolic syndrome. In contrast, histidine and lysine correlate with a lower likelihood of metabolic syndrome. Moreover, there is a plethora of molecules, and more research is needed to understand their role in the development of metabolic syndrome (Lent-Schochet et al., 2019). Inactivity also causes hypoxic conditions leading to redox imbalance, which may also be observed on metabolic levels. Differentially expressed levels of creatine, hypoxantine, acetylcarnitine, and taurine were reported, due to hypoxic conditions (Crass and Lombardini, 1977; Franconi et al., 1985; Malcangio et al., 1989; Aureli et al., 1994; Michalk et al., 1997; Amano et al., 2003; Chen et al., 2009; Scafidi et al., 2010; Powers et al., 2011; Chen et al., 2013; Turner et al., 2015; Scheer et al., 2016; Lee et al., 2017; Sibomana et al., 2021; Wilken et al., 2022). Inactivity also leads to muscle loading (alteration I muscle protein synthesis) and heart failure (Rittweger et al., 2016), both leading to systemic hypoxemia and elevated levels of reactive oxygen substances (ROS). For the majority of people, life can be improved with moderate activity. Physical exercise is one of the main stimuli in restoring prooxidant to antioxidant balance in chronic disease patients (Vincent at el., 2007). Sitting less and moving more (low-to-moderate exercise) or a staircase approach with an increase in activity can prevent the development of metabolic syndrome (Debevec et al., 2017; Dunstan et al., 2021). However, this approach must be considered a never-ending story, which means that physical activity must continue even if health has improved (Figure 8). However, there are also people who face various health problems from birth (premature born infants, patients with spinal muscular disease) and whose possibilities of being physically active are limited, such as in spinal muscular atrophy, which will be discussed later in this work. 15 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 7: Lifestyle of modern humans (Dunstan et al., 2021). Slika 7: Življenjski stil modernega človeka (Dunstan in sod., 2021). Figure 8. Staircase approach for increased activity level (Dunstan et al., 2021). Staircase approach should be used to increase level of activity with small steps towards reduced probabilities of reducing non-communicable disease. Slika 8. Postopno povečevanje aktivnosti (Dunstan in sod., 2021). Postopno povečevanje aktivnosti z majhnimi koraki zmanjša verjetnost kroničnih bolezni. 1.3 PURPOSE OF THE RESEARCH As stated above, three biomedically relevant reduced-exercise models form the backbone of our current work (Figure 9). The overall aim of our research was to determine the physiological responses at metabolic level to different levels of physical (in)activity and health status (PreTerm, SMA, X-Adapt) in relation to personal characteristics of participants. To achieve this goal, it was necessary to prepare the entire infrastructure of bioinformatics analytical pathways for pre-processing of molecular data and the correct statistical, modelling, and integration approaches for data processing and interpretation. 16 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Within each project, additional explorations were made based once different ‘omics layers are made available in our measurements (Figure 8): i) PreTerm ((taxonomy (Bacteria, Archaea, Fungi, Protozoa, Viruses) + functional genes + metabolic pathways + predicted metabolomes + MAGs assembly)) coupled to 1H-NMR metabolomics (metabolites + metabolic pathways); ii) SMA (1H-NMR metabolomics (metabolites + metabolic pathways)); iii) X-Adapt (1H-NMR metabolomics (metabolites + metabolic pathways)) iv) Healthy baseline (1H-NMR metabolomics) Samples feces urine serum liquor Omics Metagenomics Metabolomics technoloy Functional metabolic predicted 1H-NMR metabolic 1H-NMR metabolic 1H-NMR metabolic 1H-NMR metabolic Taxonomy (B,A,F,P,V) MAGs genes pathways metabolites metabolites pathways metabolites pathways metabolites pathways metabolites pathways Omics layer PreTerm SMA X-adapt Healthy Figure 9: Graphical presentation of collected samples and ‘omics layers. Graphical presentation of collected samples and ‘omics layers in PreTerm, SMA and X-Adapt projects. In addition, a healthy urine database was also collected, and it represents healthy baseline for Slovenian NMR urinary database. Slika 9: Grafična predstavitev pobranih vzorcev in ‘omskih nivojev Grafična predstavitev pobranih vzorcev in ‘omskih nivojev v projektih PreTerm, SMA in X-Adapt. Dodatno so bili še pobrani vzorci zdravih, ki predstavljajo bazno linijo za Slovensko NMR podatkovno bazo. 1.4 HYPOTHESES 1.4.1 PreTerm related The PreTerm-related hypotheses are discussed in the published paper presented in chapter 2.1.7 and additionally in chapters 2.2.1 and 3.1.6. H0: No significant difference exists between preterm and term groups of participants at the levels of faecal or urine metabolomes or faecal metagenomes. H1: There are significant differences between preterm and term groups of participants in faecal and urine metabolomes that can be linked to their physical performance in experiments and physiological data at exercise and rest. H2: There are significant differences at the level of metagenomics makeup of both groups, giving rise to identification of specific metabolic pathways differing between the two groups and their gut environment characteristics. 17 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 H3: Term and preterm gut samples contain specific MAGs associated with differences in gut environmental conditions between the two groups. 1.4.2 SMA related The SMA-related hypothesis is discussed in published paper presented in chapter 2.1.3 and additionally in chapter 3.1.4. H0: There are no significant differences in metabolomes before and after treatment. H1: There are significant differences in urine (systemic) and liquor (local) metabolomes before and after treatment with gene therapy, enabling identification of characteristic metabolic pathways discerning the two groups. 1.4.3 Hypotheses of merged dataset Hypotheses of the merged dataset are discussed in chapters 2.2.2 and 3.1.7. H0: There is no significant difference between metabolomes of prematurely born, born on time, before SMA treatment, and after SMA treatment groups. H1: There are significant differences in urine metabolomes that enable identification of biomarker pools and metabolic pathways delineating various groups under investigation. 18 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2 SCIENTIFIC WORKS 2.1 PUBLISHED SCIENTIFIC WORKS 2.1.1 Computational framework for high-quality production and large-scale evolutionary analysis of metagenome assembled genomes Murovec B., Deutsch L. , Stres B. 2019. Computational framework for high-quality production and large-scale evolutionary analysis of metagenome assembled genomes. Molecular Biology and Evolution, 37, 2: 593-598 Abstract Microbial species play important roles in different environments and the production of high-quality genomes from metagenome data sets represents a major obstacle to understanding their ecological and evolutionary dynamics. Metagenome-Assembled Genomes Orchestra (MAGO) is a computational framework that integrates and simplifies metagenome assembly, binning, bin improvement, bin quality (completeness and contamination), bin annotation, and evolutionary placement of bins via detailed maximum-likelihood phylogeny based on multiple marker genes using different amino acid substitution models, next to average nucleotide identity analysis of genomes for delineation of species boundaries and operational taxonomic units. MAGO offers streamlined execution of the entire metagenomics pipeline, error checking, computational resource distribution and compatibility of data formats, governed by user-tailored pipeline processing. MAGO is an open-source-software package released in three different ways, as a singularity image and a Docker container for HPC purposes as well as for running MAGO on a commodity hardware, and a virtual machine for gaining a full access to MAGO underlying structure and source code. MAGO is open to suggestions for extensions and is amenable for use in both research and teaching of genomics and molecular evolution of genomes assembled from small single-cell projects or large-scale and complex environmental metagenomes. This work was published as an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (CC-BY-NC 4.0). For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). 19 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 20 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 21 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 22 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 23 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 24 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 25 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.1.2 General unified microbiome profiling pipeline (GUMPP) for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data to predicted microbial metagenomes, enzymatic reactions and metabolic pathways Murovec B., Deutsch L. , Stres B. 2021. General unified microbiome profiling pipeline (GUMPP) for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data to predicted microbial metagenomes, enzymatic reactions and metabolic pathways. Metabolites, 11, 6: 336, doi: https://doi.org/10.3390/metabo11060336, 14 p. Abstract General Unified Microbiome Profiling Pipeline (GUMPP) was developed for large scale, streamlined and reproducible analysis of bacterial 16S rRNA data and prediction of microbial metagenomes, enzymatic reactions and metabolic pathways from amplicon data. GUMPP workflow introduces reproducible data analyses at each of the three levels of resolution (genus; operational taxonomic units (OTUs); amplicon sequence variants (ASVs)). The ability to support reproducible analyses enables production of datasets that ultimately identify the biochemical pathways characteristic of disease pathology. These datasets coupled to biostatistics and mathematical approaches of machine learning can play a significant role in extraction of truly significant and meaningful information from a wide set of 16S rRNA datasets. The adoption of GUMPP in the gut-microbiota related research enables focusing on the generation of novel biomarkers that can lead to the development of mechanistic hypotheses applicable to the development of novel therapies in personalized medicine. This work was published as an Open Access article distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0). For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). 26 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 27 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 28 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 29 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 30 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 31 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 32 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 33 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 34 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 35 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 36 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 37 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 38 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 39 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 40 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.1.3 Spinal muscular atrophy after nusinersen therapy: improved physiology in pediatric patients with no significant change in urine, serum, and liquor 1H-NMR metabolomes in comparison to an age-matched, healthy cohort Deutsch L. , Osredkar D., Plavec J., Stres B. 2021. Spinal muscular atrophy after nusinersen therapy: improved physiology in pediatric patients with no significant change in urine, serum, and liquor 1H-NMR metabolomes in comparison to an age-matched, healthy cohort. Metabolites, 11, 4: 206, doi. https://doi.org/10.3390/metabo11040206, 15 p. Abstract Spinal muscular atrophy (SMA) is a genetically heterogeneous group of rare neuromuscular diseases and was until recently the most common genetic cause of death in children. The effects of 2-month nusinersen therapy on urine, serum, and liquor 1H-NMR metabolomes in SMA males and females were not explored yet, especially not in comparison to the urine 1H-NMR metabolomes of matching male and female cohorts. In this prospective, single-centered study, urine, serum, and liquor samples were collected from 25 male and female pediatric patients with SMA before and after 2 months of nusinersen therapy and urine samples from a matching healthy cohort (n = 125). Nusinersen intrathecal application was the first therapy for the treatment of SMA by the Food and Drug Administration (FDA) and the European Medicines Agency (EMA). Metabolomes were analyzed using targeted metabolomics utilizing 600 MHz 1H-NMR, parametric and nonparametric multivariate statistical analyses, machine learning, and modeling. Medical assessment before and after nusinersen therapy showed significant improvements of movement, posture, and strength according to various medical tests. No significant differences were found in metabolomes before and after nusinersen therapy in urine, serum, and liquor samples using an ensemble of statistical and machine-learning approaches. In comparison to a healthy cohort, 1H-NMR metabolomes of SMA patients contained a reduced number and concentration of urine metabolites and differed significantly between males and females as well. Significantly larger data scatter was observed for SMA patients in comparison to matched healthy controls. Machine learning confirmed urinary creatinine as the most significant, distinguishing SMA patients from the healthy cohort. The positive effects of nusinersen therapy clearly preceded or took place devoid of significant rearrangements in the 1H-NMR metabolomic makeup of serum, urine, and liquor. Urine creatinine was successful at distinguishing SMA patients from the matched healthy cohort, which is a simple systemic novelty linking creatinine and SMA to the physiology of inactivity and diabetes, and it facilitates the monitoring of SMA disease in pediatric patients through non-invasive urine collection. This work was published as an Open Access article distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0). 41 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). The hypothesis from section 1.4.2 from this work were discussed in this paper. 42 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 43 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 44 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 45 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 46 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 47 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 48 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 49 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 50 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 51 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 52 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 53 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 54 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 55 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 56 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 57 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.1.4 The importance of objective stool classification in fecal 1H-NMR metabolomics: exponential increase in stool crosslinking is mirrored in systemic inflammation and associated to fecal acetate and methionine Deutsch L. , Stres B. 2021. The importance of objective stool classification in fecal 1H-NMR metabolomics: exponential increase in stool crosslinking is mirrored in systemic inflammation and associated to fecal acetate and methionine. Metabolites, 11, 3: 172, doi. https://doi.org/10.3390/metabo11030172, 16 p. Abstract Past studies strongly connected stool consistency-as measured by Bristol Stool Scale (BSS)-with microbial gene richness and intestinal inflammation, colonic transit time and metabolome characteristics that are of clinical relevance in numerous gastro intestinal conditions. While retention time, defecation rate, BSS but not water activity have been shown to account for BSS-associated inflammatory effects, the potential correlation with the strength of a gel in the context of intestinal forces, abrasion, mucus imprinting, fecal pore clogging remains unexplored as a shaping factor for intestinal inflammation and has yet to be determined. Our study introduced a minimal pressure approach (MP) by probe indentation as measure of stool material crosslinking in fecal samples. Results reported here were obtained from 170 samples collected in two independent projects, including males and females, covering a wide span of moisture contents and BSS. MP values increased exponentially with increasing consistency (i.e., lower BSS) and enabled stratification of samples exhibiting mixed BSS classes. A trade-off between lowest MP and highest dry matter content delineated the span of intermediate healthy density of gel crosslinks. The crossectional transects identified fecal surface layers with exceptionally high MP and of <5 mm thickness followed by internal structures with an order of magnitude lower MP, characteristic of healthy stool consistency. The MP and BSS values reported in this study were coupled to reanalysis of the PlanHab data and fecal 1H-NMR metabolomes reported before. The exponential association between stool consistency and MP determined in this study was mirrored in the elevated intestinal and also systemic inflammation and other detrimental physiological deconditioning effects observed in the PlanHab participants reported before. The MP approach described in this study can be used to better understand fecal hardness and its relationships to human health as it provides a simple, fine scale and objective stool classification approach for the characterization of the exact sampling locations in future microbiome and metabolome studies. This work was published as an Open Access article distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0). For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). 58 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 59 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 60 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 61 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 62 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 63 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 64 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 65 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 66 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 67 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 68 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 69 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 70 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 71 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 72 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 73 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 74 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.1.5 Systems view of deconditioning during spaceflight simulation in the PlanHab project: the departure of urine 1H-NMR metabolomes from healthy state in young males subjected to bedrest inactivity and hypoxia Šket R., Deutsch L. , Prevoršek Z., Mekjavić I.B., Plavec J., Rittweger, J., Debevec T., Eiken O., Stres B. 2020. Deutsch L., Stres B. 2021. Systems view of deconditioning during spaceflight simulation in the PlanHab Project: The departure of urine 1H-NMR metabolomes from healthy state in young males subjected to bedrest inactivity and hypoxia. Frontiers in Physiology, 11: 532271, doi. https://doi.org/10.3389/fphys.2020.532271, 15 p. Abstract We explored the metabolic makeup of urine in prescreened healthy male participants within the PlanHab experiment. The run-in (5 day) and the following three 21-day interventions [normoxic bedrest (NBR), hypoxic bedrest (HBR), and hypoxic ambulation (HAmb)] were executed in a crossover manner within a controlled laboratory setup (medical oversight, fluid and dietary intakes, microbial bioburden, circadian rhythm, and oxygen level). The inspired O2 (FiO2) fraction next to inspired O2 (PiO2) partial pressure were 0.209 and 133.1 ± 0.3 mmHg for the NBR variant in contrast to 0.141 ± 0.004 and 90.0 ± 0.4 mmHg (approx. 4,000 m of simulated altitude) for HBR and HAmb interventions, respectively. 1H-NMR metabolomes were processed using standard quantitative approaches. A consensus of ensemble of multivariate analyses showed that the metabolic makeup at the start of the experiment and at HAmb endpoint differed significantly from the NBR and HBR endpoints. Inactivity alone or combined with hypoxia resulted in a significant reduction of metabolic diversity and increasing number of affected metabolic pathways. Sliding window analysis (3 + 1) unraveled that metabolic changes in the NBR lagged behind those observed in the HBR. These results show that the negative effects of cessation of activity on systemic metabolism are further aggravated by additional hypoxia. The PlanHab HAmb variant that enabled ambulation, maintained vertical posture, and controlled but limited activity levels apparently prevented the development of negative physiological symptoms such as insulin resistance, low-level systemic inflammation, constipation, and depression. This indicates that exercise apparently prevented the negative spiral between the host’s metabolism, intestinal environment, microbiome physiology, and proinflammatory immune activities in the host. This work was published as an Open Access article distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0). For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). 75 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 76 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 77 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 78 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 79 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 80 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 81 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 82 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 83 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 84 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 85 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 86 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 87 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 88 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 89 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 90 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.1.6 Exercise and interorgan communication: short-term exercise training blunts differences in consecutive daily urine 1H-NMR metabolomic signatures between physically active and inactive individuals Deutsch L. , Sotiridis A., Murovec B., Plavec J., Mekjavić I., Debevec T., Stres B. 2022. Exercise and interorgan communication: short-term exercise training blunts differences in consecutive daily urine 1H-NMR metabolomic signatures between physically active and inactive individuals. Metabolites, 12,6: 473, doi. https://doi.org/10.3390/metabo12060473, 18 p. Abstract Physical inactivity is a worldwide health problem, an important risk for global mortality and is associated with chronic noncommunicable diseases. The aim of this study was to explore the differences in systemic urine 1H-NMR metabolomes between physically active and inactive healthy young males enrolled in the X-Adapt project in response to controlled exercise (before and after the 3-day exercise testing and 10-day training protocol) in normoxic (21% O2), normobaric (~1000 hPa) and normal-temperature (23 °C) conditions at 1 h of 50% maximal pedaling power output (Wpeak) per day. Interrogation of the exercise database established from past X-Adapt results showed that significant multivariate differences existed in physiological traits between trained and untrained groups before and after training sessions and were mirrored in significant differences in urine pH, salinity, total dissolved solids and conductivity. Cholate, tartrate, cadaverine, lysine and N6-acetyllisine were the most important metabolites distinguishing trained and untrained groups. The relatively little effort of 1 h 50% Wpeak per day invested by the untrained effectively modified their resting urine metabolome into one indistinguishable from the trained group, which hence provides a good basis for the planning of future recommendations for health maintenance in adults, irrespective of the starting fitness value. Finally, the 3-day sessions of morning urine samples represent a good candidate biological matrix for future delineations of active and inactive lifestyles detecting differences unobservable by single-day sampling due to day-to-day variability. This work was published as an Open Access article distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0). For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). 91 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 92 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 93 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 94 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 95 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 96 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 97 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 98 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 99 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 100 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 101 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 102 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 103 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 104 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 105 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 106 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 107 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 108 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 109 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.1.7 Urine and fecal 1H-NMR metabolomes differ significantly between pre-term and full-term born physically fit healthy adult males Deutsch L. , Debevec T., Millet G.P.., Osredkar D., Opara S., Šket R., Murovec B., Mramor M., Plavec J. Stres B. 2022. Urine and fecal 1H-NMR metabolomes differ significantly between pre-term and full-term born physically fit healthy adult males. Metabolites, 12: X, doi. https://doi.org/10.3390/metabo12060536, 23 p. Abstract Preterm birth (before 37 weeks gestation) accounts for ~10% of births worldwide and remains one of the leading causes of death in children under 5 years of age. Preterm born adults have been consistently shown to be at an increased risk for chronic disorders including cardiovascular, endocrine/metabolic, respiratory, renal, neurologic, and psychiatric disorders that result in increased death risk. Oxidative stress was shown to be an important risk factor for hypertension, metabolic syndrome and lung disease (reduced pulmonary function, long-term obstructive pulmonary disease, respiratory infections, and sleep disturbances). The aim of this study was to explore the dif-ferences between preterm and full-term male participants’ levels of urine and fecal proton nuclear magnetic resonance (1H-NMR) metabolomes, during rest and exercise in normoxia and hypoxia and to assess general differences in human gut-microbiomes through metagenomics at the level of taxonomy, diversity, functional genes, enzymatic reactions, metabolic pathways and predicted gut metabolites. Significant differences existed between the two groups based on the analysis of 1H-NMR urine and fecal metabolomes and their respective metabolic pathways, enabling the elucidation of a complex set of microbiome related metabolic biomarkers, supporting the idea of distinct host-microbiome interactions between the two groups and enabling the efficient classification of samples; however, this could not be directed to specific taxonomic characteristics. This work was published as an Open Access article distributed under the terms of the Creative Commons Attribution License (CC-BY 4.0). For my personal contributions as a doctoral student and author of this thesis, please refer to Table 2 (page 142). The hypothesis from section 1.4.1 from this work were discussed in this paper. 110 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 111 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 112 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 113 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 114 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 115 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 116 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 117 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 118 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 119 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 120 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 121 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 122 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 123 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 124 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 125 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 126 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 127 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 128 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 129 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 130 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 131 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 132 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.2 ADDITIONAL SCIENTIFIC WORK 2.2.1 Metagenomes assembled genomes from the PreTerm project 2.2.1.1 Introduction In the context of taxonomic and functional analysis of the microbiome community, the third option is to assemble short-read sequences obtained with modern sequencing technologies into fully recovered genomes from the microbiome using available tools. This process is used to assemble new metagenome-assembled genomes. There are many genome assemblers specifically designed for metagenomic data, but none of them are perfect. A whole range of specialised tools have been developed to solve the problems of metagenomic assembly caused by the properties of the collected data. Depending on the length of the generated reads, assemblers are based on different approaches, from overlap-layout-consensus tools based on overlap strategies to those using de Bruijn graphs to work with data. It is important to note that it is not only the efficiency and quality of work that influence the popularity of assemblers, but also the ease of use of the tool, the existence of a simple, detailed and easy-to-understand manual, the continuous development of the tool, and the speed and quality of feedback form the tool’s support team (Lapidus and Korobeynikov, 2021). For this reason, we have combined the multitude of different tools needed for metagenome assembly into the MAGO pipeline (Section 2.1.1). This approach can lead to the discovery of new species that cannot be cultured and that have become increasingly important in recent years (Fricker et al., 2019; Nayfach et al., 2019; Murovec et al., 2020; Lapidus and Korobeynikov, 2021). Sequence assembly can be divided into two necessary steps, all of which are already included in the MAGO pipeline: 1. metagenomic assembly (assembly of short read sequences (250 base pairs) into longer contigs). 2. binning (grouping of contigs with the same sequences into their taxon ID (e.g., closely related organisms)). This process can also produce some artefacts in de novo assembled sequences, such as “bulges” or “tips”, which are often artefacts due to sequencing errors (Zerbino and Birney, 2008). For this reason, MAGs need to be validated. For this purpose, the CheckM tool (Parks et al., 2015) is used to check the completeness and contamination of the assembled genomes. MAGs can be divided into high- and medium-quality groups according to the standards for minimum information about a metagenome-assembled genome (MIMAG). MAGs in the high-quality group contain < 5% contamination and are > 90% complete. Medium-quality MAGs contain < 10% contamination and are > 50% complete (Parks et al., 2015; Bowers et al., 2017). The hypothesis from section 1.4.1 were partly discussed in this chapter (table 2). 133 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.2.1.2 Materials and methods For this work, we used sequences from the PreTerm project and assembled the MAGs from the preterm and full-term groups individually. The main purpose was to obtain some characteristic species belonging to preterm group’s adolescents involved in the PreTerm project. Sequences from the Preterm Project (Deutsch et al., 2022b) were used to compile characteristic MAGs for the preterm and full-term groups of participants. Sequences from the preterm and control groups were assembled separately using the MAGO Singularity Image on the Leo4 HPC cluster (University of Innsbruck, Austria). Fastp (Chen et al., 2018) and FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) were used for quality control and preprocessing. Three different assemblers were used for assembly: metaSPAdes (Nurk et al., 2017), MEGAHIT (Li et al., 2015) and IDBA-UD (Peng et al., 2012). Contigs were binned and bins were improved using the tools BinSanity (Graham, Heidelberg and Tully, 2017), CONCOCT (Alneberg et al., 2014), MetaBAT (Kang et al., 2015) and MaxBin (Wu et al., 2016). In the end, DAStool (Sieber et al., 2018) was used to refine and dereplicate the resulting bins to obtain near-complete MAGs, which were then checked for completeness and contamination level using the ChekM (Parks et al., 2015) tool. High-quality MAGs from both groups were used for average amino acid identity calculation with ezTree (Wu, 2018), genome annotation with Prokka (Seeman, 2014), pan- and core-genome analysis with Roary (Page et al., 2015), and high-throughput average nucleotide identity calculation with FastANI (Jain et al., 2018), all of which were integrated into the MAGO tool (Murovec et al., 2020). JSpeciesWS Online Service (Richter et al., 2016) was used to determine taxonomic thresholds with tetra-correlation search (Teeling et al., 2004) by comparing our high-quality MAGs with the reference genome database (GenomesDB). A mosaic plot was generated using Past software (Hammer et al., 2001). 2.2.1.3 Results The total number of sequencing reads was lower in the preterm group (491 million total reads compared to 531 million reads in the control group). After filtering with fastp, 494.6 million reads were obtained in the control group and 486 million reads in the preterm group. Other reads were removed because they were of poor quality or contained too many Ns (it was not possible to basecall for these bases). The remainder of the sequences were used for metagenome assembly; 320 MAGs were assembled in the preterm group, and 27 of these MAGs belonged to the MAGs in the high-quality group (average completeness was 93.93±2.9% and contamination was 2.9±1.43%). In the control group, 124 MAGs were assembled, 24 of which belonged to the high-quality group (average completeness was 95.4±2.8% and contamination was 2.5±1.4%). MAGs from the preterm groups were approximately 1 Mb larger and counted almost twice as many contigs. Preterm MAGs also had a higher percentage of GC base pairs (5% higher on average). All high-quality MAGs were submitted to the online service JSpeciesWS for a tetra-correlation search with the genome reference database GenomesDB, which contains more than 55,000 genomes. No significant differences were observed 134 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 between preterm and control group (Figure 10, Figure 11). Approximately the same number of high- , medium-, and low-quality MAGs were assembled in both groups. 100 ) (% 80 on 60 nati 40 20 Contami 0 0 20 40 60 80 100 Completeness (%) Control preterm Figure 10: Relationship between completeness and contamination of MAGs in control and preterm group. Slika 10: Odnos med popolnostjo in kontaminacijami na novo sestavljenih metagenomov v kontrolni in preterm skupini. 135 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 11: Number of MAGs per both groups and their quality. Number of high (completeness>95%, contamination<5%), medium (completeness>75%, contamination<10%) and low (completeness>50%, contamination<25%) quality MAGS in the preterm and control groups. Slika 11: Število na novo sestavljenih metagenomov med skupinami in njihova kvaliteta. Število na novo sestavljenih metagenomov visoke (popolnost>95 %, kontaminacija<5 %), srednje (popolnost>75 %, kontaminacija<10 %) in nizke (popolnost>50 %, kontaminacija<25 %) v preterm in kontrolni skupini. 2.2.1.4 Discussion Sequences from the preterm and full-term (control) groups were assembled separately in order to search for group-specific MAGs that could lead to discovery of taxonomic differences that were not observed in the previously published metaBakery analysis. Although a greater number of high-, medium-, and-low quality MAGs were assembled in the preterm group according to the MIMAG standard (Bowers et al., 2017), we did not observe MAGs specific to the preterm group. The quality of the sequences was comparable and not significantly different in both groups. The higher number of MAGs is consistent with higher diversity indices in the PreTerm group, as previously observed (Deutsch et al., 2022). One of the most important parts of the de novo MAGs assembly is the ability to detect the “uncultured majority”, which is also what we hoped to detect, especially in the preterm group. Based on these results, we can conclude that preterm and adult full-term born adults are not different in terms of microbial taxonomy, albeit due to the unequal variance within the groups. In contrast, we have shown that the functionality of the microbial worlds differs between adult preterm compared to adult full-term groups in terms of enzymatic reactions, metabolic pathways, and predicted metabolites (Deutsch et al., 2022b). This once again shows the higher relevance and 136 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 importance of microbial functionality relative to microbial taxonomic composition for the inference of relationships with human phenotypic characteristics through the production of various metabolites. 2.2.2 Data integration 2.2.2.1 Introduction To properly understand the complexity of biological systems, well-being, and diseases, various ‘omics high-throughput technologies (e.g., sequencing, various types of spectrometry, etc.) have been used and are becoming more affordable for scientists (Zitnik et al., 2019). It soon became clear very that we cannot capture the whole understanding of the system based on only one level of datasets. “Top-down” approach is the term that was evaluated in the context of systems biology research. In general, this means that we measure a set of parameters at the system level and then make inferences about the overall functionality of the system (Kohl et al., 2010; Price et al., 2017). Without ‘omics methods, all domains relied strictly on single types of data that could not explain the entire system. ‘Omics methods enabled the development of modern statistical approaches (data reduction methods) and data integration. With these methods, it became easier to draw conclusions based on thousands of parameters that could be measured with these methods (Zitnik et al., 2019). These approaches, along with machine learning, are converging into precision medicine, which is composed of four words (also referred to as P4 for short): predictive, preventive, personalized, and participatory precision medicine. The combination of all four terms leads us to maintain our health longer and prevent noncommunicable diseases (Hood and Friend, 2011; Hood and Flores, 2012; Price et al., 2017). With the combination of ‘omics methods, developed models, and evaluation of these methods in practical medicine, future health policies will also change and the chances of detecting diseases as early as possible and before it is too late for effective treatment will also increase. However, there is also a need for caution in introducing this approach into daily use, especially in data protection and better and more secure computing infrastructure (Thapa and Camtepe, 2021). The hypothesis from section 1.4.3 were assessed in this chapter (table 2). 2.2.2.2 Materials and methods Total urinary NMR metabolomes collected from five different projects-Slovenian NMR database (PlanHab (Debevec et al., 2014; Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018; Šket et al., 2020), X-Adapt (Deutsch et al., 2022a), healthy women and men, SMA (Deutsch et al., 2020), PreTerm (Deutsch et al., 2022b)) were integrated with the aim to build up Slovenian NMR database (manuscript in preparation). We used the DIABLO (Singh et al., 2019) and PLSDA (Wang and Lê Cao, 2020) methods, which are integrated into the miXomics R package (Rohart et al., 2017). 137 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 2.2.2.3 Results All urinary metabolites collected in five different projects were utilized: PlanHab (522 samples), PreTerm (183 samples), Spinal Muscular Atrophy (48 samples), X-Adapt (239 samples), Healthy Women and their daughters (94 samples), and Healthy Men and their sons (133 samples); 185 samples were included in the low physical activity group (bedrest part of the form the PlanHab study and spinal muscular atrophy participants), 919 samples were included into medium physical activity group (healthy women and men, start of the PlanHab and Hamb end from the PlanHab study, preterm and full-term born participants from the PreTerm project, untrained participants of the X-Adapt study), and 115 samples were included in the high physical activity group (trained X-Adapt study participants) (Figure 12). The largest area under the curve was observed when comparing the low activity group (AUC=0.91) with the others and the lowest when comparing the moderate activity group with the others (AUC=0.75) (Figure 13). Figure 12: PSLDA of all metabolomes stratified by activity. The sample plot representing PLSDA centroids of all 1200 metabolomes obtained in five different dataset and corresponding to their level of physical activity. Slika 12: Rezultati analize PLSDA vse metabolomov glede na aktivnost. Graf prikazuje centroide PLSDA vseh 1200 zbranih metabolomov v petih različnih študijah in razdeljenih glede na nivo njihove fizične aktivnosti preiskovancev. 138 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 13: The success of classification with PLSDA. ROC curves and with the accompanying AUC values representing the success of classification of metabolomes between three different levels of activity. Slika 13: Uspeh klasifikacije z metodo PLSDA. Krivulje ROC s pripadajočimi vrednostmi AUC, ki prikazujejo uspešnost klasifikacije metabolom glede na nivo fizične aktivnosti. 2.2.2.4 Discussion We combined more than 1200 collected samples of urine 1H-NMR metabolomes into the Slovenian urine NMR database. Information from all our previous projects (PlanHab, spinal muscular atrophy, X-Adapt, PreTerm, healthy women and men) were integrated. All measured spectra were analysed with the same procedure of spectral deconvolution to obtain metabolites in all projects. We have shown that we can distinguish between the different levels of physical activity based on the metabolites in urine. Future integration of additional data on various diseases with medical diagnoses could provide basis for the development of a pre-screening tool amenable for routine information gathering at clinical setting. Such a large integrations of metabolomics data into a single database are also susceptible to several sources of systematic error that can lead to lack of reproducibility and poor data quality. To minimize this, all samples were processed in the same way using our in-house processing pipeline (Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018; Šket et al., 2020; Deutsch et al., 2021a; Deutsch et al., 2021b, Deutsch et al., 2022a; Deutsch et al., 2022b), alongside commercially available software for targeted spectral deconvolution analysis utilizing the same version of the Human Metabolome Database 4.0. Our pipeline is therefore generic and accessible to other interested researchers making 139 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 repeated exploration of the same data a reality. In addition, significant extensions with novel data can be made every year with reasonable effort. This should lead to database updates as the Human Metabolome Database has grown from a few thousand metabolites in the first edition (Wishart et al., 2007) to 217,000 metabolites in the latest edition, published in 2021 (Wishart et al., 2021). The data recorded in the past can be effectively reanalysed for novel insight and increased percent of explained spectral information. Second, standardized analytical protocols in our laboratory allowed us to minimize the systematic errors that normally occur due to batch effects. However, there is still room for improvement. Batch effects need to be eliminated in the integration and construction of databases (Ding et al., 2022). There are already approaches to eliminate batch effects, usually developed in other ‘omics domains. These approaches include Dirichlet-multinomial regression (Dai et al., 2019), percentile-normalization methods (Gibbons et al., 2018), quantile regression methods (Ling et al., 2021), the ComBat Bayesian approach (Johnson et al., 2007), Norm ISWSVR (Ding et al., 2022), and the sPLSDA (Wang and Lê Cao, 2020), which was implemented in miXomics. Batch effects can occur when comparing different studies for biological reasons (uniqueness of each biological system due to health status, diet, or lifestyle in general), technical reasons (different batches of the same buffers, different vendors, protocols, NMR devices), or computational reasons (use of different parameters and different software) (Wang and Lê Cao, 2020). Another question is which normalization method is the best for the data being analysed. In the field of metabolomics, the NOREVA software was developed to overcome this challenge. The only limitation is that it is not suitable for NMR metabolomics and was developed for MS metabolomics (Yang et al., 2020). In our case, Box-Cox normalization and sPLSDA approach were used to integrate all metabolomes. This method showed competitive performance in removing batch effects on one side, but still preserves variations due to lifestyle or other biological metadata categories (Wang and Lê Cao, 2020). We have shown that urinary metabolic fingerprinting has the potential to reveal an individual’s metabolic status and provide a snapshot of health and disease (Azad and Shulaev, 2019; Mussap et al., 2021). Metabolomics in general involves the systematic identification of metabolites in the human body. To increase its use in daily medical practise, all levels of metabolomics research should be standardised (sampling, wet lab analysis, and also analytical approaches at the level of algorithms) (Ashrafian et al., 2021). Building a national database will improve the understanding of the Slovenian metabolome and the identification of metabolites specific to particular disease or physical condition. This approach was demonstrated in the Netherlands based on 26,000 collected blood metabolomes in the Dutch Biobanking and BioMolecular Resources and Research Infrastructure (Bizzarri et al., 2022). They showed that 1H-NMR metabolomics can capture a wide range of conventional clinical variables in epidemiological studies and that it is possible to generate predictors for discriminating between different diseases such as diabetes, metabolic syndrome, insulin resistance, inflammation (Crohn disease, ulcerative colitis) based on machine learning. Top-down interpretation of metabolomic datasets consisting of different studies is impossible using simple approaches due to the enormous amount of data (Lakrisenko and Weindl, 2021. In addition, and in line with the above, 140 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 metabolomics accounted for the majority of funding and potential research between all ‘omics fields. However, it was also noted that the main problem is the lack of standardisation for integrating different metabolomics datasets and that this could be important in the future to increase the confidence of metabolite identification in large datasets but also to address the variability within and between different ‘omics fields (Yu et al., 2022). For this reason, newly developed methods that were tailored to specifically address these problems in statistically sound way should be used. Due to the complexity of the data linked to metadata of patients and/or participants, computational models are needed to understand these data in different ways, such as machine-learning methods (Bizzarri et al., 2022), metabolic networks (Töpfer et al., 2015), constraint-based and kinetic models (Volkova et al., 2020; Lakrisenko and Weindl, 2021). Our database already provides one implementation of the above considerations into sound and effective approach transforming the 1H-NMR urine data into a form amenable for building machine-learning models in the very near future for their use in medical diagnostics. Unknown urine samples could easily be classified as members of either healthy or various disease groups. With this work, we aim to stimulate the interest of other researchers in the field of biomedicine to include NMR metabolomics in their research process in order to complement our newly established database with their concise descriptions of medical conditions in order to reach some 10,000 samples at national scale. This is of relevance due to the central European geographic location of the Republic of Slovenia and its local genetic characteristics coupled to lifestyle habits, dietary characteristics, and environmental conditions. To summarize, the assembly and modelling of these data to create ML models is a viable approach that can be used in medical practise to distinguish between various disease phenotypes and healthy groups. Taking this approach is one step closer to precision data-driven medicine that would improve health care approach on a national scale. A Slovenian urine NMR database paper is currently in preparation. 141 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 3 DISCUSSION AND CONCLUSIONS 3.1 DISCUSSION In this chapter, we summarise the developments presented within this doctoral thesis in a more comprehensive interrelated manner. First, we focus on “3.1 Developed tools for data integration” then “3.2 Physico-chemical characteristics of microbial world in the gut” and continue with the most important review of the data and findings produced within the four projects “3.3 Metabolomics in the PlanHab study”, “3.4 Spinal muscular atrophy”, “3.5 X-Adapt project – the influence of short term training on inactive individuals” and “3.6 metabolomes and microbial metagenomes can distinguish pre-term and full-term born adults”. Finally, we focus on the most informative part of “3.7 data integration” with concluding remarks “3.8 What about the future?” and extensions of the presented work. Table 2 lists my personal contributions to each paper published within four years of this PhD. Table 2: My contributions to published and unpublished work and postulated hypothesis in the frame of this PhD. Preglednica 2: Moj doprinos k objavljenim člankom in postavljene hipoteze v okviru doktorata. Postulated hypothesis in PhD Published or additional work Leon Deutsch contributions proposal Conceptualization of analysis, B.S.; methodology, B.M., L.D., Murovec B., Deutsch L. , Stres B. 2019. B.S.; formal analysis, L.D., Computational framework for high-quality B.S., B.M.; data curation, production and large-scale evolutionary L.D., B.S., B.M.; writing— analysis of metagenome assembled original draft preparation, genomes. Molecular Biology and B.S., L.D.; visualization L.D., Evolution, 37, 2: 593-598 B.S.; project administration, B.S.; funding acquisition, B.S., B.M. Murovec B., Deutsch L. , Stres B. 2021. Conceptualization of analysis, General unified microbiome profiling B.S.; methodology, B.M., L.D., pipeline (GUMPP) for large scale, B.S.; formal analysis, L.D., streamlined and reproducible analysis of B.S., B.M.; data curation, bacterial 16S rRNA data to predicted L.D., B.S., B.M.; writing— microbial metagenomes, enzymatic original draft preparation, reactions and metabolic pathways. B.S., L.D.; visualization L.D., Metabolites, 11, 6: 336, doi: B.S.; project administration, https://doi.org/10.3390/metabo11060336, B.S.; funding acquisition, B.S., 14 p. B.M. Continued on next page 142 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Table 2 (continued) Postulated hypothesis in PhD Published or additional work Leon Deutsch contributions proposal Conceptualization for metabolomic analysis, B.S.; H0: There are no significant samples collection, L.D., B.S. differences in metabolomes Deutsch L. , Osredkar D., Plavec J., Stres B. and D.O.; metabolome before and after treatment. 2021. Spinal muscular atrophy after analysis, L.D. and B.S.; H1: There are significant nusinersen therapy: improved physiology in methodology, D.O., B.S. and differences in urine pediatric patients with no significant change J.P.; formal analysis, L.D., (systemic) and liquor (local) in urine, serum, and liquor 1H-NMR B.S., D.O. and J.P.; data metabolomes before and metabolomes in comparison to an age- curation, L.D. and B.S.; after treatment with gene matched, healthy cohort. Metabolites, 11, 4: writing—original draft therapy, enabling 206, preparation, L.D., B.S., D.O. identification of doi.https://doi.org/10.3390/metabo11040206, and J.P.; visualization L.D. and characteristic metabolic 15 p. B.S.; project administration, pathways discerning the two D.O. and B.S.; funding groups. acquisition, D.O. and B.S. A Deutsch L. , Stres B. 2021. The importance of objective stool classification in fecal 1H- Conception and design of the NMR metabolomics: exponential increase in study (B.S.), data collection stool crosslinking is mirrored in systemic (L.D., B.S.), data preparation inflammation and associated to fecal acetate and analysis (L.D., B.S.), and methionine. Metabolites, 11, 3: 172, doi. writing and critical revision of https://doi.org/10.3390/metabo11030172, 16 the manuscript (L.D., B.S.). A p. BS provided the concept for Šket R., Deutsch L. , Prevoršek Z., Mekjavić metabolome analysis and I.B., Plavec J., Rittweger, J., Debevec T., drafted the manuscript. TD and Eiken O., Stres B. 2020. Deutsch L., Stres B. JR collected the samples. BS, 2021. Systems view of deconditioning during RŠ, and JP designed the spaceflight simulation in the PlanHab Project: metabolome analyses. RŠ, BS, The departure of urine 1H-NMR ZP, LD, OE, and IM conducted metabolomes from healthy state in young the research. RŠ, BS, and LD males subjected to bedrest inactivity and analyzed the data. RŠ and BS hypoxia. Frontiers in Physiology, 11: 532271, provided necessary code to doi. streamline 1H-NMR spectra https://doi.org/10.3389/fphys.2020.532271, analyses and provided statistical 15 p. analyses. Conceptualization, T.D. and B.S.; methodology, L.D. and Deutsch L., Sotiridis A., Murovec B., Plavec B.S.; conceptualization for J., Mekjavić I., Debevec T., Stres B. 2022. metabolomic analysis, J.P. and Exercise and interorgan communication: B.S., formal analysis, L.D. and short-term exercise training blunts B.S.; data curation, B.M., L.D. differences in consecutive daily urine 1H- and B.S; exercise database, NMR metabolomic signatures between T.D., A.S., I.M; writing-physically active and inactive individuals. original draft preparation, Metabolites, 12,6: 473, doi. L.D. and B.S.; visualization, https://doi.org/10.3390/metabo12060473, 18 L.D. and B.S.; supervision, p. B.S.; project administration, A.S., T.D., I.M., B.S.; funding acquisition, T.D., I.M., B.S. Continued on next page 143 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Table 2 (continued) Postulated hypothesis in PhD Published or additional work Leon Deutsch contributions proposal H0: No significant difference exists between preterm and term groups of participants at the levels of faecal or urine metabolomes or faecal metagenomes. H1: There are significant Conceptualization, T.D., D.O., differences between preterm G.P.M. and B.S.; data and term groups of collection, M.M., S.O., R.Š., participants in faecal and L.D. and B.S.; methodology, urine metabolomes that can Deutsch L., Debevec T., Millet G.P.., L.D., R.Š. and B.S.; be linked to their physical Osredkar D., Opara S., Šket R., Murovec conceptualization for performance in experiments B., Mramor M., Plavec J. Stres B. 2022. metabolomic analysis, J.P. and and physiological data at Urine and fecal 1H-NMR metabolomes B.S., formal analysis, L.D. and exercise and rest. differ significantly between pre-term and B.S.; data curation, B.M.; H2: There are significant full-term born physically fit healthy adult writing—original draft differences at the level of males. Metabolites, 12: 6, doi. preparation, L.D. and B.S.; metagenomics makeup of https://doi.org/10.3390/metabo12060536, visualization, L.D. and B.S.; both groups, giving rise to 23 p. supervision, B.S.; project identification of specific administration, T.D., G.P.M., metabolic pathways differing D.O. and B.S.; funding between the two groups and acquisition, T.D., D.O. and B.S. their gut environment characteristics. H3: Term and preterm gut samples contain specific MAGs associated with differences in gut environmental conditions between the two groups. Data collection, formal analysis, MAGs assembly visualisation, writing H0: There is no significant difference between metabolomes of prematurely born, born on time, before SMA treatment and post SMA treatment groups. Data collection, formal analysis, Data integration H1: There are significant visualization, writing differences in urine metabolomes that enable identification of biomarker pools and metabolic pathways delineating various groups under investigation. 144 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 3.1.1 Developed tools for data integration Microbial species play important roles in diverse environments characterised by a wide range of organismal complexity (Murovec et al., 2020). Microbes living in the gut are in constant not only bidirectional interactions with the host but also multidirectional interaction with their microbial counterparts through the production of various molecules that can improve the health status of the host or, in contrast, lead to the development of a noncommunicable disease or its progression (Murovec et al., 2020). Disease progression can manifest as mild gastrointestinal symptoms or as serious diseases such as inflammatory bowel disease, colon cancer, or liver cancer. It has to be kept in mind that specific proteins and peptides next to metabolites from metabolic reactions mediate the crosstalk between gut, brain, and other peripheral metabolic organs in order to maintain energy homeostasis. The multidirectional interactions between metabolic organs and the central nervous system have evolved in parallel with the multicellularity of organisms to maintain whole-body energy homeostasis and ensure the organism’s adaptation to external environmental parameters. These interactions become severely affected in pathological conditions of noncommunicable diseases, such as obesity, insulin resistance, metabolic syndrome or type2 diabetes. Bioactive peptides and proteins next to hormones and cytokines, produced by both peripheral organs and the central nervous system, plus molecules from muscle wear and tear including metabolites from microbiome and energy production/consumption are key messengers in this interorgan communication (Castillo-Armengol et al., 2019). A number of diseases were linked to metabolic imbalances that are partially or completely related to the gut microbiome (from metabolic syndrome and obesity to autoimmune diseases, infections, and mental disorders (Murovec et al., 2021). The discovery of sequencing technologies enabled the study of microbes that cannot be cultured. It quickly became clear that most microbes (i.e., 99%) cannot be cultured in the laboratory environment, but we can sequence their genetic material and see which microbes are present in the sample. Based on amplicon sequencing (e.g., 16S rRNA) or whole metagenome sequencing, we can determine which microbes are present in the samples (microbiota) and if coupled to their genetic potential through inference (based on 16S rRNA coupled to nearest genome sequences) or analyse all the genes (based on whole metagenome) that are present in the sample. Based on their genetic potential, we can infer the microbial functionality of the sample (what these microbes most likely can do), enzymatic reactions that they support, next to the metabolic pathways that result from enzymatic reactions and metabolites that are most likely the result of all these numerous transformations (Berg et al., 2020). A number of different methods were developed for the analysis of sequences in the context of microbiome research. Based on 16S rRNA, Mothur (Schloss et al., 2009) can be used to analyse amplicon sequence material at three different levels: (i) genus (Rühlemann et al., 2021), (ii) 97% 16S rRNA identity operational taxonomic units (Mysara et al., 2017), or (iii) amplicon sequence variants (Callahan et al., 2017; Schloss, 2021). In addition, another set of tools was developed for predicting 145 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 microbial functionality based on amplicon sequences: PICRUSt (Langille et al. 2013a), PICRUSt2 (Douglas et al., 2020), Tax4Fun (Aßhauser et al., 2015), Tax4Fun2 (Wemheuer et al., 2020), and Piphillin (Narayan et al., 2020). These tools link 16S rRNA sequence information to reference genome sequences and predict microbial potential based on metagenomic functional gene content (Sun et al., 2020). We have developed GUMPP for large-scale, streamlined, and reproducible analysis of bacterial amplicon data and prediction of their functional potential (Murovec et al., 2021), consisting of Mothur (Schloss et al., 2009), PICRUSt2 (Douglas et al., 2020), and piphillin (Narayan et al., 2020) in order to support large scale data analyses. Thus far, more than 600 samples from 32 studies amounting to 120 million reads were analysed in meta-analysis project (Klammsteiner, University of Innsbruck, in preparation). The more objective analysis of functionality of microbes cannot be studied without sequencing the entire metagenome directly. Whole metagenome sequencing involves the untargeted sequencing of a random subset of all sequences to certain read depth, not like in targeted (amplicon) sequencing in which only a small portion of a specific gene is sequenced. BioBakery (McIver et al., 2018; Beghini et al., 2021) is the workflow for whole metagenome sequence analysis that combines different tools for quality analysis, taxonomic analysis (MetaPhlAn), functional genes, enzymatic reactions, and metabolic pathways of interest in the microbial community (HUMAn3). In addition, the extension of this method by utilizing training on actual metagenomes coupled to lipid-soluble and water-soluble metabolomes determined through mass spectrometry allows prediction of microbial metabolites on metagenome information alone and hence describing the metabolomes that might be produced in this community (MelonnPan). Another positive aspect of whole-genome sequencing is that information on genetic material can be obtained from different taxonomic groups (archaea, bacteria, protozoa, fungi, DNA viruses, (also human DNA)), which can improve the understanding of the complexity and interactions between different taxonomic layers. We are in the process of publishing the developed metaBakery workflow (manuscript in preparation), which is a re-implementation of the BioBakery workflow, with the addition of the sequence QC steps, extended with diversity calculators implemented within Mothur, guided by our in-house skeleton application, and implemented as Singularity container for large-scale, streamlined, and reproducible analyses at HPC setting. The next step in whole metagenome sequencing is the possibility of de-novo metagenome assembly. This is a process in which reads are screened for quality, assembled, and binned together to yield assembled metagenomes. This process can lead to the discovery of entirely new species. However, care must be taken in this process regarding the completeness and contamination of the newly assembled genomes. According to the MIMAG standard (Bowers et al., 2019), we should all strive to assemble the most complete (> 95%) and least contaminated (< 5%) MAGs. These will enable the next stage of evolutionary analysis and hopefully provide new ideas on how microbes interact with human beings as their host. We have developed the Metagenome-Assembled Genomes Orchestra (MAGO (Murovec et al., 2020)) from highly successful tools for quality analysis (FastQC, fastp (Chen et al., 2018)), assembly (IDBA-UD (Peng et al, 2012), metaSPAdes (Nurk et al, 2017) and megaHIT (Li et al., 2015)) and binning (maxBin (Wu et al., 2016), MetaBAT (Kang et al., 2015), 146 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 CONCOT (Alneberg et al., 2014), BinSanity (Graham et al., 2017), and DAStool (Sieber et al., 2018)). In conjunction, the CheckM tool (Parks et al., 2015) is used to filter out which MAGs are of high quality according to the MIMAG standards (Bowers et al., 2019). The MAGO tool also allows the user to analyse the evolution of the MAGs obtained using ezTree (Wu, 2018), average amino acid identity (AAI) enables insight into species cut-off values, Prokka (Seemann, 2014) serves for genome annotation while Roary (Page et al., 2015) provides pan- and core-genome analysis, and FastANI enables nucleotide identity analysis of genomes. The resulting bins are then selected based on their completeness and contamination according to MIMAG standard and analysed subsequently using other tools (Castro et al., 2018; Rodriguez-R et al., 2018; Ruiz-Perez et al., 2021). The three tools for large scale data analyses presented in our work (MAGO, GUMPP, metaBakery (manuscript in preparation)) were prepared as a skeleton framework consisting of more than 10,000 lines of code written in Python, which orchestrates the execution of each part and takes care of the execution of programs and the creation of their command lines (Murovec et al., 2020; Murovec et al., 2021). The parameters for the execution of the workflow are entirely in the hands of the user. All tools were developed as Singularity images (Kurtzer et al., 2017) prepared for straightforward deployment on HPC for large-scale, straightforward analysis of 10,000 samples as well as for educational purposes. Both, metaBakery and MAGO tools were used for metagenomic sequence analysis in the PreTerm project (Deutsch et al., 2022b). Both tools are under the CC-BY 4.0 open-source license and are open to any extensions, thus providing the opportunity to develop further and become standardized workflows for microbial analysis on a global scale. metaBakery (in preparation) will be used in the future project that is part of the Million Microbiomes from Human Project (MMHP, (Fang et al., 2018; Han et al., 2018; Patterson et al., 2019)) and will provide insight into the Slovenian gut microbiome. Currently, 5000 deep sequencing samples (10 mio reads/sample) encompassing 13 gastrointestinal diseases including depression next to healthy state (14 conditions) from 22 states were processed utilizing 1.2 million CPUh on a VEGA supercomputer (in preparation), providing thus another well represented dataset amenable for ML exploration. 3.1.2 Physicochemical characteristics of microbial world in the gut The peristaltic waves that create the contractile patterns of the small intestine create an environment that is constantly changing. The constant mixing of faecal material results in changes in environmental conditions for the microbes living in the gut, such as pH, which can affect microbial growth (Ehrlein and Schemann, 2005; Johnson et al., 2012; Cremer et al., 2016; Glover et al., 2016; Cremer et al., 2017; Sket et al., 2017a). A number of studies have linked stool consistency, the microbial living environment, to the richness of the gut microbiota, its composition, enterotypes, elevated inflammatory levels, lipopolysaccharides, and bacterial growth rates (Tigchelaar et al., 2016; Vandeputte et al., 2016). Stool consistency was mostly assessed with BSS method (Heaton et al., 1992; Lewis and Heaton, 1997). Lower BSS scores were associated with longer colonic transit time, higher microbial richness, and protein catabolism (Roager et al., 2016). Alteration of the microbiota and the occurrence of local inflammation was previously found to be correlated with BSS. High intra-147 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 and inter-rater variance was observed in the assessment of the BSS (Derrien et al., 2010; Chumpitazi et al., 2016). The assessment of the BSS is based on self-assessment, which may be biased. Therefore, only a well-trained expert can draw medically important conclusions based on the BSS alone (Matsuda et al., 2021). Faecal materials are semisolid materials (i.e., pastes) in terms of material physics (Grillet et al., 2012), which places them between viscoelastic materials (semipermanent deformation in response to external forces) and plastic materials (permanent deformation). This way of thinking led us to the evaluation of the minimal pressure approach for the less biased and high-throughput evaluation of the consistency of faecal material (Deutsch and Stres, 2021). Minimal pressure, expressed as force per unit area, is the pressure required to cause permanent deformation of faecal material. We have shown that MP increases exponentially compared to decreasing values of BSS, regardless of the sex of the individuals (Deutsch and Stres, 2021). We demonstrated that there is a nonlinear (asymptomatic) and complex relationship between dry matter and MP. Longitudinal mapping of the surface MP over the entire length of a single stool sample revealed that various fine-grained internal, local differences existed. In addition, despite the BSS uniform scoring of lower BSS values, our analysis showed that a more resistant stool surface layer was followed by softer internal structures, resulting in lower MP values associated with approximately healthy stool consistency (Deutsch and Stres, 2021). We found a boundary that may distinguish between healthy state (MP < 75) or constipation (MP > 75) (Blake et al., 2016; Sket et al., 2017b; Sket et al., 2018). MP < 30 corresponded to aqueous stool samples. MP approach introduced the continuous scale, which can be measured to overcome the problems of BSS assessment errors in BSS around 3 and 4, which are difficult to determine based on visual inspection, despite the training and visual support in classification (Deutsch and Stres, 2021). MP was measured on the samples collected within the PlanHab (Šket et al., 2020) and the PreTerm study (Deutsch et al., 2022b). Notably, the past studies demonstrated that blockage of faecal surface pores and mucus retention were associated with selective pressure on the gut microbiome, its gene expression, and metabolic activity, leading to local inflammation (Vandeputte et al., 2016; Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018; Aron-Wisnewsky et al., 2019). Thus, we showed that the MP approach can accurately describe the clinical significance of stool consistency (Deutsch and Stres, 2021). In addition, the MP approach does not require the pre-treatment of samples and allows for ease of measurement without expensive equipment, as well as reproducibility of these measurements with different samples (fresh vs. frozen; male vs. female), with simple correction for the temperature of measurement. We also found that MP correlates with faecal methionine and acetate based on 1H-NMR measurements. Based on these two metabolites, we can distinguish three different groups of faecal consistency (MP < 30, 3075). Methionine was previously associated with oxidative stress and was elevated in inactive individuals, while acetate correlated negatively with insulin sensitivity, indicating that different stool consistencies may have an impact on the biological system of the host. The observed differences in methionine and acetate associated with MP, were thus apparently consequence of inactivity coupled with Western diet as based on the samples collected 148 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 within the PlanHab project. The MP approach enabled us to show that with the measurement of some physicochemical parameters and ‘omics methods, a completely new level of understanding of complex biological systems can be commenced and explored (Deutsch and Stres, 2021). 3.1.3 Metabolomics in the PlanHab study The PlanHab study was the first study by our group to examine the metabolomics of human urine (Šket et al., 2020). The run-in and the following three 21-day interventions (NBR, HBR, and HAmb) in a crossover manner) were performed. Morning urine samples were collected throughout the experimental setup (Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018). The unique crossover design allowed us to consider the responses of the same participants to all three experimental variants under controlled dietary, environmental, and experimental conditions. A total of 523 urine samples were collected and prepared for 1H-NMR measurements. Participants in the bed rest group (NBR and HBR) had specific metabolic compositions compared with the HAmb group. We concluded that the decision of the host to minimize physical activity under hypoxic conditions can be detected within a few days at the level of the urine metabolome measured by NMR. Under normoxic bed rest conditions, these metabolic changes became detectable within the first ten days. The metabolites identified in this study were associated with a number of different diseases: (i) chronic obstructive pulmonary disease (Adamko et al., 2015; Ząbek et al., 2015) and (ii) cardiovascular disease associated with tissue hypoxia, which can also lead to type 2 diabetes, depression, and osteoporosis (Wang et al., 2011a; Senn et al., 2012; Adamko et al., 2015; Ząbek et al., 2015). The PlanHab study utilizing urine 1H-NMR metabolomes led us to conclude that there is no simple metabolic biomarker that could distinguish between different states (healthy vs. sick, active vs. inactive; active vs. sedentary). Complex multivariate descriptions of metabolism were needed to capture commonalities in human physiology, interpersonal variability, and temporal variability. This concept was utilized in all other subsequent studies. For instance, a metabolite could be up- or down-regulated depending on the metabolic pathway. Overall, inactivity alone or in combination with hypoxia resulted in decreased systemic metabolic diversity, increased number of metabolic pathways affected, and more rapid metabolic deconditioning leading to the development of negative physiological symptoms such as insulin resistance, low-level systemic inflammation, constipation, depression, and metabolic syndrome (Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018). The results of the PlanHab study encouraged us to continue our research utilizing samples from other studies involving different levels of inactivity, such as X-Adapt (differences between trained and untrained individuals), spinal muscular atrophy, and the PreTerm project (Figure 14), which compares different times of exposure to hypoxia, physical activity, and time of exposure to different conditions (Šket et al., 2020). 149 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 14: Representation of studies involved in this work Representation of studies involved in this work with the relation to physical activity and hypoxia exposure. Slika 14: Prikaz študij udeleženih v tem delu Prikaz študij udeleženih v tem delu glede na stopnjo fizikalne aktivnosti na eni strani in izpostavitve hipoksiji na drugi. 3.1.4 Spinal muscular atrophy Spinal muscular atrophy is a neuromuscular disease that manifests as progressive atrophy and weakening of skeletal muscle due to progressive loss of motor neurons and also affects a number of other organ systems (Melki, 2017; Yeo and Darras, 2020). With an incidence of 1 per 11,000 births, it is still considered the most common genetic cause of child deaths (Sugarman et al., 2012). In SMA patients, mutations in the centromeric SMN2 gene lead to the formation of unstable proteins and, at the same time, the expression of the telomeric SMN1 gene is also impaired due to deletion (Lefebvre et al., 1995; Lorson and Androphy, 2000; Lunn and Wang, 2008; Smeriglio et al., 2020). In recent years, new therapies have been developed for the treatment of SMA. These therapies alter the natural course of the disease by changing the expression of or replacing mutated genes involved in the development of SMA (Chiriboga et al., 2016). Nusinersen was the first drug approved by the Food and Drug Administration in the United States and by the European Medicines Agency for SMA. Nusinersen is an antisense oligonucleotide that modifies mRNA splicing, resulting in an active SMN 2 protein and thus better SMA outcomes (Chiriboga et al., 2016; Corey, 2017; Ramdas and Servais, 2020). It must be administered intrathecally because it cannot cross the blood-brain barrier (Faber et al., 2007; Rigo et al., 2012). 150 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Urine, liquor, and serum samples from SMA patients were collected before treatment and after the 4th application of nusinersen. Medical examination at the 4th application showed improvement in mobility. The application of nusinersen resulted in better movement, easier writing and sitting or standing, and increase in strength, all as measured by the Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders (CHOP INTEND (Glanzman et al., 2010), the Hammersmith Functional Motor Scale (HMFS (Pera et al., 2017)), the Expanded Hammersmith Functional Motor Scale (HMFSE (Pera et al., 2017)), or the Motor Function Measurement (MFM (Bérard et al., 2005)) tests. Patients showed improvement in wheelchair control, ambulation, fatigue, hygiene, speech, and sleep after the 4th application of nusinersen (Osredkar et al., 2021). In contrast to the physical examinations, we could not establish that based on the npMANOVA test on all metabolic matrices (urine, liquor, serum) regardless of gender or data transformation (Deutsch et al., 2021a). In this context, we could not reject the null hypothesis from section 1.3.2 and table 2, which states that there are no significant differences before and after treatment. Perhaps these differences could be confirmed after 10 applications of nusinersen, but this would take too much additional time to collect the samples and to complete within the timeframe of this doctoral thesis. These results show that the efficacy of nusinersen can be seen with the medical examinations and the assessment test. Perhaps the use of other metabolomics methods such as mass spectrometry, which is more sensitive to nanomolar concentrations compared with NMR, would lead to the detection of biomarkers that could be used as biomarkers for monitoring nusinersen treatment (Emwas et al., 2019). In addition, a series of urine samples were collected from the matched healthy cohort to compare the metabolomes of SMA patients with the metabolomes of healthy individuals. This comparison led to the observation of a significant metabolic difference between females and males (p=0.0001), as well as the healthy cohort and the SMA patients. The npMANOVA showed the importance of gender (F=54.9; p=0.0001) and SMA status before and after treatment (F=20.7; p=0.0001) to be significant. Both methods, PLSDA and Random Forest, showed significant differences between female and male metabolomes, and we also detected different metabolic diversity when comparing SMA patients to a comparable healthy cohort. A significant reduction in the cumulative concentration of metabolites was observed in SMA patients (p < 0.05). The reduction in the number of metabolites was also observed in healthy females compared to healthy males. This was the first report describing the existence of differences between males and females. Because of these differences, it is important for future studies to include a larger number of females in studies such as this one to determine the important differences between female and male metabolic makeups and pathways. There are some preliminary parallels with studies of exercise showing that metabolite counts may increase after exercise (Nieman et al., 2013; Schranner et al., 2020) or studies of bed rest (e.g., PlanHab), which also showed a 30% reduction in metabolite counts after three weeks of bed rest (Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018). Symptoms such as insulin resistance, bone and muscle loss, 151 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 changes in lipid metabolism were all detected in bedrest studies, and all of these symptoms can be observed also on the list of conditions associated with SMA. We used urine metabolomes from SMA patients and healthy individuals to create a classification model to distinguish between these two conditions. For this purpose, the JADBIO machine learning was used (Tsamardinos et al., 2022), and logistic ridge regression was selected with an AUC value of 0.958 as the best model to distinguish SMA patients and healthy controls. Creatinine was the key metabolite separating healthy from SMA-affected participants as was also reported a few months before our publication in another study that monitored the SMA progression of denervation with elevated levels of creatinine in more severe forms of SMA disease (Alves et al., 2020). Creatinine concentrations did not change significantly in SMA patients before and after the 4th application of nusinersen. The increased creatinine levels were also observed in urine samples from our bed rest studies (PlanHab (Šket et al., 2020)). The reintroduction of exercise completely reversed the adverse effects in these studies (Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018; Šket et al., 2020). Immobilized patients receiving vibration therapies benefited compared with controls and may represent a potential step in the physical activation of SMA patients after nusinersen therapy (Hoff et al., 2015) in the future due to involuntary contractions of muscles during balancing (Figure 15). 152 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 15: Model representing results of inactivity. Model representing the general results of long-term inactivity due to illness or bed-rest studies. All levels of inactivity can result in systemic symptoms leading to noncommunicable diseases. Resuming or reintroducing physical activity can reduce these symptoms and lead to better treatment and health outcomes. Slika 15: Model predstavlja rezultate neaktivnosti. Model predstavlja rezultate dolgočasne neaktivnosti nastale zaradi bolezni ali študij ležanja. Ne glede na razlog, vse vrste neaktivnosti, vodijo v pojav sistemskih simptomov, ki se kažejo kot kronične bolezni. Povečana fizikalna aktivnost, lahko izboljša zdravje ali zdravljenje takih bolezni. 3.1.5 X-Adapt project – the influence of short-term training on inactive individuals We investigated complete inactivity within the context of the SMA project. However, in the 21st century, it is becoming increasingly clear that physical inactivity, which is the consequence of a sedentary lifestyle and physically less challenging working conditions, is also a global problem that poses a risk for the development of chronic noncommunicable diseases and increased global mortality (Kelly et al., 2020b). It was showed that minimizing sedentary time can reduce the risk of chronic diseases such as coronary heart disease, type 2 diabetes, metabolic syndrome, etc. (Sallis et al., 2016). The goal of the X-Adapt project was to examine the differences between physically active (trained participants) and inactive individuals (Sotiridis et al., 2018; Sotiridis, 2019b; Sotiridis et al., 2019; Sotiridis et al., 2020). The project pre-screened the participants and enrolled 10 active and matching 10 inactive male participants in the 10-day training protocol, which consisted of daily training on a cycle ergometer at 50% of maximal pedalling power under normoxic and normobaric (~1000 hPA) conditions at 24°C ambient temperature. Before participating in the 10 days of training, all participants (active and inactive) underwent the three-day testing under thermoneutral normoxic and hypoxic conditions next to hot normoxic conditions. Study participants were classified as trained or 153 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 untrained based on their maximal oxygen output (untrained VO2max < 45 mL-kg-1-min-1, trained VO2max > 55 mL-kg-1-min-1) (Jay et al., 2011; Montero and Lundby, 2017). The trained participants practiced their activities several times per week (running, swimming, cycling) and the untrained participants were asked not to participate in organized sports but were allowed to be active because of commuting (cycling to work). Urine was collected from all participants before the start of the study, at pre-testing, after 10 days of training and after the study (Armstrong and Barker, 2011; Sotiridis et al., 2018; Sotiridis, 2019a; Sotiridis et al., 2019; Sotiridis et al., 2020; Deutsch et al., 2022a). The measurements directed at human physiology showed that there were some nearly significant and statistically significant differences between trained and untrained subjects at pretesting, and that there were nearly significant (but still insignificant) differences even after only 10 days of training when comparing pre- and post-training, suggesting that some characteristics may be observed in subjects leading an active lifestyle. The differences between the condition before and after training were larger in the untrained groups and based on the measurements of VO2max before training and its change during the 10 days of training the rate of adaptation to training is greater in untrained individuals. Based on physiological measurements, we observed that the untrained and trained groups became synchronized in terms of the measured training parameters (Sotiridis et al., 2018; Sotiridis, 2019a; Sotiridis et al., 2019; Sotiridis et al., 2020; Deutsch et al., 2022a). Based on urine metabolomics, no significant difference could be detected between urine samples before and after 10 days of training. However, differences were observed between trained and untrained urine 1H-NMR metabolomes. In addition, urine physicochemical properties (pH, total dissolved solids, salinity and conductivity) also differed significantly between these two groups. For example, pH was decreased in untrained individuals, a condition previously associated with metabolic syndrome and chronic heart failure (Maalouf et al., 2007; Otaki et al., 2013; Kraut and Madias, 2016; Shimodaira et al., 2017). Metabolites (cholate, tartrate, cadaverine, lysine, N6-acetilysine, methanol, N-acetylglucosamine, butanone, and caprate) were identified as metabolites responsible for differentiation between trained and untrained group using multivariate statistics and machine learning. All metabolites were previously observed in studies related to muscle damage, hormone receptor levels, recovery after resistance training, lower cardiovascular risk (tartrate) (Abramowicz and Galloway, 2005; Spiering et al., 2008) or atrophic state in myotubes, and obesity (cholate) (Li et al., 2020; Abrigo et al., 2021; Alamoudi et al., 2021; Mercer et al., 2021; Pushpass et al., 2021; Zheng et al., 2021). Cholate is a primary bile acid that was enriched in the untrained group, which was previously associated with the development of cancer. Incidentally, increased concentrations of primary bile acids in the bloodstream were observed in less fit women, and a single training run may decrease the amount of these compounds (Danese et al., 2017; Maurer et al., 2020). 154 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Lysine and cadaverine are polyamines previously associated with metabolic syndrome and colon or liver cancer (cadaverine was elevated in the untrained group). Lysine is involved in aminoacyl-tRNA biosynthesis, a metabolic pathway enriched in the trained group and previously correlated with higher physical activity, which may be due to changes in protein synthesis in active subjects (Robinson et al., 2017; Castro et al., 2019; Tabone et al., 2021; Tian et al., 2021). 2-hydroxy-3-methyl valerate was increased in untrained participants, which may affect energy metabolism via PPAR-α, as previously shown in older, functionally impaired adults (Coen et al., 2013; Lustgarten et al., 2014). Using this approach, we showed that the entire system in active subjects was significantly different from that of inactive subjects (p=0.003). After 10 days of training, the significance of difference disappeared at the end of the campaign (p=0.226) (Figure 16). It became clear that minor metabolomic differences existed between the metabolomes of trained and untrained subjects, which remained physiologically completely different with respect to their physical capabilities. Therefore, lifelong training would be required to maintain a healthy metabolome phenotype. Our study showed that an exercise load 5 times higher than the 75–150 minutes per week recommended by WHO is effective (Sallis et al., 2016; Kelly et al., 2020b). In addition, this experiment has shown that 3-day morning urine samples provide a good biological matrix for discriminating active from inactive individuals, which cannot be observed in a 1-day sampling because of diurnal variability. Systemic homeostasis depends on a number of different parameters and involves communication between different organs through which metabolic pathways affected by a metabolite in one organ can affect other metabolic pathways in another organ. A sedentary lifestyle can disrupt this communication between organs, leading to the manifestation of various diseases. Higher levels of exercise can restore interorgan communication in physically inactive individuals towards that of healthy and active individuals (Di Liegro et al., 2019; Deutsch et al., 2022a). 155 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 16: Change between trained (T) and untrained (UT) participants of X-Adapt study. The entire system in active subjects was significantly different from that of inactive subjects (p=0.003) before the X-Adapt study. After 10 days of training, the significance of difference disappeared at the end of the campaign (p=0.226). Slika 16: Sprememba med treniranimi (T) in netreniranimi (UT) udeleženci študije X-Adapt. Na začetku kampanje je bil celoten sistem treniranih udeležencev študije X-Adapt drugačen od netreniranih udeležencev (p=0.003). Po 10-dnevnem treniranju je ta razlika na nivoju celotnega Sistema izginila (p=0.226). 3.1.6 Metabolomes and microbial metagenomes can distinguish preterm and full-term born adults Preterm birth is defined as a birth before 37 weeks gestation; approximately 10% of births are preterm worldwide, and it is still one of the leading causes of death in children under 5 years of age. Preterm birth increases the risk of developing various chronic diseases such as cardiovascular, endocrine/metabolic, renal, neurological, and psychiatric disorders. One of the main causes of these disorders is increased oxidative stress in the first weeks of life (Moutquin, 2003; Magalhães et al., 2004; Pialoux et al., 2009; Blencowe et al., 2012; Lushchak, 2014; Liu et al., 2015; Manley et al., 2015; Debevec et al., 2017; Crump, 2020; Tingleff et al., 2021). There is a high probability that some clinical parameters such as body fat mass, arterial blood pressure, fasting glucose and cholesterol may be elevated (Kerkhof et al., 2012; Markopoulou et al., 2019; Crump, 2020). All of these characteristics were shown in various studies to be different between preterm and full-term born adults and that these differences are particularly related to the production of reactive oxygen species, and can be observed in association of different levels of exercise or physical activity (Magalhães et al., 2004; Powers et al., 2011; Filippone et al., 2012; Debevec et al., 2017; Martin et al., 2018). The aim of the PreTerm project was to investigate whether differences of blunted ventilatory response (HVR) exist in physically fit young men (born preterm and full-term) under hypoxic and normoxic 156 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 environmental conditions at rest and during physical activity (Debevec et al., 2019; Debevec et al., 2022). In addition, a high-throughput analytical approach consisting of urine and faecal metabolomics and faecal metagenomics was used to describe the complexity of the human body and gut microbiome in its response to increased oxidative stress levels at rest and during exercise in normoxia and hypoxia (Martin et al., 2020). A total of 37 men were enrolled in this study (15 born full- term and 22 born preterm). Incremental cycling in normoxia and hypoxia were shown to increase levels of oxidative proteins, catalase, superoxide dismutase, and nitrosative markers in both groups immediately after exercise (Martin et al., 2020). Participants in the preterm group showed lower exercise capacity in normoxia compared with the full-term group and had lower HVR, whereas no such difference was observed in hypoxia (Vrijlandt et al., 2006; Lovering et al., 2013; Svedenkrans et al., 2013; Bates et al., 2014; Clemm et al., 2014; Farrell et al., 2015; Debevec et al., 2019). These results indicate that preterm infants may have increased oxidative stress during acute exercise in normoxia, whereas such a response was not observed in hypoxia (Martin et al., 2020). We measured 25 physicochemical variables in the stool samples (including the MP approach described above), and no significant differences were found between the preterm and full-term groups. These results indicate that there were no differences in gut environment parameters between preterm and full-term infants, regardless of environment (hypoxia vs. normoxia). Faecal and urine samples were collected three days before and three days after the hypoxic and normoxic tests. Multivariate statistics based on 1- and 2-way PERMANOVA showed that there were significant differences between preterm and full-term participants based on faecal and urine metabolome, but not between pre-test and post-test in normoxia and hypoxia (Deutsch et al., 2022b). Acetone, tartrate, and trans-aconitate were metabolites that were decreased in the preterm group according to the MetaboAnalyst’s results. These metabolites are associated with exercise, fasting, or diabetes mellitus (Paradis et al., 2015; Crump et al., 2019; Perrone et al., 2021). Based on the urinary metabolome, the most interesting enriched metabolic pathway (D-arginine and D-ornithine metabolism) was described previously in association with systemic or tissue hypoxia (Qiu et al., 2017; Haraldsdottir et al., 2019). The differences appear to be due to impaired autonomic function because heart rate recovers more slowly in preterm adults, which could lead to anoxia and increase their cardiovascular risk, as previously suggested (Sonntag et al., 2007; Ten, 2017). Faecal metabolomes also differed between preterm and full-term participants. Lactate, serotonin, and tyrosine were the major metabolites that accounted for the difference between the preterm and full-term groups. The first two were increased in the preterm group, which, together with the enriched metabolic pathway (Warburg effect), shows that some metabolic changes can be observed in preterm infants. The Warburg effect was described previously in preterm infants and associated with mitochondrial dysfunction (McIver et al., 2018). These findings may represent the first evidence that systemic differences due to lifelong exposure to oxidative stress do indeed exist and raise the question 157 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 of whether these differences are associated with minute differences generated on the part of the preterm host or on the part of the microbiome responding to these environmental signals or with their mutual interaction in the form of a complex biochemical network in the steady state (Deutsch et al., 2022b). The collected faecal samples were used for shotgun sequencing to investigate whether the observed differences in faecal metabolomes exist at the microbial level. No significant differences were observed at the taxonomic level, but the relative abundances of archaea and viruses were higher in the preterm group and would deserve further, more detailed inspection using larger sample collections. The calculation of Shannon and other diversity indices showed that microbial diversity was higher in preterm group. In the previous decade, it became clear that the more important question in the study of the microbiome is what the microbes in our gut are doing. For this reason, HUMAnN3 and metaPHLAnN3 were used to determine which gene families, enzymatic reactions, metabolic pathways, and predicted metabolites can be used to distinguish between preterm and full-term born adults. Machine learning was used to build classification models for this purpose utilizing JADBio (Deutsch et al., 2022b). No significant differences were detected based on gene families, but we did detect some differences based on enzymatic reactions, metabolic pathways, and predicted metabolites. The previously described RXN-15378 enzymatic reaction of succinate dehydrogenase was increased in the preterm group. Succinate itself is a microbial metabolite and can accumulate in the intestinal tract during inflammation or microbial imbalances. It has tissue-specific but also pro-inflammatory properties and is also a source of propionate production by Bacteroides spp. and Prevotella sp. Succinate was shown to accumulate in cells under low-oxygen conditions and represents the metabolic signature of hypoxia. Excessive uptake of microbially produced succinate was shown to lead to higher levels of intracellular succinate, which slowed down prolyl- hydroxylase activity through product inhibition and lead to additional activation and stabilization of HIF-1α beyond the response to hypoxia itself, which significantly enhanced LPS-induced expression of proinflammatory cytokines in human cells (Rubic et al., 2008; Ariza et al., 2012; Tannahill et al., 2013; Akram, 2014; Littlewood-Evans et al., 2016; Connors et al., 2018). PWY-7456 (β-(1,4)-mannan degradation), PWY-7323 (superpathway of GDP-mannose-derived O-antigen building blocks biosynthesis) and GLYCOLY-SIS-TCA-GLYOX-BYPASS (superpathway of glycolysis, pyruvate dehydrogenase, TCA, and glyoxylate bypass), P221-PWY (octane oxidation), PWY-5173 (unclassified) were pathways that were increased in the preterm group. Some of them may be beneficial and strive for mucosal integrity and host nutrition (β-(1,4)-mannan degradation) or significantly increase energy production, which would be important in the case of oxidative stress as in preterm individuals (super-pathway of glycolysis, pyruvate dehydrogenase, TCA, and glyoxylate bypass). Acetyl-CoA biosynthesis may also lead to increased production of butyrate via the production of acetyl-CoA. In contrast, some pathways have a more negative effect and were also increased in the preterm group. These pathways were shown to be involved in lipopolysaccharide 158 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 LPS production (GDP-mannose-derived O-antigen building blocks biosynthesis) associated with Gram-negative bacteria and causative agent of different degrees of inflammation (Samuel and Reeves, 2003; Wolfs et al., 2010; Shah et al., 2015; Kim et al., 2016; La Rosa et al., 2019; Lindstad et al., 2021). However, octane oxidation, previously described in the context of westernization of the human gut and associated with liver disease, was also observed (Deutsch et al., 2022b). All these differences can be associated with the physiologically significant deficits observed between both groups (Martin et al., 2018; Martin et al., 2020). Seventeen predicted metabolites were also detected by the 1H-NMR approach, none of which were considered important for differentiation in the machine learning. Significant differences were detected in the urine and faecal metabolomes in addition to predicted metabolites, suggesting that systemic differences between the two groups exist. Elevated metabolites were previously associated with cardiovascular disease (carnitine), increased intestinal permeability, elevated levels of inflammatory cytokines, metabolic syndrome, or cancer growth (putrescine and diacetylspermine). In contrast, some predicted metabolites were decreased in the preterm group. Deoxycholate is a secondary bile acid and a known promoter of colon cancer. The decreased levels of this molecule were generally observed due to the increased urinary excretion of cholate observed in urine metabolomics. Given the physiological differences between the two groups examined in this study, it seems plausible that there were also differences in the extent of utilization of these polyphenols in the preterm group. Hydrocinnamic acid was observed to a lesser extent in the preterm group. The lower content of reducing sugars (fructose, glucose, and galactose) in the preterm group corresponded with a greater capacity to form short-chain fatty acids (Fukiya et al., 2009; Wang et al., 2011b; Koeth et al., 2013; Tang et al., 2013; Ussher et al., 2013; Staley et al., 2017; Heinken et al., 2019; Wirbel et al., 2019). In addition, de novo MAGs were assembled from the same sequences using our MAGO tool (see above). No significant differences were found at the level of MAGs, which corresponds to the same result at the level of taxonomic data obtained with Metaphlan. This is consistent with our observation that there are no significant taxonomic differences between the microbiota of the preterm and the control groups. By introducing a controlled diet, a controlled water intake, and a controlled circadian rhythm as previously described (Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018; Šket et al., 2020), it should be possible in future experiments with sufficient sample size to first establish the existence of significant differences in the microbiome (or the lack thereof) and then focus on the assembly of MAGs. The metabolic responses and predicted metabolites indicated that the microbiome of preterm group has greater metabolic flux compared with the full-term group, suggesting the existence of minor, yet unmeasured, but apparently significant environmental differences in the preterm gut relative to controls. With the results described above (Figure 17), we can confirm two alternative hypotheses from section 1.4.1 and table 2. The first confirmed hypothesis states that there are significant differences between the preterm and full-term groups of participants in faecal and urine metabolomes that can be linked 159 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 to their physical performance in experiments and physiological data at exercise and rest. The second hypothesis states that there are significant differences at the level of metagenomics makeup of both groups, giving rise to identification of specific metabolic pathways differing between two groups and their gut environment characteristics. In the frame of taxonomic descriptions, we could not reject the null hypothesis of no difference between the groups as no significant differences were observed. Figure 17: A summary of observed changes in PreTerm study. A summary of observed changes at various information levels showing that significant differences exist between the preterm and full-term adult urine metabolomes, faecal metabolomes, and microbial metabolic reactions and pathways. Taken together, these results show that host and its microbiome behave measurably different in healthy physically fit young males in comparison to matched full-term controls. Slika 17: Povzetek opaženih razlik v študiji PreTerm. Povzetek opaženih razlik na različnih nivojih informacij, ki kažejo na signifikantne razlike med predčasno in pravočasno rojenimi odraslimi na podlagi metabolomov urina in fekalnih vzorcev ter mikrobnih metabolnih reakcij in poti. Če povzamemo, ti rezultati nakazujejo, da se gostitelj in mikrobiom različno odzivata med predčasno in pravočasno rojenimi odraslimi. 3.1.7 Data integration We summarized more than 1200 collected samples in the creation of the Slovenian urine 1H-NMR database. Metabolomics data from all projects (PlanHab, spinal muscular atrophy, X-Adapt, PreTerm, healthy women and men) were integrated. All measured spectra were analysed with the same procedure to obtain the same metabolites in all projects. We showed that at this level of physiological 160 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 data characteristics distinction is possible between the different activity levels based on the metabolites in urine. All samples were processed in the same way and can be reprocessed again utilizing future database updates using our in-house processing pipeline (Sket et al., 2017a; Sket et al., 2017b; Sket et al., 2018; Šket et al., 2020; Deutsch et al. 2021a; Deutsch et al., 2021b; Deutsch et al., 2022a; Deutsch et al., 2022b) alongside commercially available software for targeted 1H-NMR spectral deconvolution. For instance, the same spectra can be rerun with future database updates of the Human Metabolome Database (HMDB) as it grew from a few thousand metabolites in the first edition (Wishart et al., 2007) to 217,000 metabolites in the latest edition in 2021 (Wishart et al., 2021). The standardized analytical protocols established in our laboratory enabled minimizing systematic errors that usually occur due to batch effects or contributions by various NMR experts. The Box-Cox normalization and the sPLSDA approach utilized to integrate all metabolomes in our study showed competitive performance in removing batch effects, but still preserved variations due to lifestyle or other biological reasons (Wang and Lê Cao, 2020). This approach also allowed us to partly confirm the alternative hypothesis from section 1.4.3 and table 2 confirming significant differences in urinary metabolomes that allow the identification of biomarker pools and metabolic pathways that delineate different groups under study. The identification of biomarker pools should be confirmed on larger dataset. We showed that urinary metabolic fingerprinting has the potential provide a snapshot of metabolic status relevant and related to health and activity status (Azad and Shulaev, 2019; Mussap et al., 2021). In general, metabolomics involves the systematic identification of metabolites in the human body (Ashrafian et al., 2021). The development of a national database should improve the understanding of the Slovenian metabolome in comparison to studies from other European countries and the identification of metabolites specific to various diseases or physical conditions. With an enlarged database, we avoid problems with small sample sizes as observed in individual studies described above. We would need cohorts at least two orders of magnitude larger to confirm the final results of these studies. 1H-NMR metabolomics has the potential to capture a wide range of conventional clinical variables in epidemiological studies, including missing variables for patient metadata, and makes it possible to generate predictors of discrimination between different diseases based on machine learning. Top-down interpretation of metabolomic datasets, particularly urine that can be collected noninvasively, can provide sufficient data to draw conclusions about how samples should be classified into different groups. We hope to generate interest from other researchers to incorporate NMR metabolomics into their research to expand our established database to approximately 10,000 samples on a national scale. The modelling of such data collection represents unique avenue to create ML models that can be used in medical practice at least tentatively to distinguish between healthy and unhealthy metabolic states next to between different diseases. Thus, this approach represents a step closer to data-driven precision medicine that has the potential to inform health on a national scale. The publication of the Slovenian urine NMR database is in preparation. 161 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 3.1.8 What about the future? The beauty of ‘omics research is that it generates thousands of different variables in different data matrices. All of the datasets obtained in this work were analysed in different ways, but there is always room for improvement of the use of different methods on the wet lab side or on the computational side. For instance, efforts are being directed towards inclusion of ongoing projects focusing on urine metabolomics as part of Slovenian urine 1H-NMR database. In line with these, (i) a total of 320 samples from the PreAlti project (extension of the PreTerm project) were collected and measured, (ii) extension of SMA is currently in the phase of ongoing sample collection, (iii) samples are also being collected from two clinical cohorts from the University Clinical Centre of Ljubljana including the Children’s Hospital (tics, anorexia), while (iv) clinical cohorts associated with Million Microbiomes from Humans Project are aiming at collecting more than 1000 faecal and urine samples for metagenomics and metabolomics analyses. All these projects are on the way to generate thousands of gigabytes of molecular data accompanied by participants metadata in accordance with GDPR and ethical considerations as governed by the Ethics Commission of the Republic of Slovenia in ongoing efforts to improve the understanding of the Slovenian microbiome, metabolome, and physiology by creating better and more appropriate models and networks that will be characteristic of different diseases and/or physical conditions. Maintaining systemic homeostasis and responding to nutritional and environmental challenges requires the coordination of a variety of organs and tissues. To respond to diverse metabolic demands, the human body integrates a system of interorgan communication through which one tissue can influence metabolic pathways in a distant tissue. Dysregulation of these communication pathways through lack of exercise (sedentary lifestyle) and high-energy diets contributes to diseases such as obesity, diabetes, liver disease, and atherosclerosis. For timely interventions, we should think about using body fluids (such as urine) that allow for non-invasive sampling but are sensitive enough to differentiate between a range of biomarkers (Figure 18). The ability to effectively conduct quality control of incoming datasets, the pre-processing of sequencing or metabolomics raw data files to organized data matrices, the pre-processing of missing values, standardization and normalization procedures, in addition to the batch corrections established in this study coupled with data integration approaches enable the syncing of metagenomics, metabolomics and metadata for the same participants in the future, integrating the information about different states in the complexity of human body. This enables a better understanding of inter-organ communication, which acts as a gatekeeper for metabolic health, as multidirectional interactions between metabolic organs and the central nervous system mediate crosstalk between the gut, brain, and other peripheral metabolic organs to maintain energy homeostasis. This enables the search for new therapeutic strategies and promotes a healthy lifestyle to counteract metabolic disorders and other diseases. 162 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Figure 18: The continuation of the projects, described in this work. Slika 18: Nadaljevanje projektov, opisanih v tem delu. 163 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 3.9 CONCLUSIONS • The GUMPP and MAGO tools were developed, and the metaBakery tool is under development for high-throughput amplicon and shotgun sequencing analysis. All tools are available as Singularity image containers prepared for deployment on the HPC cluster and can be further developed by all users. All tools were developed under the open-source license CC-BY 4.0. • By utilizing the newly developed MP approach on faecal samples, we showed that measuring some physicochemical parameters and using ‘omics methods can lead to a completely new understanding of complex biological systems. The MP approach was a less-biased and fine-scale approach to measure faecal hardness compared to the previously used BSS approach, including the interior of samples. • Participants of the bed rest group (NBR and HBR) from the PlanHab study had a specific metabolic composition compared with the HAmb group. We concluded that the host decision to minimize physical activity under hypoxic conditions can be detected within a few days at the level of the urine metabolome measured by NMR. • When urine, serum, and liquor samples from SMA patients were compared before and after the 4th application of drug nusinersen, no differences were observed. However, urine creatinine was observed as a possible biomarker to distinguish healthy individuals from SMA patients. • SMA study allowed us to observe some differences between healthy male and female urine metabolomes, which shows the importance of including women in biomedical and physiological studies. • Urinary metabolomes of untrained individuals differed from metabolomes collected from trained participants in the X-Adapt study. After 10 days of training, these differences disappeared, demonstrating the importance of physical activity for humans. • It was shown that consecutive 3-day urine collection can enable better understanding of morning metabolomes representing a systemic description of the state of human body. • Urinary and faecal metabolomes of preterm and full-term born individuals of the PreTerm study were different. Microbial functionality observed on shotgun sequencing of stool samples was also different in the two groups. However, no significant taxonomic differences could be observed due to unequal variance at this information level. • The integration of urine metabolomes from five different projects enabled creating a Slovenian NMR database that has the potential for the future to include more samples from different specimens and to create classification models to discriminate between different diseases or activity levels. 164 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 4 SUMMARY (POVZETEK) 4.1 SUMMARY Modern human are increasingly threatened by the daily sedentary lifestyle and by serious diseases. The effects of short-term inactivity lead to maladaptations in body physiology, gut microbiota, and metabolic profiles, resulting in increased inflammation, depression, insulin resistance similar to metabolic syndrome, and type 2 diabetes symptoms. However, the effects of long-term physical inactivity, lack of oxygenation, and large muscle signalling are not well understood, although they have direct and widespread biomedical significance for preterm birth and/or genetic disorders, such as SMA, obesity, cardiovascular deconditioning, and chronic obstructive pulmonary disease. To address these issues, three projects analysed a variety of samples: i) physiological responses in adulthood as a consequence of preterm birth (PreTerm project; ARRS J3-7536; EU project https://recap-preterm.eu/); ii) spinal muscular atrophy (project within the University Clinical Centre of Ljubljana) as an extreme case of physical inactivity; and iii) cross-adaptation between heat and hypoxia: a novel strategy for performance and work-ability enhancement in various environments (X-Adapt; research project ARRS J5-9350). The SMA and PreTerm projects addressed lifelong exposure to systemic effects of reduced physical activity: i) intermittent episodes of systemic hypoxia at rest/sleep (PreTerm) and ii) continuous systemic hypoxia due to reduced host physical activity and relief of hypoxia after therapy. The X-Adapt project addressed the impact of regular 10-day training on the physiology of healthy trained and untrained individuals. In addition, little is known about the existence of differences in the human-gut microbiome relationship due to lifelong exposure to hypoxic episodes in preterm versus full-term born adolescents (The PreTerm project), which could impact the functionalities and metabolism of the microbiome in these hosts. For a better understanding, especially of the microbiome, the appropriate tools for high-throughput big data analysis were developed on our side. The GUMPP workflow was developed for amplicon sequencing at three different levels (i) genus, (ii) OTU, or (iii) ASV. The GUMPP workflow consists of the most commonly cited tools for amplicon sequence analysis (Mothur) and microbial functionality prediction (PICRUSt2 and piphilin). The metaBakery workflow is prepared for shotgun sequence analysis and also consists of BioBakery tools (MetaPhlaAn (taxonomic analysis), HUMANn3 (analysis of functional genes, enzymatic reactions, and metabolic pathways) and MelonnPan (prediction of microbial metabolites). The manuscript of the metaBakery tool is currently in the preparation phase. The third tool developed is a MAGO tool that uses the most advanced methods for microbiome analysis and consists of the main quality control tools (FastQC, fastp), assemblers (IDBA-UD, metaSPAdes, megahit) and binners (maxBin, MetaBAT, CONCOT, BinSanity and DAStool). CheckM tools were integrated throughout the pipeline to select assembled MAGs based on completeness and contamination according to the MIMAG standard. All tools were prepared as a skeleton framework consisting of 10,000 lines of code written in Python and packaged as a singularity image ready for use on HPC clusters. All tools were developed under the CC-BY 4.0 license and are released for development by other researchers. 165 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 In the context of the microbial world in the human gut, physicochemical parameters are important for both microbial homeostasis and human homeostasis. The BSS was previously used to assess gut health based on a visual assessment. One problem with this assessment was personal bias. In this work, we developed a new method for high-throughput assessment of faecal consistency, which we called minimal pressure (MP), which is expressed as the force per unit area required to cause permanent deformation of faeces. MP showed correlation with BSS, but provides the true assessment on a continuous scale. The correlation between MP and faecal methionine and acetate showed with different MP values. Both metabolites were previously associated with Western diet and inactivity, such as the sedentary lifestyle. With MP, a new approach for measuring physicochemical parameters was introduced, which, together with the ‘omics method, provides another level of understanding of the microbial world in the human gut. The PlanHab study was the first study by our group to investigate the problems of inactivity and hypoxia from the perspective of 1H-NMR metabolomics. It was a crossover study with three different 21-day experiments (i) hypoxic bed rest, (ii) normoxic bed rest, and (iii) hypoxic ambulation. In both bed rest studies, detectable metabolic changes were observed based on morning urine. The identified metabolites were previously associated with various chronic diseases (chronic obstructive pulmonary disease, cardiovascular disease, etc.). Overall, inactivity alone or in combination with hypoxia resulted in decreased systemic metabolic diversity, increased the number of metabolic pathways affected, and accelerated metabolic deconditioning, leading to the development of negative physiological symptoms associated with these chronic diseases. The results of the PlanHab project allowed us to join the spinal muscular atrophy project. In this project, we were able to analyse the metabolomes of atrophic patients in three different samples (serum, liquor, and urine) before treatment and after the 4th application of nusinersen, the first treatment approved by the EMA and FDA for the treatment of SMA. We found no significant differences between metabolomes. In parallel, we also collected urine samples from healthy Slovenian patients who matched the SMA patients in age and sex. Using machine-learning methods, we were able to determine urine creatinine to be a potential biomarker for the diagnosis of SMA. The SMA project studied complete disease-related inactivity. The X-Adapt project allowed us to understand the impact of a 10-day exercise regimen on the metabolome of trained and untrained participants in the study. It was showed before that minimal activity can reduce the likelihood of metabolic syndrome due to a sedentary lifestyle. Participants were tested before and after the 10 days of training. Urine samples were collected at four different time points. Urine samples were collected over three days to reduce day-to-day variation. Briefly, some metabolites were found to be important in discriminating between trained and untrained subjects, but the significant differences disappeared after 10 days of training when trained and untrained subjects became more metabolically synchronised. In general, we showed that there is little difference between the two groups and that a lifelong active lifestyle is necessary to maintain a healthy metabolome. 166 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 In the PreTerm project, adult preterm and full-term adults participated to observe differences in metabolome (faecal and urine) and microbial metagenome when exposed to hypoxic or normoxic conditions (cycling on an ergometer). No significant differences were observed based on 25 measured physicochemical parameters in faeces (including the MP approach). In addition, some metabolic differences were observed in faecal and urine samples, some of which were previously associated with the development of noncommunicable diseases, particularly in preterm born adults. In addition, shotgun sequencing of the faecal samples was performed. We demonstrated that the taxonomic composition of preterm and full-term groups was the same, based on analysis of sequences and de novo MAGs, but microbial functions were different, once again demonstrating the importance of studying microbial functionality. Metabolic responses and predicted metabolites indicated that the microbiome of the preterm group had greater metabolic flux than that of the full-term group, suggesting minor, previously unmeasured, but apparently significant environmental differences in the preterm gut compared with controls. The final step was completed with data integration. More than 1200 metabolomes from all projects (PlanHab, X-Adapt, SMA, PreTerm and healthy comparison group) were integrated with the miXomics package. We have shown that there is a possibility that we can use urine NMR metabolomes to differentiate between different groups (diseased vs healthy, active vs inactive) in the future. Top-down interpretation of metabolomic datasets, especially urine that can be collected noninvasively, may provide sufficient data to draw conclusions about how samples should be classified into different groups. We hope to stimulate the interest of other researchers to incorporate NMR metabolomics into their research in order to expand our established database to approximately 10,000 samples on a national scale. The manuscript of the Slovenian NMR database is currently under preparation. In addition, the expansion of our NMR database continues: 320 samples from the Prealti project (continuation of the PreTerm project) were already collected and measured, the SMA project was extended and sample collection continues, two additional clinical cohorts are being collected (tics, anorexia), and more than 1000 faecal and urine samples will be collected as part of the Million Microbiomes from Humans project. All of these projects are on track to generate thousands of gigabytes of molecular data accompanied by participant metadata. This is being done in compliance with the General Data Protection Regulation (GDPR) and ethical considerations as defined by the Ethics Committee of the Republic of Slovenia to improve the understanding of the Slovenian microbiome, metabolome, and human physiological states. To respond to diverse metabolic demands, the human body integrates a system of interorgan communication through which one tissue can influence metabolic pathways in a distant tissue. Dysregulation of these communication pathways through lack of exercise (sedentary lifestyle) and high-energy diets contributes to human diseases such as obesity, diabetes, liver disease, and atherosclerosis. For timely interventions, body fluids (such as urine) represent logical choice and allow for non-invasive sampling but are sensitive enough to differentiate between a range of biomarkers. 167 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 4.2 POVZETEK V človeškem prebavnem traktu živi 1013 mikrobni celic, ki proizvajajo, spreminjajo in porabljajo na tisoče kemijskih spojin, ki vplivajo na mikrobno sestavo in zdravje ljudi. Tehnologije sekvenciranja (metagenomika) in druge ‘omske metode (metabolomika, proteomika, lipomika) so nam poleg sodobnih biostatističnih in strojnih metod učenja omogočile globlje razumevanje in nova spoznanja o kompleksnosti in vzročnosti med mikrobioto in njenim gostiteljem pri raziskavah bolnih in zdravih kohort preiskovancev skozi čas. Pomembna povezava med tema dvema skupinama je stanje metabolnega okolja, ki odraža medsebojni vpliv fiziologije gostitelja in mikrobioma (Schmidt, 2021). V prejšnjih študijah smo v okviru projekta PlanHab raziskovali posledice zmanjšane fizične aktivnosti in zmanjšane vadbe pri gostitelju (človeku) (Debevec in sod., 2014; Sket in sod., 2017a; Sket in sod., 2017b; Sket, 2018; Sket in sod., 2018). Posledice kratkotrajne neaktivnosti so povzročile nepravilnosti v telesni fiziologiji, črevesni mikrobioti in metabolomskih profilih, kar je povzročilo povečano sistemsko vnetje, depresijo, inzulinsko rezistenco, pojave, ki so podobni začetkom pri metabolnem sindromu in diabetesu tipa 2. Po drugi strani pa učinki dolgotrajne telesne neaktivnosti, pomanjkanja kisika in signalov velikih mišic v primeru posledic prezgodnjega poroda in/ali genetskih motenj, kot so spinalna mišična atrofija (SMA), debelost, srčno popuščanje in kronična obstruktivna pljučna bolezen, kljub neposrednemu in velikemu biomedicinskem pomenu niso dobro razumljeni. Da bi raziskali ta problem, smo zbrali raznovrstno paleto vzorcev v okviru treh kontroliranih in natančno vodenih projektov: i) fiziološki odzivi v odraslosti kot posledica prezgodnjih porodov (projekt PreTerm; ARRS J3-7536; projekt EU https: //recap-preterm.eu/); ii) spinalna mišična atrofija (SMA KCLJ) in iii) navzkrižna adaptacija na vročino in hipoksijo – nova strategija za pripravljenost in povečanje netreniranosti v različnih okoljih (X-Adapt; projekt ARRS projekt J5-9350). Vsi projekti obravnavajo vseživljenjsko izpostavljenost sistemskim učinkom zmanjšane telesne aktivnosti: i) prekinjajoče epizode sistemske hipoksije v mirovanju / spanju (PreTerm), ii) kontinuirano sistemsko hipoksijo zaradi zmanjšane telesne aktivnosti gostitelja zaradi genetskega defekta in lajšanje hipoksije po genetski terapiji, ali iii) primerjavo treniranih in netreniranih zdravih, mladih moških. Opravili smo biokemijsko karakterizacijo telesnih tekočin, zbranih v okviru vseh projektov in jih uporabili za raziskovanje biokemijske sestave (metaboliti) in njihovih interakcij (metabolne poti). Mikrobne vrste igrajo pomembno vlogo v raznolikih okoljih, za katera je značilen širok spekter kompleksnosti organizmov (Murovec in sod., 2019). Mikrobi, ki živijo v črevesju, so v stalni interakciji z gostiteljem in večsmerni interakciji s svojimi mikrobnimi sorodniki s proizvodnjo različnih molekul, ki lahko izboljšajo zdravstveno stanje gostitelja ali po drugi strani vodijo v razvoj nenalezljive (kronične) bolezni ali njeno napredovanje (Murovec in sod., 2020). Napredovanje bolezni se lahko kaže kot blagi gastrointestinalni simptomi na eni strani ali resne bolezni, kot so vnetna črevesna bolezen, rak debelega črevesa ali rak jeter na drugi strani. Številne bolezni so bile povezane s presnovnimi neravnovesji, ki so delno ali v celoti povezana s črevesnim mikrobiomom (od metabolnega sindroma in debelosti do avtoimunskih bolezni, okužb in duševnih motenj (Murovec 168 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 in sod., 2021)). Po razvoju in izboljšavah tehnologij sekvenciranja je hitro postalo jasno, da večine mikrobov (npr. 99 %) ni mogoče gojiti v laboratorijskem okolju, da ločevanje sevov na podlagi biološko relevantnih lastnosti ni enostavno izvedljivo, vseeno pa lahko na podlagi njihovega genetskega materiala vidimo, kateri mikrobi so prisotni v vzorcu. Na podlagi njihovega genetskega materiala lahko tudi sklepamo o mikrobni funkcionalnosti vzorca (kaj lahko ti mikrobi naredijo (funkcionalni geni, encimske reakcije, ali metabolne poti (Murovec in sod., 2020))). Za zanesljivo in ponovljivo analizo obsežnih mikrobnih podatkov smo razvili tri orodja. Prvo je orodje General Unified Microbiome Profiling Pipeline (GUMPP), ki je namenjeno obsežni, poenostavljeni in ponovljivi analizi bakterijskih amplikonskih podatkov (na nivoju rodu, operacijskih taksonomskih enot in razlik v sekvenčni variantah) in napovedovanje njihovega funkcionalnega potenciala (Murovec in sod., 2021), ki ga sestavljajo Mothur (Schloss in sod., 2009), PICRUSt2 (Douglas in sod., 2020) in piphillin (Narayan in sod., 2020). Sekvenciranje celotnega zaporedja genomov vključuje netarčno sekvenciranje naključne podmnožice vseh zaporedij do določene globine sekvenciranja, ne kot pri tarčnem (amplikonskem) sekevnciranju, kjer je posekvenciran le majhen del specifičnega gena. BioBakery je orodje za analizo zaporedja celotnega metagenoma, ki združuje različna orodja za analizo kakovosti, taksonomsko analizo (MetaPhlAn), funkcionalne gene, encimske reakcije in presnovne poti, ki so prisotne v mikrobni združbi (HUMAn3 (Beghini et al., 2021)). Poleg tega omogoča napovedovanje mikrobnih metabolitov samo na podlagi metagenomskih informacij in s tem vpogled v potencialno sestavo mikrobnih metabolomov, ki bi lahko bili prisotni v tej združbi (MelonnPan (Mallick in sod., 2019)). Pozitiven vidik sekvenciranja celotnega genoma je tudi ta, da lahko pridobimo informacije o genskem materialu iz različnih taksonomskih skupin (arheje, bakterije, protozoji, glive, virusi, tudi človeška DNA), kar lahko izboljša razumevanje kompleksnosti in interakcij med različnimi taksonomskimi nivoji. To nas pripelje do drugega orodja, ki je bilo razvito iz naše strani (metaBakery - v pripravi), ki je reimplementacija orodja BioBakery, z dodatkom, ki omogočajo kvalitativne analize in razširjeno z algoritmi za izračun mikrobne pestrosti. Naslednji korak pri analizi celotnega metagenoma je možnost de-novo sestavljanja metagenoma (MAG). To je postopek, pri katerem se sekvenčni odčitki pregledajo glede kakovosti, sestavijo in združijo skupaj, da dobimo sestavljene metagenome. To je proces, ki lahko vodi do odkritja popolnoma novih mikrobnih vrst, saj 99 % mikrobnih vrst ne moremo gojiti v laboratorijskih pogojih. Za namene obsežnih, poenostavljenih in ponovljivih analiz smo razvili orodje MAGO (Murovec in sod., 2020). To sestoji iz zelo uspešnih orodij za analizo kakovosti (FastQC, fastp (Chen in sod., 2018)), orodij za sestavljanje (IDBA-UD (Peng in sod., 2012), metaSPAdes (Nurk in sod., 2017) in megaHIT (Li in sod., 2015) in združevanje (maxBin (Wu in sod., 2016), MetaBAT (Kang in sod., 2015), CONCOT (Alneberg in sod., 2016), BinSanity (Graham in sod., 2017) in DAStool (Sieber in sod., 2018)). V nadaljevanju se uporablja orodje CheckM (Parks in sod., 2015) za filtriranje, kateri MAG so visokokakovostni v skladu s standardi MIMAG (Bowers in sod., 2019) (glede na popolnost 169 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 in kontaminacijo). Orodje MAGO uporabniku omogoča tudi evolucijsko analizo z orodji ezTree (Wu, 2018), Prokka (Seeman, 2014), Roary (Page in sod., 2015) in FastANI (Jain in sod., 2018). Vsa razvita orodja so bila pripravljena v programskem jeziku Python in sestavljena iz več kot 10.000 vrstic kode. Parametri za izvedbo poteka analiz so v celoti v rokah uporabnika. Vsa orodja so bila razvita kot slike Singularity (Kurtzer in sod., 2017), pripravljene za preprosto uporabo na visoko zmogljivih računalniških grozdih (HPC) za obsežne in preproste analize 10.000 vzorcev na eni strani in za izobraževalne namene na drugi strani (Murovec in sod., 2020; Murovec in sod., 2021). Obe orodji sta pod odprtokodno licenco CC-BY 4.0 in sta odprti za vse razširitve, s čimer nudita priložnost za nadaljnji razvoj in postaneta standardizirani za mikrobno analizo v svetovnem merilu. Peristaltični valovi, ki ustvarjajo kontraktilne vzorce tankega črevesa, in s tem ustvarjajo nenehno se spreminjajoče se okolje. Konstantno mešanje fekalnega materiala povzroča prostorske in kemijske spremembe okoljskih pogojev skozi čas za mikrobe, ki živijo v črevesju, kar lahko vpliva na njihovo aktivnost, ekspresijo genov, rast in številčnost posameznih skupin mikrobov (Ehrlein and Schemann, 2005; Johnson in sod., 2012; Cremer in sod., 2016; Glover in sod., 2016; Cremer in sod., 2017; Sket in sod., 2017a). Številne študije v preteklosti so povezale konsistenco blata z bogastvom črevesne mikrobiote, njeno sestavo, enterotipi, povišanimi nivoji vnetja, lipopolisaharidi in hitrostjo rasti bakterij (Tigchelaar in sod., 2016; Vandeputte in sod., 2016). Konsistenca blata je bila v preteklosti ocenjena z bristolsko lestvico (ang. Bristol Stool Scale (BSS)) (Heaton in sod., 1992; Lewis and Heaton, 1997). Ena od pomanjkljivosti metode BSS je, da prihaja do visokega odstopanja med ocenjevalci zaradi pristranskosti in vizualne ocene (Derrien in sod., 2010; Chumpitazi in sod., 2016). Zato lahko le dobro usposobljen strokovnjak pripravi medicinsko pomembne zaključke na podlagi ocene BSS (Matsuda in sod., 2021). Fekalni materiali so po fiziki materialov poltrdni materiali (tj. paste) (Grillet in sod., 2012), ki jih umeščamo med viskoelastične materiale (poltrajna deformacija kot odziv na zunanje sile) na eni strani in plastične materiale (trajna deformacija) na drugi strani. Ta način razmišljanja nas je pripeljal do vrednotenja s pomočjo minimalnega tlaka (MP) kot metode za manj pristransko in visoko zmogljivo ocenjevanje konsistence fekalnega materiala (Deutsch in Stres, 2021). Minimalni tlak, izražen kot sila na enoto površine, je tlak, ki je potreben, da povzroči trajno deformacijo fekalnega materiala. Pokazali smo, da MP narašča eksponentno v primerjavi z linearno padajočimi vrednostmi BSS, ne glede na spol (Deutsch in Stres, 2021). Pokazali smo tudi, da obstaja nelinearna (asimptomatska) in kompleksna povezava med suho snovjo in MP. Vzdolžno kartiranje površinskega MP po celotni dolžini posameznega vzorca blata je pokazalo, da obstajajo različne drobnozrnate notranje, lokalne razlike. Poleg tega je kljub enotnemu točkovanju BSS pri nižjih vrednostih BSS naša analiza pokazala, da so bolj odpornim površinskim plastem blata sledile mehkejše notranje strukture, kar ima za posledico nižje vrednosti MP, povezane s približno zdravo konsistenco blata (Deutsch in Stres, 2021). Te lastnosti z uporabo BSS ne moremo ovrednotiti. Določili smo mejo, ki lahko razlikuje med zdravim stanjem (MP < 75) ali zaprtjem (MP > 75) (Blake in sod., 2016; Sket in sod., 2017b; Sket in sod., 2018). MP < 30 je ustrezalo vzorcem tekočega blata (driska). MP smo izmerili na vzorcih, 170 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 zbranih v okviru študij PlanHab (Sket in sod., 2017a; Sket in sod., 2017b; Sket in sod., 2018) in PreTerm (Deutsch in sod., 2022b). Predvsem prva je pokazala, da sta bili blokada por na fekalni površini in zadrževanje sluzi povezana s selektivnim pritiskom na mikrobiom črevesja, njegovo gensko ekspresijo in presnovno aktivnost, kar lahko vodi do lokalnega vnetja (Vandeputte in sod., 2016; Sket in sod., 2017a; Sket in sod., 2017b; Sket in sod., 2018; Aron-Wisnewsky in sod., 2019). S tem smo pokazali, da lahko pristop MP natančno opiše klinični pomen konsistence blata (Deutsch in Stres, 2021). Poleg tega pristop MP ne zahteva predhodne obdelave vzorcev in omogoča enostavno merjenje brez drage opreme, pa tudi ponovljivost teh meritev med različnimi vzorci (sveži proti zamrznjeni; moški proti ženski). Ugotovili smo tudi, da MP korelira s fekalnim metioninom in acetatom na podlagi meritev 1H-NMR. Na podlagi teh dveh metabolitov lahko ločimo tri različne skupine fekalne konsistence (MP < 30, 3075 ). Metionin je bil prej povezan z oksidativnim stresom in je bil povišan pri neaktivnih posameznikih, medtem ko je acetat negativno koreliral z občutljivostjo na inzulin (Martínez in sod., 2017; Müller in sod., 2019), kar kaže, da lahko različna konsistenca blata vpliva na biološki sistem gostitelja. Opažene razlike v metioninu in acetatu, povezanem z MP, so bile tako očitno posledica neaktivnosti v okviru projekta PlanHab v kombinaciji z zahodno prehrano. Pristop MP nam je omogočil, da smo z merjenjem nekaterih fizikalno-kemijskih parametrov na eni strani in z omskimi metodami na drugi strani lahko začeli in raziskali povsem novo raven razumevanja kompleksnih bioloških sistemov (Deutsch in Stres, 2021). Študija PlanHab je bila prva študija naše skupine, ki je vključevala metabolomiko človeškega urina. Vzorce jutranjega urina smo zbirali skozi celoten eksperiment, ki je bil zamišljen kot navzkrižno oblikovan eksperiment (angl. Cross-over design). Vsi udeleženci študije so šli skozi vse tri oblike poskusa (21-dnevno ležanje v hipoksiji ali normoksiji ali pa gibanje v hipoksiji (Sket in sod., 2017a; Sket in sod., 2017b; Sket in sod., 2018, Šket in sod., 2020)). Edinstvena zasnova nam je omogočila, da smo upoštevali odzive istih udeležencev v vseh treh eksperimentalnih različicah pod nadzorovanimi prehranskimi, okoljskimi in eksperimentalnimi pogoji. Zbrali smo 523 vzorcev urina in jih pripravili za meritve 1H-NMR. Udeleženci, ki so ležali (NBR in HBR), so imeli specifične metabolne značilnosti v primerjavi s skupino HAmb. Pokazalo se je, da je odločitev gostitelja, da zmanjša telesno aktivnost v hipoksičnih pogojih, mogoče zaznati v nekaj dneh na ravni urinskega 1H-NMR metaboloma. V normoksičnih pogojih ležanja v postelji smo te metabolne spremembe zaznali šele v prvih desetih dneh. Metaboliti, opaženi v tej študiji, so bili povezani s številnimi različnimi boleznimi: (i) kronično obstruktivno pljučno boleznijo (Adamko, 2015; Zabek, 2015) in (ii) srčno- žilno boleznijo, povezano s tkivno hipoksijo, ki lahko vodi tudi do sladkorne bolezni tipa 2, depresijo in osteoporozo (Jones, 2014; Wang in sod., 2011; Senn in sod., 2012). Študija PlanHab z uporabo metabolomov 1H-NMR v urinu nas je pripeljala do zaključka, da ni enostavnega metabolnega biomarkerja, ki bi lahko razlikoval med različnimi stanji (zdravo proti bolnemu, aktivno proti neaktivnemu; aktivno proti sedečemu). Za zajetje skupnih značilnosti človeške fiziologije, medosebne in časovne variabilnosti so bili potrebni kompleksni multivariatni opisi metaboloma. Ta koncept je bil uporabljen v vseh drugih nadaljnjih študijah. Na splošno je neaktivnost sama ali v kombinaciji s hipoksijo povzročila zmanjšano sistemsko metabolno raznolikost in povečano število prizadetih metabolnih poti, kar je povzročilo razvoj negativnih fizioloških simptomov, kot so 171 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 inzulinska rezistenca, nizka stopnja sistemskega vnetja, zaprtje, depresija in presnovni sindrom (Sket in sod., 2017a; Sket in sod., 2017b; Sket in sod., 2018, Šket in sod., 2020). Rezultati študije PlanHab so nas spodbudili, da nadaljujemo z raziskavami v drugih študijah, ki vključujejo različne stopnje neaktivnosti, kot so X-Adapt (razlike med treniranimi in netreniranimi posamezniki), spinalna mišična atrofija in projekt PreTerm, ki primerja različne čase izpostavljenosti hipoksiji, telesno aktivnost in čas izpostavljenosti različnim pogojem. Poleg tega smo dodatno zbrali več kot 200 vzorcev zdravih moških in žensk ter njihovih sinov in hčera (Schmidt, 2021). Spinalna mišična atrofija je živčno-mišična bolezen, ki se kaže kot progresivna atrofija in oslabitev skeletnih mišic zaradi progresivne izgube motoričnih nevronov in prizadene številne druge organske sisteme (Melki, 2017; Yeo in Darras, 2020). Z incidenco 1 na 11.000 rojstev še vedno velja za najpogostejši genetski vzrok smrti otrok (Sugarman in sod., 2012). Pri bolnikih s SMA mutacije v centromernem genu SMN2 vodijo do tvorbe nestabilnih proteinov, hkrati pa je zaradi delecije motena tudi ekspresija telomernega gena SMN1 (Lefebvre in sod., 1995; Lorson in Androphy, 2000; Lunn in Wang, 2008; Smeriglio in sod., 2020). V zadnjih letih so se pojavile nove terapije za zdravljenje SMA. Te terapije spremenijo naravni potek bolezni s spremembo izražanja ali zamenjavo mutiranih genov, ki sodelujejo pri razvoju SMA (Chiriboga in sod., 2016). Nusinersen je bilo prvo zdravilo za zdravljenje SMA, ki sta ga odobrila Uprava za hrano in zdravila v Združenih državah Amerike in Evropska agencija za zdravila. Nusinersen je protismiselni oligonukleotid, ki vpliva na spajanje mRNA, kar ima za posledico aktiven protein SMN 2 in s tem boljše rezultate SMA (Chiriboga in sod., 2016; Corey, 2017; Ramdas in Servais, 2020). Nusinersen zahteva intratekalno aplikacijo, ker ne more prečkati krvno-možganske pregrade (Faber in sod., 2007; Rigo in sod., 2012). Vzorci urina, likvorja in seruma bolnikov s SMA so bili zbrani pred zdravljenjem in po 4. aplikaciji zdravila nusinersen. Zdravniški pregled ob četrti aplikaciji zdravila je pokazal izboljšanje gibljivosti. Bolniki so pokazali izboljšanje nadzora nad invalidskim vozičkom, premikanja, utrujenosti, higiene, govora in spanja po 4. aplikaciji nusinersena (Deutsch in sod., 2021a, Osredkar in sod., 2021). V nasprotju s fizičnimi pregledi, razlik nismo uspeli potrditi, na podlagi metabolomov urina, likvorja in seruma pred in po aplikaciji zdravila. V tem kontekstu ne moremo ovreči ničelne hipoteze iz poglavja 1.4.2, ki pravi, da ni bistvenih razlik pred in po zdravljenju. Morda bi te razlike lahko potrdili po 10 aplikacijah nusinersena, vendar bi to trajalo preveč dodatnega časa za zbiranje vzorcev in dokončanje v časovnem okviru tega doktorata. Ti rezultati kažejo, da je učinkovitost nusinersena mogoče ugotoviti z zdravniškimi pregledi in testi gibljivosti. Morda bi uporaba drugih metabolomskih metod, kot je masna spektrometrija, ki je bolj občutljiva (nM) v primerjavi z NMR (mM), privedla do odkrivanja biomarkerjev, ki bi jih lahko uporabili kot biomarkerje za spremljanje zdravljenja z nusinersenom. Lahko pa, da so signali iz izboljšanega metabolizma na račun večje fizične aktivnosti še premalo vidni in se pokažejo šele pri kasnejših aplikacijah (Deutsch in sod., 2020a). 172 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Poleg vzorcev iz projekta SMA smo zbrali vzorce urina iz ujemajoče se zdrave kohorte, da bi primerjali metabolome bolnikov s SMA z metabolomi zdravih posameznikov. Ta primerjava je privedla do opazovanja pomembnih metabolnih razlik med ženskami in moškimi na eni strani (p=0,0001) in zdravo kohorto in bolniki s SMA na drugi. Vpliv spola in prisotnost bolezni je v obeh primerih bila statistično signifikantna. Obe metodi, PLSDA in Random Forest, sta pokazali pomembne razlike med ženskimi in moškimi metabolomi. Pri bolnikih s SMA smo opazili znatno zmanjšanje kumulativne koncentracije metabolitov (p < 0,05). Zmanjšanje števila metabolitov smo opazili tudi pri zdravih ženskah v primerjavi z zdravimi moškimi. Zaradi razlik med ženskami in moškimi je pomembno, da prihodnje študije vključijo večje število žensk v študije, kot je ta, da bi ugotovili pomembne razlike med ženskimi in moškimi metaboliti in njihovimi biokemijskimi potmi. Opazili smo nekaj vzporednic s predhodnimi študijami vadbe, ki kažejo, da se lahko število metabolitov poveča po vadbi (Nieman in sod., 2013; Schranner in sod., 2020) ali študijah ležanja v postelji (npr. PlanHab), ki so prav tako pokazale 30-odstotno zmanjšanje števila presnovkov po 3 tednih ležanja v postelji (Sket in sod., 2017a; Sket in sod., 2017b; Sket in sod., 2018). Simptomi, kot so inzulinska rezistenca, izguba kosti in mišic, spremembe v presnovi lipidov, so bili odkriti v študijah ležanja in vse te simptome je mogoče opaziti tudi na seznamu stanj, povezanih s SMA (Osredkar in sod., 2021). Za namene sestavljanja klasifikacijskih modelov za razlikovanje med bolnimi in zdravimi, smo uporabili metabolome urina pri bolnikih s SMA in zdravih posameznikih. S pomočjo avtomatskega strojnega učenja smo kreirali model, ki uspešno ločuje med tema skupinama (AUC 0,958). Kreatinin je bil ključni metabolit, ki je ločil zdrave od pacientov s SMA, kot so poročali tudi nekaj mesecev pred našo objavo v drugi študiji, ki je spremljala napredovanje denervacije SMA s povišanimi ravnmi serumskega kreatinina pri hujših oblikah bolezni SMA (Alves in sod., 2020). Koncentracije kreatinina se pri bolnikih s SMA niso bistveno spremenile pred in po 4. aplikaciji nusinersena. Spremenjeno raven kreatinina so opazili tudi v vzorcih urina iz naših preteklih študij (PlanHab (Šket in sod., 2020)). Ponovna uvedba vadbe je v teh študijah popolnoma obrnila neželene učinke. Imobilizirani bolniki, ki so v preteklosti prejemali vibracijsko terapijo pri drugih boleznih, so imeli koristi v primerjavi s kontrolami in lahko predstavljajo potencialni korak pri fizični aktivaciji bolnikov s SMA po terapiji z nusinersenom (Deutsch in sod., 2021a). V okviru projekta SMA smo raziskali stanje popolne neaktivnosti. Vendar pa je v 21. stoletju vse bolj jasno, da je telesna neaktivnost, ki je posledica sedečega načina življenja, tudi globalni problem, ki predstavlja tveganje za razvoj kroničnih nenalezljivih bolezni in povečano globalno smrtnost (Kelly in sod., 2020b). Pokazalo se je že, da lahko minimiziranje časa sedenja zmanjša tveganje za kronične bolezni, kot so koronarna bolezen srca, sladkorna bolezen tipa 2, metabolni sindrom itd. (Sallis in sod., 2016). Cilj projekta X-Adapt je bil preučiti razlike med fizično aktivnimi (treniranimi udeleženci) in neaktivnimi (netreniranimi) posamezniki (Sotiridis in sod., 2018; Sotiridis, 2019b; Sotiridis in sod., 2019; Sotiridis in sod., 2020). Projekt je vključeval 10 treniranih in 10 netreniranih moških v 10-dnevnem protokolu vadbe, ki je obsegal vsakodnevno vadbo na kolesarskem ergometru pri 50 % največje moči pedaliranja v normoksičnih in normobaričnih (~1000 hPA) pogojih pri 24°C. 173 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Pred udeležbo in po 10 dneh vadbe so vsi udeleženci (aktivni in neaktivni) opravili tridnevno testiranje v termonevtralnih normoksičnih in hipoksičnih pogojih ter vročih normoksičnih pogojih. Udeleženci študije so bili razvrščeni kot trenirani ali netrenirani glede na njihovo maksimalno aerobno kapaciteto (netrenirani VO2max < 45 mL-kg-1-min-1, trenirani VO2max > 55 mL-kg-1-min-1) (Jay in sod., 2011;Montero in Lundby, 2017). Meritve, usmerjene v človeško fiziologijo, so pokazale, da je obstajalo nekaj pomembnih razlik med treniranimi in netreniranimi preiskovanci. Razlike med stanjem pred in po treningu so bile večje v netreniranih skupinah. Na podlagi meritev VO2max pred treningom in njegove spremembe v 10 dneh treninga je stopnja prilagajanja na trening največja pri netreniranih posameznikih (Sotiridis in sod., 2018; Sotiridis, 2019b; Sotiridis in sod., 2019; Sotiridis in sod., 2020). Glede na metabolome urina ni bilo mogoče zaznati pomembnih razlik pred in po 10 dneh treninga. Vendar pa so bile opažene razlike pri primerjavi med urinskimi metabolomi med treniranimi in netreniranimi udeleženci. Poleg tega so se med tema dvema skupinama bistveno razlikovale tudi fizikalno-kemijske lastnosti urina (pH, skupne raztopljene trdne snovi, slanost in prevodnost). Na primer, pH se je znižal pri netreniranih posameznikih, kar je bilo prej povezano s presnovnim sindromom in kroničnim srčnim popuščanjem (Maalouf in sod., 2007; Otaki in sod., 2013; Kraut in Madias, 2016; Shimodaira in sod., 2017). Metaboliti (holat, tartrat, kadaverin, lizin, N6-acetilizin, metanol, N-acetilglukozamin, butanon in kaprat) so bili identificirani s pomočjo multivariatne statistike in strojnega učenja kot metaboliti, ki so odgovorni za razlikovanje med trenirano in netrenirano skupino. Vse metabolite so predhodno opazili v študijah, povezanih s poškodbami mišic, ravnmi hormonskih receptorjev, okrevanjem po treningu z odpornostjo, nižjim kardiovaskularnim tveganjem (tartrat) (Abramowicz in Galloway, 2005; Spiering in sod., 2008) ali atrofičnim stanjem, debelostjo, razvojem raka, metabolnim sindromom (holat) (Li in sod., 2020; Abrigo in sod., 2021; Alamoudi in sod., 2021; Mercer in sod., 2021; Pushpass in sod., 2021; Zheng in sod., 2021). S tem pristopom smo pokazali, da se celoten sistem pri aktivnih osebah bistveno razlikuje od tistega pri neaktivnih (p=0,003). Po 10 dneh treniranja so se celokupne razlike med treniranimi in netreniranimi zmanjšale (p=0,226). Naša študija je pokazala, da je vadba od 75-150 minut na teden, ki jih priporoča Svetovna zdravstvena organizacij, premalo učinkovita in da bi bila potrebna 5-krat večja vadba. Poleg tega je ta poskus pokazal, da 3-dnevni jutranji vzorci urina zagotavljajo dobro biološko matriko za razlikovanje aktivnih od neaktivnih posameznikov, ki jih ni mogoče opaziti pri dnevnem vzorčenju zaradi dnevnih variabilnosti posameznika. Sistemska homeostaza je odvisna od številnih različnih parametrov in vključuje komunikacijo med različnimi organi, prek katere lahko metabolne poti, na katere vplivajo metaboliti v enem organu, vplivajo na druge metabolne poti v drugem organu. Sedeči način življenja z odsotnostjo signalov velikih mišic in oksigenacije sistema ter porabe hranil lahko moti to komunikacijo med organi, kar vodi v manifestacijo različnih bolezni. 174 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Višje ravni vadbe lahko obnovijo medorgansko komunikacijo do zdravih in fizično aktivnih posameznikov (Deutsch in sod., 2022a). Prezgodnji porod je opredeljen kot rojstvo pred 37. tednom gestacije. Po vsem svetu je približno 10 % prezgodnjih porodov in je še vedno eden vodilnih vzrokov smrti pri otrocih, mlajših od 5 let. Prezgodnji porod povečuje tveganje za razvoj različnih kroničnih bolezni, kot so srčno-žilne, endokrine/metabolične, ledvične, nevrološke in psihiatrične motnje. Eden glavnih vzrokov za te motnje je povečan oksidativni stres v prvih tednih življenja (Moutquin, 2003; Magalhães in sod., 2004; Pialoux in sod., 2009; Blencowe in sod., 2012; Lushchak, 2014; Liu in sod., 2015; Manley in sod., 2015; Debevec in sod., 2017; Crump, 2020; Tingleff in sod., 2021). Obstaja velika verjetnost, da so nekateri klinični parametri, kot so telesna masa, arterijski krvni tlak, glukoza na tešče in holesterol, lahko povišani pri prezgodaj rojenih odraslih (Kerkhof in sod., 2012; Markopoulou in sod., 2019; Crump, 2020). Različne študije so pokazale, da se vse te značilnosti razlikujejo med prezgodaj in pravočasno rojenimi odraslimi in da so te razlike, zlasti povezane s proizvodnjo reaktivnih kisikovih vrst, in jih je mogoče opaziti v povezavi z različnimi stopnjami vadbe ali telesne dejavnosti (Magalhães in sod., 2004; Powers in sod., 2011; Filippone in sod., 2012; Debevec in sod., 2017; Martin in sod., 2018). Namen projekta PreTerm je bil raziskati, ali obstajajo razlike med prezgodaj in pravočasno rojenimi mladimi moškimi v ventilacijskem odzivu (HVR) pri telesni aktivnosti ali mirovanju v hipoksičnih in normoksičnih okoljskih pogojih (Debevec in sod., 2019; Debevec in sod., 2022). Poleg tega je bil za opis kompleksnosti človeškega telesa in črevesnega mikrobioma pri njegovem odzivu na povečane ravni oksidativnega stresa v mirovanju in med vadbo pri normoksiji in hipoksiji uporabljen analitični pristop, ki je sestavljen iz metabolomike urina in fecesov ter fekalne metagenomike (Deutsch in sod., 2022b). Pokazalo se je, da kolesarjenje pri normoksiji in hipoksiji zviša ravni oksidativnega stresa v obeh skupinah takoj po vadbi (Martin in sod., 2020). Udeleženci v skupini prezgodaj rojenih so pokazali nižjo vadbeno zmogljivost pri normoksiji v primerjavi s kontrolno skupino, in so imeli nižji HVR, medtem ko takšne razlike niso opazili pri hipoksiji (Vrijlandt in sod., 2006; Lovering in sod., 2013; Svedenkrans in sod., 2013; Bates in sod., 2014; Clemm in sod., 2014; Farrell in sod., 2015; Debevec in sod., 2019). Ti rezultati kažejo, da imajo lahko prezgodaj rojeni povečan oksidativni stres med akutno vadbo v normoksiji, medtem ko takšnega odziva pri hipoksiji niso opazili (Martin in sod., 2020). V vzorcih blata smo izmerili 25 fizikalno-kemijskih spremenljivk (vključno z zgoraj opisanim pristopom MP), pri čemer med obema skupinama nismo ugotovili bistvenih razlik. Vzorci blata in urina so bili zbrani tri dni pred hipoksičnim in normoksičnim testom in tri dni po njem (Deutsch in sod., 2022b). 175 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Aceton, tartrat in trans-akonitat so bili urinski metaboliti, ki so se glede na rezultate MetaboAnalyst zmanjšali v skupini prezgodaj rojenih in korelirajo z vadbo, postom ali diabetes melitusom (Paradis in sod., 2015; Crump in sod., 2019; Perrone in sod., 2021). Zdi se, da so razlike posledica oslabljene avtonomne funkcije, ker se srčni utrip pri prezgodaj rojenih odraslih obnavlja počasneje, kar bi lahko povzročilo anoksijo in povečalo srčno-žilno tveganje, kot je bilo že objavljeno (Qiu in sod., 2017; Haraldsdottir in sod., 2019). Laktat, serotonin in tirozin so bili glavni fekalni metaboliti, ki so predstavljali razliko med prezgodaj in pravočasno rojeno skupino. Prva dva metabolita sta bila povečana v skupini prezgodaj rojenih, kar skupaj z obogateno metabolno potjo (Warburgov učinek) kaže, da lahko pri njih opazimo nekatere metabolne spremembe, ki ji lahko povezujemo z mitohondrijsko disfunkcijo (Sonntag in sod., 2007; Ten, 2017). Te ugotovitve lahko predstavljajo prvi dokaz, da sistemske razlike zaradi vseživljenjske izpostavljenosti oksidativnemu stresu res obstajajo in postavljajo vprašanje, ali so te razlike povezane z majhnimi razlikami, ki nastanejo na strani prezgodaj rojenega gostitelja ali na delu mikrobioma, ki se odziva zaradi teh okoljskih signalov drugače kot pri pravočasno rojenih (Deutsch in sod., 2022b). Zbrani fekalni vzorci so bili uporabljeni za sekvenciranje, da bi raziskali, ali opažene razlike v fekalnih metabolomih korelirajo z razlikami na mikrobni ravni. Na taksonomski ravni nismo opazili bistvenih razlik, čeprav je bila relativna številčnost arhej in virusov višja v skupini prezgodaj rojenih. V zadnjem desetletju je postalo jasno, da je pri preučevanju mikrobioma pomembnejše vprašanje, kaj mikrobi v našem črevesju počnejo. Zato smo naredil analizo funkcionalnosti preučevanega mikrobioma z našim orodjem metaBakery. Strojno učenje je bilo uporabljeno za izdelavo klasifikacijskih modelov in identifikacijo potencialnih biomarkerjev (Deutsch in sod., 2022b). Na podlagi genskih družin ni bilo odkritih bistvenih razlik, vendar smo zaznali nekaj razlik na podlagi encimskih reakcij, metabolnih poti in predvidenih metabolitov. Predhodno opisana encimska reakcija sukcinat dehidrogenaze (RXN-15378) je bila povečana v skupini prezgodaj rojenih. Sukcinat je sam po sebi mikrobni metabolit in se lahko kopiči v črevesnem traktu med vnetjem ali mikrobnim neravnovesjem. Ima tkivno specifične, a tudi protivnetne lastnosti in je tudi vir produkcije propionata s strani Bacteroides spp. in Prevotella sp. Pokazalo se je, da se sukcinat kopiči v celicah v pogojih z nizko vsebnostjo kisika in predstavlja metabolni podpis hipoksije. Pokazalo se je, da prekomerni privzem mikrobno proizvedenega sukcinata vodi do višjih ravni znotrajceličnega sukcinata, ki na koncu poveča odziv na samo hipoksijo in hkrati poveča LPS- inducirano ekspresijo proinflamatornih citokinov v človeških celicah (Rubic in sod., 2008; Ariza in sod., 2012; Tannahill in sod., 2013; Akram, 2014; Littlewood-Evans in sod., 2016; Connors in sod., 2018; Deutsch in sod., 2022b). PWY-7456 (razgradnja β-(1,4)-manana), PWY-7323 (superpot biosinteze gradnikov O-antigena iz GDP-manoze) in GLYCOLY-SIS-TCA-GLYOX-BYPASS (superpot glikolize, piruvat dehidrogenaza, TCA in glioksilatni obvod), P221-PWY (oksidacija oktana), PWY-5173 (nerazvrščen) so bile poti, ki so bile povečane v skupini prezgodaj rojenih. Nekateri od njih so lahko koristne in si prizadevajo za celovitost sluznice in prehranjevanje gostitelja (razgradnja β-(1,4)- 176 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 manana) ali pa znatno povečajo proizvodnjo energije, kar bi bilo pomembno v primeru oksidativnega stresa kot pri prezgodaj rojenih posameznikih (superpot glikolize, piruvat dehidrogenaza, TCA in glioksilatni obvod). Biosinteza acetil-CoA lahko povzroči tudi povečano proizvodnjo butirata s proizvodnjo acetil-CoA. Po drugi strani pa imajo nekatere poti bolj negativen učinek in so bile povečane tudi v skupini prezgodaj rojenih. Izkazalo se je, da so te poti vključene v proizvodnjo lipopolisaharidov LPS (biosinteza gradnikov O-antigena iz GDP-manoze), povezane s po gramu negativnimi bakterijami in povzročitelji različnih stopenj vnetja (Samuel and Reeves, 2003; Wolfs in sod., 2010; Shah in sod., 2015; Kim in sod., 2016; La Rosa in sod., 2019; Lindstad in sod., 2021). Po drugi strani pa je bila opažena tudi oktanska oksidacija, ki je bila prej opisana v kontekstu zahodnjaškega načina prehranjevanja in povezana z boleznijo jeter. Vse te razlike je mogoče povezati s fiziološko pomembnimi primanjkljaji, opaženimi med obema skupinama (Martin in sod., 2018; Martin in sod., 2020; Schmidt, 2021). S pristopom napovedovanja mikrobnih metabolitov je bilo odkritih tudi sedemnajst metabolitov, ki ločujejo med obema skupinama, vendar nobeden od njih ni bil zaznan v primeru fekalne metabolomike (Deutsch in sod., 2022b) s pomočjo strojnega učenja. Poleg metabolnih poti, ki izhajajo iz metagenomsko predvidenih metabolitov, so bile odkrite pomembne razlike v metabolitih v urinu in blatu, kar kaže, da obstajajo sistemske razlike med obema skupinama. Povišani metaboliti so bili prej povezani s srčno-žilnimi boleznimi (karnitin), povečano prepustnostjo črevesja, zvišanimi ravnmi vnetnih citokinov, metabolnim sindromom ali razvojem raka (putrescin in diacetilspermin). Po drugi strani so se nekateri predvideni presnovki zmanjšali v skupini prezgodaj rojenih. Deoksiholat je sekundarna žolčna kislina in znan promotor raka debelega črevesa. Zmanjšane ravni te molekule so na splošno opazili zaradi povečanega izločanja holata z urinom, opaženega pri metabolomiki urina. Nižja vsebnost redukcijskih sladkorjev (fruktoze, glukoze in galaktoze) v skupini prezgodaj rojenih je ustrezala večji sposobnosti tvorbe kratkoverižnih maščobnih kislin (Fukiya in sod., 2009; Wang in sod., 2011b; Koeth in sod., 2013; Tang in sod., 2013; Ussher in sod., 2013; Staley in sod., 2017; Heinken in sod., 2019; Wirbel in sod., 2019). Na ravni na novo sestavljenih metagenomov nismo ugotovili razlik, kar sovpada rezultatom na ravni taksonomskih podatkov, pridobljenih s programom Metaphlan. To je skladno z našim opažanjem, da med mikrobioto prezgodaj in pravočasno rojenih ni pomembnih taksonomskih razlik (Deutsch in sod., 2022b). Z zgoraj opisanimi rezultati lahko potrdimo dve alternativni hipotezi iz poglavja 1.4.1. Prva potrjena hipoteza navaja, da obstajajo pomembne razlike med prezgodaj rojenih in pravočasno rojenimi skupinami udeležencev v metabolitih fecesa in urina, ki jih je mogoče povezati z njihovo fizično zmogljivostjo v poskusih in fiziološkimi podatki med vadbo in mirovanjem. Druga hipoteza navaja, da obstajajo pomembne razlike na ravni metagenomske sestave obeh skupin, zaradi česar je mogoče identificirati specifične metabolne poti, ki se med skupinama razlikujejo, in značilnosti njihovega črevesnega okolja. Razlike med na novo sestavljenimi metagenomi med obema skupinama nismo 177 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 opazili, zato v tem primeru ne moremo ovreči ničelne hipoteze, ki pravi, da ni razlike med prezgodaj in pravočasno rojeno skupino. Več kot 1200 zbranih vzorcev smo združili pri izdelavi slovenske baze podatkov 1H-NMR urina. Vsi zbrani urinski vzorci iz 5 projektov (PlanHab, spinalna mišična atrofija, X-Adapt, PreTerm, zdrave ženske in moški) so bili integrirani. Vsi izmerjeni spektri so bili analizirani z enakim postopkom, da bi dobili enake metabolite v vseh skupinah. Pokazali smo, da je na tej ravni fizioloških podatkov mogoče razlikovati med različnimi stopnjami aktivnosti na podlagi metabolitov v urinu. Vsi vzorci so bili obdelani na enak način in jih je mogoče v prihodnosti ponovno obdelati z uporabo nadaljnjih posodobitev baze podatkov Human Metabolome Database (Wishart in sod. 2007; Wishart in sod, 2022) z uporabo naših lastnih orodij za obdelavo metabolomskih podatkov (Šket in sod., 2020; Murovec in sod., 2018; Deutsch in sod., 2021a; Deutsch in sod., 2021b; Deutsch in sod., 2022a; Deutsch in sod., 2022b) skupaj s komercialno dostopno programsko opremo za tarčno 1H-NMR analizo. Na primer, iste spektre je mogoče ponovno analizirati s prihodnjimi posodobitvami baze podatkov o človeški metabolomski bazi (HMDB), saj je ta narasla z nekaj tisoč metabolitov v prvi izdaji (Wishart in sod., 2007) na 217.000 metabolitov v zadnji izdaji v 2021 (Wishart in sod., 2021). Standardizirani analitični protokoli, vzpostavljeni v našem laboratoriju, so nam omogočili, da smo zmanjšali sistematične napake. Box-Cox normalizacija in pristop sPLSDA, uporabljena za integracijo vseh metabolomov v naši študiji, sta pokazala uspešnost pri odstranjevanju učinkov različnih serij vzorcev na eni strani, hkrati pa pokaže še vedno ohranjene razlike zaradi življenjskega sloga ali drugih bioloških razlogov (Wang in La Cao, 2020). Ta pristop nam je omogočil tudi potrditev alternativne hipoteze iz razdelka 1.4.3, da obstajajo pomembne razlike v urinskih metabolomih, ki omogočajo identifikacijo naborov biomarkerjev in presnovnih poti, ki razmejujejo različne skupine, ki jih preučujemo (Schmidt, 2021). Pokazali smo, da lahko metabolni prstni odtis v urinu omogoči posnetek metabolnega statusa celotnega sistema telesa, ki ga lahko povezujemo z zdravjem ali boleznijo (Azad in Shulaev, 2019; Mussap in sod., 2021). Metabolomika na splošno vključuje sistematično identifikacijo metabolitov v človeškem telesu (Ashrafian in sod., 2021). Razvoj nacionalne baze podatkov naj bi izboljšal razumevanje slovenskega metaboloma vzporedno s študijami iz drugih evropskih držav in identifikacijo metabolitov, specifičnih za različne bolezni ali fizična stanja. Metabolomika 1H-NMR ima potencial za zajemanje širokega spektra običajnih kliničnih spremenljivk v epidemioloških študijah, vključno z manjkajočimi spremenljivkami za metapodatke o pacientih in omogoča ustvarjanje biomarkerjev za razlikovanje med različnimi boleznimi na podlagi strojnega učenja. Celostna razlaga metabolomskih podatkovnih nizov, zlasti urina, ki ga je mogoče zbrati neinvazivno, lahko zagotovi dovolj podatkov za sklepanje o tem, kako je treba vzorce razvrstiti v različne skupine. Upamo, da bomo spodbudili zanimanje drugih raziskovalcev za vključitev NMR metabolomike v svoje raziskave, da bi razširili našo uveljavljeno bazo podatkov na približno 10.000 vzorcev na nacionalni ravni. Modeliranje takšnega zbiranja podatkov predstavlja edinstveno pot za ustvarjanje modelov strojnega učenja, ki jih je mogoče vsaj okvirno uporabiti v medicinski praksi za razlikovanje med zdravimi in nezdravimi metabolomskimi stanji poleg različnih bolezni. Tako ta pristop 178 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 predstavlja korak bližje personalizirani medicini, ki temelji na podatkih in ima potencial za informiranje o zdravju na nacionalni ravni. Raziskovalni članek o slovenski NMR bazi je v pripravi. Zgodba o Slovenski NMR bazi teče naprej. V skladu s tem je bilo (i) zbranih in izmerjenih skupno 320 vzorcev iz projekta PreAlti (razširitev projekta PreTerm), (ii) razširitev SMA je trenutno v fazi zbiranja vzorcev, (iii) vzorci se zbirajo tudi iz dveh kliničnih kohort iz Univerzitetnega kliničnega centra Ljubljana v sodelovanju s Pediatrično kliniko (tiki, anoreksija), medtem ko v okviru (iv) klinične kohorte, povezane s projektom Million Microbiomes from Humans Project, nameravamo zbrati več kot 1000 vzorcev blata in urina za metagenomiko in metabolomske analize. S temi projekti smo na poti, da ustvarimo na tisoče gigabajtov molekularnih podatkov, ki bodo v bodoče uporabni tudi v vsakodnevni diagnostiki in so primerljivi največjim evropskim študijam. Ohranjanje sistemske homeostaze ter odzivanje na prehranske in okoljske izzive zahteva usklajevanje različnih organov in tkiv. Da bi odgovorili na različne presnovne zahteve, človeško telo integrira sistem medorganske komunikacije, prek katerega lahko eno tkivo vpliva na presnovne poti v oddaljenem tkivu. Porušitev teh komunikacijskih poti zaradi pomanjkanja vadbe (sedeči življenjski slog) ali vnosa visokokalorične prehrane prispeva k človeškim boleznim, kot so debelost, sladkorna bolezen, bolezni jeter in ateroskleroza. Za pravočasne posege bi morali razmišljati o uporabi telesnih tekočin (kot je urin), ki omogočajo neinvazivno vzorčenje, hkrati pa so dovolj občutljive, da razlikujejo med vrstami biomarkerjev (Schmidt, 2021). To odpira prostor za boljše razumevanje medorganske komunikacije kot vratarja za metabolno zdravje, saj obstajajo večsmerne interakcije med organi in osrednjim živčnim sistemom, z namenom ohranjanja energijske homeostaze in omogočanja novih terapevtskih strategij in spodbujanja zdravega življenja za preprečevanje presnovnih motenj in drugih bolezni. 179 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 5 REFERENCES Abramowicz W.N., Galloway S.D. 2005. Effects of acute versus chronic L-carnitine L-tartrate supplementation on metabolic responses to steady state exercise in males and females. International Journal of Sport Nutrition and Exercise Metabolism, 15, 4: 386-400 Abrigo J., Gonzalez F., Aguirre F., Tacchi F., Gonzalez A., Meza M.P., Simon F., Cabrera D., Arrese M., Karpen S., Cabello-Verrugio, C. 2021. Cholic acid and deoxycholic acid induce skeletal muscle atrophy through a mechanism dependent on TGR5 receptor. Journal of Cellular Physiology, 236, 1: 260-272 Adamko D.J., Nair P., Mayers I., Tsuyuki R.T., Regush S., Rowe, B.H. 2015. Metabolomic profiling of asthma and chronic obstructive pulmonary disease: A pilot study differentiating diseases. The Journal of Allergy and Clinical Immunology, 136, 3: 571-580 Akram M. 2014. Citric acid cycle and role of its intermediates in metabolism. Cell Biochemistry and Biophysics, 68, 3: 475-478 Alamoudi J.A., Li W., Gautam N., Olivera M., Meza J., Mukherjee S., Alnouti, Y. 2021. Bile acid indices as biomarkers for liver diseases I: diagnostic markers. World Journal of Hepatology, 13, 4: 433-455 Alneberg J., Bjarnason B.S., De Bruijn I., Schirmer M., Quick J., Ijaz U.Z., Lahti L., Loman N.J., Andersson A.F., Quince C. 2014. Binning metagenomic contigs by coverage and composition. Nature Methods, 11, 4: 1144-1146 Alves C.R.R., Zhang R., Johnstone A.J., Garner R., Nwe P.H., Siranosian J.J., Swoboda, K.J. 2020. Serum creatinine is a biomarker of progressive denervation in spinal muscular atrophy. Neurology, 94, 9: e921-e931 Amano H., Maruyama K., Naka M., Tanaka T. 2003. Target validation in hypoxia-induced vascular remodeling using transcriptome/metabolome analysis. The Pharmacogenomics Journal, 3, 3: 183-188 Anderson M.J., Walsh D.C.I. 2013. PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? Ecological Monographs, 83, 4: 557-574 Apweiler R., Beissbarth T., Berthold M.R., Blüthgen N., Burmeister Y., Dammann O., Deutsch A., Feuerhake F., Franke A., Hasenauer J., Hoffmann S., Höfer T., Jansen P.L., Kaderali L., Klingmüller U., Koch I., Kohlbacher O., Kuepfer L., Lammert F., Maier D., Pfeifer N., Radde N., Rehm M., Roeder I., Saez-Rodriguez J., Sax U., Schmeck B., Schuppert A., Seilheimer B., Theis F.J., Vera J., Wolkenhauer O. 2018. Whither systems medicine? Experimental & Molecular Medicine, 50, 3: e453, doi: 10.1038/emm.2017.290, 6 p. Argmann C.A., Houten S.M., Zhu J., Schadt, E.E. 2016. A next generation multiscale view of inborn errors of metabolism. Cell Metabolism, 23, 1: 13-26 Ariza A.C., Deen P.M., Robben J.H. 2012. The succinate receptor as a novel therapeutic target for oxidative and metabolic stress-related conditions. Frontiers in Endocrinology, 3: 22, doi: 10.3389/fendo.2012.00022, 8 p. 180 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Armstrong N., Barker A.R. 2011. Endurance training and elite young athletes. Medicine and Sport science, 56: 59-83 Aron-Wisnewsky J., Prifti E., Belda E., Ichou F., Kayser B.D., Dao M.C., Verger E.O., Hedjazi L., Bouillot J.L., Chevallier J.M., Pons N., Le Chatelier E., Levenez F., Ehrlich, S.D., Dore J., Zucker J.D., Clement K. 2019. Major microbiota dysbiosis in severe obesity: fate after bariatric surgery. Gut, 68, 1: 70-82 Aßhauer K.P., Wemheuer B., Daniel R., Meinicke P. 2015. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data. Bioinformatics (Oxford, England), 31, 17: 2882-2884 Ashrafian H., Sounderajah V., Glen R., Ebbels T., Blaise B.J., Kalra D., Kultima K., Spjuth O., Tenori L., Salek R.M., Kale N., Haug K., Schober D., Rocca-Serra P., O'donovan C., Steinbeck C., Cano I., De Atauri P., Cascante, M. 2021. Metabolomics: the stethoscope for the twenty-first century. Medical Principles and Practice: International Journal of the Kuwait University, Health Science Centre, 30, 4: 301-310 Aureli T., Miccheli A., Di Cocco M.E., Ghirardi O., Giuliani A., Ramacci M.T., Conti F. 1994. Effect of acetyl-L-carnitine on recovery of brain phosphorus metabolites and lactic acid level during reperfusion after cerebral ischemia in the rat--study by 13P- and 1H-NMR spectroscopy. Brain Research, 643, 1-2: 92-99 Azad R.K., Shulaev V. 2019. Metabolomics technology and bioinformatics for precision medicine. Briefings in Bioinformatics, 20, 6: 1957-1971 Barnes S., Benton H.P., Casazza K., Cooper S.J., Cui X., Du X., Engler J., Kabarowski J.H., Li S., Pathmasiri W., Prasain J.K., Renfrow M.B., Tiwari, H.K. 2016. Training in metabolomics research. II. Processing and statistical analysis of metabolomics data, metabolite identification, pathway analysis, applications of metabolomics and its future. Journal of Mass Spectrometry: JMS, 51, 8: 535-548 Barr A.J. 2018. The biochemical basis of disease. Essays in Biochemistry, 62, 5: 619-642 Bates M.L., Farrell, E.T., Eldridge, M.W. 2014. Abnormal ventilatory responses in adults born prematurely. The New England Journal of Medicine, 370, 6: 584-585 Beghini F., Mciver L.J., Blanco-Míguez A., Dubois L., Asnicar F., Maharjan S., Mailyan A., Manghi P., Scholz M., Thomas A.M., Valles-Colomer M., Weingart G., Zhang Y., Zolfo M., Huttenhower C., Franzosa, E.A., Segata, N. 2021. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. ELife, 10: e65088, doi: 10.7554/eLife.65088, 42 p. Beirnaert C., Meysman P., Vu T.N., Hermans N., Apers S., Pieters L., Covaci A., Laukens K. 2018. speaq 2.0: A complete workflow for high-throughput 1D NMR spectra processing and quantification. PLoS Computational Biology, 14, 3: e1006018, doi: 10.1371/journal.pcbi.1006018, 25 p. Bender A., Scheiber J., Glick M., Davies J.W., Azzaoui K., Hamon J., Urban L., Whitebread S., Jenkins J.L. 2007. Analysis of pharmacology data and the prediction of adverse drug reactions and off-target effects from chemical structure. ChemMedChem, 2, 6: 861-873 Bérard C., Payan C., Hodgkinson I., Fermanian J. 2005. A motor function measure for neuromuscular diseases. Construction and validation study. Neuromuscular Disorders: NMD, 15, 463-470 181 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Berg G., Rybakova D., Fischer D., Cernava T., Vergès M.C., Charles T., Chen X., Cocolin L., Eversole K., Corral G.H., Kazou M., Kinkel L., Lange L., Lima N., Loy A., Macklin J.A., Maguin E., Mauchline T., Mcclure R., Mitter B., Ryan M., Sarand I., Smidt H., Schelkle B., Roume H., Kiran G.S., Selvin J., Souza R.S.C., Van Overbeek L., Singh B.K., Wagner M., Walsh A., Sessitsch A., Schloter, M. 2020. Microbiome definition re-visited: old concepts and new challenges. Microbiome, 8, 1: 103, doi: 10.1186/s40168-020-00875-0, 22 p. Bingol K. 2018. Recent advances in targeted and untargeted metabolomics by NMR and MS/NMR methods. High-Throughput, 7, 2: 9, doi: 10.3390/ht7020009, 11 p. Biomarkers Definitions Working Group. 2001. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clinical Pharmacology and Therapeutics, 69, 3: 89-95 Bizzarri D., Reinders M.J.T., Beekman M., Slagboom P.E., Bbmri N.L., Van Den Akker E.B. 2022. 1H-NMR metabolomics-based surrogates to impute common clinical risk factors and endpoints. EBioMedicine, 75: 103764, doi: 10.1016/j.ebiom.2021.103764, 15 p. Bjerrum J.T., Wang Y., Hao F., Coskun M., Ludwig C., Gunther U., Nielsen O.H. 2015. Metabonomics of human fecal extracts characterize ulcerative colitis, Crohn's disease and healthy individuals. Metabolomics, 11: 122-133 Blake M.R., Raker J.M., Whelan K. 2016. Validity and reliability of the Bristol stool form scale in healthy adults and patients with diarrhoea-predominant irritable bowel syndrome. Alimentary Pharmacology & Therapeutics, 44, 7: 693-703 Blencowe H., Cousens S., Oestergaard M.Z., Chou D., Moller A.B., Narwal R., Adler A., Vera Garcia C., Rohde S., Say L., Lawn, J.E. 2012. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: a systematic analysis and implications. Lancet (London, England), 379, 9832: 2162-2172 Booth F.W., Roberts C.K., Laye M.J. 2012. Lack of exercise is a major cause of chronic diseases. Comprehensive Physiology, 2, 2: 1143-1211 Borkowski A.A., Wilson C.P., Borkowski S.A., Thomas L.B., Deland L.A., Grewe S.J., Mastorides S.M. 2019. Google Auto ML versus Apple Create ML for histopathologic cancer diagnosis; Which algorithms are better? Comprehensive Physiology, 2, 2: 1143-1211 Bousquet J., Anto J.M., Sterk P.J., Adcock I.M., Chung K.F., Roca J., Agusti A., Brightling C., Cambon-Thomsen A., Cesario A., Abdelhak S., Antonarakis S.E., Avignon A., Ballabio A., Baraldi E., Baranov A., Bieber T., Bockaert J., Brahmachari S., Brambilla C., Bringer J., Dauzat M., Ernberg I., Fabbri L., Froguel P., Galas D., Gojobori T., Hunter P., Jorgensen C., Kauffmann F., Kourilsky P., Kowalski M.L., Lancet D., Le Pen C., Mallet J., Mayosi B., Mercier J., Metspalu A., Nadeau J.H., Ninot G., Noble D., Ozturk M., Palkonen S., Prefaut C., Rabe K., Renard E., Roberts R.G., Samolinski B., Schunemann H.J., Simon H.U., Soares M.B., Superti-Furga G., Tegner J., Verjovski-Almeida S., Wellstead P., Wolkenhauer O., Wouters E., Balling R., Brookes A.J., Charron D., Pison C., Chen Z., Hood L., Auffray, C. 2011. Systems medicine and integrated care to combat chronic noncommunicable diseases. Genome Medicine, 3: 43, doi: 10.1186/gm259, 12 p. Bowers R.M., Kyrpides N.C., Stepanauskas R., Harmon-Smith M., Doud D., Reddy T.B.K., Schulz F., Jarett J., Rivers A.R., Eloe-Fadrosh E.A., Tringe S.G., Ivanova N.N., Copeland A., Clum A., 182 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Becraft E.D., Malmstrom R.R., Birren B., Podar M., Bork P., Weinstock G.M., Garrity G.M., Dodsworth J.A., Yooseph S., Sutton G., Glöckner F.O., Gilbert J.A., Nelson W.C., Hallam S.J., Jungbluth S.P., Ettema T.J.G., Tighe S., Konstantinidis K.T., Liu W.T., Baker B.J., Rattei T., Eisen J.A., Hedlund B., Mcmahon K.D., Fierer N., Knight R., Finn R., Cochrane G., Karsch-Mizrachi I., Tyson G.W., Rinke C., Consortium G.S., Lapidus A., Meyer F., Yilmaz P., Parks D.H., Eren A.M., Schriml L., Banfield J.F., Hugenholtz P., Woyke, T. 2017. Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea. Nature Biotechnology, 35, 8: 725-731 Bro R., Kamstrup-Nielsen M.H., Engelsen S.B., Savorani F., Rasmussen M.A., Hansen L., Olsen A., Tjonneland A., Dragsted, L.O. 2015. Forecasting individual breast cancer risk using plasma metabolomics and biocontours. Metabolomics, 11, 5: 1376-1380 Brown C.T., Sharon I., Thomas B.C., Castelle C.J., Morowitz M.J., Banfield, J.F. 2013. Genome resolved analysis of a premature infant gut microbial community reveals a Varibaculum cambriense genome and a shift towards fermentation-based metabolism during the third week of life. Microbiome, 1: 30, doi: 10.1186/2049-2618-1-30, 19 p. Callahan B.J., Mcmurdie P.J., Holmes, S.P. 2017. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. The ISME Journal, 11, 12: 2639-2643 Cañueto D., Gómez J., Salek R.M., Correig X., Cañellas, N. 2018. rDolphin: a GUI R package for proficient automatic profiling of 1D 1 H-NMR spectra of study datasets. Metabolomics: Official Journal of the Metabolomic Society, 14, 3: 24, doi: 10.1007/s11306-018-1319-y, 5 p. Castillo-Armengol J., Fajas L., Lopez-Mejia I.C. 2019. Inter-organ communication: a gatekeeper for metabolic health. EMBO Reports, 20, 9: e47903, doi: 10.15252/embr.201947903, 16 p. Castro A., Duft R.G., Ferreira M.L.V., Andrade A.L.L., Gáspari A.F., Silva L.M., Oliveira-Nunes S.G., Cavaglieri C.R., Ghosh S., Bouchard C., Chacon-Mikahil M.P.T. 2019. Association of skeletal muscle and serum metabolites with maximum power output gains in response to continuous endurance or high-intensity interval training programs: The TIMES study - a randomized controlled trial. PloS One 14, 2: e0212115, doi: 10.1371/journal.pone.0212115, 32 p. Castro J.C., Rodriguez-R L.M., Harvey W.T., Weigand M.R., Hatt J.K., Carter M.Q., Konstantinidis, K.T. 2018. imGLAD: accurate detection and quantification of target organisms in metagenomes. PeerJ, 6: e5882, doi: 10.7717/peerj.5882, 23 p. Chen K., Zhang Q., Wang J., Liu F., Mi M., Xu H., Chen F., Zeng, K. 2009. Taurine protects transformed rat retinal ganglion cells from hypoxia-induced apoptosis by preventing mitochondrial dysfunction. Brain Research, 1279: 131-138 Chen P.C., Pan C., Gharibani P.M., Prentice H., Wu, J.Y. 2013. Taurine exerts robust protection against hypoxia and oxygen/glucose deprivation in human neuroblastoma cell culture. Advances in Experimental Medicine and Biology, 775: 167-175 Chen S., Zhou Y., Chen Y., Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics (Oxford, England), 34, 17: i884-i890 Chiriboga C.A., Swoboda K.J., Darras B.T., Iannaccone S.T., Montes J., De Vivo D.C., Norris D.A., Bennett C.F., Bishop K.M. 2016. Results from a phase 1 study of nusinersen (ISIS-SMN(Rx)) in children with spinal muscular atrophy. Neurology, 86, 10: 890-897 183 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Chong J., Liu P., Zhou G., Xia J. 2020. Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data. Nature Protocols, 15, 3: 799-821 Chong J., Soufan O., Li C., Caraus I., Li S., Bourque G., Wishart D.S., Xia J. 2018. MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis. Nucleic Acids Research, 46, W1: W486-W494 Chong J., Wishart D.S., Xia J. 2019. Using MetaboAnalyst 4.0 for comprehensive and integrative metabolomics data analysis. Current Protocols in Bioinformatics, 68, 1: e86, doi: 10.1002/cpbi.86, 128 p. Chumpitazi B.P., Self M.M., Czyzewski D.I., Cejka S., Swank P.R., Shulman R.J. 2016. Bristol stool form scale reliability and agreement decreases when determining Rome III stool form designations. Neurogastroenterology and Motility: The Official Journal of the European Gastrointestinal Motility Society, 28, 3: 443-448 Clemm H.H., Vollsæter M., Røksund O.D., Eide G.E., Markestad T., Halvorsen, T. 2014. Exercise capacity after extremely preterm birth. Development from adolescence to adulthood. Annals of the American Thoracic Society, 11, 4: 537-545 Coen P.M., Jubrias S.A., Distefano G., Amati F., Mackey D.C., Glynn N.W., Manini T.M., Wohlgemuth S.E., Leeuwenburgh C., Cummings S.R., Newman A.B., Ferrucci L., Toledo F.G., Shankland E., Conley K.E., Goodpaster, B.H. 2013. Skeletal muscle mitochondrial energetics are associated with maximal aerobic capacity and walking speed in older adults. The Journals of Gerontology. Series A, Biological Sciences and Medical Sciences, 68, 4: 447-455 Connors J., Dawe N., Van Limbergen J. 2018. The role of succinate in the regulation of intestinal inflammation. Nutrients, 11, 1: 25, doi: 10.3390/nu11010025, 12 p. Corey D.R. 2017. Nusinersen, an antisense oligonucleotide drug for spinal muscular atrophy. Nature Neuroscience, 20, 4: 497- 499 Costea P.I., Zeller G., Sunagawa S., Pelletier E., Alberti A., Levenez F., Tramontano M., Driessen M., Hercog R., Jung F.E., Kultima J.R., Hayward M.R., Coelho L.P., Allen-Vercoe E., Bertrand L., Blaut M., Brown J.R.M., Carton T., Cools-Portier S., Daigneault M., Derrien M., Druesne A., De Vos W.M., Finlay B.B., Flint H.J., Guarner F., Hattori M., Heilig H., Luna R.A., Van Hylckama Vlieg J., Junick J., Klymiuk I., Langella P., Le Chatelier E., Mai V., Manichanh C., Martin J.C., Mery C., Morita H., O'toole P.W., Orvain C., Patil K.R., Penders J., Persson S., Pons N., Popova M., Salonen A., Saulnier D., Scott K.P., Singh B., Slezak K., Veiga P., Versalovic J., Zhao L., Zoetendal E.G., Ehrlich S.D., Dore J., Bork, P. 2017. Towards standards for human fecal sample processing in metagenomic studies. Nature Biotechnology, 35, 11: 1069-1076 Craig, J. 2008. Complex diseases: research and applications. Nature Education, 1: 184 Crass M.F., Lombardini J.B. 1977. Loss of cardiac muscle taurine after acute left ventricular ischemia. Life Sciences, 21, 7: 951-958 Cremer J., Segota I., Yang C.Y., Arnoldini M., Sauls J.T., Zhang Z., Gutierrez E., Groisman A., Hwa T. 2016. Effect of flow and peristaltic mixing on bacterial growth in a gut-like channel. Proceedings of the National Academy of Sciences of the United States of America, 113, 41: 11414-11419 184 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Cremer J., Arnoldini M., Hwa T. 2017. Effect of water flow and chemical environment on microbiota growth and composition in the human colon. Proceedings of the National Academy of Sciences of the United States of America, 114, 25: 10069-10240 Cristianini N., Shawe-Taylor, J. 2000. An introduction to support vector machines and other kernel-based learning methods. Cambridge, Cambridge University Press: 189 p. Crump C., Sundquist J., Winkleby M.A., Sundquist K. 2019. Gestational age at birth and mortality from infancy into mid-adulthood: a national cohort study. The Lancet. Child & Adolescent Health, 3, 6: 408-417 Crump C. 2020. An overview of adult health outcomes after preterm birth. Early Human Development, 150: 105187, doi: 10.1016/j.earlhumdev.2020.105187, 8 p. D'amore R., Ijaz U.Z., Schirmer M., Kenny J.G., Gregory R., Darby A.C., Shakya M., Podar M., Quince C., Hall, N. 2016. A comprehensive benchmarking study of protocols and sequencing platforms for 16S rRNA community profiling. BMC Genomics, 17: 55, doi: 10.1186/s12864-015-2194-9, 20 p. Da Silva R.R., Dorrestein P.C., Quinn R.A. 2015. Illuminating the dark matter in metabolomics. Proceedings of the National Academy of Sciences of the United States of America, 112, 41: 12549-12550 Dai Z., Wong S.H., Yu J., Wei Y. 2019. Batch effects correction for microbiome data with Dirichlet-multinomial regression. Bioinformatics (Oxford, England), 35, 5: 807-814 Danese E., Salvagno G.L., Tarperi C., Negrini D., Montagnana M., Festa L., Sanchis-Gomar F., Schena F., Lippi G. 2017. Middle-distance running acutely influences the concentration and composition of serum bile acids: Potential implications for cancer risk? Oncotarget, 8, 32: 52775-52782 Debevec T., Bali T.C., Simpson E.J., Macdonald I.A., Eiken O., Mekjavic I.B. 2014. Separate and combined effects of 21-day bed rest and hypoxic confinement on body composition. European Journal of Applied Physiology, 114, 11: 2411-2425 Debevec T., Millet G.P., Pialoux, V. 2017. Hypoxia-induced oxidative stress modulation with physical activity. Frontiers in Physiology, 8: 84, doi: 10.3389/fphys.2017.00084, 9 p. Debevec T., Pialoux V., Millet G.P., Martin A., Mramor M., Osredkar D. 2019. Exercise overrides blunted hypoxic ventilatory response in prematurely born men. Frontiers in Physiology, 10: 437, doi: 10.3389/fphys.2019.00437, 10 p. Debevec T., Poussel M., Osredkar D., Willis S.J., Sartori C., Millet, G.P. 2022. Post-exercise accumulation of interstitial lung water is greater in hypobaric than normobaric hypoxia in adults born prematurely. Respiratory Physiology & Neurobiology, 297: 34890833, doi: 10.1016/j.resp.2021.103828, 4 p. Deo R.C. 2015. Machine learning in medicine. Circulation, 132, 20: 1920-1930 Derrien M., Van Passel M.W., Van De Bovenkamp J.H., Schipper R.G., De Vos W.M., Dekker J. 2010. Mucin-bacterial interactions in the human oral cavity and digestive tract. Gut Microbes, 1, 4: 254-268 Deutsch L., Osredkar D., Plavec J., Stres, B. 2021. Spinal muscular atrophy after nusinersen therapy: improved physiology in pediatric patients with no significant change in urine, serum, and liquor 185 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 1H-NMR metabolomes in comparison to an age-matched, healthy cohort. Metabolites, 11, 4: 206, doi: 10.3390/metabo11040206, 15 p. Deutsch L., Stres, B. 2021. The importance of objective stool classification in fecal 1H-NMR metabolomics: exponential increase in stool crosslinking is mirrored in systemic inflammation and associated to fecal acetate and methionine. Metabolites, 11, 3: 172, doi: 10.3390/metabo11030172, 16 p. Deutsch L., Soritirdis A., Murovec B., Plavec J., Mekjavic I., Debevec T., Stres B. 2022a. Exercise and interorgan communication: short-term exercise training blunts differences in consecutive daily urine 1H-NMR metabolomic signatures between physically active and inactive individuals. Metabolites, 12, 6: 473, doi: https://doi:org/10.3390/metabo12060473, 16 p. Deutsch L., Debevec T., Millet G. P., Osredkar D., Opara S., Šket R., Murovec B., Mramor M, Plavec J., Stres B. 2022b. Urine and fecal 1H-NMR metabolomes differ significantly between pre-term and full-term born physically fit healthy adults. Metabolites, 12, 6: 536, doi: https://doi.org/10.3390/metabo12060536, 23 p. Dhariwal A., Chong J., Habib S., King I.L., Agellon L.B., Xia J. 2017. MicrobiomeAnalyst: a web-based tool for comprehensive statistical, visual and meta-analysis of microbiome data. Nucleic Acids Research, 45, W1: W180-W188 Di Liegro C.M., Schiera G., Proia P., Di Liegro I. 2019. Physical activity and brain health. Genes, 10, 9: 720, doi: 10.3390/genes10090720, 40 p. Ding X., Yang F., Chen Y., Xu J., He J., Zhang R., Abliz Z. 2022. Norm ISWSVR: A data integration and normalization approach for large-scale metabolomics. Analytical Chemistry, 78, 13: 4281-4290 Dona A.C., Kyriakides M., Scott F., Shephard E.A., Varshavi D., Veselkov K., Everett J.R. 2016. A guide to the identification of metabolites in NMR-based metabonomics/metabolomics experiments. Computational and Structural Biotechnology Journal, 14: 135-153 Douglas G.M., Maffei V.J., Zaneveld J.R., Yurgel S.N., Brown J.R., Taylor C.M., Huttenhower C., Langille M.G.I. 2020. PICRUSt2 for prediction of metagenome functions. Nature Biotechnology, 38, 6: 685-688 Dumas M.E., Kinross J., Nicholson J.K. 2014. Metabolic phenotyping and systems biology approaches to understanding metabolic syndrome and fatty liver disease. Gastroenterology, 146, 1: 46-62 Dunstan D.W., Dogra S., Carter S.E., Owen, N. 2021. Sit less and move more for cardiovascular health: emerging insights and opportunities. Nature Reviews. Cardiology, 18, 9: 637-648 Ebbels T.M.D., Lindon J.C., Coen M. 2013. Processing and modelling of nuclear magnetic resonance (NMR). In: Metabolic profiling. Metz T. O. (ed). London, Humana Press: 365-388 Ehrlein H.J., Schemann M. 2005. Gastrointestinal motility. Technische Universität München, 26 p. Ekins S., Puhl A.C., Zorn K.M., Lane T.R., Russo D.P., Klein J.J., Hickey A.J., Clark A.M. 2019. Exploiting machine learning for end-to-end drug discovery and development. Nature Materials, 18, 5: 435-441 Elliott P., Posma J.M., Chan Q., Garcia-Perez I., Wijeyesekera A., Bictash M., Ebbels T.M., Ueshima H., Zhao L., Van Horn L., Daviglus M., Stamler J., Holmes E., Nicholson J.K. 2015. Urinary 186 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 metabolic signatures of human adiposity. Science Translational Medicine, 7, 285: 285ra62, doi: 10.1126/scitranslmed.aaa5680, 16 p. Emwas A.H. 2015. The strengths and weaknesses of NMR spectroscopy and mass spectrometry with particular focus on metabolomics research. Methods in Molecular Biology (Clifton, N.J.), 1277: 161-193 Emwas A.H., Roy R., Mckay R.T., Ryan D., Brennan L., Tenori L., Luchinat C., Gao X., Zeri A.C., Gowda G.A., Raftery D., Steinbeck C., Salek R.M., Wishart D.S. 2016. Recommendations and standardization of biomarker quantification using NMR-based metabolomics with particular focus on urinary analysis. Journal of Proteome Research, 15, 2: 360-373 Emwas A.H., Saccenti E., Gao X., Mckay R.T., Dos Santos V.A.P.M., Roy R., Wishart D.S. 2018. Recommended strategies for spectral processing and post-processing of 1D 1 H-NMR data of biofluids with a particular focus on urine. Metabolomics: Official Journal of the Metabolomic Society, 14, 3: 31, doi: 10.1007/s11306-018-1321-4, 23 p. Emwas A.H., Roy R., Mckay R.T., Tenori L., Saccenti E., Gowda G.A.N., Raftery D., Alahmari F., Jaremko L., Jaremko M., Wishart, D.S. 2019. NMR spectroscopy for metabolomics research. Metabolites, 9, 7: 123, doi: 10.3390/metabo9070123, 39 p. Faber J.E., Szymeczek C.L., Cotecchia S., Thomas S.A., Tanoue A., Tsujimoto G., Zhang H. 2007. Alpha1-adrenoceptor-dependent vascular hypertrophy and remodeling in murine hypoxic pulmonary hypertension. American Journal of Physiology. Heart and Circulatory Physiology, 292, 5: H2316-H2323 Fang C., Zhong H., Lin Y., Chen B., Han M., Ren H., Lu H., Luber J.M., Xia M., Li W., Stein S., Xu X., Zhang W., Drmanac R., Wang J., Yang H., Hammarström L., Kostic A.D., Kristiansen K., Li J. 2018. Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing. GigaScience, 7, 3: gix133, doi: 10.1093/gigascience/gix133, 8 p. Fanos, V. 2016. Metabolomics and microbiomics personalized - medicine from the fetus to the adult. London, Academic Press: 144 p. Farrell E.T., Bates M.L., Pegelow D.F., Palta M., Eickhoff J.C., O'brien M.J., Eldridge M.W. 2015. Pulmonary gas exchange and exercise capacity in adults born preterm. Annals of the American Thoracic Society, 12, 8: 1130-1137 Feurer M., Eggensperger K., Falkner S., Lindauer M., Hutter F. 2021. Auto-Sklearn 2.0: hands-free AutoML via meta-Learning. ArXiv, doi: https://doi:org/10.48550/arXiv.2007.04074, 56 p. Filippone M., Bonetto G., Corradi M., Frigo A.C., Baraldi E. 2012. Evidence of unexpected oxidative stress in airways of adolescents born very pre-term. The European Respiratory Journal, 40, 5: 1253-1259 Franconi F., Stendardi I., Failli P., Matucci R., Baccaro C., Montorsi L., Bandinelli R., Giotti A. 1985. The protective effects of taurine on hypoxia (performed in the absence of glucose) and on reoxygenation (in the presence of glucose) in guinea-pig heart. Biochemical Pharmacology, 34, 15: 2611-2615 Frank D.N., St Amand A.L., Feldman R.A., Boedeker E.C., Harpaz N., Pace N.R. 2007. Molecular-phylogenetic characterization of microbial community imbalances in human inflammatory bowel 187 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 diseases. Proceedings of the National Academy of Sciences of the United States of America, 104, 34: 13780-13785 Fricker A.M., Podlesny D., Fricke W.F. 2019. What is new and relevant for sequencing-based microbiome research? A mini-review. Journal of Advanced Research, 19: 105-112 Fukiya S., Arata M., Kawashima H., Yoshida D., Kaneko M., Minamida K., Watanabe J., Ogura Y., Uchida K., Itoh K., Wada M., Ito S., Yokota A. 2009. Conversion of cholic acid and chenodeoxycholic acid into their 7-oxo derivatives by Bacteroides intestinalis AM-1 isolated from human feces. FEMS Microbiology Letters, 293, 2: 263-270 Gallo Cantafio M.E., Grillone K., Caracciolo D., Scionti F., Arbitrio M., Barbieri V., Pensabene L., Guzzi P.H., Di Martino M.T. 2018. From single level analysis to multi-omics integrative approaches: a powerful strategy towards the precision oncology. High-throughput, 7, 4: 33, doi: 10.3390/ht7040033, 20 p. Garud N.R., Good B.H., Hallatschek O., Pollard K.S. 2019. Evolutionary dynamics of bacteria in the gut microbiome within and across hosts. PLoS Biology, 17,1: e3000102, doi: 10.1371/journal.pbio.3000102, 29 p. Gibbons S.M., Duvallet C., Alm E.J. 2018. Correcting for batch effects in case-control microbiome studies. PLoS Computational Biology, 14, 4: e1006102, doi: 10.1371/journal.pcbi.1006102, 17 p. Giongo A., Gano K.A., Crabb D.B., Mukherjee N., Novelo L.L., Casella G., Drew J.C., Ilonen J., Knip M., Hyoty H., Veijola R., Simell T., Simell O., Neu J., Wasserfall C.H., Schatz D., Atkinson M.A., Triplett E.W. 2011. Toward defining the autoimmune microbiome for type 1 diabetes. The ISME Journal, 5, 1: 82-91 Giraudeau P., Silvestre V., Akoka S. 2015. Optimizing water suppression for quantitative NMR-based metabolomics: a tutorial review. Metabolomics, 11, 5: 1041-1055 Glanzman A.M., Mazzone E., Main M., Pelliccioni M., Wood J., Swoboda K.J., Scott C., Pane M., Messina S., Bertini E., Mercuri E., Finkel R.S. 2010. The children's hospital of philadelphia infant test of neuromuscular disorders (CHOP INTEND): test development and reliability. Neuromuscular Disorders: NMD, 20, 3: 155-161 Glover L.E., Lee J.S., Colgan, S.P. 2016. Oxygen metabolism and barrier regulation in the intestinal mucosa. The Journal of Clinical Investigation, 126, 10: 3680-3688 Graham E.D., Heidelberg J.F., Tully, B.J. 2017. BinSanity: unsupervised clustering of environmental microbial assemblies using coverage and affinity propagation. PeerJ, 5: e3035, doi: 10.7717/peerj.3035, 19 p. Grillet A.M., Wyatt N.B., Gloe L.M. 2012. Polymer gel rheology and adhesion. In: Rheology. De Vicente J. (ed). London, IntechOpen: 59-80 Hammer O., Harper D.A.T., Ryan P.D. 2001. PAST: Paleontological statistics software package for education and data analysis. Palaeontologia Electronica, 1, 9: 4, doi: http://palaeo-electronica.org/2001_1/past/issue1_01.htm, 9 p. Han M., Hao L., Lin Y., Li F., Wang J., Yang H., Xiao L., Kristiansen K., Jia H., Li J. 2018. A novel affordable reagent for room temperature storage and transport of fecal samples for metagenomic analyses. Microbiome, 6, 1: 43, doi: 10.1186/s40168-018-0429-0, 7 p. 188 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Haraldsdottir K., Watson A.M., Beshish A.G., Pegelow D.F., Palta M., Tetri L.H., Brix M.D., Centanni R.M., Goss K.N., Eldridge M.W. 2019. Heart rate recovery after maximal exercise is impaired in healthy young adults born preterm. European Journal of Applied Physiology, 119, 4: 857-866 Hasin Y., Seldin M., Lusis A. 2017. Multi-omics approaches to disease. Genome Biology, 18,1: 83, doi: 10.1186/s13059-017-1215-1, 15 p. Heaton K.W., Radvan J., Cripps H., Mountford R.A., Braddon F.E.M., Hughes A.O. 1992. Defecation frequency and timing, and stool form in the general population: a prospective study. Gut, 33, 6: 818-824 Heinken A., Ravcheev D.A., Baldini F., Heirendt L., Fleming R.M.T., Thiele I. 2019. Systematic assessment of secondary bile acid metabolism in gut microbes reveals distinct metabolic capabilities in inflammatory bowel disease. Microbiome, 7, 1: 75, doi: 10.1186/s40168-019-0689-3, 18 p. Hoff P., Belavý D.L., Huscher D., Lang A., Hahne M., Kuhlmey A.K., Maschmeyer P., Armbrecht G., Fitzner R., Perschel F.H., Gaber T., Burmester G.R., Straub R.H., Felsenberg D., Buttgereit F. 2015. Effects of 60-day bed rest with and without exercise on cellular and humoral immunological parameters. Cellular & Molecular Immunology, 12, 4: 483-492 Holmes E., Loo R.L., Stamler J., Bictash M., Yap I.K., Chan Q., Ebbels T., De Iorio M., Brown I.J., Veselkov K.A., Daviglus M.L., Kesteloot H., Ueshima H., Zhao L., Nicholson J.K., Elliott P. 2008a. Human metabolic phenotype diversity and its association with diet and blood pressure. Nature, 453, 7193: 396-400 Holmes E., Wilson I.D., Nicholson J.K. 2008b. Metabolic phenotyping in health and disease. Cell, 134, 5: 714-717 Hood L., Friend S.H. 2011. Predictive, personalized, preventive, participatory (P4) cancer medicine. Nature reviews. Clinical Oncology, 8, 3: 184-187 Hood L., Flores M. 2012. A personal view on systems medicine and the emergence of proactive P4 medicine: predictive, preventive, personalized and participatory. New Biotechnology, 29, 6: 613-624 Hunter P., Nielsen P. 2005. A strategy for integrative computational physiology. Physiology (Bethesda, Md.), 20: 316-325 Hutter F., Kotthoff L., Vanschoren J. 2019. Automated machine learning. Cham, Switzerland, Springer: 219 p. Jain C., Rodriguez-R L.M., Phillippy A.M., Konstantinidis K.T., Aluru S. 2018. High throughput ANI analysis of 90K prokaryotic genomes reveals clear species boundaries. Nature Communications, 9 1: 5114, doi: 10.1038/s41467-018-07641-9, 8p. Jay O., Bain A.R., Deren T.M., Sacheli M., Cramer M.N. 2011. Large differences in peak oxygen uptake do not independently alter changes in core temperature and sweating during exercise. American journal of physiology. Regulatory, Integrative and Comparative physiology, 301, 3: R832-R841 Johnson L.R., Ghishan F.K., Kaunitz J.D., Merchant J.L., Said H.M., Wood J.D. 2012. Physiology of the gastrointestinal tract. London, Academic Press: 2308 p. 189 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Johnson W.E., Li C., Rabinovic A. 2007. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics (Oxford, England), 8, 1: 118-127 Karaglani M., Gourlia K., Tsamardinos I., Chatzaki E. 2020. Accurate blood-based diagnostic biosignatures for Alzheimer's disease via automated machine learning. Journal of Clinical Medicine, 9, 9: 3016, doi: 10.3390/jcm9093016, 14 p. Kelly R.S., Kelly M.P., Kelly P. 2020a. Metabolomics, physical activity, exercise and health: A review of the current evidence. Biochimica et Biophysica Acta (BBA) Molecular Basis of Disease, 1866, 12: 165936, doi: https://doi:org/10.1016/j.bbadis.2020.165936, 17 p. Kerkhof G.F., Breukhoven P.E., Leunissen R.W., Willemsen R.H., Hokken-Koelega A.C. 2012. Does preterm birth influence cardiovascular risk in early adulthood? The Journal of Pediatrics, 161, 3: 390-396 Keun H.C., Athersuch T.J. 2022. Nuclear magnetic resonance (NMR) - based metabolomics. In: Metabolic Profiling. Metz T. O. (ed). London, Humana Press: 321-334 Kim K.A., Jeong J.J., Yoo S.Y., Kim D.H. 2016. Gut microbiota lipopolysaccharide accelerates inflamm-aging in mice. BMC Microbiology, 16: 9, doi: 10.1186/s12866-016-0625-7, 9 p. Klein M.S. 2021. Affine transformation of negative values for NMR metabolomics using the mrbin R package. Journal of Proteome Research, 20, 2: 1397-1404 Knight R., Vrbanac A., Taylor B.C., Aksenov A., Callewaert C., Debelius J., Gonzalez A., Kosciolek T., Mccall L.I., Mcdonald D., Melnik A.V., Morton J.T., Navas J., Quinn R.A., Sanders J.G., Swafford A.D., Thompson L.R., Tripathi A., Xu Z.Z., Zaneveld J.R., Zhu Q., Caporaso J.G., Dorrestein P.C. 2018. Best practices for analysing microbiomes. Nature Reviews. Microbiology, 16, 7: 410-422 Koeth R.A., Wang Z., Levison B.S., Buffa J.A., Org E., Sheehy B.T., Britt E.B., Fu X., Wu Y., Li L., Smith J.D., Didonato J.A., Chen J., Li H., Wu G.D., Lewis J.D., Warrier M., Brown J.M., Krauss R.M., Tang W.H., Bushman F.D., Lusis A.J., Hazen S.L. 2013. Intestinal microbiota metabolism of L-carnitine, a nutrient in red meat, promotes atherosclerosis. Nature Medicine, 19, 5: 576-585 Kohl H.W., Craig C.L., Lambert E.V., Inoue S., Alkandari J.R., Leetongin G., Kahlmeier S., Group L.P.A.S.W. 2012. The pandemic of physical inactivity: global action for public health. Lancet (London, England), 380, 9838: 294-305 Kohl P., Crampin E.J., Quinn T.A., Noble D. 2010. Systems biology: an approach. Clinical Pharmacology and Therapeutics, 88, 1: 25-33 Kostic A.D., Gevers D., Pedamallu C.S., Michaud M., Duke F., Earl A.M., Ojesina A.I., Jung J., Bass A.J., Tabernero J., Baselga J., Liu C., Shivdasani R.A., Ogino S., Birren B.W., Huttenhower C., Garrett W.S., Meyerson M. 2012. Genomic analysis identifies association of Fusobacterium with colorectal carcinoma. Genome Research, 22, 2: 292-298 Kotthoff L., Thornton C., Hoos H.H., Hutter F., Leyton-Brown K. 2017. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research, 18, 25: 1-5 Kraut J.A., Madias N.E. 2016. Metabolic acidosis of CKD: an update. American Journal of Kidney Diseases: The Official Journal of the National Kidney Foundation, 67, 2: 307-317. 190 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Kurtzer G.M., Sochat V., Bauer M.W. 2017. Singularity: scientific containers for mobility of compute. PloS One, 12, 5: e0177459, doi: 10.1371/journal.pone.0177459, 20 p. La Rosa S.L., Leth M.L., Michalak L., Hansen M.E., Pudlo N.A., Glowacki R., Pereira G., Workman C.T., Arntzen M.Ø., Pope P.B., Martens E.C., Hachem M.A., Westereng B. 2019. The human gut Firmicute Roseburia intestinalis is a primary degrader of dietary β-mannans. Nature Communications, 10, 1: 905, doi: 10.1038/s41467-019-08812-y, 14 p. Lakrisenko P., Weindl D. 2021. Dynamic models for metabolomics data integration. Current Opinion in Systems Biology, 28: 100358, doi: https://doi:org/10.1016/j.coisb.2021.100358, 7 p. Langille M.G., Zaneveld J., Caporaso J.G., Mcdonald D., Knights D., Reyes J.A., Clemente J.C., Burkepile D.E., Vega Thurber R.L., Knight R., Beiko R.G., Huttenhower C. 2013. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nature Biotechnology, 31, 9: 814-821 Lapidus A.L., Korobeynikov A.I. 2021. Metagenomic data assembly - the way of decoding unknown microorganisms. Frontiers in Microbiology, 12: 613791, doi: https://doi:org/10.3389/fmicb.2021.613791, 16 p. Lee E.C., Fragala M.S., Kavouras S.A., Queen R.M., Pryor J.L., Casa D.J. 2017. Biomarkers in sports and exercise: tracking health, performance, and recovery in athletes. Journal of Strength and Conditioning Research, 31, 10: 2920-2937 Lefebvre S., Bürglen L., Reboullet S., Clermont O., Burlet P., Viollet L., Benichou B., Cruaud C., Millasseau P., Zeviani M. 1995. Identification and characterization of a spinal muscular atrophy-determining gene. Cell, 80, 1: 155-165 Legendre P., Legendre L.F.J. 2012. Numerical Ecology. Amsterdam, Elsevier: 1006 p. Lent-Schochet D., Mclaughlin M., Ramakrishnan N., Jialal I. 2019. Exploratory metabolomics of metabolic syndrome: A status report. World journal of diabetes, 10, 1: 23-36 Lewis S.J., Heaton K.W. 1997. Stool form scale as a useful guide to intestinal transit time. Scandinavian Journal of Gastroenterology, 32, 9: 920-924 Li D., Liu C.M., Luo R., Sadakane K., Lam T.W. 2015. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics (Oxford, England), 31, 10: 1674-1676 Li S., Ung T.T., Nguyen T.T., Sah D.K., Park S.Y., Jung Y.D. 2020. Cholic acid stimulates MMP-9 in human colon cancer cells via activation of MAPK, AP-1, and NF-κB Activity. International Journal of Molecular Sciences, 21, 10: 3420, doi: 10.3390/ijms21103420, 16 p. Lin W., Djukovic A., Mathur D., Xavier J.B. 2021. Listening in on the conversation between the human gut microbiome and its host. Current Opinion in Microbiology, 63: 150-157 Lindon J.C., Holmes E., Nicholson J.K. 2003. So what's the deal with metabonomics? Analytical Chemistry, 75, 17: 384a-391a Lindstad L.J., Lo G., Leivers S., Lu Z., Michalak L., Pereira G.V., Røhr Å.K., Martens E.C., Mckee L.S., Louis P., Duncan S.H., Westereng B., Pope P.B., La Rosa S.L. 2021. Human gut Faecalibacterium prausnitzii deploys a highly efficient conserved system to cross-feed on β- mannan-derived oligosaccharides. MBio, 12, 3: e0362820, doi: 10.1128/mBio.03628-20, 18 p. 191 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Ling W., Zhao N., Lulla A., Plantinga A.M., Fu W., Zhang A., Liu H., Li Z., Chen J., Randolph T., Koay W.L.A., White J.R., Launer L.J., Fodor A.A., Meyer K.A., Wu M.C. 2021. Batch effects removal for microbiome data via conditional quantile regression (ConQuR). BioRxiv: 10.1101/2021.09.23.461592, 29 p. Littlewood-Evans A., Sarret S., Apfel V., Loesle P., Dawson J., Zhang J., Muller A., Tigani B., Kneuer R., Patel S., Valeaux S., Gommermann N., Rubic-Schneider T., Junt T., Carballido J.M. 2016. GPR91 senses extracellular succinate released from inflammatory macrophages and exacerbates rheumatoid arthritis. The Journal of Experimental Medicine, 213, 9: 1655-1662 Liu L., Oza S., Hogan D., Perin J., Rudan I., Lawn J.E., Cousens S., Mathers C., Black R.E. 2015. Global, regional, and national causes of child mortality in 2000-13, with projections to inform post-2015 priorities: an updated systematic analysis. Lancet (London, England), 385, 9966: 430-440 Lorson C.L., Androphy E.J. 2000. An exonic enhancer is required for inclusion of an essential exon in the SMA-determining gene SMN. Human Molecular Genetics, 9, 2: 259-265 Lovering A.T., Laurie S.S., Elliott J.E., Beasley K.M., Yang X., Gust C.E., Mangum T.S., Goodman R.D., Hawn J.A., Gladstone I.M. 2013. Normal pulmonary gas exchange efficiency and absence of exercise-induced arterial hypoxemia in adults with bronchopulmonary dysplasia. Journal of Applied Physiology (Bethesda, Md.: 1985), 115, 7: 1050-1056 Lunn M.R., Wang C.H. 2008. Spinal muscular atrophy. Lancet (London, England), 371, 9630: 2120-2133 Lushchak V.I. 2014. Free radicals, reactive oxygen species, oxidative stress and its classification. Chemico-Biological Interactions, 224: 164-175 Lustgarten M.S., Price L.L., Chalé A., Fielding R.A. 2014. Metabolites related to gut bacterial metabolism, peroxisome proliferator-activated receptor-alpha activation, and insulin sensitivity are associated with physical function in functionally-limited older adults. Aging Cell, 13, 5: 918-925 Ma S., Tong M., Yuan S., Liu H. 2019. Responses of the microbial community structure in Fe (II)- bearing sediments to oxygenation: the role of reactive oxygen species. ACS Earth and Space Chemistry, 3, 5: 738-747 Maalouf N.M., Cameron M.A., Moe O.W., Adams-Huet B., Sakhaee K. 2007. Low urine pH: a novel feature of the metabolic syndrome. Clinical Journal of the American Society of Nephrology: CJASN, 2, 5: 883-888 Madrid-Gambin F., Oller-Moreno S., Fernandez L., Bartova S., Giner M.P., Joyce C., Ferraro F., Montoliu I., Moco S., Marco S. 2020. AlpsNMR: An R package for signal processing of fully untargeted NMR-based metabolomics. Bioinformatics (Oxford, England), 36, 9: 2943-2945 Magalhães J., Ascensão A., Viscor G., Soares J., Oliveira J., Marques F., Duarte J. 2004. Oxidative stress in humans during and after 4 hours of hypoxia at a simulated altitude of 5500 m. Aviation, Space, and Environmental Medicine, 75, 1: 16-22 Maguire M.L. 2014. An introduction to metabolomics and systems biology. In: Metabolomics and systems biology in human health and medicine. Oliver J. A. H. (ed.). Oxfordshire, United Kingdom, Cabi: 1-19 192 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Malcangio M., Bartolini A., Ghelardini C., Bennardini F., Malmberg-Aiello P., Franconi F., Giotti A. 1989. Effect of ICV taurine on the impairment of learning, convulsions and death caused by hypoxia. Psychopharmacology, 98, 3: 316-320 Mallick H., Franzosa E.A., Mclver L.J., Banerjee S., Sirota-Madi A., Kostic A.D., Clish C.B., Vlamakis H., Xavier R.J., Huttenhower C. 2019. Predictive metabolomic profiling of microbial communities using amplicon or metagenomic sequences. Nature Communications, 10: 3136, doi: 10.1038/s41467-019-10927-1, 11 p. Manley B.J., Doyle L.W., Davies M.W., Davis P.G. 2015. Fifty years in neonatology. Journal of Paediatrics and Child Health, 51, 1: 118-121 Marin L., Miguelez E.M., Villar C.J., Lombo F. 2015. Bioavailability of dietary polyphenols and gut microbiota metabolism: antimicrobial properties. BioMed Research International: 905215, doi: 10.1155/2015/905215, 18 p. Markopoulou P., Papanikolaou E., Analytis A., Zoumakis E., Siahanidou T. 2019. Preterm birth as a risk factor for metabolic syndrome and cardiovascular disease in adult life: a systematic review and meta-analysis. The Journal of Pediatrics, 210: 69-80 Martin A., Faes C., Debevec T., Rytz C., Millet G., Pialoux V. 2018. Preterm birth and oxidative stress: Effects of acute physical exercise and hypoxia physiological responses. Redox Biology, 17: 315-322 Martin A., Millet G., Osredkar D., Mramor M., Faes C., Gouraud E., Debevec T., Pialoux V. 2020. Effect of pre-term birth on oxidative stress responses to normoxic and hypoxic exercise. Redox Biology, 32: 101497, doi: 10.1016/j.redox.2020.101497, 7 p. Martínez Y., Li X., Liu G., Bin P., Yan W., Más D., Valdivié M., Hu C.A., Ren W., Yin Y. 2017. The role of methionine on metabolism, oxidative stress, and diseases. Amino Acids, 49, 12: 2091-2098 Matsuda K., Akiyama T., Tsujibe S., Oki K., Gawad A., Fujimoto J. 2021. Direct measurement of stool consistency by texture analyzer and calculation of reference value in Belgian general population. Scientific Reports, 11, 1: 2400, doi: 10.1038/s41598-021-81783-7, 7p. Maurer A., Ward J.L., Dean K., Billinger S.A., Lin H., Mercer K.E., Adams S.H., Thyfault J.P. 2020. Divergence in aerobic capacity impacts bile acid metabolism in young women. Journal of Applied Physiology (Bethesda, Md.: 1985), 129, 4: 768-778 Mciver L.J., Abu-Ali G., Franzosa E.A., Schwager R., Morgan X.C., Waldron L., Segata N., Huttenhower C. 2018. bioBakery: a meta'omic analysis environment. Bioinformatics 34, 7: 1235-1237 Melki J. 2017. Advances in Spinal Muscular Atrophy Research. In: Spinal muscular atrophy - Disease mechanisms and therapy. Sumner C.J., Paushkin S., Ko C (eds). London, Academic Press: xxiii - xxiv Mercer K.E., Maurer A., Pack L.M., Ono-Moore K., Spray B.J., Campbell C., Chandler C.J., Burnett D., Souza E., Casazza G., Keim N., Newman J., Hunter G., Fernadez J., Garvey W.T., Harper M.E., Hoppel C., Adams S.H., Thyfault J. 2021. Exercise training and diet-induced weight loss increase markers of hepatic bile acid (BA) synthesis and reduce serum total BA concentrations in 193 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 obese women. American Journal of Physiology. Endocrinology and Metabolism, 320, 5: E864-E873 Michalk D.V., Wingenfeld P., Licht C. 1997. Protection against cell damage due to hypoxia and reoxygenation: the role of taurine and the involved mechanisms. Amino Acids, 13, 3-4: 337-346 Mizeranschi A., Groen D., Borgdorff J., Hoekstra A.G., Chopard B., Dubitzky W. 2016. Anatomy and physiology of multiscale modeling and simulation in systems medicine. In: Systems medicine. Methods in molecular biology. Schmitz U. (ed). New York, USA, Humana Press: 375-405 Montero D., Lundby C. 2017. Refuting the myth of non-response to exercise training: 'non-responders' do respond to higher dose of training. The Journal of Physiology, 595, 11: 3377-3387 Moreno-Indias I., Lahti L., Nedyalkova M., Elbere I., Roshchupkin G., Adilovic M., Aydemir O., Bakir-Gungor B., Santa Pau E.C., D'elia D., Desai M.S., Falquet L., Gundogdu A., Hron K., Klammsteiner T., Lopes M.B., Marcos-Zambrano L.J., Marques C., Mason M., May P., Pašić L., Pio G., Pongor S., Promponas V.J., Przymus P., Saez-Rodriguez J., Sampri A., Shigdel R., Stres B., Suharoschi R., Truu J., Truică C.O., Vilne B., Vlachakis D., Yilmaz E., Zeller G., Zomer A.L., Gómez-Cabrero D., Claesson M.J. 2021. Statistical and machine learning techniques in human microbiome studies: contemporary challenges and solutions. Frontiers in Microbiology, 12: 635781, doi: 10.3389/fmicb.2021.635781, 9 p. Moutquin J.M. 2003. Classification and heterogeneity of preterm birth. BJOG: An International Journal of Obstetrics and Gynaecology, 110, 20: 30-33 Murovec B., Deutsch L., Stres B. 2020. Computational framework for high-quality production and large-scale evolutionary analysis of metagenome assembled genomes. Molecular Biology and Evolution, 37, 2: 593-598 Murovec B., Makuc D., Kolbl Repinc S., Prevorsek Z., Zavec D., Sket R., Pecnik K., Plavec J., Stres B. 2018. (1)H NMR metabolomics of microbial metabolites in the four MW agricultural biogas plant reactors: A case study of inhibition mirroring the acute rumen acidosis symptoms. Journal of Environmental Management, 222: 428-435 Mussap M., Noto A., Piras C., Atzori L., Fanos V. 2021. Slotting metabolomics into routine precision medicine. Expert Review of Precision Medicine and Drug Development, 6, 3: 173-187 Mustafa A., Rahimi Azghadi M. 2021. Automated machine learning for healthcare and clinical notes analysis. Computers 10, 2: 24, doi: 10.3390/computers10020024, 31 p. Mysara M., Vandamme P., Props R., Kerckhof F.M., Leys N., Boon N., Raes J., Monsieurs P. 2017. Reconciliation between operational taxonomic units and species boundaries. FEMS Microbiology Ecology, 93, 4: fix029, doi: 10.1093/femsec/fix029, 12 p. Müller M., Hernández M.A.G., Goossens G.H., Reijnders D., Holst J.J., Jocken J.W.E., Van Eijk H., Canfora E.E., Blaak E.E. 2019. Circulating but not faecal short-chain fatty acids are related to insulin sensitivity, lipolysis and GLP-1 concentrations in humans. Scientific Reports, 9:12515, https://doi:org/10.1038/s41598-019-48775-0, 9 p. Narayan N.R., Weinmaier T., Laserna-Mendieta E.J., Claesson M.J., Shanahan F., Dabbagh K., Iwai S., Desantis T.Z. 2020. Piphillin predicts metagenomic composition and dynamics from DADA2-194 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 corrected 16S rDNA sequences. BMC Genomics, 21, 1: 56, doi: 10.1186/s12864-019-6427-1, 12 p. Nayfach S., Pollard, K.S. 2016. Toward accurate and quantitative comparative metagenomics. Cell, 166, 5: 1103-1116 Nayfach S., Shi Z.J., Seshadri R., Pollard K.S., Kyrpides N.C. 2019. New insights from uncultivated genomes of the global human gut microbiome. Nature, 568, 7753: 505-510 Nieman D.C., Shanely R.A., Gillitt N.D., Pappan K.L., Lila M.A. 2013. Serum metabolic signatures induced by a three-day intensified exercise period persist after 14 h of recovery in runners. Journal of Proteome Research, 12, 10: 4577-4584 Noble D. 2002. Modeling the heart--from genes to cells to the whole organ. Science (New York, N.Y.), 295, 5560: 1678-1682 Nurk S., Meleshko D., Korobeynikov A., Pevzner P.A. 2017. metaSPAdes: a new versatile metagenomic assembler. Genome Research, 27, 5: 824-834 Osredkar D., Jílková M., Butenko T., Loboda T., Golli T., Fuchsová P., Rohlenová M., Haberlova J. 2021. Children and young adults with spinal muscular atrophy treated with nusinersen. European Journal of Paediatric Neurology, 30: 1 - 8 Otaki Y., Watanabe T., Takahashi H., Hasegawa H., Honda S., Funayama A., Netsu S., Ishino M., Arimoto T., Shishido T., Miyashita T., Miyamoto T., Konta T., Kubota I. 2013. Acidic urine is associated with poor prognosis in patients with chronic heart failure. Heart and Vessels, 28, 6: 735-741 Page A.J., Cummins C.A., Hunt M., Wong V.K., Reuter S., Holden M.T., Fookes M., Falush D., Keane J.A., Parkhill J. 2015. Roary: rapid large-scale prokaryote pan genome analysis. Bioinformatics (Oxford, England), 31,22: 3691-3693 Pang Z., Chong J., Zhou G., De Lima Morais D.A., Chang L., Barrette M., Gauthier C., Jacques P.É., Li S., Xia, J. 2021. MetaboAnalyst 5.0: narrowing the gap between raw spectra and functional insights. Nucleic Acids Research, 49, W1: W388-W396 Paradis A.N., Gay M.S., Wilson C.G., Zhang L. 2015. Newborn hypoxia/anoxia inhibits cardiomyocyte proliferation and decreases cardiomyocyte endowment in the developing heart: role of endothelin-1. PloS One, 10, 2: e0116600, doi: 10.1371/journal.pone.0116600, 21p. Parks D.H., Imelfort M., Skennerton C.T., Hugenholtz P., Tyson G.W. 2015. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Research, 25, 7: 1043-1055 Patterson J., Carpenter E.J., Zhu Z., An D., Liang X., Geng C., Drmanac R., Wong G.K. 2019. Impact of sequencing depth and technology on de novo RNA-Seq assembly. BMC Genomics, 20, 1: 604, doi: 10.1186/s12864-019-5965-x, 14 p. Peisl B.Y.L., Schymanski E.L., Wilmes P. 2018. Dark matter in host-microbiome metabolomics: Tackling the unknowns-a review. Analytica Chimica Acta, 1037: 13-27 Peng Y., Leung H.C., Yiu S.M., Chin F.Y. 2012. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics (Oxford, England), 28, 11: 1420-1428 195 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Pera M.C., Coratti G., Forcina N., Mazzone E.S., Scoto M., Montes J., Pasternak A., Mayhew A., Messina S., Sframeli M., Main M., Lofra R.M., Duong T., Ramsey D., Dunaway S., Salazar R., Fanelli L., Civitello M., De Sanctis R., Antonaci L., Lapenta L., Lucibello S., Pane M., Day J., Darras B.T., De Vivo D.C., Muntoni F., Finkel R., Mercuri E. 2017. Content validity and clinical meaningfulness of the HFMSE in spinal muscular atrophy. BMC Neurology, 17, 1:39, doi: 10.1186/s12883-017-0790-9, 10 p. Perrone S., Negro S., Laschi E., Calderisi M., Giordano M., De Bernardo G., Parigi G., Toni A.L., Esposito S., Buonocore G. 2021. Metabolomic profile of young adults born preterm. Metabolites, 11, 10: 697, doi: 10.3390/metabo11100697, 11 p. Pialoux V., Mounier R., Rock E., Mazur A., Schmitt L., Richalet J.P., Robach P., Coudert J., Fellmann N. 2009. Effects of acute hypoxic exposure on prooxidant/antioxidant balance in elite endurance athletes. International Journal of Sports Medicine, 30, 2: 87-93 Powers S.K., Nelson W.B., Hudson M.B. 2011. Exercise-induced oxidative stress in humans: cause and consequences. Free Radical Biology & Medicine, 51, 5: 942-950 Price N.D., Magis A.T., Earls J.C., Glusman G., Levy R., Lausted C., Mcdonald D.T., Kusebauch U., Moss C.L., Zhou Y., Qin S., Moritz R.L., Brogaard K., Omenn G.S., Lovejoy J.C., Hood L. 2017. A wellness study of 108 individuals using personal, dense, dynamic data clouds. Nature Biotechnology, 35, 8: 747-756 Pushpass R.G., Alzoufairi S., Jackson K.G., Lovegrove J.A. 2021. Circulating bile acids as a link between the gut microbiota and cardiovascular health: impact of prebiotics, probiotics and polyphenol-rich foods. Nutrition Research Reviews: 1-20 Qin J., Li R., Raes J., Arumugam M., Burgdorf K.S., Manichanh C., Nielsen T., Pons N., Levenez F., Yamada T., Mende D.R., Li J., Xu J., Li S., Li D., Cao J., Wang B., Liang H., Zheng H., Xie Y., Tap J., Lepage P., Bertalan M., Batto J.M., Hansen T., Le Paslier D., Linneberg A., Nielsen H.B., Pelletier E., Renault P., Sicheritz-Ponten T., Turner K., Zhu H., Yu C., Li S., Jian M., Zhou Y., Li Y., Zhang X., Li S., Qin N., Yang H., Wang J., Brunak S., Doré J., Guarner F., Kristiansen K., Pedersen O., Parkhill J., Weissenbach J., Consortium M., Bork P., Ehrlich S.D., Wang, J. 2010. A human gut microbial gene catalogue established by metagenomic sequencing. Nature, 464: 59-65 Qin J., Li Y., Cai Z., Li S., Zhu J., Zhang F., Liang S., Zhang W., Guan Y., Shen D., Peng Y., Zhang D., Jie Z., Wu W., Qin Y., Xue W., Li J., Han L., Lu D., Wu P., Dai Y., Sun X., Li Z., Tang A., Zhong S., Li X., Chen W., Xu R., Wang M., Feng Q., Gong M., Yu J., Zhang Y., Zhang M., Hansen T., Sanchez G., Raes J., Falony G., Okuda S., Almeida M., Lechatelier E., Renault P., Pons N., Batto J.M., Zhang Z., Chen H., Yang R., Zheng W., Yang H., Wang J., Ehrlich S.D., Nielsen R., Pedersen O., Kristiansen K. 2012. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature, 490: 55-60 Qin N., Yang F., Li A., Prifti E., Chen Y., Shao L., Guo J., Le Chatelier E., Yao J., Wu L., Zhou J., Ni S., Liu L., Pons N., Batto J.M., Kennedy S.P., Leonard P., Yuan C., Ding W., Hu X., Zheng B., Qian G., Xu W., Ehrlich S.D., Zheng S., Li, L. 2014. Alterations of the human gut microbiome in liver cirrhosis. Nature, 513: 59-64 196 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Qiu S., Cai X., Sun Z., Li L., Zuegel M., Steinacker J.M., Schumann U. 2017. Heart rate recovery and risk of cardiovascular events and all-cause mortality: A meta-analysis of prospective cohort studies. Journal of the American Heart Association, 6, 5: e005505, doi: 10.1161/JAHA.117.005505, 16 p. Quince C., Walker A.W., Simpson J.T., Loman N.J., Segata N. 2017. Shotgun metagenomics, from sampling to analysis. Nature Biotechnology, 35, 12: 1211-1211 Ramdas S., Servais L. 2020. New treatments in spinal muscular atrophy: an overview of currently available data. Expert Opinion on Pharmacotherapy, 21,3: 307-315 Richter M., Rosselló-Móra R., Oliver Glöckner F., Peplies J. 2016. JSpeciesWS: a web server for prokaryotic species circumscription based on pairwise genome comparison. Bioinformatics (Oxford, England), 32, 6: 929-931 Rigo F., Hua Y., Krainer A.R., Bennett C.F. 2012. Antisense-based therapy for the treatment of spinal muscular atrophy. The Journal of Cell Biology, 199, 1: 21-25 Rittweger J., Debevec T., Frings-Meuthen P., Lau P., Mittag U., Ganse B., Ferstl P.G., Simpson E.J., Macdonald I.A., Eiken O., Mekjavic I.B. 2016. On the combined effects of normobaric hypoxia and bed rest upon bone and mineral metabolism: Results from the PlanHab study. Bone, 91: 130-138 Roager H.M., Hansen L.B.S., Bahl M.I., Frandsen H.L., Carvalho V., Gobel R.J., Dalgaard M.D., Plichta D.R., Sparholt M.H., Vestergaard H., Hansen T., Sicheritz-Ponten T., Nielsen H.B., Pedersen O., Lauritzen L., Kristensen M., Gupta R., Licht T.R. 2016. Colonic transit time is related to bacterial metabolism and mucosal turnover in the gut. Nature Microbiology, 1, 9:16093, doi: 10.1038/nmicrobiol.2016.93, 9 p. Robinson M.M., Dasari S., Konopka A.R., Johnson M.L., Manjunatha S., Esponda R.R., Carter R.E., Lanza I.R., Nair, K.S. 2017. Enhanced protein translation underlies improved metabolic and physical adaptations to different exercise training modes in young and old humans. Cell Metabolism, 25, 3: 581-592 Rodriguez-R L.M., Gunturu S., Harvey W.T., Rosselló-Mora R., Tiedje J.M., Cole J.R., Konstantinidis K.T. 2018. The microbial genomes atlas (MiGA) webserver: taxonomic and gene diversity analysis of Archaea and Bacteria at the whole genome level. Nucleic Acids Research, 46, W1:W282-W288 Rohart F., Gautier B., Singh A., Lê Cao K.A. 2017. mixOmics: An R package for ‘omics feature selection and multiple data integration. PLoS Computational Biology. 13, 11: e1005752, doi: 10.1371/journal.pcbi.1005752, 19 p. Rubic T., Lametschwandtner G., Jost S., Hinteregger S., Kund J., Carballido-Perrig N., Schwärzler C., Junt T., Voshol H., Meingassner J.G., Mao X., Werner G., Rot A., Carballido J.M. 2008. Triggering the succinate receptor GPR91 on dendritic cells enhances immunity. Nature Immunology, 9, 11: 1261-1269 Ruiz-Perez C.A., Conrad R.E., Konstantinidis K.T. 2021. MicrobeAnnotator: a user-friendly, comprehensive functional annotation pipeline for microbial genomes. BMC Bioinformatics, 22, 1: 11, doi: 10.1186/s12859-020-03940-5, 16 p. 197 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Rühlemann M.C., Hermes B.M., Bang C., Doms S., Moitinho-Silva L., Thingholm L.B., Frost F., Degenhardt F., Wittig M., Kässens J., Weiss F.U., Peters A., Neuhaus K., Völker U., Völzke H., Homuth G., Weiss S., Grallert H., Laudes M., Lieb W., Haller D., Lerch M.M., Baines J.F., Franke A. 2021. Genome-wide association study in 8,956 German individuals identifies influence of ABO histo-blood groups on gut microbiome. Nature Genetics, 53, 2: 147-155 Sallis J.F., Bull F., Guthold R., Heath G.W., Inoue S., Kelly P., Oyeyemi A.L., Perez L.G., Richards J., Hallal P.C., Committee L.P.A.S.E. 2016. Progress in physical activity over the olympic quadrennium. Lancet (London, England), 388, 10051: 1325-1336 Samuel G., Reeves P. 2003. Biosynthesis of O-antigens: genes and pathways involved in nucleotide sugar precursor synthesis and O-antigen assembly. Carbohydrate Research, 338, 23: 2503-2519 Scafidi S., Fiskum G., Lindauer S.L., Bamford P., Shi D., Hopkins I., Mckenna M.C. 2010. Metabolism of acetyl-L-carnitine for energy and neurotransmitter synthesis in the immature rat brain. Journal of Neurochemistry, 114, 3: 820-831 Scheer M., Bischoff A.M., Kruzliak P., Opatrilova R., Bovell D., Büsselberg D. 2016. Creatine and creatine pyruvate reduce hypoxia-induced effects on phrenic nerve activity in the juvenile mouse respiratory system. Experimental and Molecular Pathology, 101, 1: 157-162 Scheperjans F., Aho V., Pereira P.A., Koskinen K., Paulin L., Pekkonen E., Haapaniemi E., Kaakkola S., Eerola-Rautio J., Pohja M., Kinnunen E., Murros K., Auvinen P. 2015. Gut microbiota are related to Parkinson's disease and clinical phenotype. Movement Disorders: Official Journal of the Movement Disorder Society, 30, 3: 350-358 Scher J.U., Sczesnak A., Longman R.S., Segata N., Ubeda C., Bielski C., Rostron T., Cerundolo V., Pamer E.G., Abramson S.B., Huttenhower C., Littman D.R. 2013. Expansion of intestinal Prevotella copri correlates with enhanced susceptibility to arthritis. Elife, 2: e01202, doi: 10.7554/eLife.01202, 20 p. Schloss P.D. 2021. Amplicon sequence variants artificially split bacterial genomes into separate clusters. MSphere, 6, 4: e0019121, doi: 10.1101/2021.02.26.433139, 6p. Schloss P.D., Westcott S.L., Ryabin T., Hall J.R., Hartmann M., Hollister E.B., Lesniewski R.A., Oakley B.B., Parks D.H., Robinson C.J., Sahl J.W., Stres B., Thallinger G.G., Van Horn D.J., Weber C.F. 2009. Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and Environmental Microbiology, 75, 23: 7537-7541 Schmidt H.H.H.W. 2021. The end of medicine as we know it - and why your health has a future. Springer Cham: 291 p. Schmidt T.S.B., Raes J., Bork P. 2018. The human gut microbiome: from association to modulation. Cell, 172, 6: 1198-1215 Schranner D., Kastenmuller G., Schonfelder M., Romisch-Margl W., Wackerhage H. 2020. Metabolite concentration changes in humans after a bout of exercise: a systematic review of exercise metabolomics studies. Sports Medicine - Open, 6, 1: 11, doi: 10.1186/s40798-020-0238-4, 17 p. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England) 30, 14: 2068-2069 198 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Segata N., Waldron L., Ballarini A., Narasimhan V., Jousson O., Huttenhower C. 2012. Metagenomic microbial community profiling using unique clade-specific marker genes. Nature Methods, 9, 8: 811-814 Senn T., Hazen S.L., Tang W.H. 2012. Translating metabolomics to cardiovascular biomarkers. Progress in Cardiovascular Diseases, 55, 1: 70-76 Shah J., Jefferies A.L., Yoon E.W., Lee S.K., Shah P.S., Network C.N. 2015. Risk factors and outcomes of late-onset bacterial sepsis in preterm neonates born at. American Journal of Perinatology, 32, 7: 675-682 Shen M., Xiao Y., Golbraikh A., Gombar V.K., T Ropsha, A. 2003. Development and validation of k-nearest-neighbor QSPR models of metabolic stability of drug candidates. Journal of Medicinal Chemistry, 46, 14: 3013-3020 Shimodaira M., Okaniwa S., Nakayama T. 2017. Fasting single-spot urine pH is associated with metabolic syndrome in the Japanese population. Medical Principles and Practice: International Journal of the Kuwait University, Health Science Centre, 26, 433-437 Sibomana I., Foose D.P., Raymer M.L., Reo N.V., Karl J.P., Berryman C.E., Young A.J., Pasiakos S.M., Mauzy C.A. 2021. Urinary metabolites as predictors of acute mountain sickness severity. Frontiers in Physiology, 12: 709804, doi: 10.3389/fphys.2021.709804, 11 p. Sidey-Gibbons J.A.M., Sidey-Gibbons C.J. 2019. Machine learning in medicine: a practical introduction. BMC Medical Research Methodology, 19, 1: 64, doi: 10.1186/s12874-019-0681-4, 18 p. Sieber C.M.K., Probst A.J., Sharrar A., Thomas B.C., Hess M., Tringe S.G., Banfield J.F. 2018. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nature Microbiology, 3, 7: 836-843 Singh A., Shannon C.P., Gautier B., Rohart F., Vacher M., Tebbutt S.J., Lê Cao K.A. 2019. DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics (Oxford, England), 35, 17: 3055-3062 Sinha R., Abu-Ali G., Vogtmann E., Fodor A.A., Ren B., Amir A., Schwager E., Crabtree J., Ma S., Consortium M.Q.C.P., Abnet C.C., Knight R., White O., Huttenhower C. 2017. Assessment of variation in microbial community amplicon sequencing by the microbiome quality control (MBQC) project consortium. Nature Biotechnology, 35, 11: 1077-1086 Sket R., Debevec T., Kublik S., Schloter M., Schoeller A., Murovec B., Mikus K.V., Makuc D., Pecnik K., Plavec J., Mekjavic I.B., Eiken O., Prevorsek Z., Stres B. 2018. Intestinal metagenomes and metabolomes in healthy young males: inactivity and hypoxia generated negative physiological symptoms precede microbial dysbiosis. Frontiers in Physiology, 9: 198, doi:10.3389/fphys.2018.00198, 16 p. Sket R., Treichel N., Debevec T., Eiken O., Mekjavic I., Schloter M., Vital M., Chandler J., Tiedje J.M., Murovec B., Prevorsek Z., Stres B. 2017a. Hypoxia and inactivity related physiological changes (constipation, inflammation) are not reflected at the level of gut metabolites and butyrate producing microbial community: The PlanHab study. Frontiers in Physiology, 8: 250, doi: 10.3389/fphys.2017.00250, 16 p. 199 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Sket R., Treichel N., Kublik S., Debevec T., Eiken O., Mekjavic I., Schloter M., Vital M., Chandler J., Tiedje J.M., Murovec B., Prevorsek Z., Likar M., Stres, B. 2017b. Hypoxia and inactivity related physiological changes precede or take place in absence of significant rearrangements in bacterial community structure: The PlanHab randomized trial pilot study. Plos One, 12, 12: e0188556, 26 p. Smeriglio P., Langard P., Querin G., Biferi M.G. 2020. The identification of novel biomarkers is required to improve adult SMA patient stratification, diagnosis and treatment. Journal of Personalized Medicine, 10, 3: 75, doi:10.3390/jpm10030075, 23 p. Sonntag B., Stolze B., Heinecke A., Luegering A., Heidemann J., Lebiedz P., Rijcken E., Kiesel L., Domschke W., Kucharzik T., Maaser C. 2007. Preterm birth but not mode of delivery is associated with an increased risk of developing inflammatory bowel disease later in life. Inflammatory bowel diseases, 13, 11: 1385-1390 Sotiridis A. 2019. Independent and combined effects of heat and hypoxic acclimation on exercise performance in humans: with particular reference to cross-adaption. Doctoral Dissertation. Ljubljana, Institute Jozef Stefan: 190 p. Sotiridis A., Debevec T., Ciuha U., Eiken O., Mekjavic I.B. 2019. Heat acclimation does not affect maximal aerobic power in thermoneutral normoxic or hypoxic conditions. Experimental Physiology, 104, 3: EP087268, doi: 10.1113/EP087268, 14 p. Sotiridis A., Debevec T., Ciuha U., Mcdonnell A.C., Mlinar T., Royal J.T., Mekjavic I.B. 2020. Aerobic but not thermoregulatory gains following a 10-day moderate-intensity training protocol are fitness level dependent: A cross-adaptation perspective. Physiological Reports, 8, 3: e14355, doi: 10.14814/phy2.14355, 17 p. Sotiridis A., Debevec T., Mcdonnell A.C., Ciuha U., Eiken O., Mekjavic I.B. 2018. Exercise cardiorespiratory and thermoregulatory responses in normoxic, hypoxic and hot environment following 10-day continuous hypoxic exposure. Journal of applied physiology (Bethesda, Md.: 1985), 125: 1284–1295 Spiering B.A., Kraemer W.J., Hatfield D.L., Vingren J.L., Fragala M.S., Ho J.Y., Thomas G.A., Häkkinen K., Volek J.S. 2008. Effects of L-carnitine L-tartrate supplementation on muscle oxygenation responses to resistance exercise. Journal of Strength and Conditioning Research, 22, 4: 1130-1135 Staley C., Weingarden A.R., Khoruts A., Sadowsky M.J. 2017. Interaction of gut microbiota with bile acid metabolism and its influence on disease states. Applied Microbiology and Biotechnology, 101, 1: 47-64 Stres B., Kronegger L. 2019. Shift in the paradigm towards next-generation microbiology. Fems Microbiology Letters, 366, 15: fnz159, doi:10.1093/femsle/fnz159, 9 p. Sugarman E.A., Nagan N., Zhu H., Akmaev V.R., Zhou Z., Rohlfs E.M., Flynn K., Hendrickson B.C., Scholl T., Sirko-Osadsa D.A., Allitto B.A. 2012. Pan-ethnic carrier screening and prenatal diagnosis for spinal muscular atrophy: clinical laboratory analysis of >72,400 specimens. European journal of human genetics: EJHG, 20, 1: 27-32 200 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Sun S., Jones R.B., Fodor A.A. 2020. Inference-based accuracy of metagenome prediction tools varies across sample types and functional categories. Microbiome, 8: 46, doi: https://doi:org/10.1186/s40168-020-00815-y, 9 p. Susnow R.G., Dixon S.L. 2003. Use of robust classification techniques for the prediction of human cytochrome P450 2D6 inhibition. Journal of Chemical Information and Computer Sciences 43, 4: 1308-1315 Svedenkrans J., Henckel E., Kowalski J., Norman M., Bohlin K. 2013. Long-term impact of preterm birth on exercise capacity in healthy young men: a national population-based cohort study. PloS One, 8, 12: e80869, doi: 10.1371/journal.pone.0080869, 10 p. Šket R., Deutsch L., Prevoršek Z., Mekjavić I.B., Plavec J., Rittweger J., Debevec T., Eiken O., Stres B. 2020. Systems view of deconditioning during spaceflight simulation in the PlanHab project: the departure of urine 1H-NMR metabolomes from healthy state in young males subjected to bedrest inactivity and hypoxia. Frontiers in physiology, 11: 532271, doi: 10.3389/fphys.2020.532271, 15 p. Tabone M., Bressa C., García-Merino J.A., Moreno-Pérez D., Van E.C., Castelli F.A., Fenaille F., Larrosa M. 2021. The effect of acute moderate-intensity exercise on the serum and fecal metabolomes and the gut microbiota of cross-country endurance athletes. Scientific Reports, 11, 1: 3558, doi: 10.1038/s41598-021-82947-1, 12 p. Tang W.H., Wang Z., Levison B.S., Koeth R.A., Britt E.B., Fu X., Wu Y., Hazen S.L. 2013. Intestinal microbial metabolism of phosphatidylcholine and cardiovascular risk. The New England Journal of Medicine, 368, 17: 1575-1584 Tannahill G.M., Curtis A.M., Adamik J., Palsson-Mcdermott E.M., Mcgettrick A.F., Goel G., Frezza C., Bernard N.J., Kelly B., Foley N.H., Zheng L., Gardet A., Tong Z., Jany S.S., Corr S.C., Haneklaus M., Caffrey B.E., Pierce K., Walmsley S., Beasley F.C., Cummins E., Nizet V., Whyte M., Taylor C.T., Lin H., Masters S.L., Gottlieb E., Kelly V.P., Clish C., Auron P.E., Xavier R.J., O'neill L.A. 2013. Succinate is an inflammatory signal that induces IL-1β through HIF-1α. Nature, 496: 238-242 Tebani A., Afonso C., Marret S., Bekri S. 2016. Omics-based strategies in precision medicine: toward a paradigm shift in inborn errors of metabolism investigations. International Journal of Molecular Sciences, 17, 9: 1555, doi: 10.3390/ijms17091555, 27 p. Teeling H., Meyerdierks A., Bauer M., Amann R., Glöckner F.O. 2004. Application of tetranucleotide frequencies for the assignment of genomic fragments. Environmental Microbiology, 6, 9: 938-947 Ten V.S. 2017. Mitochondrial dysfunction in alveolar and white matter developmental failure in premature infants. Pediatric Research, 81, 2, 286-292 Teschendorff A.E. 2019. Avoiding common pitfalls in machine learning omic data science. Nature Materials, 18, 5: 422-427 Thapa C., Camtepe S. 2021. Precision health data: Requirements, challenges and existing techniques for data security and privacy. Computers in Biology and Medicine, 129: 104130, doi: 10.1016/j.compbiomed.2020.104130, 19 p. 201 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Thornton C., Hutter F., Hoos H.H., Eyton-Brown K. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms: 1208.3719v2, doi: https://doi:org/10.48550/arXiv.1208.3719, 9 p. Tian Q., Corkum A.E., Moaddel R., Ferrucci L. 2021. Metabolomic profiles of being physically active and less sedentary: a critical review. Metabolomics: Official Journal of the Metabolomic Society, 17, 7: 68, doi: 10.1007/s11306-021-01818-y, 16 p. Tigchelaar E.F., Bonder M.J., Jankipersadsing A., Fu J., Wijmenga C., Zhernakova A. 2016. Gut microbiota composition associated with stool consistency. Gut, 65, 3: 540-542 Tingleff T., Vikanes Å., Räisänen S., Sandvik L., Murzakanova G., Laine K. 2021. Risk of preterm birth in relation to history of preterm birth: a population-based registry study of 213 335 women in Norway. BJOG: An International Journal of Obstetrics and Gynaecology, 129: 900-907, doi: 10.1111/1471-0528.17013, 8 p. Trygg J., Wold, S. 2002. Orthogonal projections to latent structures (O-PLS). Journal of Chemometrics, 16, 3: 119-128 Tsamardinos I., Charonyktakis P., Lakiotaki K., Borboudakis G., Zenklusen J.C., Juhl H., Chatzaki E., Lagani, V. 2020. Just add data: Automated predictive modeling and biosignature discovery. bioRxiv: doi: 10.1101/2020.05.04.075747, 46 p. Tsamardinos I., Charonyktakis P., Papoutsoglou G., Borboudakis G., Lakiotaki K., Zenklusen J. C., Juhl H., Chatzaki E., Lagani V. 2022. Just add data: automated and predicitve modeling for knowledge discovery and feature selection. Npj Precision Oncology, 6: 38, doi: https://doi.org/10.1038/s41698-022-00274-8, 17 p. Turner C.E., Byblow W.D., Gant N. 2015. Creatine supplementation enhances corticomotor excitability and cognitive performance during oxygen deprivation. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 35, 4: 1773-1780 Töpfer N., Kleessen S., Nikoloski Z. 2015. Integration of metabolomics data into metabolic networks. Frontiers in Plant Science, 6: 49, doi: https://doi:org/10.3389/fpls.2015.00049, 13 p. Ussher J.R., Lopaschuk G.D., Arduini A. 2013. Gut microbiota metabolism of L-carnitine and cardiovascular risk. Atherosclerosis, 231, 2: 456-461 Vandeputte D., Falony G., Vieira-Silva S., Tito R.Y., Joossens M., Raes J. 2016. Stool consistency is strongly associated with gut microbiota richness and composition, enterotypes and bacterial growth rates. Gut, 65, 1: 57-62 Vignoli A., Ghini V., Meoni G., Licari C., Takis P.G., Tenori L., Turano P., Luchinat C. 2019. High-throughput metabolomics by 1D NMR. Angewandte Chemie (International Ed. in English), 58, 4: 968-994 Volkova S., Matos M.R.A., Mattanovich M., Marín De Mas I. 2020. Metabolic modelling as a framework for metabolomics data integration and analysis. Metabolites, 10, 8:303, doi: 10.3390/metabo10080303, 27 p. Vrijlandt E.J., Gerritsen J., Boezen H.M., Grevink R.G., Duiverman E.J. 2006. Lung function and exercise capacity in young adults born prematurely. American Journal of Respiratory and Critical Care Medicine, 173, 8: 890-896 202 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Wang Q., Wang K., Wu W., Giannoulatou E., Ho J.W.K., Li L. 2019. Host and microbiome multi-omics integration: applications and methodologies. Biophysical Reviews, 11, 1: 55-65 Wang T.J., Larson M.G., Vasan R.S., Cheng S., Rhee E.P., Mccabe E., Lewis G.D., Fox C.S., Jacques P.F., Fernandez C., O'donnell C.J., Carr S.A., Mootha V.K., Florez J.C., Souza A., Melander O., Clish C.B., Gerszten R.E. 2011a. Metabolite profiles and the risk of developing diabetes. Nature Medicine 17, 4: 448-453 Wang Y., Lê Cao K.A. 2020. Managing batch effects in microbiome data. Briefings in Bioinformatics, 21, 6: 1954-1970 Wang Z., Klipfell E., Bennett B.J., Koeth R., Levison B.S., Dugar B., Feldstein A.E., Britt E.B., Fu X., Chung Y.M., Wu Y., Schauer P., Smith J.D., Allayee H., Tang W.H., Didonato J.A., Lusis A.J., Hazen S.L. 2011b. Gut flora metabolism of phosphatidylcholine promotes cardiovascular disease. Nature, 472, 7341: 57-63 Waring J., Lindvall C., Umeton R. 2020. Automated machine learning: Review of the state-of-the-art and opportunities for healthcare. Artificial Intelligence in Medicine, 104: 101822, doi: 10.1016/j.artmed.2020.101822, 12 p. Wemheuer F., Taylor J.A., Daniel R., Johnston E., Meinicke P., Thomas T., Wemheuer B. 2020. Tax4Fun2: prediction of habitat-specific functional profiles and functional redundancy based on 16S rRNA gene sequences. Environmental Microbiome, 15, 1: 11, doi: 10.1186/s40793-020-00358-7, 12 p. Whipps J.M., Lewis K., Cooke R.C. 1988. Mycoparasitism and plant disease control. In: Fungi in Biological Control Systems. Burge N.M. (ed). Manchester, Manchester University Press: 161-187 Wilken B., Ramirez J.M., Richter D.W., Hanefeld F. 2022. The response to hypoxia is affected by creatine in the central respiratory network of mammals 251. Pediatric Research, 40, 3: 557-557 Wilkinson J.E., Franzosa E.A., Everett C., Li C., Trainees H.R.A., Investigators H., Hu F.B., Wirth D.F., Song M., Chan A.T., Rimm E., Garrett W.S., Huttenhower C. 2021. A framework for microbiome science in public health. Nature Medicine, 27, 5: 766-774 Wirbel J., Pyl P.T., Kartal E., Zych K., Kashani A., Milanese A., Fleck J.S., Voigt A.Y., Palleja A., Ponnudurai R., Sunagawa S., Coelho L.P., Schrotz-King P., Vogtmann E., Habermann N., Niméus E., Thomas A.M., Manghi P., Gandini S., Serrano D., Mizutani S., Shiroma H., Shiba S., Shibata T., Yachida S., Yamada T., Waldron L., Naccarati A., Segata N., Sinha R., Ulrich C.M., Brenner H., Arumugam M., Bork P., Zeller G. 2019. Meta-analysis of fecal metagenomes reveals global microbial signatures that are specific for colorectal cancer. Nature Medicine, 25, 4: 679-689 Wishart D.S. 2008. Quantitative metabolomics using NMR. Trac-Trends in Analytical Chemistry, 27, 3: 228-237 Wishart D.S. 2019. NMR metabolomics: A look ahead. Journal of Magnetic Resonance (San Diego, Calif.: 1997), 306: 155-161 Wishart, D.S., Feunang, Y.D., Marcu, A., Guo, A.C., Liang, K., Vazquez-Fresno, R., Sajed, T., Johnson, D., Li, C., Karu, N., Sayeeda, Z., Lo, E., Assempour, N., Berjanskii, M., Singhal, S., Arndt, D., Liang, Y., Badran, H., Grant, J., Serra-Cayuela, A., Liu, Y., Mandal, R., Neveu, V., 203 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 Pon, A., Knox, C., Wilson, M., Manach, C., and Scalbert, A. 2018. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Research 46: D608-D617 Wishart D.S., Guo A., Oler E., Wang F., Anjum A., Peters H., Dizon R., Sayeeda Z., Tian S., Lee B.L., Berjanskii M., Mah R., Yamamoto M., Jovel J., Torres-Calzada C., Hiebert-Giesbrecht M., Lui V.W., Varshavi D., Varshavi D., Allen D., Arndt D., Khetarpal N., Sivakumaran A., Harford K., Sanford S., Yee K., Cao X., Budinski Z., Liigand J., Zhang L., Zheng J., Mandal R., Karu N., Dambrova M., Schiöth H.B., Greiner R., Gautam V. 2022. HMDB 5.0: the human metabolome database for 2022. Nucleic Acids Research, 50: D622-D631 Wishart D.S., Jewison T., Guo A.C., Wilson M., Knox C., Liu Y., Djoumbou Y., Mandal R., Aziat F., Dong E., Bouatra S., Sinelnikov I., Arndt D., Xia J., Liu P., Yallou F., Bjorndahl T., Perez-Pineiro R., Eisner R., Allen F., Neveu V., Greiner R., Scalbert A. 2013. HMDB 3.0--the human metabolome database in 2013. Nucleic Acids Research, 41: D801-D807 Wishart D.S., Tzur D., Knox C., Eisner R., Guo A.C., Young N., Cheng D., Jewell K., Arndt D., Sawhney S., Fung C., Nikolai L., Lewis M., Coutouly M.A., Forsythe I., Tang P., Shrivastava S., Jeroncic K., Stothard P., Amegbey G., Block D., Hau D.D., Wagner J., Miniaci J., Clements M., Gebremedhin M., Guo N., Zhang Y., Duggan G.E., Macinnis G.D., Weljie A.M., Dowlatabadi R., Bamforth F., Clive D., Greiner R., Li L., Marrie T., Sykes B.D., Vogel H.J., Querengesser, L. 2007. HMDB: the human metabolome database. Nucleic Acids Research, 35: D521-D526 Wold S., Sjostrom M., Eriksson L. 2001. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58, 2: 109-130 Wolfs T.G., Derikx J.P., Hodin C.M., Vanderlocht J., Driessen A., De Bruïne A.P., Bevins C.L., Lasitschka F., Gassler N., Van Gemert W.G., Buurman W.A. 2010. Localization of the lipopolysaccharide recognition complex in the human healthy and inflamed premature and adult gut. Inflammatory Bowel Diseases, 16, 1: 68-75 Wu Y.W. 2018. ezTree: an automated pipeline for identifying phylogenetic marker genes and inferring evolutionary relationships among uncultivated prokaryotic draft genomes. BMC Genomics, 19: 921, doi: 10.1186/s12864-017-4327-9, 10 p. Wu Y.W., Simmons B.A., Singer, S.W. 2016. MaxBin 2.0: an automated binning algorithm to recover genomes from multiple metagenomic datasets. Bioinformatics (Oxford, England), 32, 4: 605-607 Yang C., Chowdhury D., Zhang Z., Cheung W.K., Lu A., Bian Z., Zhang L. 2021. A review of computational tools for generating metagenome-assembled genomes from metagenomic sequencing data. Computational and Structural Biotechnology Journal, 19: 6301-6314 Yang Q., Wang Y., Zhang Y., Li F., Xia W., Zhou Y., Qiu Y., Li H., Zhu F. 2020. NOREVA: enhanced normalization and evaluation of time-course and multi-class metabolomic data. Nucleic Acids Research, 48: W436-W448 Yeo C.J.J., Darras B.T. 2020. Overturning the paradigm of spinal muscular atrophy as sust a motor neuron disease. Pediatric Neurology, 109: 12-19 Yu C.T., Chao B.N., Barajas R., Haznadar M., Maruvada P., Nicastro H.L., Ross S.A., Verma M., Rogers S., Zanetti K.A. 2022. An evaluation of the National Institutes of Health grants portfolio: identifying opportunities and challenges for multi-omics research that leverage metabolomics 204 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 data. Metabolomics: Official Journal of the Metabolomic Society, 18, 5: 29, doi: 10.1007/s11306-022-01878-8, 12 p. Zerbino D.R., Birney E. 2008. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research, 18, 5: 821-829 Zhang C., Yin A., Li H., Wang R., Wu G., Shen J., Zhang M., Wang L., Hou Y., Ouyang H., Zhang Y., Zheng Y., Wang J., Lv X., Wang Y., Zhang F., Zeng B., Li W., Yan F., Zhao Y., Pang X., Zhang X., Fu H., Chen F., Zhao N., Hamaker B.R., Bridgewater L.C., Weinkove D., Clement K., Dore J., Holmes E., Xiao H., Zhao G., Yang S., Bork P., Nicholson J.K., Wei H., Tang H., Zhao L. 2015. Dietary Modulation of gut microbiota contributes to alleviation of both genetic and simple obesity in children. EBioMedicine, 2, 8: 968-984 Zheng G., Price W.S. 2010. Solvent signal suppression in NMR. Progress in Nuclear Magnetic Resonance Spectroscopy, 56, 3: 267-288 Zheng X., Chen T., Zhao A., Ning Z., Kuang J., Wang S., You Y., Bao Y., Ma X., Yu H., Zhou J., Jiang M., Li M., Wang J., Ma X., Zhou S., Li Y., Ge K., Rajani C., Xie G., Hu C., Guo Y., Lu A., Jia W., Jia W. 2021. Hyocholic acid species as novel biomarkers for metabolic disorders. Nature Communications, 12, 1: 1487, doi: 10.1038/s41467-021-21744-w, 11 p. Zhou G., Ewald J., Xia, J. 2021. OmicsAnalyst: a comprehensive web-based platform for visual analytics of multi-omics data. Nucleic Acids Research, 49: W476-W482 Zitnik M., Nguyen F., Wang B., Leskovec J., Goldenberg A., Hoffman M.M. 2019. Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. An International Journal on Information Fusion, 50: 71-91 Ząbek A., Stanimirova I., Deja S., Barg W., Kowal A., Korzeniewska A., Orczyk-Pawiłowicz M., Baranowski D., Gdaniec Z., Jankowska R., Młynarz P. 2015. Fusion of the 1H NMR data of serum, urine and exhaled breath condensate in order to discriminate chronic obstructive pulmonary disease and obstructive sleep apnea syndrome. Metabolomics: Official Journal of the Metabolomic Society, 11, 6: 1563-1574 205 Deutsch L. Bioinformatics integration of microbiome and metabolomics data in the translational context. Doct. dissertation. Ljubljana, University of Ljubljana, Biotechnical Faculty, 2022 ACKNOWLEDGEMENTS First. I would like to thank you, my supervisor Prof. Blaž Stres, for your support, encouragement and guidance on professional, as well on personal level. This work could not be done without you. Second, I would like to thank all professors, colleagues, friends involved in my PhD journey: PhD thesis committee members Prof. Andrej Blejec, Prof. Gregor Anderluh and Prof. David Gomez Cabrero; Colleagues from University of Ljubljana Prof. Boštjan Murovec (Faculty of Electrical Engineering), Prof. Damjan Osredkar (Faculty of Medicine, University Children’s hospital), Prof. Tadej Debevec (Faculty of Sport), Prof. Sabina Kolbl Repinc (Faculty of Civil and Geodetic Engineering), Prof. Andrej Lavrenčič, Fani Oven and Ana Jakopič (Biotechnical Faculty); Colleagues from Jožef Stefan Institute Prof. Igor Mekjavić and Dr. Alexandros Sotiridis (National and Kapodistrian University of Athens). Colleagues from Slovenian NMR Centre (National Institute of Chemistry, Slovenia) Prof. Janez Plavec, Dr. Damjan Makuc, Uroš Javornik and Klemen Pečnik. Colleague from University Children’s hospital Dr. Robert Šket; Colleague from BioSistemika d.o.o. Dr. Zala Prevoršek; Colleague from Labena Dr. Tine Pokorn; Coworkers from Department of Animal Science, Biotechnical Faculty, University of Ljubljana. Thank you all. Third, thanks to organizations and institutions that supported the growth of my scientific network: Slovenian Research Agency for Young Research Fellowship (SRA #51867; MR+ call awarded to Prof. Blaž Stres) and European Cooperation in Science and Technology COST action CA18131 (Statistical and machine learning techniques in human microbiome studies (ML4Microbiome)). Fourth, thanks to high performance computing clusters SLING (Slovenian national supercomputing network) and HPC infrastructure of the University of Innsbruck. Fifth, thanks to all volunteers for providing their samples. Last but not least, my family and friends, thank you for your priceless support.