92
TO A GENERAL APPLICABLE MONOSACCHARIDE IDENTIFICATION USING TOCSY MATCHING Yannick Dandois Student number: 01109010 Supervisor(s): Prof. Dr. José Martins, Prof. Dr. Peter Dawyndt A dissertation submitted to Ghent University in partial fulfilment of the requirements for the degree of Master of Science in Chemistry Academic year: 2016 - 2017

TO A GENERAL APPLICABLE MONOSACCHARIDE IDENTIFICATION USING TOCSY … · 2017. 8. 3. · TO A GENERAL APPLICABLE MONOSACCHARIDE IDENTIFICATION USING TOCSY MATCHING Yannick Dandois

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • TO A GENERAL APPLICABLE

    MONOSACCHARIDE IDENTIFICATION

    USING TOCSY MATCHING

    Yannick Dandois Student number: 01109010

    Supervisor(s): Prof. Dr. José Martins, Prof. Dr. Peter Dawyndt

    A dissertation submitted to Ghent University in partial fulfilment of the requirements for the degree of

    Master of Science in Chemistry

    Academic year: 2016 - 2017

  • I

    Dankwoord

    De weg van de student is lang en zeker niet zonder heuvels. Maar door die enkelen blijft student

    zijn toch nog altijd leuk. De kans om aan een project te werken; die niet enkel pure chemie bevat

    maar ook andere aspecten in de wetenschap, van het programmeren tot de wiskunde die zich er

    telkens achter verschuilt, is een kans die ik enkel gekregen heb dankzij mijn twee uitzonderlijke

    promotoren. Zij hebben mij niet enkel de kans gegeven te werken aan een project waar zij beiden

    en ik volledig konden achterstaan, mijn promotoren hebben elke week opnieuw tijd vrijgemaakt

    om me persoonlijk te begeleiden. Hiervoor dank ik mijn promotoren José Martins en Peter

    Dawyndt.

    Tevens dank ik ook Niels Geudens; die me telkens een antwoord bood op de vragen die ik had.

    Het werken zonder begeleider geeft je vrijheid; maar laat je wel af en toe achter met talloze

    vragen. Niels was altijd aanwezig om me een antwoord te bieden, maar ook de andere mensen

    van de NMRSTR groep verdienen een bedanking.

    Ik dank ook Sofie Van Damme, voor de uitstekende hulp tijdens de conformatie-analyses.

    Ook mensen van buiten de UGent verdienen een bedanking; de dichte vriendengroep die al jaren

    elkaar bijstaat; Chiel, Anthony, Matthias, Rutger en co.

    Tot slot dank ik P. d. W.; dankzij jou ben ik chemie gaan studeren.

  • II

    Table of Contents

    Chapter 1: Introduction .................................................................................................................. 1

    1.1 Context of this project ...................................................................................................... 1

    1.2 Sugars, saccharides and carbohydrates: not only a source of energy ............................. 2

    1.3 Oligo- and polysaccharides .............................................................................................. 3

    1.4 Monosaccharides structure and conformation – a refresher .......................................... 5

    1.4.1 Different forms of monosaccharides ........................................................................ 6

    1.4.2 Variations in monosaccharides: the α vs β and D vs L .............................................. 7

    1.4.3 Variations in conformations ..................................................................................... 8

    1.5 Introduction towards TOCSY ............................................................................................ 8

    1.6 Content of this project ................................................................................................... 11

    Chapter 2: Current analysis methods ........................................................................................... 12

    2.1 Mass Spectroscopy methods ......................................................................................... 12

    2.2 HPLC methods ................................................................................................................ 13

    2.3 NMR standard methods ................................................................................................. 13

    2.4 The TOCSY-Matching approach (Gheysen et al.) ........................................................... 14

    2.5 Comparison of the techniques ....................................................................................... 16

    Chapter 3: Designing a new experiment ...................................................................................... 17

    3.1 TOCSY pulse sequence ................................................................................................... 17

    3.2 A novel experimental setup ........................................................................................... 17

    3.3 Achieving selectivity ....................................................................................................... 18

    3.4 p2D-sel TOCSY ................................................................................................................ 20

    3.5 p3D-Bsel TOCSY .............................................................................................................. 23

    3.6 Conclusion: p2D-sel TOCSY versus p3D-Bsel TOCSY ...................................................... 24

  • III

    Chapter 4: Data processing ........................................................................................................... 26

    4.1 Chunkification ................................................................................................................ 27

    4.1.1 Chunkification of the p2D-sel TOCSY ............................................................................ 28

    4.1.2 Chunkification of the p3D-Bsel TOCSY ......................................................................... 28

    4.2 Integration of the spectra .............................................................................................. 30

    4.2.1 Peak determination and noise size ......................................................................... 31

    4.2.2 Determination of the integral filter and integration of the spectra ....................... 31

    4.3 TQD filter ....................................................................................................................... 33

    Chapter 5: Saccharide comparison and clustering ....................................................................... 36

    5.1 Introduction.................................................................................................................... 36

    5.2 Towards a monosaccharide database ............................................................................ 36

    5.3 Monosaccharide comparison ......................................................................................... 37

    5.3.1 Curve comparison: the integration method ........................................................... 38

    5.3.2 Curve comparison: the Fréchet method ................................................................. 39

    5.4 Clustering of the monosaccharides ................................................................................ 40

    5.4.1 The first cluster: β-galactopyranose and α-L-arabinopyranose ............................. 42

    5.4.2 The second and third cluster: manno-, L-rhamno- and β-lyxopyranose ................ 43

    5.4.3 The fourth cluster: α-xylopyranose and α-glucopyranose ..................................... 45

    5.5 The usability of the technique on furanoses.................................................................. 46

    5.5.1 A computational analysis of arabinofuranose ........................................................ 46

    5.5.2 A spectral analysis of ribofuranose ......................................................................... 47

    5.6 Significance level the of technique ................................................................................ 48

    5.6.1 The repeated measurement approach ................................................................... 48

    5.6.2 The cluster approach .............................................................................................. 49

  • IV

    5.6.3 The significance level: conclusion ........................................................................... 50

    Chapter 6: Proof of concept & operation ..................................................................................... 51

    6.1 Sucrose (p2D) ................................................................................................................. 51

    6.2 A mixture of monosaccharides ...................................................................................... 52

    6.2.1 The John Doe sample (p2D and p3D) ..................................................................... 52

    6.2.2 The honey sample (p2D) ......................................................................................... 53

    6.3 Capsular polysaccharides ............................................................................................... 54

    Chapter 7: Conclusion and further research ................................................................................ 57

    7.1 General conclusion ......................................................................................................... 57

    7.2 Conclusion: the analysis of a furanose ........................................................................... 58

    7.3 Usability of the current processing tools ....................................................................... 59

    7.4 Further research and required software updates ......................................................... 59

    Chapter 8: Appendix ..................................................................................................................... 60

    8.1 GitHub manual ............................................................................................................... 60

    8.2 Cluster analysis data for the integration method .......................................................... 65

    8.3 Cluster analysis data for the Fréchet method ................................................................ 66

    8.4 Ten starting conformations of beta L-arabinofuranose ................................................. 67

    Chapter 9: References................................................................................................................... 68

    Chapter 10: Dutch summary – Nederlandstalige samenvatting .................................................. 70

    10.1 Een nieuwe experimentele aanpak ................................................................................ 70

    10.2 Geautomatiseerde verwerking van de spectra .............................................................. 71

    10.3 Besluit ............................................................................................................................. 71

    Scientific article: To A General Applicable Monosaccharide Identification Using TOCSY Matching

  • V

    List of abbreviations and technical terms

    Abbreviated monosaccharides and chemical compounds are not included in this list.

    1D-sel TOCSY One Dimensional selective TOCSY

    CCM Curve Comparison Method

    Chunk Part of the spectrum coming from one monosaccharide

    Curve Refers to the curve resulting after the integration of a multiplet. Sometimes called integration or mixing time curve.

    ESI-ITMS electrospray ionization ion trap mass spectrometry

    fqlist Parameter used by the pulse program containing all frequencies (p2D-sel TOCSY only)

    GAG GlycosAminoGlycans

    GC-MS Gas Chromatography Mass Spectroscopy

    HMBC Heteronuclear multiple-bond correlation spectroscopy

    HPLC High Performance Liquid Chromatography

    HSQC Heteronuclear single quantum coherence spectroscopy

    JD #1 John Doe sample #1; sample containing three different monosaccharides

    Max Curve Projection curve coming from one Chunk

    m% Mass percentage notation

    MS Mass Spectroscopy

    NMR Nuclear Magnetic Resonance

    p2D Refers to the p2D-sel TOCSY experiment

    p3D Refers to the p3D-Bsel TOCSY experiment

    p2D-sel TOCSY 1D-SEL TOCSY with a pseudo dimension in the mixing time

    p3D-Bsel TOCSY 2D band selective TOCSY with a pseudo dimension in the mixing time

    Resx Resolution in the x-dimension; number of points in the array

    rga Receiver Gain Acquisition (automatic determination)

    S/N-ratio Signal over Noise ratio, also abbreviated as ‘sino’

    SOx/SFOx Spectral (Field) Offset; Nyquist frequency in the x-dimension; the center of the acquisition window

    SWx Width of the spectral window in the x-dimension

    TOCSY TOtal Correlation Spectroscopy

    vclist Variable counter list; used for the different mixing times

  • VI

    List of figures

    Figure 1.1 Total sugar consumption worldwide (statista.com) ...................................................... 2

    Figure 1.2 Starch; a branched polysaccharide ................................................................................ 4

    Figure 1.3 Step by step identification ............................................................................................. 5

    Figure 1.4 The pyranose form of hexose saccharides (left) and the pentose saccharides (right) . 5

    Figure 1.5 Cyclization of the acyclic aldehyde to both the furanose and pyranose ....................... 6

    Figure 1.6 The cyclization of D-Glucose .......................................................................................... 6

    Figure 1.7 The alpha (left) and beta (right) D-Glucose form .......................................................... 7

    Figure 1.8 D (left) and L (right) glucose........................................................................................... 7

    Figure 1.9 The Haworth projection and both chair conformations of β-D-glucopyranose ............ 8

    Figure 1.10 Comparison of a 1D-sel TOCSY of the non-anomeric region of α-glucose with

    different mixing times (blue:43ms; green: 95ms) .......................................................................... 9

    Figure 1.11 α-D-Mannopyranose; note the angle between the H1 and H2 will cause a bottleneck

    ....................................................................................................................................................... 10

    Figure 1.13 The Karplus-relation ................................................................................................... 10

    Figure 1.13 Dihedral angle between four atoms .......................................................................... 10

    Figure 2.1 1D HNMR spectrum of Galactose ................................................................................ 13

    Figure 2.2 NMR flowchart for monosaccharide determination (Touckach, 2013) ...................... 14

    Figure 2.3 Gheysen determination table (60ms) .......................................................................... 14

    Figure 2.4 A 2D-TOCSY spectra of sucrose (700MHz; 60ms). The signal corresponding to the

    anomeric hydrogen and its cross peaks have been indicated. ..................................................... 15

    Figure 3.1 Default pulse program for a 2D TOCSY ........................................................................ 17

    Figure 3.2 An overlay of a regular 1D-HNMR (blue); a 1D-HNMR using 90° selective pulse (red)

    centered anomeric signal (yet partly exciting water). In order to improve the 90° selective pulse

    results, one can lengthen the pulse in order to increase selectivity to further reduce the

    intensity of the water signal. Last, a 1D-HNMR using selective spin-echo (green) in order to

    excite the entire anomeric region. ............................................................................................... 19

    Figure 3.3 2D-TOCSY spectra of JD #1 (100ms – 700MHz) with a superposed band selective

    indication. The anomeric hydrogen region on the diagonal has been indicated. ........................ 19

    file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439957file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439958file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439959file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439960file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439961file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439962file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439963file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439964file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439965file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439966file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439966file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439967file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439967file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439968file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439969file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439970file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439971file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439972file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439973file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439973file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439974file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439975file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439975file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439975file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439975file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439975file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439976file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439976

  • VII

    Figure 3.4 1D proton nmr of sucrose (green); 1D-sel TOCSY of sucrose (100ms) using selectivity

    on the anomeric hydrogen using both the dipsi-2 (red) as mlev-17 (blue) sequence on 700MHz.

    Only the non-anomeric region is shown. The effect of the z-filter in the DIPSI-2 sequence is

    clearly visible around 3.45ppm upon comparison of the DIPSI-2 and mlev-17 sequences. ........ 21

    Figure 3.5 A 3D (left) and 2D (right) representation of a p2D-sel TOCSY of β-glucopyranose (the

    time domain axes are scaled incorrectly) ..................................................................................... 22

    Figure 3.6 Data resulting from a p3D experiment (JD sample). Data identical to the p2D

    experiment has been indicated. ................................................................................................... 24

    Figure 4.1 Flowchart of the processing. ........................................................................................ 26

    Figure 4.2 p2D-sel TOCSY taken on alpha (bottom) and beta (top) glucose directly from Topspin

    ....................................................................................................................................................... 27

    Figure 4.3 The data structure of the p3D-Bsel TOCSY as a list of 2D-TOCSY after the dimension

    flip (the commas indicate that there are multiple elements in the array) .................................. 28

    Figure 4.4 The diagonal extracted automatically from JD#1 as priviously shown in Figure 3.2 .. 29

    Figure 4.5 Projection of one chunk to achieve optimal signal to noise ratio for each individual

    peak ............................................................................................................................................... 30

    Figure 4.6 Peak determination on the diagonal of the p3D of JD#1 ............................................ 31

    Figure 4.7 Peak region determination for the middle peak of a triplet, only depicted for the right

    side. (The noise limit is exaggerated and is only for demonstration purpoces) .......................... 32

    Figure 4.8 The integral filter obtained form the spectrum (α-glucose) ....................................... 33

    Figure 4.9 The application of the TQD-filter on Sucrose .............................................................. 34

    Figure 4.10 Chunk Plot of α-Glucose ............................................................................................ 35

    Figure 5.1 Integration method applied on two curves ................................................................. 39

    Figure 5.2 The Fréchet distance calculation (left) and two different curves with high Fréchet

    distance (right), yet still a high probability of similarity using the integration method ............... 39

    Figure 5.3 Frequency (left) and scatter (right top) plot of both methods of the probability two

    monosaccharides of the database arre identical. ........................................................................ 41

    Figure 5.4 HCA on the Integration Method Matrix, with the significance level of 0.75, the

    determined clusters are indicated in blue. ................................................................................... 42

    file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439977file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439977file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439977file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439977file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439978file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439978file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439979file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439979file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439980file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439981file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439981file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439982file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439982file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439983file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439984file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439984file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439985file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439986file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439986file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439987file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439988file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439989file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439990file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439991file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439991file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439992file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439992file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439993file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439993

  • VIII

    Figure 5.5 β-galactopyranose (left) and α-L-arabinopyranose (right) .......................................... 43

    Figure 5.6 Plot of the TOCSY cross correlation intensities vs mixing time of the first cluster ..... 43

    Figure 5.7 β-manno-, β-L-rhamno- and β-lyxopyranose respectively .......................................... 43

    Figure 5.8 α-manno-, α-L-rhamno- and α-lyxopyranose respectively (top) and both chair

    conformations of α-lyxose (bottom) ............................................................................................ 44

    Figure 5.9 Chunk plot of the monosaccharides in the second cluster ......................................... 44

    Figure 5.10 Chunk plot of α-glucose and α-xylose of the non-anomeric region. ......................... 45

    Figure 5.11 Polar Cremer-Pople plot of the initial (orange dots) conformations, with an arrow to

    there eventual conformation (blue globe). The lower the energy, the bigger the globe.The final

    conformaiton of 2 and 7 has been drawn on the left top. ........................................................... 47

    Figure 5.12 A chunk plot of both ribofuranoses. .......................................................................... 48

    Figure 5.13 Frequency plot of both methods of the similarity matrix ......................................... 49

    Figure 6.1 Results of a Sucrose; using integration method with the minima criterion, showing all

    curves but the anomeric signal. .................................................................................................... 51

    Figure 6.2 Two chunks of the p2D spectra of the JD sample showing overlap. ........................... 52

    Figure 6.3 Chunk plot of β-glucose (red), β-galactose (green) and β-xylose (blue) ..................... 53

    Figure 6.4 1D proton NMR of the 19F (blue) and 22F (red) seroptype. The 22F seroptype has

    been shifted 0.1 ppm to the right. ................................................................................................ 55

    Figure 7.1 1D proton NMR (red) and the result of a selective pulse as used by the p3D

    experiment (selgpse), with identical number of scans and scaling on the 19F sample. .............. 57

    Figure 7.2 p2D spectra of ribofuranose taken on a 700MHz spectrometer. ............................... 58

    file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439994file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439995file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439996file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439997file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439997file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439998file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484439999file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440000file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440000file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440000file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440001file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440002file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440003file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440003file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440004file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440005file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440006file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440006file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440007file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440007file:///C:/Users/yannick/Documents/_Documenten/UGent/Thesis/Writing/Thesis/Thesis.docx%23_Toc484440008

  • IX

    List of Tables

    Table 1 All monosaccharides in the database .............................................................................. 37

    Table 2 Standard deviation of the techniques on sucrose. .......................................................... 49

    Table 3 Processing hits for both the p2D and p3D experiment using the integration method and

    the minima critereon. ................................................................................................................... 52

    Table 4 Experimental settings and for the CPS for both the p2D and p3D experiment. ............. 55

  • 1

    Chapter 1: Introduction

    1.1 Context of this project

    Monosaccharide identification is not only used to identify and quantify monosaccharide

    composition, it is also a critical step in the analysis of large carbohydrates. The analysis of these

    carbohydrates is not only of high importance in the food industry, but also in medicinal chemistry.

    In medicinal chemistry, it is often the case that interactions between a saccharide compound and

    proteins are studied; yet the monosaccharide type and sequence is not always completely

    determined due to the high complexity of the compound and the costs of analysis.

    This work: Towards a general applicable monosaccharide identification using TOCSY matching

    is a collaboration between two research groups at the UGent, the NMRSTR Research group

    (department of Molecular and Macromolecular Chemistry) and the Computational Biology Lab

    (Department of Applied Mathematics, Computer Science and Statistics) with the aim to propose

    a fast, automatic identification and processing method for monosaccharides and polysaccharide

    composition. This work is not connected to an ongoing doctoral thesis, however the foundations

    have been laid by K. Gheysen (Gheysen, 2011), who has written her doctorate partially on this

    subject. The TOCSY-matching approach, the concept introduced by Gheysen et al., is revisited in

    this work and extended by developing an automated analysis. The TOCSY matching approach

    allows a fast monosaccharide identification without a full analysis of the spectrum. In its current

    implementation, analysis typically proceeds using a single 2D TOCSY at a specific mixing time.

    This analysis occurs in a manual, user defined mode and is therefore not only prone to user bias,

    but also limits the amount of different monosaccharides that can be examined. This is due to the

    fact that the amount of monosaccharides that were investigated was limited, which could cause

    misinterpretation of the spectra. In this work, a quantitative algorithm driven by the TOCSY

    matching is developed in order to avoid these drawbacks. In addition to having a larger pyranose

    database, the method has been extended to include a furanose sugar in order to assess the

    strengths and limitations of the technique itself. In order to achieve this, a novel experimental

    approach was developed, giving enhanced spectra that does not necessarily extends measuring

    time upon increased complexity.

  • 2

    1.2 Sugars, saccharides and carbohydrates: not only a source of energy

    A world without sugars is unthinkable in the 21st century. Every year the world population

    consumes about 170 million metric tons of sugar (statista.com, n.d.). Sugars are in nearly every

    food source, often added in order to further sweeten food (these are often referred to as added

    sugars).

    These added sugars are in almost every food product, going from yoghurt to soda drinks, a can

    of cola for example easily contains 10m% carbohydrates and depending of the soda, the sugar

    composition will be different. Normal Coca-Cola® contains high fructose corn syrup. This is a

    combination of different corn syrups that have been processed using enzymes. The coke in the

    ‘green bottle’, also known as the ‘Mexican Coke®’ is sweetened with sucrose (of which the word

    has a French origin: sucre + ose). Sucrose is a disaccharide extracted from Nature (from sugar

    canes for example) and often referred to as table sugar. These added sugars are added to food

    artificially, resulting in an extended knowledge of the carbohydrate composition.

    When sugars are naturally present in the food substance, knowing their carbohydrate

    composition i.e. the type and sequence of the monosaccharides is no longer straightforward.

    Natural variation occurs caused by different origins or even weather conditions. A complex

    example is honey, which is directly harvested from Nature and is almost entirely made from

    sugars (mostly mono- and disaccharides). The foraging bees collect sucrose rich nectar, of which

    they partially digest the sucrose into glucose and fructose (using digestive enzymes) in a process

    which is called regurgitation. While this process is well known, sugars not originating from

    Figure 1.1 Total sugar consumption worldwide (statista.com)

  • 3

    sucrose i.e. glucose and fructose are found in honey as well. Maltose for instance, a disaccharide

    of glucose, is an often occurring example. These variations in monosaccharide units in these

    higher sugars (di- and trisaccharides) result in a partially unknown composition.

    But sugars are also found and used outside the food industry, although in that context they are

    usually referred to as saccharides or carbohydrates as their taste is no longer of importance. They

    are often the subject of scientific research in the life sciences, as they interact with proteins and

    antibodies. Upon checking Web of Science, a ‘saccharide’ search resulted in 2500 hits for

    biochemistry and molecular biology alone (for the years 1955-2017). A common example for the

    usage of saccharides is cancer treatment under the form of chemotherapy (Calvaresi, 2013). The

    cancer cells can be targeted due to the fact that aggressive tumors demonstrate high glycolytic

    rates as they are rapidly dividing cells. By tagging saccharides with toxic molecules, they are used

    as a ‘Trojan horse’ to bring the toxic molecules inside the cancer cell.

    In medicinal chemistry, one can find countless other applications of the usage of sugars as they

    are (in general) benign for the human body. Typical fillers for drugs are starch and sugars like

    lactose. These are added to give a pill a bigger size for user comfort as the active compound is

    usually present in extreme small doses. The science of these saccharides, baptized to

    ‘glycobiology’ somewhere in the late 1980s combines the chemistry and biochemistry discipline

    on the usage of saccharides. Unfortunately, these saccharides all have a highly similar chemical

    structure with most of their differences resulting from differences in stereochemistry.

    1.3 Oligo- and polysaccharides

    Oligosaccharides are build up by monosaccharides and have between two and eight

    monosaccharides linked to each by glycosidic bonds. Due to their limited length, the different

    branching possibilities are limited compared to polysaccharides, however they often occur in a

    branched state. They are found in food, where they serve as a prebiotic in the colon, but aren’t

    used as a source of energy by our body directly.

    Polysaccharides are highly occurring compounds in Nature. They are polymeric carbohydrates,

    containing long chains of monosaccharides linked together by glycosidic bonds. These

    polysaccharides are divided in three main groups: food storage polysaccharides (such as starch

  • 4

    and glycogen), structural polysaccharides (chitin and cellulose) and mucopolysaccharides (also

    known as glycosaminoglycans or GAG’s).

    Starch can be found in corn, potatoes, and grains etc. while glycogen is only found in animals

    (and is often referred to as animal starch). Both starch and glycogen are essential for nutrition

    (as they are our main source of energy) and are homopolysaccharides since only one type of

    monosaccharide is present in the polysaccharide. For both starch and glycogen this

    monosaccharide is glucose. Due to their simplicity in monosaccharide composition, they are of

    small importance during this project, as these polysaccharides only have one type of

    monosaccharide, which is well known.

    Structural polysaccharides such as chitin and cellulose are found in both plants and animals. They

    are responsible for the rigidity of plants and for animals, chitin is found in the shell of crustaceans

    (shrimps and lobsters for example). Cellulose is a homopolysaccharide of glucose, linearly linked

    over a beta 1-4 bond allowing to form long and straight chains. This straight chain conformation

    results in strong fibers, giving plants there structural strength. Chitin is similar to cellulose, but

    has a derivative of glucose as its monosaccharide, namely N-acetyl-glucosamine.

    Pneumococcal disease (Streptococcus pneumoniae), a potentially lethal bacterial infection,

    shows the importance of the analysis of polysaccharides. This bacteria causes different infections,

    however a vaccine does exist (ATCC, 2015). These vaccines are based on the serotype of the

    bacteria which is depending of the polysaccharide capsule of the bacteria. Knowledge with

    regards to the monosaccharide composition and the glycosidic bonds of the polysaccharide

    capsule are crucial in order to formulate vaccines. While these CPS have been extensively studied,

    some still remain unknown (Geno & Gilbert, 2015).

    Figure 1.2 Starch; a branched polysaccharide (retrieved from nutrientsreview.com)

  • 5

    Upon identification of the oligo- or polysaccharide, two kinds of information are required. First,

    the monosaccharide composition must be determined, as it is the building block of the oligo- or

    polysaccharide. Second, the way the monosaccharides are interconnected with the glycosidic

    bond and their order must be determined as well. In this thesis; only the identification of the

    monosaccharides themselves is being tackled. No attempt is made in order to know in which

    order they are connected and which hydroxyl functions are used in this connection. The latter

    typically occurs in a later stage of the analysis. The identification of monosaccharides is referred

    to as step I (Figure 1.3).

    1.4 Monosaccharides structure and conformation – a refresher

    Monosaccharides have a chemical formula (𝐶𝐻2𝑂)𝑛 where the origin of the group name can be

    seen (carbohydrates: carbon and water). They are classified into four main groups according to

    the number of carbon atoms in the chain, leading to triose, tetrose, pentose and hexose

    monosaccharides of which only the latter two are at the focus of this project due to the fact that

    they are the ones that mostly occur in Nature.

    Within each group, various saccharide structures exist as a result of differences in cyclization and

    stereochemistry. In addition different conformations may need to be taken into account when

    considering cyclic pentose and hexose structures. As the mass of the monosaccharides are (per

    Figure 1.3 Step by step identification

    Figure 1.4 The pyranose form of hexose saccharides (left) and the pentose saccharides (right)

  • 6

    group) identical; most standard chemical methods are only able to differentiate between a

    pentose and hexose but not between the different pentoses (or hexoses) themselves as

    differences are only found in the stereochemistry.

    1.4.1 Different forms of monosaccharides

    Monosaccharides can both occur in cyclic and acyclic forms in a chemical equilibrium. This

    equilibrium is possible thanks to the aldehyde function only present in the acyclic form. Through

    intramolecular addition of one of the hydroxyl functions with the aldehyde a hemiacetal is

    formed, leading to a cyclic structure. An interactive representation of the cyclization can be

    found on Wikimedia.1

    As multiple hydroxyl functions are present on each molecule; this reversible reaction can result

    in different ring sizes. For pyranose type monosaccharides the ring has six atoms; for furanose

    the ring has five atoms. This means the geometrical structure of the molecule is fundamentally

    different, which is exploited in this project.

    _________________________ 1 https://upload.wikimedia.org/wikipedia/commons/a/af/Glucose_Fisher_to_Haworth.gif

    Figure 1.5 Cyclization of the acyclic aldehyde to both the furanose and pyranose

    Figure 1.6 The cyclization of D-Glucose

    https://upload.wikimedia.org/wikipedia/commons/a/af/Glucose_Fisher_to_Haworth.gif

  • 7

    Generally speaking, pyranose is the dominant form due to the presence of added ring strain for

    the furanose cycle, giving the furanose forms typical mass percentages between 2 and 15%.

    1.4.2 Variations in monosaccharides: the α vs β and D vs L

    Due to the hemiacetal formation a new stereo center is created leading to the formation of two

    diastereoisomers referred to as alpha and beta. The newly formed hydroxyl function can be

    either in the cis or trans relationship in comparison to the C6 carbon (Figure 1.7). A mixture of

    both will be formed with the ratio depending on their relative free energies of formation. The α

    and β denomination is defined based on the mutual relationship: trans and cis respectively for

    the alcoholic functions on the C1 and the C5 carbon. In the Haworth-projection, for all D-type

    sugars, the α-type monosaccharides have the anomeric hydrogen in the upward position. For the

    L-type monosaccharides it will be in the downward position.

    The D and L notation refers to the Fisher representation and the orientation of the hydroxyl

    function indicated in blue (Figure 1.8). The D-monosaccharides are (in general) always the most

    occurring in Nature. There are only a handful exceptions (such as for rhamnose). The L type sugar

    is the mirror image of the D type, both are depicted in Figure 1.8 for glucose. As it is not possible

    to differentiate between two mirror images using NMR, the D-type monosaccharide will (unless

    otherwise mentioned) always be investigated.

    Figure 1.7 The alpha (left) and beta (right) D-Glucose form

    Figure 1.8 D (left) and L (right) glucose

  • 8

    1.4.3 Variations in conformations

    Throughout this project; two different representations will be used; the Haworth projection will

    be used often to compare structure and stereochemistry of different monosaccharides. However,

    multiple conformations still remain possible for a single Haworth projection. Pyranoses may

    occur in two chair conformations2, and when these are important, both will be shown.

    The most stable chair conformation will be the one where a majority of hydroxyl functions are

    positioned in the equatorial plane and are not axial on the ring. This is due to the fact that axial

    groups have spherical interactions between each other destabilizing the molecule. Therefor the

    chair conformation with the smallest axial groups (often a hydrogen atom) is expected to be the

    more stable conformation.3 Figure 1.9 illustrates both chair conformations for α-D-glucose, the

    left chair conformation is expected to be more stable as it has four hydroxyl functions and C6 in

    the equatorial plane while the right conformation only has hydrogens in the equatorial plane.

    1.5 Introduction towards TOCSY

    TOCSY or TOtal Correlation SpectroscopY; is an NMR technique where magnetization associated

    with a particular spin can be passed along the entire spin system of which it is part in a molecule.

    Transfer occurs between non scalar coupled protons as long as there are scalar couplings

    between intervening protons in the spin system. In poly saccharides, every monosaccharide unit

    defines such a separate spin system as the glycosidic bond involves too many bonds to allow for

    significant scalar coupling to occur between protons of neighboring monosaccharides.

    _________________________ 2 Other conformations exist as well, yet they have higher relative energies (boat conformation, twist-boat conformation) 3 One must also take the anomeric effect in account. For certain monosaccharides it might be less apparent which chair conformation shows a lower energy. Multiple other effects might be required to be taken into account.

    Figure 1.9 The Haworth projection and both chair conformations of β-D-glucopyranose

  • 9

    Considering the chemical structure of monosaccharides, the scalar coupling networks mostly

    involve 3JHH scalar couplings between vicinal hydrogens in the spin system, resulting in a linear

    network (Eq. 1). The transfer rate of magnetization within such a network is dependent of two

    main factors: the mixing time (the time that is given to pass on the magnetization) and the

    coupling constant values between each of the vicinal hydrogens (3JHH). The coupling constant will

    determine the magnetization transfer speed. Small couplings will reduce the speed of

    magnetization transfer; while large a 3JHH coupling constant will lead to rapid transfer.

    Due to the cyclic nature of monosaccharides, a monosaccharide diastereomer will be in one of

    both chair conformations (for the pyranose form) meaning the dihedral angles between the

    vicinal hydrogens are fixed and not prone to averaging over conformations. As the dihedral angles

    determine the size of the coupling, the set of coupling constants along the network is therefore

    also conformation specific. This has the result that each monosaccharide is expected to have a

    specific set of 3JHH-coupling constants between each of the vicinal hydrogens. This leads to a

    characteristic set of couplings at which the magnetization can be passed along from the H1 to H2,

    to H3 and so on. For a hexose monosaccharide; the following couplings would be obtained

    starting from the anomeric hydrogen:

    𝑯𝟏 𝟑𝑱𝟏𝟐→ 𝑯𝟐

    𝟑𝑱𝟐𝟑→ 𝑯𝟑

    𝟑𝑱𝟑𝟒→ 𝑯𝟒

    𝟑𝑱𝟒𝟓→ 𝑯𝟓 𝟑𝑱𝟓𝟔′

    → 𝑯𝟔′

    𝟑𝑱𝟓𝟔→ 𝑯𝟔

    Eq. 1

    This means the appearance of a signal or TOCSY correlation for H2 through TOCSY transfer from

    H1 is to a good approximation only dependent on the mixing time and the 3𝐽12 coupling. The

    signal of H3 is dependent on both the 3𝐽12 and the 3𝐽23 coupling and so on. Differences in scalar

    coupling values along the transfer path from H1 to H6 will lead to a different pattern and intensity

    Figure 1.10 Comparison of a 1D-sel TOCSY of the non-anomeric region of α-glucose with different mixing times (blue:43ms; green: 95ms)

  • 10

    of TOCSY correlations depending on the particular monosaccharide. When a particular 3JHH is

    small, this will lead to a transfer bottle-neck, preventing significant transfer of magnetization to

    the remainder of the spin system. This is for instance the case in α-D-mannopyranose (Figure

    1.11), where the 3J12 scalar coupling is very low (approximately 1Hz) causing a bottleneck for the

    entire spin system. Unfortunately, this bottleneck effect will make it hard to analyze certain

    monosaccharides that have an identical conformation before the bottleneck in the spin system

    (see 5.4.2).

    The size of all the individual couplings therefor has a big influence on how the TOCSY pattern

    changes upon varying the mixing time. As previously mentioned, the size of the individual

    coupling is dependent on the dihedral angle between the vicinal hydrogens. The Karplus-relation

    attempts to describe the correlations between each vicinal hydrogens (3JHH):

    3𝐽𝐻𝐻 = 𝐴𝑐𝑜𝑠2𝜃 + 𝐵𝑐𝑜𝑠𝜃 + 𝐶 Eq. 2

    The A, B and C are 7.76; -1.1 and 1.4 respectively for unsubstituted monosaccharides (Haasnoot,

    Deleeuw, & Altona, 1980). As the A, B and C are known, the only unknown variable of the 3JHH-

    coupling is the dihedral angle (the concept of a dihedral angle is shown in Figure 1.13).

    Using the TOCSY experiment, the rate at which magnetization propagates to the other hydrogen

    atoms in the spin system can be mapped. Different correlations will be visible depending on the

    mixing time due to the bottlenecks. In order to compensate against this and to achieve a high

    Figure 1.13 The Karplus-relation Figure 1.13 Dihedral angle between four atoms

    Figure 1.11 α-D-Mannopyranose; note the angle between the H1 and H2 will cause a bottleneck

  • 11

    signal to noise ratio for all signals, high mixing times (100ms) are most often used. However, due

    to technical limitations of the hardware (mostly the probe); a safety limit on the mixing time has

    been set to 130ms.

    1.6 Content of this project

    During this project, a new experimental setup is proposed in order to reduce the setup time of

    the NMR experiments and automate the setup further. This is done using two different

    experimental methods (pseudo 2D and pseudo 3D) and is discussed in chapter three. In order to

    automate the processing and provide a quantitative comparison between monosaccharides, a

    fully automatic processing script is written in Python (chapter four). It must be noted that chapter

    four is very technical, as it describes the processing in detail. Next, chapter five compares the

    measured monosaccharides against each other in a reference database. Chapter two contains a

    short description of other techniques used in order to analyze monosaccharides and their (dis-)

    advantages, including NMR and the current TOCSY-matching approach. The thesis is concluded

    with a few case studies in chapter six and a conclusion in chapter seven.

  • 12

    Chapter 2: Current analysis methods

    There are many simple methods to quantify carbohydrates in substances such as food. However,

    methods having identification of chemical structure and composition as a goal are few and more

    involved. Depending of the chain length (polysaccharides or oligosaccharides), techniques

    involve partial or total hydrolysis in order to split oligosaccharides into the individual

    monosaccharides and require an extended knowledge of organic chemistry. When total

    hydrolysis is used, information on the glycosylation pattern and overall structure is lost. Some

    techniques also require modification of the saccharides, which adds another step to the analysis

    process.

    2.1 Mass Spectroscopy methods

    Using mass-spectroscopy, it does not seem arbitrary to differentiate between diastereoisomers

    as these have an identical molar mass. However, it has been proven to be possible to differentiate

    between three different hexoses (glucose, galactose and mannose) using ESI-ITMS in positive ion

    mode. It was however not possible to differentiate between α and β monosaccharides (Zhu &

    Sato, 2007). In this research only the monosaccharide composition was investigated, no attempt

    was made to determine the saccharide sequence and the nature of the glycosidic linkages.

    However it has been proven possible to do so using MS-MS. The data given by Zhu shows that

    the technique requires extended use of wet chemistry and also lacks information on which

    monosaccharides can be differentiated, as the amount of monosaccharides investigated was

    fairly limited.

    Using GC-MS, it is possible to do a full but destructive analysis of monosaccharide composition.

    In order to do step I of the analysis (chapter 1.3), a full hydrolysis is however required. This can

    be done with hydrochloric acid for example. After the hydrolysis, many derivatization methods

    can be applied, such as silylation and fluoroacylation (Sassaki & Souza, 2013) to enable GC-MS

    analysis. The identity is typically confirmed by using suitable monosaccharide reference

    standards.

  • 13

    2.2 HPLC methods

    This approach shows similarities to GC-MS as it also requires hydrolysis in the case of oligo- or

    polysaccharides. Once a raw mixture of monosaccharides is obtained; the retention time of each

    monosaccharide is determined using chromatography. The retention time is monosaccharide

    specific giving a method for monosaccharide identification (Saddic & Ebert, n.d.). The exact

    recipes for this analysis can be found in the paper of Saddic as this method uses ‘wet chemistry’.

    Similar to MS methods, this technique is destructive for the sample yet it is able to differentiate

    between multiple monosaccharides.

    2.3 NMR standard methods

    Upon the analysis of a single monosaccharide sample, a 1D proton NMR will be sufficient. But

    once the sample consists of either multiple monosaccharides or oligo-/polysaccharides, a 1D

    proton NMR spectrum will generally no longer suffice as there will be a significant amount of

    spectral overlap between the signals of the different units. Unfortunately, chemical shifts are not

    completely compound specific, as they are dependent of the chemical environment as well. As

    such their usage is not advised for the identification accept for the anomeric proton.4 The

    anomeric proton has a high chemical shift (4.5-6 ppm) due to the vicinity of the alcohol and ether

    oxygen atom. The rest of the hydrogens on the monosaccharides are all in the same area (3-4.2

    ppm) of the spectrum and as small deviations can occur, it is not possible to use this for a correct

    annotation of the spectra.

    _________________________ 4 The anomeric proton is responsible for the alpha or beta notation; it is indicated with H1.

    Figure 2.1 1D HNMR spectrum of Galactose

  • 14

    For these more complex samples, two dimensional experiments are required. Using the classical

    homonuclear (only one type of atom is measured in two dimensions: COSY, TOCSY,

    NOESY/ROESY) and heteronuclear experiments (HSQC, HMBC involving 1H/13C) it is possible to

    identify monosaccharides. This process requires first a full annotation of the spectrum, after

    which the conformational analysis will determine which monosaccharide has been measured. As

    the usage of chemical shift is not possible for non-anomeric hydrogens, the usage of scalar

    coupling constants between the different hydrogens and heteronuclear experiments such as the

    HSQC and HMBC are required in order to complete the structural analysis. To show the

    complexity of this process, a possible work flowchart has been shown in Figure 2.2. This process

    for monosaccharide and polysaccharide determination has been described extensively by Guus

    & Gotfredsen and will not be repeated in this project.

    2.4 The TOCSY-Matching approach (Gheysen et al.)

    As the current NMR-technique to identify different carbohydrates is

    very extensive, time consuming and requires manual examination of

    the data, Gheysen et al. has proposed a new approach called TOCSY-

    matching. While TOCSY-matching does not allow a full annotation of

    the spectra, it gives a new approach to quickly identify the content of

    monosaccharide, considerably facilitating a full annotation in the

    subsequent analysis. The TOCSY-matching approach consists of

    Figure 2.2 NMR flowchart for monosaccharide determination (Touckach, 2013)

    Figure 2.3 Gheysen determination table (60ms)

  • 15

    taking one 2D-TOCSY at a chosen mixing time (often 100ms) of an oligosaccharide (or a mixture

    of monosaccharides. The TOCSY trace from the anomeric signals is analyzed either along the F1

    or the F2 dimension (the 2D TOCSY is symmetrical by design and both directions should bring an

    identical result although there is a resolution difference).

    Using this it is possible to compare the peak intensity per monosaccharide with the diagrams

    provided by Gheysen (the 60ms chart is shown in Figure 2.3). These were made using multiple

    2D TOCSY spectra for each monosaccharide (at 30, 60 and 100ms mixing time) in the matching

    charts. The monosaccharide in Figure 2.4 shows one intense peak (the diagonal peak) and four

    medium intensity peaks. When comparing this with the diagram, it tells us that this

    monosaccharide in the sucrose sample, is one out of both glucoses. However it is not possible to

    determine the stereochemistry of the anomeric hydrogen (α or β) in this case. This specific case

    results in the fact that to differentiate between the alpha and beta glucose, a second 2D-TOCSY

    must be taken at a mixing time of 30ms. Only then will we be able to correctly identify α-D-

    glucose. While taking extra 2D TOCSY spectra at different mixing times works in the case of

    glucose, none of the mixing times used in the matching charts enable us to differentiate between

    the α and β form of galactose for instance.

    There are three main concerns about this implementation of the TOCSY-matching approach.

    First, it will only become apparent during the analysis and the processing of the spectra that an

    Figure 2.4 A 2D-TOCSY spectra of sucrose (700MHz; 60ms). The signal corresponding to the anomeric hydrogen and its cross peaks have been indicated.

  • 16

    extra 2D TOCSY is required. This can cause delays in the analysis of the entire sample5. Upon

    taking the extra 2D TOCSY spectrum, it is probable that it is impossible to differentiate between

    the monosaccharides. The second and main concern is the limited amount of monosaccharides

    investigated previously. As a result, it is unknown if monosaccharides not present in the charts

    might show identical patterns. Xylose for instance, a pentose, might show identical peaks in the

    charts and be misidentified as glucose. Finally, a third item is that the method only provides a

    qualitative determination and does not provide a quantitative analysis. The analysis is therefore

    prone to user bias.

    2.5 Comparison of the techniques

    While multiple techniques do exist, all have their own advantages and disadvantages. Both mass

    spectroscopy and HPLC are able to analyze a mixture of monosaccharides and the chemical

    composition of polysaccharides (step I). It not possible however, to analyze the glycosidic bonds

    interconnecting the monosaccharides in an oligo- or polysaccharide.6 The main advantage of

    both these techniques is the low sample amount required. NMR is able to identify every

    monosaccharide and the correct glycosidic bond, yet it requires a much higher amount of sample.

    A second disadvantage of NMR is the high cost of an NMR apparatus.

    Another important factor that must be taken into account is that it is unknown which

    monosaccharides can all be differentiated from each other. Different mass spectroscopy

    methods and chromatography methods often include three or four monosaccharides in order to

    see if the technique is able to differentiate between these and are a perfect proof of concept. A

    universal identification method in order to identify any monosaccharide (independent whether

    or not it is in a polysaccharide) is not always possible or does not exist, apart from a full analysis

    using NMR, without the TOCSY-matching approach. It is our goal to reinvestigate the TOCSY-

    matching approach in order to: expand the coverage of different mixing times, expand the

    amount of monosaccharides investigated, automate the processing and explore the option for a

    quantitative monosaccharide analysis.

    _________________________ 5 In the case that extra measurement time has to be requested. 6 It is possible to determine the order of monosaccharides, yet not by which hydroxyl function they are interconnected. For this, NMR is required.

  • 17

    Chapter 3: Designing a new experiment

    3.1 TOCSY pulse sequence

    The default pulse program for a TOCSY experiment can be seen in Figure 3.1.7 The fid gives us a

    frequency dimension after Fourier transformation and is often referred to as the direct

    dimension. In order to obtain the extra information generated by the TOCSY sequence, two main

    variants exist. In the 2D experiment the t1 time period is systematically increased, yielding a

    series of 1D measurements from which the indirect dimension may be constructed and

    subsequently Fourier transformed. The second option is to change the 90° universal pulse to a

    selective pulse while keeping the t1 fixed resulting in a 1D selective TOCSY. This selective

    excitation region can be placed anywhere in the 1H spectrum, resulting in the TOCSY cross

    correlation pattern of the selected hydrogen.

    3.2 A novel experimental setup

    For the new experimental setup, the transfer of magnetization from the anomeric signal through

    the entire spin system must be sampled for each saccharide. This would require a setup of

    multiple 2D-TOCSY experiments, covering a mixing time of 0 to approximately 100ms and

    therefore a high amount of measurement time. Indeed, depending on the instrument resolution

    and sensitivity, a single 2D spectrum requires between 2 and 12 hours to record for a single

    TOCSY mixing time. It should be noted however, that most of the time spent is used to record

    250 to 500 1D experiments to sample the indirect time domain t1 of the 2D spectrum, so as to

    achieve the required resolution for analysis of the TOCSY traces along F1. Repeating this for d 10

    different mixing times would be prohibitively long. This can in principle be alleviated by using 1D-

    selective TOCSY experiments. Here an individual anomeric resonance is excited and its

    magnetization is then channeled in the TOCSY sequence. This results in direct generation of the

    _________________________ 7 d1: relaxation delay; t1: time period, corresponds to the F1 axis after Fourier transform; fid: free induction decay.

    Figure 3.1 Default pulse program for a 2D TOCSY

  • 18

    TOCSY trace in a 1D spectrum for that particular anomeric signal. The measuring time is now

    reduced to that of a single 1D experiment repeated for as many mixing times as one wishes to

    sample to record the TOCSY transfer. Thus only about 10 1D’s are now required. However, this

    needs to be repeated for each individual monosaccharide. This notwithstanding, using multiple

    1D-selective TOCSY experiments with changing mixing time per monosaccharide, reduces the

    measurement time compared to the 2D approach. To facilitate analysis the associated pulse

    program was set up so that all 1D selective TOCSY measurements for a particular monosaccharide

    are recoded in a single experiment, generating a 2D like presentation with one frequency and

    one mixing time axis. However, as this needs to be repeated for every anomeric signal a loop is

    added so that each anomeric signal is targeted one by one generating a series of 2D’s in a single

    file. This will be referred to as the p2D sequence.

    Unfortunately, a minimum separation between individual anomeric resonances needs to occur

    throughout, a condition which is generally not satisfied. In this case, the TOCSY traces of two or

    more signals will overlap, compromising analysis. In order to address this issue, a band selective

    version of the 2D TOCSY was developed, which marries the best of both worlds with a limited to

    no extra time cost. The resulting p3D sequence, thus generates a 3D spectrum, consisting of two

    frequency axis and a mixing time axis. Information on how to actually set up an experiment, both

    the p2D as the p3D experiment, can be found online on the GitHub8, an online code repository.

    Pulse programs, vclists and the processing software have been made fully available for download

    in this environment.

    3.3 Achieving selectivity

    As previously mentioned; the TOCSY experiment consists of a 90° pulse; followed by a spinlock

    sequence. To limit the range of signals along the F1 dimension to the anomeric signals and

    achieve a suppression of all other peaks as shown in Figure 3.3, it is sufficient to avoid exciting

    the signals outside the region of interest prior to the t1 evolution period of the TOCSY spectrum.

    To achieve this; two options present themselves:

    _________________________ 8 https://github.ugent.be/ydandois/Thesis-Source-Code or https://github.com/FramedYannick/Thesis-Source-Code

    https://github.ugent.be/ydandois/Thesis-Source-Codehttps://github.com/FramedYannick/Thesis-Source-Code

  • 19

    It is possible to give a high selective 90° pulse as is done in the standard selective TOCSY

    experiment. However, upon using this technique for both the p2D and p3D off-resonance effects

    will be introduced causing phasing problems and non-uniform excitation, especially for the p3D

    experiment where the anomeric hydrogen excitation region is quite wide. A second problem is

    that the shape of these selective pulses are often Gaussians (or have a similar shape) as such that

    they will still excite non-anomeric hydrogens (causing aliasing). The high selective 90° pulses can

    be useful for the p2D experiment but not to excite an entire band from the 2D TOCSY plane as is

    done in the p3D experiment.

    A second possibility is to give a hard 90° pulse followed by a short delay, after which a

    selective 180° refocusing pulse is applied to the anomeric region only, followed by a delay of

    equal length. This will result in all magnetization evolving in the xy-plane, yet only the

    magnetization of the inverted anomeric hydrogens are refocused by the selective inversion pulse.

    By bracketing this pulse between two pulsed field gradient pulses, only the inverted signals will

    Figure 3.3 2D-TOCSY spectra of JD #1 (100ms – 700MHz) with a superposed band selective indication. The anomeric hydrogen region on the diagonal has been indicated.

    Figure 3.2 An overlay of a regular 1D-HNMR (blue); a 1D-HNMR using 90° selective pulse (red) centered anomeric signal (yet partly exciting water). In order to improve the 90° selective pulse results, one can lengthen the pulse in order to increase

    selectivity to further reduce the intensity of the water signal. Last, a 1D-HNMR using selective spin-echo (green) in order to excite the entire anomeric region.

  • 20

    yield a coherent signal after the spin-lock sequence, with refocusing of chemical shift evolution.

    The non-anomeric hydrogen signals remain in the xy-plane, however they are not refocused and

    the gradients will almost reduce their signal contribution to zero. This technique is referred to as

    gradient enhanced selective spin-echo.

    Both excitation methods are compared in Figure 3.2 using a standard 1D 1H NMR (i.e. without

    TOCSY). The selective 90° 27ms pulse approach (red spectrum) was given on the anomeric

    hydrogen of β-galactose and the α-galactose anomeric hydrogen is not excited. Yet the solvent

    peak (water) is still present and out of phase. Making the 90° selective pulse longer will increase

    the selectivity but also the phase distortion of the selectively excited signal. For band-selective

    excitation, a shorter selective 90° pulse is required, but this causes non-uniform excitation and

    the need for large phase corrections. Therefore, the 180° refocusing pulse approach is preferred.

    Using a 180° BURP 15ms pulse, both the α and β-hydrogen can be seen with expected relative

    intensity and without phasing problems. Both techniques show no signals of the non-anomeric

    hydrogens; and as such folding and aliasing will be prevented in our band selective TOCSY

    experiment. As it is our goal to have a general experiment with limited setup, the spin-echo

    technique is used throughout. It offers us the possibility to create a general applicable method

    for assigning all monosaccharide units from their anomeric resonance even when the chemical

    shift of these anomeric hydrogen resonances change due to the precise chemical circumstances

    of the sample studied. Due to the fact the spin-echo sequence can be used with much longer

    pulses (and wider refocusing regions) without any phasing problems (as seen in Figure 3.3), the

    spin-echo has the clear advantage for the p3D experiment.

    3.4 p2D-sel TOCSY

    In principle, the 1D-sel TOCSY is taken by selectively give a 90° pulse to one anomeric hydrogen

    and executing a spin-lock preventing resonances from evolving chemical shift in the xy-plane

    using either the MLEV or DIPSI pulse sequence, while keeping the t1 parameter fixed. This will

    cause the magnetization to be passed on towards the vicinal hydrogen networks. An example of

    a 1D-sel TOCSY using the spin echo technique as previously mentioned can be found in Figure 3.4

    with a 1D proton spectrum.

  • 21

    With the goal of combining multiple selective TOCSY with varying mixing time on the same

    anomeric resonance into one single experiment, a pseudo 2D selective TOCSY pulse program was

    set up as a 1D selective TOCSY adding one extra loop over the mixing time. The spectrometer

    only requires one setup per experiment and as such setup time is reduced in comparison to the

    previous workflow. For this, two items must be combined:

    The 1D experiments must loop over all required mixing times. While Gheysen et al. showed that

    both the DIPSI-2 and MLEV sequence are equally effective for the TOCSY transfer, due to the fact

    that the DIPSI-2 sequence can have zero quantum coupling filtering and as such provides cleaner

    spectra for an automated analysis (Figure 3.4); the DIPSI-sequence was used throughout this

    project.9

    The mixing times used were fixed to the same series of values (in milliseconds) for each

    experiment, facilitating an automatic comparison later on:10

    8.63, 17.26, 25.90, 34.53, 43.16, 51.80, 60.43, 69.06, 77.70, 86.33, 94.96, 103.60, 112.23

    These are not chosen arbitrary, the length of the 90° pulse (p6) used in the DIPSI-2 sequence was

    set to 25 µs and the DIPSI-2 was executed per three loops resulting in steps of 8.63ms.

    _________________________ 9 This can clearly be seen in Figure 3.4 upon comparison of the triplet (3.3ppm) and doublet of doublets (3.45ppm) in the mlev-17 (blue) and dipsi-2 (red). 10 Using curve interpolation, it is possible to make the processing uniformed, as long as enough mixing times are used. This however complicates the issue and this process increases the experimental error on the data. Therefore these mixing time values were also used in the p3D experiment.

    Figure 3.4 1D proton nmr of sucrose (green); 1D-sel TOCSY of sucrose (100ms) using selectivity on the anomeric hydrogen using both the dipsi-2 (red) as mlev-17 (blue) sequence on 700MHz. Only the

    non-anomeric region is shown. The effect of the z-filter in the DIPSI-2 sequence is clearly visible around 3.45ppm upon comparison of the DIPSI-2 and mlev-17 sequences.

  • 22

    Since the p2D experiment does the selective excitation per anomeric hydrogen (meaning per

    monosaccharide) present in the sample, it must loop over all hydrogen frequencies in the

    anomeric region of the spectrum. Due to this, the time required for the p2D version to be

    measured scales linearly with the amount of monosaccharides in the sample (+-20min per

    monosaccharide for default resolution and sixteen scans). In principle, this could also be

    automated, by introducing an a 1D proton scan and automatic peak determination. An extra loop

    for all the frequencies required to be excited is added. This is stored in the fqlist parameter and

    must then also be set by the operator. However, manual examination of the hydrogen spectrum

    for the setup remains highly advised as errors would have a high impact on the spectra in this

    stage.

    It would also be difficult to automatically predict the selective pulse length required for selective

    excitation of a certain band-width, as this depends on the distribution of signals in the anomeric

    region. Higher selectivity requires longer pulse lengths. This means that a manual set up is

    required for each signal any way. This also allows the receiver gain setting to amplify the signal

    during detection to be optimized for each individual anomeric signal (using ‘rga’). This is

    especially of interest when monosaccharide units appear in different S/N-ratio’s in the spectrum.

    As the most intense signal defines the gain, the monosaccharide with the lowest signal

    contribution may have a less than optimal receiver gain resulting in a reduced signal to noise ratio

    in comparison to what can be achieved using the hardware with optimal settings. In the view of

    this and subsequent development of the p3D sequence, it was opted not to automatize this step

    of the acquisition and the user must manually enter the different frequencies.

    Figure 3.5 A 3D (left) and 2D (right) representation of a p2D-sel TOCSY of β-glucopyranose (the time domain axes are scaled incorrectly)

  • 23

    Extensive documentation and details on how to set up an experiment can be found on the

    GitHub. The total measurement time of this experiment is dependent on the amount of

    frequencies set in the fqlist, meaning the more monosaccharides in the mixture that should be

    analyzed, the longer the experiment will take (a linear correlation is present). This approach

    works well for samples where the amount of monosaccharides is limited and the anomeric peaks

    are well separated. Unfortunately, upon overlap of the anomeric hydrogen peaks in the spectra,

    it will not be possible to be selective between different anomeric hydrogens and an automated

    analysis will no longer by possible. Selective excitation of an anomeric hydrogen peak will always

    excite a region around the anomeric hydrogen. If two signals overlap, the second anomeric

    hydrogen signal will contribute to the signal measured after selective excitation. As this will

    influence the profile of the signals over different mixing times, the processing software will not

    be able to identify the correct monosaccharide. In such cases, it is advised to use a different

    spectral approach, namely the p3D-Bsel TOCSY.

    3.5 p3D-Bsel TOCSY

    In Figure 3.3 a 2D TOCSY spectra is depicted of JD#111. Upon taking this spectrum over multiple

    mixing times, the same experimental results will be achieved as in the p2D-sel TOCSY experiment.

    But as this would be an array of 2D experiments, it would be very measurement time consuming.

    As the blacked out regions in the regular 2D TOCSY spectrum are regions not required for the

    automated analysis of the spectra, the parameters are set as such to only measure the white part

    of the spectrum. It becomes apparent that the resolution in the F1 dimension is dependent of

    two items, the spectral width that must be covered and the number of sampled points. The

    former is typically 7 to 8 ppm, while the latter varies from 250 to 500, leading to minimal and

    maximal resolutions (before zero filling) of 31.25 points per ppm and 71.43 points per ppm

    respectively. Limiting the sample along the F1 to a band of 1ppm (as shown as the white region

    in Figure 3.3), reduces the required amount of points in the F1 dimension with a factor 7-8 upon

    keeping the same resolution. Also, as signals are more dispersed in the anomeric region (in

    comparison to the non-anomeric region) the resolution can be further reduced.

    _________________________ 11 JD#1 is a sample with the mixture of 3 different saccharides, resulting in 6 different monosaccharides

  • 24

    3.6 Conclusion: p2D-sel TOCSY versus p3D-Bsel TOCSY

    In general, the goals for a new experimental analysis were to (1) constrain all experiments into

    one experimental setup making not only the setup but also the analysis easier, (2) reduce the

    required measurement time through a selective approach and (3) create a uniform mixing time

    dimension for all experiments in order to facilitate an automatic processing.

    Comparing the measurement time of the p3D-Bsel TOCSY with that of the p2D-sel TOCSY, it

    becomes immediately apparent that the measurement time of the p3D experiment is much

    longer (about 7 hours for a default resolution with a spectral window of 1ppm in the F1

    dimension). It would seem the p2D experiment has the advantage on samples with a low

    monosaccharide content (up to approximately 15 monosaccharides).

    But the p2D-sel TOCSY does have its limitations. Due to the fact it uses selective 90° peaks on

    each anomeric hydrogen, interference could present itself when two anomeric hydrogens have

    similar chemical shift. Both will be excited and this will cause problems for the processing

    software, which is only designed to process each monosaccharide one by one and is not using

    deconvolution. This choice was made as the p3D experiment solves this issue.

    Figure 3.6 Data resulting from a p3D experiment (JD sample). Data identical to the p2D experiment has been indicated.

  • 25

    In conclusion, the time factor is only from little importance upon choosing between the p2D and

    the p3D experiment. As long as all anomeric hydrogen signals are well separated in the terms of

    chemical shift, the operator is better off using the p2D-sel TOCSY. The p3D-Bsel TOCSY only

    receives the advantage upon a higher complexity of the anomeric hydrogen region. As long as

    the peaks are slightly separated, the p3D experiment will be able to differentiate between the

    different monosaccharides due to the extra dimension. In chapter six both experiments will be

    demonstrated on a mixture of multiple monosaccharides and on oligosaccharides to show this in

    further detail.

  • 26

    Chapter 4: Data processing

    The goal is to obtain monosaccharide specific curves of the peak intensities versus the mixing

    time independent of the experimental setup and the experiment itself. As two different

    experimental approaches were used, the data has to be put in a uniform format.

    The data is first manually processed using the NMR spectrometer software Topspin® in order to

    ensure optimal spectra.12 The data is read in using NMRGlue, an open source Python framework

    (Helmus & Jaroniec, 2013) designed to be compatible with multiple NMR formats. Thanks to

    NMRGlue, the data can almost immediately be processed using Python.

    The data finds itself in a format depending on the experiment

    that was used. Both p2D and p3D experiment will contain

    data coming from multiple monosaccharides. This data is

    split up and reorganized in a data format independent of the

    experiments and only contains data corresponding to one

    monosaccharide. This is called a ‘chunk’.13 These chunks will

    be analyzed completely separated from each other. On each

    chunk, the peak of the anomeric hydrogen and the

    correlations of the remaining hydrogens must be found, their

    signal must be integrated over the width of the peak in the

    chemical shift dimension. This results in curves with the

    intensity as a function of the mixing time for each correlation

    and the anomeric signals. These curves are normalized,14

    after which various filters are applied to these curves, in

    order to remove multiplets (doublets, triplets and

    quadruplets etc.). Chemical shift filters and threshold filters

    are also used in order to filter out remaining noise and

    _________________________ 12 More information on the processing using Topspin® can be found in the Github Manual. https://github.ugent.be/ydandois/Thesis-Source-Code or https://github.com/FramedYannick/Thesis-Source-Code 13 The chunkification is the only step that is different for the p2D and the p3D experiment. 14 They will all be normalized towards the anomeric hydrogen, as this signal always has the highest intensity.

    Figure 4.1 Flowchart of the processing.

    https://github.ugent.be/ydandois/Thesis-Source-Codehttps://github.com/FramedYannick/Thesis-Source-Code

  • 27

    solvent peaks. The remaining curves are monosaccharide specific and can be used for

    monosaccharide comparison (chapter five).

    4.1 Chunkification

    Both p2D and p3D experiment contains data of multiple monosaccharides. First this must be

    ‘chunkified’. A chunk is the amount of data corresponding to one monosaccharide. This step is

    the only step where the p2D and its p3D counterpart have different processing methods as the

    chunks are designed to be in an identical format for both experiments. This has the effect that

    the rest of the data processing will happen in identical manner, independent of the type of

    experiment.

    Due to the way the data is saved (in Bruker’s Topspin software); the data arrays in the F1, F2 and

    F3 dimension will always contain 2n elements.15 This must be taking into account for the

    processing in the mixing time dimension; as our experiment only contain 13 elements in the

    mixing time dimension. The last elements of an array will be filled up with zero rows until a power

    of two is reached. For the p2D-sel TOCSY the chunks are sequentially behind each other: if two

    different frequencies were measured, rows 0-25 will be data and 26-31 will be zero rows.

    _________________________ 15 With n being a positive integer, this is due to the usage of the Cooley & Tukey FFT.

    Figure 4.2 p2D-sel TOCSY taken on alpha (bottom) and beta (top) glucose directly from Topspin

  • 28

    For the p3D-Bsel TOCSY; we will always have thirteen different mixing times; meaning plane 13-

    15 will be filled with zero rows. Removal of these zero rows is important; as it will reduce

    calculation times upon manipulation of the data.

    4.1.1 Chunkification of the p2D-sel TOCSY

    As mentioned above, for the p2D-sel TOCSY; the experiment is set up so that the different chunks

    are always corresponding to sequential mixing time elements in the array.16 As we measure with

    thirteen different mixing times, the first chunk consists of elements 0-12; the second chunk of

    13-25 and so on. This shows that chunks will be lined up with a multiplication of 13 as the first

    spectrum from a new chunk. The chunkification of the p2D spectra is straightforward, as it is just

    slicing up the data array. This can be seen in Figure 4.2 where two chunks are present (rows 26-

    31 which are empty rows to fill the data up to a power of 2 are already removed from the figure).

    4.1.2 Chunkification of the p3D-Bsel TOCSY

    For the p3D-Bsel TOCSY chunkification is less direct. The experimental data consists of three

    dimensions: the F3 dimension or the direct dimension, the F2 dimension or the mixing time

    dimension and the F1 dimension or the indirect dimension corresponding to the band-selective

    region. As the order of dimensions in the data matrix is not identical to the on p2D experiment,

    _________________________ 16 As previously mentioned, the mixing time dimension is a pseudo dimension.

    Figure 4.3 The data structure of the p3D-Bsel TOCSY as a list of 2D-TOCSY after the dimension flip (the commas indicate that there are multiple elements in the array)

  • 29

    the F2 and F1 dimension need to be switched to ensure compatibility. After the switch, the p3D

    data will have F3 as the direct dimension, F2 as the indirect dimension and F1 as the mixing time

    dimension, similar to the p2D experiment.17 The setup of the data matrix can be seen in Figure

    4.3. Res stands for resolution, SW for spectral window and index stands for the index of the array

    (this is only used in formulas).

    In order to know which part of the spectra contains the diagonal peaks and therefore the

    anomeric hydrogen signals, the diagonal must be extracted. To extract the diagonal, the chemical

    shift is calculated for each data point in the F2 dimension and then the index of the corresponding

    data point in the F1 dimension (with the obtained chemical shift) is calculated. The ensemble of

    these two operations result in the following formula:18

    𝑥 =𝑆𝑂1−

    1

    2𝑆𝑊1−(

    𝑅𝑒𝑠2−𝑦

    𝑅𝑒𝑠2∗𝑆𝑊2+𝑆𝑂1

    1

    2𝑆𝑊2)

    𝑆𝑊1∗ 𝑅𝑒𝑠1 + 𝑅𝑒𝑠1 Eq. 3

    Looping over every index of the F2 dimension (index y) results in a list of coordinates of the

    diagonal (list of points (x, y) located on the diagonal). This is executed on the first plane in the

    p3D, as the lowest