Monitoring of a carbon anode paste manufacturing process ...€¦ · Résumé Le procédé de ... and gray level co-occurrence matrix (GLCM) methods were selected. These features

Monitoring of a carbon anode paste manufacturing process using machine vision

and latent variable methods

Thèse

��

Doctorat en génie chimique

Philosophiæ doctor (Ph.D.)

��

© Julien Lauzon-Gauthier, 2015

iii

Résumé

Le procédé de réduction électrolytique Hall-Héroult est utilisé pour la fabrication

industrielle d’aluminium primaire. Ce procédé nécessite l’utilisation d'anodes de carbone.

L’uniformité de la qualité de celles-ci est un paramètre très important pour assurer la

stabilité et des performances optimales des cuves d’électrolyse.

Malheureusement, les fabricants d'anodes sont actuellement confrontés à une

augmentation de la variabilité des matières premières. Cette situation est due à une

diminution de la disponibilité de matières premières de bonne qualité à faibles coûts. Pour

compenser, les fabricants d'anodes doivent diversifier leur choix de fournisseurs, ce qui

augmente la variabilité. Cependant, les usines ne sont pas préparées pour réagir à cette

situation tout en maintenant une qualité d'anode stable. Cette situation est due, entre

autres, à un manque de mesures quantitatives en temps réel de la qualité des anodes.

Plusieurs exemples d’applications industrielles de vision numérique ont été présentés

dans la littérature. Par conséquent, il existe une opportunité de développer un tel système

pour obtenir une mesure non destructive et en temps réel de la qualité de la pâte d'anode.

Le développement du capteur a été fait avec de la pâte et des anodes pressées à l'échelle

laboratoire. Un ensemble de caractéristiques de texture d'images calculées à partir de la

transformée en ondelettes discrète (DWT) et de matrices de cooccurrence de niveaux de

gris (GLCM) ont été sélectionnées. Ces caractéristiques étaient sensibles aux variations

dans la formulation et de la quantité de brai dans la pâte. Le capteur est aussi capable de

détecter la quantité optimale de brai (OPD) pour différents cokes. Ensuite, la sensibilité et

la robustesse du capteur ont été testées avec de la pâte industrielle.

Finalement, les usines collectent déjà beaucoup de mesures de procédé en temps réel.

Ces données peuvent être utilisées dans une stratégie de monitorage statistique pour

détecter et investiguer des déviations de qualité. Une nouvelle méthode statistique

multivariée par variables latentes PLS multi-blocs séquentiels (SMB-PLS) a été

développée pour améliorer l'interprétation des données industrielles par rapport aux

méthodes usuelles de PLS multi-blocs. Cette méthode a également été utilisée pour

discuter de la pertinence d’utiliser les caractéristique d'image de la pâte à un modèle

statistique pour la surveillance de la variabilité du procédé.

v

Abstract

The Hall-Héroult electrolysis reduction process used for the industrial aluminium smelting

relies on the consumption of carbon anodes. The quality and consistency of these anodes

are very important for the stability and performance of the reduction cells.

Unfortunately, the anode manufacturers currently face an increase in the raw material

variability. This is due to the declining availability of high quality, low cost and consistent

materials on the market forcing the anode manufacturers to diversify their suppliers.

However, the anode plants are not prepared to compensate for this increase in variability

and still maintain consistent anode quality. There is a lack of real-time quality monitoring

and control of the baked anodes properties and the most important raw material and

process parameters.

Machine vision applications have been successful in many industrial applications.

Therefore there is an opportunity to develop such a system to obtain a non destructive and

online measurement of the anode paste quality. This sensor could then be used in a

feedback/feedforward control strategy for attenuating the unmeasured raw material and

process variations.

The sensor development was performed using laboratory scale paste and pressed

anodes. A set of image texture features computed from discrete wavelet transform (DWT)

and gray level co-occurrence matrix (GLCM) methods were selected. These features could

capture variations in formulation, pitch ratio in the paste and in pitch demand. The sensor

was also found to be sensitive to the optimum pitch demand (OPD) of two different cokes.

Then, the sensitivity and robustness of the sensor was tested using industrial paste.

Finally, the anode plants already collect some real-time process measurement and off-line

raw material and baked anode properties that can be used to monitor and troubleshoot

process and quality deviations. A new sequential multi-block PLS (SMB-PLS) method was

developed to improve the interpretation of complex industrial dataset compared to already

available multi-block PLS methods. This method was also used to discuss the relevance of

adding real-time paste image feature to a statistical model for monitoring of the process

variability.

vii

Contents

Résumé ............................................................................................................................. iii Abstract .............................................................................................................................. v

Contents ........................................................................................................................... vii Table list ............................................................................................................................ ix

Figure list .......................................................................................................................... xi Acknowledgments ........................................................................................................... xix

Chapter 1 Introduction ................................................................................................... 1

1.1 Aluminium manufacturing ..................................................................................... 1

1.2 Anode manufacturing ........................................................................................... 2

1.3 Anode raw materials ............................................................................................ 4

1.4 Anode fabrication process .................................................................................... 5

1.5 Anode properties .................................................................................................. 8

1.6 Problems ............................................................................................................ 10

1.7 Objectives .......................................................................................................... 15

1.8 Thesis organization ............................................................................................ 17

Chapter 2 Latent variable methods .............................................................................. 19

2.1 Principal Component Analysis (PCA) ................................................................. 19

2.2 Projection to Latent Structures (PLS) ................................................................. 22

2.3 Data scaling ....................................................................................................... 24

2.4 Number of latent variables (A) ............................................................................ 24

2.5 Model interpretation tools ................................................................................... 27

Chapter 3 Image texture analysis ................................................................................ 29

3.1 Machine vision ................................................................................................... 29

3.2 Digital image ...................................................................................................... 30

3.3 Image texture analysis ....................................................................................... 31

3.3.1 Gray level co-occurrence matrix (GLCM) .................................................... 32

3.3.2 Wavelet texture analysis (WTA) .................................................................. 37

Chapter 4 Experimental ............................................................................................... 47

4.1 List of softwares ................................................................................................. 47

4.2 Laboratory anode fabrication .............................................................................. 47

4.2.1 Industrial raw material formulation ............................................................... 47

4.2.2 Laboratory raw material formulation ............................................................ 49

4.2.3 Laboratory anode fabrication ....................................................................... 51

4.2.4 Industrial paste sampling ............................................................................. 53

4.3 Image analysis methodology .............................................................................. 54

4.3.1 Description of the imaging set-up ................................................................ 54

4.3.2 Description of the image analysis methodology ........................................... 55

Chapter 5 A new Multi-block PLS algorithm including a sequential pathway ................ 61

5.1 Introduction ........................................................................................................ 61

5.2 Description of the multi-block methods ............................................................... 66

5.2.1 Multi-block PLS (MB-PLS) ........................................................................... 66

5.2.2 Sequential Orthogonal PLS (SO-PLS) ......................................................... 68

5.2.3 Proposed algorithm: the Sequential Multi-block PLS (SMB-PLS) ................ 69

5.3 Description of the dataset used for the case studies .......................................... 71

5.3.1 Simulated data from film blowing process ................................................... 71

5.3.1.1 First case – No correlation between raw materials and process data ... 72

5.3.1.2 Second case – Correlation between raw materials and process data ... 73

5.3.2 Industrial data from the anode manufacturing process ................................ 73

viii

5.4 Results and discussion ....................................................................................... 75

5.4.1 Selecting the number of components .......................................................... 75

5.4.2 Results for the film blowing example ........................................................... 77

5.4.3 Industrial data from the anode manufacturing process ................................ 86

5.5 Conclusion ....................................................................................................... 100

Chapter 6 Paste image texture analysis ..................................................................... 103

6.1 Introduction ...................................................................................................... 103

6.2 Laboratory paste and anode experiments ........................................................ 107

6.2.1 Preliminary design on paste formulation .................................................... 107

6.2.2 Detailed design on paste formulation ........................................................ 108

6.2.3 Pitch optimization experiments .................................................................. 112

6.3 Selection of preprocessing operations and image textural features .................. 114

6.3.1 Dataset and criteria used for the comparative analysis .............................. 115

6.3.2 Choice of preprocessing ............................................................................ 116

6.3.3 Choice of wavelet ...................................................................................... 119

6.3.4 Selection of textural features ..................................................................... 120

6.4 Results ............................................................................................................. 123

6.4.1 Preliminary design on paste formulation .................................................... 123

6.4.2 Detailed design on paste formulation ........................................................ 127

6.4.3 Pitch optimization experiment anodes ....................................................... 134

6.5 Conclusion ....................................................................................................... 138

Chapter 7 Industrial paste imaging ............................................................................ 141

7.1 Introduction ...................................................................................................... 141

7.2 Sampling and data synchronization .................................................................. 142

7.3 Datasets and results......................................................................................... 143

7.3.1 Normal operation ....................................................................................... 144

7.3.2 Paste plant start-up ................................................................................... 149

7.3.3 Industrial pitch optimization experiments ................................................... 153

7.4 Joint modelling of image features and paste plant data using SMB-PLS .......... 165

7.5 Conclusions ..................................................................................................... 171

Chapter 8 Conclusions and recommendations ........................................................... 175

8.2 Development of the machine vision sensor ...................................................... 175

8.3 Sensitivity and robustness to industrial paste ................................................... 177

8.4 SMB-PLS algorithm .......................................................................................... 178

8.5 Recommendations ........................................................................................... 179

8.5.1 Multivariate monitoring and control ............................................................ 179

8.5.2 Real-time paste quality measurement ....................................................... 180

Bibliography ................................................................................................................... 183

Appendix A Update of the anode properties prediction model ........................................ 193

ix

Table list

Table 1 – Typical dry aggregate particle size (Jones 1986) ................................................ 5

Table 2 – Anode properties typically measured from core samples .................................... 9

Table 3 – GLCM features of the images in Figure 16 ....................................................... 36

Table 4 – Properties of the industrial coke used for the laboratory paste manufacturing .. 48

Table 5 – Particle size distribution (measured at the plant) for each material fractions ..... 48

Table 6 – Properties of the industrial pitch supplied by ADQ for the laboratory anodes .... 49

Table 7 – Base mix formulation for the laboratory anode fabricated with the industrial raw

materials .......................................................................................................................... 49

Table 8 – Laboratory coke aggregate formulation ............................................................ 50

Table 9 – Laboratory coke properties ............................................................................... 50

Table 10- Laboratory pitch properties ............................................................................... 51

Table 11 – Heat-up rate during the laboratory anode baking ............................................ 53

Table 12 – Choice of GLCM distance L and comparison to the particle size distribution .. 59

Table 13 – Band pass size in period (i.e. spatial dimensions) for each decomposition level

of the DWT ....................................................................................................................... 60

Table 14 – List of the Y variables used for the anode manufacturing dataset case study . 74

Table 15 – Formulations used in the first series of experiments aiming at varying the

amounts of coke fines and pitch in the paste. ................................................................. 108

Table 16 – Changes in the paste formulation tested in the second set of experiments ... 110

Table 17 – List of experiments for the laboratory pitch optimization ................................ 113

Table 18 – Impact of adding contrast enhancement on PLS model statistics.................. 117

Table 19 – Impact of wavelet type and filter length on PLS model statistics ................... 119

Table 20 – Impact of different combinations of textural features on PLS model statistics 121

Table 21 – PLS model statistics for changes in fines and pitch percentages in the paste

formulation ..................................................................................................................... 124

Table 22 – PLS models statistics for the detail design on paste formulation ................... 128

Table 23 – PLS model statistics for the pitch optimization experiments .......................... 135

Table 24 – Correlation coefficients between the paste formulation variables for the normal

operation data ................................................................................................................ 145

Table 25 – Statistics of the PLS models built on normal operation data .......................... 146

Table 26 – Sample number and elapse time since the first start-up sample ................... 150

Table 27 – Statistics PCA model built on the paste plant start-up data ........................... 151

Table 28 – Changes implemented on pitch % set-point in the industrial pitch variations

dataset ........................................................................................................................... 155

Table 29 – Correlation coefficients between the paste formulation variables for the

experiments on pitch ratio .............................................................................................. 156

Table 30 – Coke and pitch properties for each pitch variation experiments .................... 156

Table 31 – Statistics of the PLS model for the design of experiments on pitch ratio ....... 160

Table 32 – Statistics of the GAD SMB-PLS model.......................................................... 167

Table 33 – Performance statistics of the original dataset PLS model in cross-validation,

prediction of the validation set and prediction of new data .............................................. 194

x

Table 34 – Performance statistics of the new PLS model in cross-validation and prediction

of the validation set ........................................................................................................ 197

xi

Figure list

Figure 1 – Cross section of a prebaked reduction cell technology (Courtesy of Alcoa) ....... 2

Figure 2 – Anode manufacturing process flowsheet (Fischer et al. 1995) ........................... 3

Figure 3 – New anode assembly (Courtesy of Alcoa) ......................................................... 4

Figure 4 – Illustration of the difference in pitch demand for two paste mixes ...................... 7

Figure 5 – Schematic of a baking furnace section (Grégoire et al. 2013) ............................ 8

Figure 6 – Illustration of the different behavior of GAD and BAD as a function of pitch % . 12

Figure 7 – Effect of constant operating conditions ............................................................ 12

Figure 8 – Illustration of the effect of different raw material and processing conditions of

the anode paste visual appearance: a) and b) 2 different industrial pastes and c) laboratory

paste ................................................................................................................................ 15

Figure 9 – Schematic of the machine vision methodology for anode paste ....................... 16

Figure 10 – Schematic representation of PCA .................................................................. 20

Figure 11 – NIPALS algorithm for PCA ............................................................................. 22

Figure 12 – Matrices of PLS ............................................................................................. 22

Figure 13 – NIPALS algorithm for PLS ............................................................................. 24

Figure 14 – Schematic of the machine vision approach (Liu 2005; Duchesne 2010) ........ 29

Figure 15 – Examples of GLCM matrices (Tessier et al. 2008) ......................................... 32

Figure 16 – Two stone surfaces with different texture used for the GLCM features example

(http://www.highresolutiontextures.com) ........................................................................... 35

Figure 17 – Frequency band divisions of the DWT of a vector (one-dimensional signal) .. 40

Figure 18 – 2-D DWT decomposition a) schematics of the filter bank used at the jth

decomposition and b) frequency distribution of the detail and approximation images (Liu &

MacGregor 2007) ............................................................................................................. 41

Figure 19 – Composite image of different textures (http://www.highresolutiontextures.com)

......................................................................................................................................... 42

Figure 20 – Approximation and detail coefficients of the composite texture image: a)

approximation at scale 3, b) comparison of the reconstructed detail at scale 1 for sub-

images 1 and 7, c) comparison of the reconstructed detail at scale 1 and 3 of sub-image 5

and d) comparison of the direction sensitivity for the reconstructed detail at scale 1 of sub-

image 2. ........................................................................................................................... 43

Figure 21 – Details of the mixer and oven for laboratory paste preparation ...................... 51

Figure 22 – Details of the press: a) cylindrical mold and dye and b) the press with the oven

to control the pressing temperature (Azari Dorcheh 2013)................................................ 52

Figure 23 – Laboratory baking furnace and baking box .................................................... 53

Figure 24 – Imaging set-up installed at the ADQ industrial plant ...................................... 54

Figure 25 – Image acquisition set-up ................................................................................ 55

Figure 26 – Anode paste machine vision flowsheet .......................................................... 56

Figure 27 – Example of paste image: a) laboratory paste and b) industrial paste ............. 57

Figure 28 – Results of the image pre-processing: a) low-pass filtered grayscale image, b)

image after contrast enhancement and c) comparison of the intensity histogram for both

images ............................................................................................................................. 58

Figure 29 – Symlet 4 wavelet function .............................................................................. 59

xii

Figure 30 – Illustration of the block order for an industrial process ................................... 61

Figure 31 – The MB-PLS algorithm for 2 regressor blocks (adapted from (Westerhuis et al.

1998)) ............................................................................................................................... 67

Figure 32 – The SO-PLS algorithm shown for 2 regressor blocks .................................... 68

Figure 33 – The SMB-PLS algorithm for two X blocks ...................................................... 70

Figure 34 – Simulated end section of a film blowing process (adapted from (Duchesne

2000)) ............................................................................................................................... 72

Figure 35 – Data blocks collected from the anode manufacturing process (Modified from

(Lauzon-Gauthier et al. 2012)) .......................................................................................... 74

Figure 36 – Q2Y and RMSEP statistics for selecting the number of components of the MB-

PLS algorithm for case 1 (Z and X are orthogonal) ........................................................... 77

Figure 37 – Q2Y and RMSEP statistics for selecting the number of component of the SO-

PLS model for case 1 ....................................................................................................... 78

Figure 38 – Q2Y and RMSEP statistics used for selecting the number of component of the

SMB-PLS model for case 1 .............................................................................................. 79

Figure 39 – Q2Y and RMSEP statistics used for selecting the number of component for the

MB-PLS algorithm for case 2 ............................................................................................ 79

Figure 40 – Q2Y and RMSEP statistics used for selecting the number of components of the

SO-PLS model for case 2 ................................................................................................. 80


SMB-PLS model for case 2 .............................................................................................. 80

Figure 42 – Explained Y variance for the three multi-block methods built on the film

blowing datasets: a) case 1 and b) case 2. Z and X block variance explained and total (i.e.

concatenated regressor blocks) variance explained: c) case 1 and d) case 2 ................... 81

Figure 43 – Relative importance of each block by LV for: a) MB-PLS case 1, b) MB-PLS,

case 2, c) SMB-PLS case 1 and d) SMB-PLS case 2 ....................................................... 84

Figure 44 – Loadings of Z, X and Y blocks in the 3rd SMB-PLS component (Z-3) for case 2.

......................................................................................................................................... 85

Figure 45 – Loadings of Z, X and Y blocks in the 4th and 5th SMB-PLS component (X-1 and

X-2) for case 2. ................................................................................................................. 86

Figure 46 – Selection of the number of LVs for the MB-PLS model computed from the

anode manufacturing dataset: a) Q2Y and b) RMSEP for all Y variables .......................... 87

Figure 47 – Selection of the number of LVs for the SO-PLS model computed from the

anode manufacturing dataset ........................................................................................... 87

Figure 48 – Selection of the number of LV for the SMB-PLS anode model ....................... 88

Figure 49 – Results obtained with the multi-block algorithms on the anode manufacturing

dataset: a) R2Y and Q2Y for all methods, b) overall R2X by block for all methods, relative

weights (bars) and block variance explained R2X (lines) by LV for c) MB-PLS and d) SMB-

PLS .................................................................................................................................. 90

Figure 50 – Bi-plot of the block weights and Y loadings for first two components (Z-1 and

Z-2) of the SMB-PLS model built on the anode manufacturing dataset ............................. 91

Figure 51 – Amount of pitch used in the formulation as a function of the amount of coke

fines particles for different raw material blends (combinations of coke and pitch suppliers)

......................................................................................................................................... 94

xiii

Figure 52 – Z and X1 block weights bi-plot for LV1 and LV2 of MB-PLS ........................... 95

Figure 53 – Bi-plots of X2 block weights and Y loadings: a) LV 5 of MB-PLS and b) LV6

(X2-1) of SMB-PLS........................................................................................................... 96

Figure 54 – Baking block (X3) scores and loadings bi-plot: a) MB-PLS block scores, b) MB-

PLS block weights for LV4-LV5, c) SMB-PLS block scores and d) SMB-PLS block weights

for LV7-LV8 (X3-1 and X3-2). The blue and red markers indicate the anodes baked in the

coldest and hottest positions in the furnace ...................................................................... 97

Figure 55 – Comparison of the information mixing in MB-PLS and SMB-PLS models: a)

super scores (LV1-LV2) of MB-PLS, b) super scores (LV1-LV2) of SMB-PLS, c) Z scores

(LV2-LV3) of MB-PLS and d) Z scores (LV2-LV3) of SMB-PLS ........................................ 99

Figure 56 – Anode paste image...................................................................................... 105

Figure 57 – Baked and green anode density (BAD and GAD) for the pitch optimization

anodes using cokes from two different sources (A and B) .............................................. 113

Figure 58 – ∆BAD of the lab formulation anodes ............................................................ 116

Figure 59 – Scores of the PLS models for the lab formulated anodes: a) no contrast

enhancement and b) with contrast enhancement ........................................................... 118

Figure 60 – Score plots for the first two PLS components (LVs 1-2) of four models from

Table 20: a) model 1, b) model 4, c) model 9 and d) model 7 ......................................... 122

Figure 61 – Final image texture analysis procedure ....................................................... 123

Figure 62 – Scores and loadings weights of the PLS model (replicates averaged) for the

case where fines and pitch variations were introduced in the paste formulation: a) LV1-LV2

scores, b) weights and loadings of LV1 and c) weights and loadings of LV 2 ................. 125

Figure 63 – Reproducibility of the imaging sensor in the case of the preliminary design on

formulation. The averaged LV1 and LV2 scores are shown for replicated samples along

with their one standard deviation error bars .................................................................... 127

Figure 64 – Butts size distribution span .......................................................................... 128

Figure 65 – Scores and loadings weights of the PLS model built on averaged replicated

samples data for the case of the detailed design on formulation: a) X scores on LV1 and

LV2, b) Y scores on LV1 and LV2, c) weights and loadings of LV1 and d) weights and

loadings of LV2 .............................................................................................................. 130

Figure 66 – Interpretation of the PLS model built using averaged replicated samples data

for the case of the detailed design on formulation. Variations in the scores and associated

contribution plots: a) and b) increase in the pitch ratio, c) and d) shot coke addition, e) and

f) decrease in the fines ratio and g) and h) change from a coarser to a finer formulation 131

Figure 67 – Reproducibility of the imaging sensor in the detailed design on formulation.

The averaged LV1 and LV2 score values are shown for a) image replicates and b) mix

replicates along with their one-standard deviation error bars .......................................... 134

Figure 68 – Comparison of the predicted and measured ∆BAD for the replicated averages

model ............................................................................................................................. 135

Figure 69 – Scores and loadings of the PLS model (averaged replicates) for the pitch

optimization experiments: a) LV1 scores , b) LV1 weights and loadings, c) LV2 scores and

d) LV2 weights and loadings .......................................................................................... 136

Figure 70 – Scores of the 3rd and 4th components of the PLS model built on the pitch

optimization dataset (averaged features) ........................................................................ 137

xiv

Figure 71 – Reproducibility of the imaging sensor in the pitch optimization experiments.

The scores of the first two components of the PLS model built on all samples are shown

along with one standard deviation error bars .................................................................. 138

Figure 72 – Formulation variables for the normal operation industrial dataset: a) dry

aggregate % and b) pitch ratio ........................................................................................ 144

Figure 73 – First component’s scores (a) and loadings (b) of the PLS model (averaged

replicate data) built on normal operation data of the ADQ paste plant ............................ 146

Figure 74 – Second component’s scores (a) and loadings (b) of the PLS model (averaged


Figure 75 – Third component’s scores (a) and loadings (b) of the PLS model (averaged


Figure 76 – Uncertainties in the scores of the PLS model built on normal operation data: a)

LV1, b) LV2 and c) LV3. One standard deviation error bars on the scores are shown. ... 149

Figure 77 – Time series of a 5h plant start-up period: a) GAD and b) scores of LV1, LV2

and LV3 .......................................................................................................................... 152

Figure 78 – Scores and loadings of the PCA model built on the industrial paste start-up

data (averaged image replicates): a) LV1 and LV3 score plot and b) LV1 loadings plot.. 153

Figure 79 – Changes in the formulation variables for the industrial dataset where pitch ratio

was varied. The five sampling campaigns are indicated by letters A-E. .......................... 154

Figure 80 – Baked anode core properties: a) BAD for experiments C and E, b) electrical

resistivity, c) compressive strength, d) CO2 reactivity residue (CRR) and e) Young’s

modulus ......................................................................................................................... 158

Figure 81 – Predicted versus measured pitch ratio obtained using the PLS model built on

data collected during the design of experiments on pitch ratio (averaged replicates) ...... 160

Figure 82 – Scores and loadings of the PLS model (averaged replicates) component 1 for

the designed experiments on pitch ratio: a) LV1 scores, b) LV1 weights and loadings, c)

scatter plots of LV1 scores and coarse % and d) scatter plots of LV1 scores and pitch %

....................................................................................................................................... 161


the designed experiments on pitch ratio: a) LV2 scores, b) LV2 weights and loadings and

c) scatter plots of LV2 scores and pitch % ...................................................................... 163


the designed experiments on pitch ratio: a) LV3 scores and b) LV3 weights and loadings

....................................................................................................................................... 164

Figure 85 – Uncertainties in the scores of the PLS model (all sample) for the data obtained

during the design of experiments on pitch ratio): a) LV1, b) LV2 and c) LV3. One standard

deviation error bars of the scores are shown in the figure. .............................................. 165

Figure 86 – Data blocks and variables used in the SMB-PLS model for predicting GAD 166

Figure 87 – Relative contribution (bars) of each regressor block in the SMB-PLS model.

The explained variance of each regressor block R2X (black lines) and of the Y block R2Y

(gray line) are also shown .............................................................................................. 168

Figure 88 – Loading weights of the raw material properties (Z) in component LV Z-1 ..... 169

Figure 89 – Block weights of LV Z-2: a) raw material (Z) and b) image features (X3) ..... 170

Figure 90 – Block weights for LV X1-2: a) formulation (X1) and b) image features (X3).. 171

xv

Figure 91 – Model residuals: a) Hoteling’s T2 and b) prediction residual ......................... 194

Figure 92 – Residual contribution: a) Observation A, b) Observation b and c) Observation

C of Figure 91 ................................................................................................................ 196

xvii

Dedicated to Christophe, Justin and Marilou

xix

Acknowledgments

For the last four years I have spent a significant amount of time working on this Ph.D.

project. It was a very challenging but also rewarding period of my life. I am grateful for the

support of many people and organizations and I would like to take this opportunity to

express my gratitude to all of these important persons.

I am thankful for the financial support of Alcoa, the Fonds de recherche du Québec –

Nature et technologies (FRQNT), the Aluminum Research Centre – REGAL, NSERC,

Université Laval and Rio Tinto Alcan. With this support, I was able to focus my attention on

my project and my family.

I would like to express my profound gratitude and respect for my supervisor, Dr. Carl

Duchesne. You have given me all the tools, support and opportunities that I needed to

accomplish this project and much more. With your guidance, I have become confident in

my abilities and knowledge.

Special tanks to Dr. Jayson Tessier, your continued support of my project within Alcoa and

your thoughtful inputs on the problems and results were invaluable.

The Université Laval chemical engineering department has been my home for the last ten

years. I need to mention the contribution of the many people that support the students

every day. Thanks to Dr. Alain Garnier, Ann Bourassa, Nadia Dumontier, Pierrette

Vachon, Jean-Nicolas Ouellet and Yann Giroux.

I would like to thank the technicians and research assistants who helped me during my

experimental work: Guillaume Gauvin, Donald Picard, Hughes Ferland and Vicky Dodier. I

had a lot of fun working with you. I would also like to thank Jean-Phillip Giguère who

helped me as an intern to fabricate numerous laboratory anodes and for the BAD

measurements.

Many thanks to Kamran Azari Dorcheh and Francois Chevarin, this Ph.D. would have

been much more difficult without your previous laboratory work on anode fabrication and

your generous help in the lab. I also appreciated working with the MACE3 chair and RDC-

anode groups: Geoffroy, Ramzi, Behzad, François G., Dave, Pierre-Olivier, Stéphane and

the others.

xx

Many folks within Alcoa have contributed to the success of this project. I have felt at home

every time I went to the plant and I felt that you supported the project. This was important

for me. Thanks to Francis-Joé, Isabelle, Réal, Romain, Christian, Marc, Katie, Don, John

and the many operators who helped me during my experiments.

To my colleagues in the office, much of the day to day life in the department was shared

with you and I enjoyed it very much. It was also a pleasure to share many activities with all

of you. Thanks to Amélie, Pierre-Marc, Wilinthon, Alexandre, Jean-Pascal, Thierry,

Mathias, Juliette, Karl, Simon, Corinne and Moez.

Massoud, thanks for your friendship and availability when we needed to discuss the latent

variable methods or the interpretation of the results.

Finally, many thank to my family for the support during all these years, especially to my

wife Marilou. I could not have completed a Ph.D., raise two beautiful kids, start my

professional career and still have a normal life without you.

Merci!

1

Chapter 1 Introduction

The aluminium industry is a very important component of the Canadian economy. In 2010,

Canada was the third largest aluminium producer with 7% of the world production (source:

International Aluminium Institute http://www.world-aluminium.org). It sustains

approximately 10,000 direct jobs in the Province of Québec alone and injects 2.5 billion

dollars per year in its economy (source: Association de l'aluminium du Canada

http://ledialoguesurlaluminium.com).

There are some manufacturing challenges to address during the fabrication of the carbon

anodes used to produce aluminium. The most important aspect in terms of smelters

operation is the consistency of the anode quality. Unfortunately, anode manufacturers

have to cope with increasing raw material variability and they are not adequately prepared

to face this situation. To maintain consistent anode quality over time, carbon plants need

to adapt the formulation and the processing conditions in response to the raw materials

variations. However, the key raw material properties and process measurements are not

available in real-time to implement such adjustment. This thesis focuses on issues related

with real-time quality control of baked carbon anodes and the lack of fast and relevant

measurements to cope with raw materials variability. New data-driven methods and non-

destructive sensing techniques are proposed to improve process understanding,

monitoring and control of the anode manufacturing process.

Some sections (i.e. 1.1 to 1.5) of this chapter are a reproduction with minor modifications

and additions of the most important parts of the chapter 2 of the author’s M.Sc. thesis

(Lauzon-Gauthier 2011). It is reproduced here to give the readers the necessary

background of the manufacturing of industrial carbon anodes.

1.1 Aluminium manufacturing

The industrial production of primary aluminum is performed using the so-called Hall-

Héroult process (Grjotheim & Kvande 1993). Basically, aluminum is obtained through the

electrolytic reduction of alumina taking place within a typically large number of

metallurgical reactors (reduction cells) electrically connected in series. The

electrochemical reaction (shown below) involves dissolved alumina (in a cryolitic bath) and

carbon as the reactants, and yields liquid aluminum and carbon dioxide (gaseous

emission).

2

( ) ( ) ( ) ( )2 3 diss s l 2 g2Al O 3C = 4Al 3CO+ + 1.1

Figure 1 presents a schematic diagram of a pre-baked anode reduction cell, also called

pot in the industry. A high electrical current (i.e. from 100 kA to 600 kA for current

technologies (Tabereaux 2000; Charmier et al. 2015)) is passed through the cell, entering

from the conducting rods and the baked carbon anodes, and exiting by the cathode block

after passing through the cryolitic bath and the liquid aluminum pad. The anodes are

immersed into a bath made of cryolite, a chemical that dissolve alumina. The reduction

reaction takes place in the bath and liquid aluminium settles at the bottom of the pot. The

metal is tapped on a daily basis to ensure a constant height of liquids (i.e. liquid aluminium

and molten electrolytic bath) in the pot. Since the anodes are consumed by the alumina

reduction reaction (equation 1.1), they need to be periodically replaced. During pot

operation, the anodes are lowered continually as they are consumed to keep the anode to

cathode distance (ACD) constant. When the anodes reach approximately 1/3 of their

original size, they are replaced by new ones. The residual anodes, called butts, are

recycled to produce new anodes.

Figure 1 – Cross section of a prebaked reduction cell technology (Courtesy of Alcoa)

1.2 Anode manufacturing

The anode manufacturing plant is a vital part of a smelter’s operation because it supplies

one of the main raw materials for the aluminium reduction process (i.e. the baked carbon

New anode

Alumina feeders

Steel shell

Conducting rod

Spent anode

Bath

Molten aluminum

Lining

Cathode block

anodes). A typical process flowsheet is shown in

briefly described in the following paragraphs

2000; Lauzon-Gauthier 2011; Azari Dorcheh 2013)

effect of raw material properties and process operation on anode quality

Figure 2 – Anode manufacturing process flowsheet

The anode raw materials consist of calcined petroleum coke, liquid coal tar pitch and

recycled anode butts. The anode filler particles (e.g. coke and butts) are classified and

ground into a desired particle size distribution. The mix of coke and butts is called the dry

aggregate mix. The dry aggregate is than pre

(i.e. the binder) to obtain

anode block of specific dimensions using either

“green” anode. Finally, the green anode is baked in a furnace

then attached to a conducting

A typical process flowsheet is shown in Figure 2 (Fischer et al. 1995)

briefly described in the following paragraphs. The interested readers are referred to

Gauthier 2011; Azari Dorcheh 2013) for a more detailed description

effect of raw material properties and process operation on anode quality.

Anode manufacturing process flowsheet (Fischer et al. 1995)


The anode filler particles (e.g. coke and butts) are classified and

into a desired particle size distribution. The mix of coke and butts is called the dry

aggregate mix. The dry aggregate is than pre-heated before it is mixed with

to obtain the so-called anode paste. The paste is then

anode block of specific dimensions using either a press or a vibrocompactor to obtain a

Finally, the green anode is baked in a furnace. The baked anode block

conducting rod, and the assembly is finally ready to be

3

(Fischer et al. 1995) and is

are referred to (Hulse

descriptions of the

(Fischer et al. 1995)


The anode filler particles (e.g. coke and butts) are classified and

into a desired particle size distribution. The mix of coke and butts is called the dry

it is mixed with liquid pitch

then formed into an

vibrocompactor to obtain a

baked anode block is

ready to be set in the pots.

4

Figure 3 – New anode assembly (Courtesy of Alcoa)

An anode assembly (i.e. baked anode and connecting rod) is presented in Figure 3. The

aluminium rod is used to connect the anode assembly to the pots. The tripod is fixed to the

anode by pouring cast iron in the stub holes gaps.

1.3 Anode raw materials

Calcined coke is manufactured from the residual heavy oil fractions of the petroleum

refining industry. It is a low value by-product (i.e. waste) and therefore, refineries have no

incentive to control and/or improve its quality. Therefore, the quantity, quality and price of

calcined cokes available on the market vary significantly over time. This implies that

carbon plants need to adapt to cokes having important differences in physical properties

and chemical impurities from shipment to shipment (McClung and Ross 2000).

The following steps are required to transform heavy oil into coke: a delayed coking

process yields the green coke and this process is followed by a calcining operation to

produce the calcined coke of interest for the aluminum industry. Calcined coke quality is

influenced by the calcining conditions and green coke quality which is influenced by crude

oil quality, refining operation and delayed coking operation parameters (Fischer et al.

1995). Several papers describe the effects of oil quality, and process operation on green

coke quality (e.g. (Fischer & Perruchoud 1985) and (Vitchus et al. 2013)).

Coal tar pitch (CTP) is the binder used for making the baked anodes for the aluminum

industry. This pitch is produced from coal tar through a distillation process. Coal tar is a by-

product of the metallurgical coke production from coal. The role of the pitch in the anode

Conducting rod

Tripod

New anode

5

recipe is to bind the dry aggregate together to enable forming the anode into a block of

specific dimensions. It is also useful to fill some of the coke particle porosity. To obtain

good mechanical properties after forming, the anodes are baked to transform the

amorphous pitch into semi-crystalline coke.

Anode butts consist of the unconsumed portion of the anodes left after they are removed

from the pots (typically about 1/3 of their original size). Anodes are not consumed

completely to avoid metal contamination from the steel stubs. However, anode butts

surfaces are contaminated by sodium and other contaminants from the anode cover

material and frozen bath. Thus a cleaning step is required before the butts are stripped

from the stubs. After cleaning and stripping, the butts are crushed, screened to the desired

size distribution and stored in silos for use in the production of fresh anodes. This reduces

the amount of waste materials and the amount of fresh coke needed to formulate the

anodes. Butts constitute approximately 15-30% of the green anode formulation (Fischer

and Perruchoud 1991).

1.4 Anode fabrication process

In the first step of the process, the dry aggregate particles (coke and butts) are pre-

processed by screening and crushing. The finer coke particles are produce by milling

some of the material in a ball mill as well as collecting the dust throughout the anode plant.

The coke is usually classified in three distinct fractions: coarse, intermediate and fines.

The butts, which are less porous than the coke, consist mainly of coarse material (Fischer

et al. 1995). The typical particle size for each fraction is given in Table 1 (Jones 1986).

Table 1 – Typical dry aggregate particle size (Jones 1986)

The fineness of the fines size fraction is characterized by the Blaine number, which

measures the particle surface area. Blaine number increases with decreasing particle size

because the particle surface area increases for smaller sizes. Hence, it is used to

characterize particles too small to be classified by sieve analysis. This parameter is usually

closely monitored by the paste plant operators.

Particle size

(US mesh)

Particle size

(µµµµm)

Coarse -¼ in/+30 -6.3 mm/+600

Intermediate -30/+100 -600/150

Fine -100 -150

max 1 in max 25 mm

Dry aggregate

Coke

Butts & baked scrap

6

The dry aggregate blend is formulated using weight belts and is discharged in a pre-

heating equipment. Dry aggregate temperature is raised to between 150 and 200°C (Hulse

2000) prior to adding pitch to the dry aggregate blend. Pitch is also pre-heated to a

temperature ranging between 170 and 230°C (Hulse 2000) before it is incorporated in the

dry aggregate mix to form the anode paste The temperature difference between the pitch

and the dry aggregate is closely monitored to avoid partial solidification of the pitch on the

coke particles when these are put into contact, which would hinder proper pitch

penetration in the filler matrix and lead to a more heterogeneous paste. The paste is then

fed into a mixer in order to evenly distribute the pitch within the dry aggregate and to

ensure that the internal pores of the coke particles are filled with the binder. Mixing

temperature is usually set between 155°C to 180°C. Anode quality generally increases

with increased coke and pitch temperature up to the degassing temperature of pitch

volatiles. The paste viscosity decreases with an increase in the temperature and this will

improve the mixing, spreading and penetration of the binder matrix in the paste (Hulse

2000).

The paste’s pitch ratio is also of great importance. Under-pitched anodes will have

insufficient mechanical properties leading to anode failure in the pots and high electrical

resistivity due to a poor binding behaviour. Over-pitched anodes lead to slump formation

(i.e. problems when forming the anodes), high weight loss, swelling and cracks formation

during baking due to greater volatiles degassing, to packing material sticking also while

baking and finally stub hole deformations (Mannweiler & Keller 1994; Hulse 2000). Pitch

demand (i.e. appropriate amount of pitch for a given dry aggregate) is a function of the

coke fines fraction and filler particle properties but also mixing temperature and duration.

There exist an optimum between particle size distribution, formulation, mixing duration and

temperature and pitch ratio (Belitskus 1978; Hulse 2000).

Optimum pitch demand (OPD) is defined by the amount of pitch needed to obtain optimum

anode properties for a given type of coke, formulation and processing parameters. This is

illustrated in Figure 4 where the baked apparent anode density (BAD), a key anode

property, and the amount of pitch needed to reach the OPD are different for two paste

recipes. It is shown in this figure that the BAD increases with pitch % up to the optimum.

Then, adding more pitch becomes detrimental to the anode quality.

7

Figure 4 – Illustration of the difference in pitch demand for two paste mixes

In this thesis, the optimum pitch demand will often be defined using the baked anode

apparent density (BAD). The BAD is used because it usually correlates well with the

optimum of other anode properties (Belitskus 1978; Belitskus 1981; Belitskus 1993;

Belitskus & Danka 1988; McHenry et al. 1998; Hulse 2000; Belitskus 2013). It is also

straightforward to measure on small anodes (i.e. lab scale or core samples). Other

properties such as electrical resistivity or any groups of properties can also be used to

define the OPD (Hulse 2000).

Anode forming is performed either by pressing or vibro-compaction. The quality of pressed

anodes depends largely on raw material properties and recipe. Vibrated anode quality

depends also on raw material quality but is more sensitive to anode forming process

parameters (e.g. paste temperature during vibro-compaction) (Hulse 2000). If the

temperature is too high, the paste viscosity will be too low and the anode could collapse

when taken out of the mold and a low temperature causing high viscosity will lead to

improper compaction. How evenly the paste is distributed within the mold also has an

impact on anode quality. Uneven distribution usually leads to anisotropic anode properties

within the block.

Anode baking is typically performed using an open ring baking furnace. Details of the

operation of this type of unit are provided in (Fischer et al. 1995; Keller & Sulger 2008). In

brief, a section of the furnace is made of several pits (generally 6 or 7) where the anodes

are staked vertically (e.g. 6 anodes large by 3 anodes high). The space between each pit

(i.e. flue wall) is a cavity where natural gas and pitch volatiles are burned in order to supply

heat to the anodes according to a pre-defined baking cycle. A schematic of a baking

furnace section in provided in Figure 5 (Grégoire et al. 2013).

BAD

Pitch %

Paste 1

Paste 2

8

Figure 5 – Schematic of a baking furnace section (Grégoire et al. 2013)

Anode baking aims essentially at developing the mechanical properties of the anodes. The

pitch needs to be cokefied in order to increase the anode mechanical strength to sustain

the pot’s operating temperature (e.g. about 960°C). The heat-up rate, the final temperature

and soaking time (e.g. amount of time the anodes are maintained at final temperature) are

the most important baking parameters (Mannweiler & Keller 1994). Also, a minimal

temperature gradient between the different positions within the furnace needs to be

maintained to minimise the variability in the anode properties at different position in the

furnace (Fischer et al. 1993). This is accomplished by an appropriate design of the baffles

and flow path in the flue walls as well as adjusting the pressure and diameters of the

burners to obtain an optimum flame profile. Additionally, some process parameters can

also be adjusted. First, the under-pressure (i.e. to adjust the amount of oxygen in the flues)

can be manipulated. Also, temperature profiles can be adjusted in response to variations

in raw material (e.g. amount of volatiles in the pitch).

1.5 Anode properties

Anode quality is defined by a number of properties measured in the laboratory from core

samples collected from a certain number of baked anodes according to a pre-defined

sampling plan. These quality attributes (listed in Table 2) are grouped into four categories:

physical properties, mechanical properties, reactivity and chemical composition (e.g.

contaminants). Details on the laboratory analyses are available in (Fischer et al. 1995).

Anodes

Coke

Refractories

Flue wall

9

Table 2 – Anode properties typically measured from core samples

The anode properties measured in the laboratory are used as an indication of process

stability and anode quality, but there are some issues with the use of core samples for

quality control. Typically, less than 1% of the weekly anode production is sampled (i.e.

core physically extracted from the anode). Moreover, the core samples (50 mm diameter

and approximately 400 mm in length) are collected from a specific location on the anode

and are not necessarily representative of the anode block (i.e. approximately 0.6 m3 and

930 kg), which properties can be anisotropic. Furthermore, the cores are generally not

long enough to measure all the properties on the same sample. Thus, the lab results might

Unit

Air permeability nPm

Apparent density kg/dm3

Thermal conductivity W/mK

Electrical resistivity mohm*cm

Flexural strengh MPa

Fracture energy J/m2

Coefficient of thermal expansion K-1

Compressive strength MPa

Young's modulus GPa

Real density kg/dm3

Cristalite size Lc nm

Ash content %

CO2 reactivity CO2 loss CRL %

CO2 dust CRD %

CO2 residue CRR %

Air reactivity Air loss ARL %

Air dust ARD %

Air residue ARR %

Chemical impurities Sulphur S %

Vanadium V ppm

Nickel Ni ppm

Silicon Si ppm

Iron Fe ppm

Aluminium Al ppm

Sodium Na ppm

Calcium Ca ppm

Properties

10

not be representative of the whole anode population (Sinclair & Sadler 2009). There is also

a few weeks delay between the sampling and the availability of the lab results, so

oftentimes, the anodes have already been set in the pots and it is too late to take

corrective actions on the anode manufacturing process. However, when used correctly

(i.e. good sampling strategies), these laboratory measurements can be use to assess the

overall anode quality over a certain time window (aggregated measurements) or to

compare the effect of different operating parameters. But due to the long processing time,

the anode properties obtained from the quality control laboratory cannot be used for real-

time feedforward/feedback control (Sinclair & Sadler 2006; Sinclair & Sadler 2009).

1.6 Problems

As introduced earlier, anode quality is critical to the optimum operation of the aluminium

smelters. However, the quality of baked anodes is becoming less consistent over time due

to three main reasons. First, the declining availability of good quality anode raw materials:

coke and pitch. Second, most smelters purchase coke materials from an increasing

number of suppliers to meet availability and quality targets (i.e. they blend cokes) but also

to reduce purchasing costs. This introduces supplier-to-supplier variations in addition to

lot-to-lot variability from any given supplier. Finally, some of the anode plants (especially

the older plants) do not have the flexibility to cope with such an increased variability.

Indeed, frequent adjustments to process operating parameters are required to attenuate

the impact of raw material variability, and most plants are not equipped to make those

corrections in a timely fashion. For example, there is a general lack of on-line sensors to

measure critical-to-quality attributes at the various stages of the anode manufacturing,

from raw materials characterization, to the different processing steps and final anode

quality assessment. Minimizing the impact of raw material variability is of great importance

for aluminium smelters because of the significant impact it has on the performance of the

reduction cells and the economical performance of the smelters (Fischer & Perruchoud

1991; Jentoftsen et al. 2009).

In the future, the suppliers of coke and pitch are expected to support the growing demand

for aluminum, but the major issues will be the increasing cost and the availability of high

quality raw material (Mannweiler et al. 2009; Baron et al. 2009; Edwards et al. 2012).

There are several reasons for the decrease in quality of the carbon materials and the most

important one is that both coke and pitch are by-products of their respective industries

which have no or very little economical incentive to improve or control their quality. The

11

second reason is that both raw materials are dependent on the source of crude oil or coal

and the conditions of the cokefaction or distillation process used to produce them. Coke

properties are highly dependent on the diversity of crude oil sources. As low sulfur, low

contaminant crudes become rarer, higher contaminated crudes are being refined. This

leads to higher levels of contaminations in the coke, especially for vanadium and sulphur.

It also leads to changes in the coke micro-texture from a sponge-like appearance to a

more isotropic texture which can increase anode cracking in the cells (Edwards et al.

2009; Edwards et al. 2012).

Due to the rarity of high quality raw materials on the market, the higher cost of coke and

pitch drives the carbon plant to more frequent supplier changes which in turn increase

even more the variability of raw material incoming to the carbon plants. Some

manufacturer are even considering using non-traditional lower cost type of coke in the

anode paste (e.g. shot coke (Edwards et al. 2009)).

Unfortunately, the anode manufacturers are not well prepared to manage this increase in

raw material variability. There is a general lack of real-time measurements of key raw

material and green and baked anode properties. The coke and pitch properties are

characterized by laboratory analysis or by simply using the certificate of analysis (COA)

from the manufacturer, but these are often available after the manufacturing of the anodes.

Furthermore, the baked anode properties are measured by sampling cylindrical core of

less than 1% of the production.

The issues with the representativeness of the anode core sample properties have been

discussed earlier, but the main problem is with the long delay to obtain those

measurements (i.e. a few weeks). The results are available too long after the anode has

been produced to be used for implementing corrective actions in the process (Sinclair &

Sadler 2006; Sinclair & Sadler 2009). Only long term deviations can be observed by this

monitoring strategy, and it is of limited use when raw material supply changes frequently.

Currently, the green anode quality is controlled by manipulating the amount of pitch in the

paste (i.e. pitch ratio in the formulation). Hulse (Hulse 2000) presented a review of

empirical model based pitch optimization techniques. At the plant, a combination of

operator experience, visual inspection of the newly formed green anode and the use of the

green anode density (GAD) as a quantitative metric are used to estimate the required

amount of binder (i.e. pitch demand) in the anode. Pitch is also adjusted to ensure smooth

12

operation of the paste plant for any given formulation of paste or raw material blend.

Unfortunately, the GAD is not a good indicator of baked anode properties since it does not

show an optimum based on the pitch demand. The GAD increases even if the baked

anode properties are decreasing due to over-pitching (Hulse 2000). This is illustrated in

Figure 6 as opposed to the BAD which has an optimum. Also, since this choice of the

optimum pitch level depends on the operator’s experience, the quality of the anode can

change from one operator to the other.

Figure 6 – Illustration of the different behavior of GAD and BAD as a function of pitch %

Other process conditions (e.g. mixing temperature, paste formulation, etc.) are generally

kept at constant operating values. Almost no real-time changes are implemented in

response to the raw material variability due to the lack of on-line quantitative quality

measurements. This situation is illustrated in Figure 7. It is shown that given the increasing

variability of incoming raw materials, if the process conditions are not adjusted (kept

constant), that the variability propagates through the baked anode final properties. Real-

time adjustments of process conditions, through feedforward and feedback control actions,

are necessary to help produce anodes of consistent quality.

Figure 7 – Effect of constant operating conditions

GADBAD

Pitch %

GAD

BAD

13

In summary, the long delays in obtaining the raw material and baked anode properties

from the laboratory and the lack of online quantitative measurements in the paste plant

make it very difficult to face the increase in coke and pitch variability.

This situation can be improved by using the data that are already available at the smelters

and carbon plants. Tessier et al. (Tessier, Duchesne, Tarcy, et al. 2011) identified a set of

combined (i.e. multivariate) anode properties that could help ensure good pot performance

in the smelter. The availability of the information on anode quality, in real-time and for all

anodes, could prevent the introduction of faulty anodes in the reduction cells. The carbon

plant data coming from the raw material properties and the process operation conditions

can be used to predict the baked anode properties. Lauzon-Gauthier et al. (Lauzon-

Gauthier et al. 2012) have shown that, at the Deschambault smelter (ADQ), between 20-

60% of the variance (i.e. model fit) in the anode properties (i.e. physical and mechanical

properties and gas reactivity) can be explained by using the routinely collected raw

material and process data. The multivariate statistical model proposed in that work only

used the data routinely measured in the plant and was shown to provide useful predictions

of the baked anode properties, available right after the baking process. This could allow for

early detection of faulty anodes and investigation of process deviations. However, the

model could only predict the properties for anodes baked at two specific positions within

the furnace due to the anode sampling strategy in place at ADQ. The above studies have

both shown that the data already routinely collected at the plant contains relevant and

useful information, but that new measurements are still necessary to explain a greater

percentage of the variations in baked anode properties which, in turn, will help improve

quality control. A combination of the currently available data and new measurement

techniques is therefore sought as a promising solution.

There are good opportunities in the anode manufacturing process to develop new tools

and sensors to improve the measurement of process variability and increase the ability of

the manufacturers to reduce the impact of the raw material variability. Machine vision

applications for process monitoring and control have become increasingly popular in

recent years. Duchesne et al. (Duchesne et al. 2012) reviewed several of these

applications including the detection of defaults in lumber wood, monitoring of the mineral

froth in a flotation process and detection of steel slab surface defects. Since the variations

in raw material quality and operating conditions of the paste plant can influence the visual

14

appearance of the anode paste, there is a good opportunity to develop a machine vision

sensor capable of monitoring change in the paste quality.

Several methods for characterizing coke or paste properties using images have been

reported in the literature. Most of the proposed method used some automatic image

analysis scheme, but all of them lack the possibility to be applied in real-time. Eilertsen et

al. (Eilertsen et al. 1996) have proposed a method for analysing the coke micro-texture

(i.e. coarseness and anisotropy) using a polarising light microscopy technique. Adams et

al. (Adams et al. 2002) developed a method to measure the thickness of the pitch layer on

coke particles by microscopy. Rorvik et al. (Rorvik et al. 2006) also proposed a method

using a microscope to measure the pitch layer thickness and the pore sizes. The main

disadvantages of these methods are the sample size is small and time consuming sample

preparation is required for each measurement. These techniques are not rapid enough to

support on-line monitoring of the process.

Sadler (Sadler 2012) proposed a method to monitor the macroscopic visual appearance of

the baked anode surfaces using a microscope and found that visible structural changes in

the surface texture could be observed on anode fabricated under different operating

conditions. It was used on baked anodes, but applicability to green anode or paste would

enable carbon plants to react to process changes before the baking step. However, this

approach was not automated and also suffers from the same drawbacks as the other

methods using microscopy and described in the previous paragraphs.

An internal report from Alcoa (Adams et al. 2007) describes a method based on images

used to measure the amount of pitch in the paste. This is the only known method of

automatic paste image analysis so far. Its major drawback was its lack of robustness to

changes occurring in industrial paste samples.

The fundamental hypotheses made and tested in this Ph.D. thesis is that the paste visual

textural appearance is influenced by the dry aggregate particle size distribution, the coke

particle porosity (i.e. pitch demand), the amount of pitch and the processing conditions of

the paste, and a machine vision approach should allow quantifying the effect of these

parameters. To support this, a few anode paste images obtained under different

formulation and processing conditions are shown in Figure 8.

15

Figure 8 – Illustration of the effect of different raw material and processing conditions of

the anode paste visual appearance: a) and b) 2 different industrial pastes and c) laboratory

paste

1.7 Objectives

The general objective of this thesis is to address issues related with the lack of fast and

relevant measurements to help cope with raw materials and process variability and enable

real-time quality control of the green and baked anodes. It is a twofold approach where a

new non-destructive machine vision system is developed to add real-time information

about the green anode paste quality. This sensor could be used in a feedfoward/feedback

control strategies to compensate for disturbances in raw material properties or formulation

variability that are difficult to measure with the usual monitoring approach. Also, a new

multi-block latent variable method is developed to improve the interpretation of the data

already available from the manufacturing process and to be able to include the additional

data coming from the machine vision sensor and other non-destructive real-time

measurement in the future in empirical models of the carbon plant.

The new sensor should be sensitive to changes in formulation and in the pitch demand of

the paste. The images are taken on the paste after mixing, but before compaction. An

illustration of the methodology is presented in Figure 9.

a)

b)

c)

16

Figure 9 – Schematic of the machine vision methodology for anode paste

The first specific objective is to develop a machine vision algorithm (i.e. image

preprocessing, image analysis and features selection) at a laboratory scale. This method

was developed with lab scale anodes in the laboratory at Université Laval. Paste samples

were prepared by varying the conditions of fabrication. These variations included the use

of different types of coke and pitch, variations of the dry aggregate particle size

distribution, of the fine particles fineness (i.e. the Blaine number), of the amount of pitch as

well as the mixing temperature of the paste. Each paste sample was imaged using a

camera in the visible spectrum (i.e. RGB). The image texture characteristics that enabled

the differentiation and classification of the different blends of paste were identified. The

image texture features were computed using advanced image texture analysis method: the

co-occurrence of gray level matrices (GLCM) and also wavelet texture analysis (WTA).

Multivariate latent variable statistical methods such as principal component analysis (PCA)

and projection to latent structures (PLS) were use to analyse the image features.

The second specific objective is to test the robustness of the machine vision sensor for

industrial scale anode paste. This was performed in the Alcoa Deschambault smelter’s

(ADQ) carbon plant. Off-line samples of paste were taken from the process after mixing

during several days of operation and the sensitivity of the sensor developed in the

laboratory to the various process conditions were tested.

Extraction de caractéristiques

Classification (PCA),Régression (PLS)

Classification (PCA)

Regression (PLS)

Image texture

features extraction(GLCM, WTA)

Anode properties

(BAD, resistivity, ...)Feedforward

controller

Feedback

controller

Formulation, pitch, particle

size distribution, ...

Forming and baking

parameters

17

The third specific objective is the development of a new sequential multi-block PLS

algorithm (SMB-PLS). Based on observations made in previous work by Tessier et al.

(Tessier, Duchesne & Tarcy 2011) and Lauzon-Gauthier et al. (Lauzon-Gauthier et al.

2012) it was found that there is a need to improve the visualization and interpretation of

PLS models for large and complex industrial datasets. It is also important to develop such

algorithm as new real-time measurements (e.g. the paste machine vision sensor) are

available. This new algorithm will be useful in the future to integrate all the data related to

anode quality into one single empirical model. These data can be available from the raw

materials, the process operating conditions, some real-time non-destructive

measurements of the paste, green anodes and baked anodes, the baking furnace

operation data, etc. The algorithm was developed using a simulation dataset and the

anode manufacturing data from (Lauzon-Gauthier et al. 2012). The new method is

described and compared to the multi-block PLS (MB-PLS) and sequential orthogonal PLS

(SO-PLS). Also the use of the machine vision data from the industrial paste is discussed.

The SMB-PLS algorithm was not developed to be specific to the anode manufacturing

process and could also be applied to other complex multi-block structured problems.

1.8 Thesis organization

This thesis is organized as follows. Chapter 2 and Chapter 3 provide background

information on statistical and image analysis methods, respectively. Chapter 4 discusses

the material properties and the experimental procedures. Chapter 5 presents a new

sequential multi-block PLS algorithm used to improve the interpretation of PLS model built

on industrial data using the anode manufacturing process data. Chapter 6 discusses the

choice of texture features chosen for the anode paste machine vision methodology and the

results obtained with laboratory pastes and anodes. Chapter 7 focuses on describing the

industrial paste results obtain with the machine vision method and the use of these data in

a SMB-PLS model of the paste plant. Finally, some conclusions are drawn and future work

is discussed.

19

Chapter 2 Latent variable methods

This chapter presents the relevant statistical background information useful for the

understanding of the work presented in this thesis. It is a reproduction of chapter 3 of the

author’s M.Sc. thesis (Lauzon-Gauthier 2011) with modification to sections 2.4 and 2.5.

The basic latent variable methods for multivariate statistical analysis are presented in this

chapter. These methods were developed in the field of chemometrics, defined by Svante

Wold as “How to get chemically relevant information out of measured chemical data, how

to represent and display this information, and how to get such information into data” (Wold

1995). The goal of these methods is to extract the most useful information from complex

and big datasets. It has been extended to chemical process analysis and monitoring as

well in the early 1990’s (Wise & Gallagher 1996; MacGregor & Kourti 1995). Two of the

most used methods, Principal Component Analysis (PCA) and Projection to Latent

Structures (PLS), also referred to as Partial Least Squares, are presented in the following

sections together with a discussion on data scaling, the selection of the number of latent

variables to include in the models and various interpretation tools.

In this thesis, the following notation is used. Scalar quantities are identified using normal

lower case characters (scalar). Vectors are shown using bold lowercase characters

(lowercase), matrices are represented by bold capital characters (CAPITAL) and the

transpose operator is illustrated using uppercase capital T (e.g. XT or tT).

2.1 Principal Component Analysis (PCA)

Principal Component Analysis is the basic multivariate data analysis approach. It is used

to model and investigate multivariate datasets. Detailed tutorials and examples can be

found in (Wold et al. 1987; Kourti 2005). Assume a data matrix X is available consisting of

I rows, commonly called observations or measurements, obtained from J different

variables (columns of X) as illustrated in Figure 10. Most industrial datasets are very large,

noisy, and the variables are typically highly collinear (i.e. X is not full rank). However,

measuring hundreds to thousands of variables on a given process does not necessarily

mean that a hundred independent events occurred on this process. In fact, process

operation is usually driven by a much lower number of underlying independent events

called lurking or latent variables (LV) involving linear combinations of the original variables

(the p’s in Figure 10). These LVs cause the large number of process variables to vary

20

together in certain directions (i.e. in a correlated fashion). PCA is one of the basic methods

for extracting these few latent variables capturing most of the variance in a dataset. The

projection of the dataset onto the lower dimensional space of A dimensions spanned by

the latent variables can then be used to visualise and interpret the relationships between

the variables and between the observations.

Figure 10 – Schematic representation of PCA

The first principal component is the linear combination of the J columns (variables) of X,

defined by the orthonormal vector p1, explaining the greatest amount of variance in the

dataset. This is mathematically formulated as an eigenvector-eigenvalue problem with the

following objective function:

{ }1

T T T1 1 1 1max subject to = 1

pp X Xp p p 2.1

Where the term within brackets represents the variance of the first latent variable t1

defined as the projection of X in the direction of p1:

1 1=t Xp 2.2

This latent variable explains the most variance in X and it is removed from the dataset

leaving the residual matrix E1:

- T1 1 1=E X t p 2.3

If the first component is not sufficient for explaining the variations in X, a second PCA

component can be added to the model. It corresponds to the linear combination of the J

variables explaining the greatest amount of variance not captured by the first component,

X

I

J

T

PT

Variables

Ob

se

rvatio

ns

21

(i.e. left in the residual matrix E1). The second component is the solution to the following

eigen problem:

{ }2

T T T T2 2 2 2 1 2max subject to 1 and 0= =

pp X Xp p p p p 2.4

The additional constraint for this second component ensures that the latent variables are

orthogonal to each other. Additional components can be added sequentially to the PCA

model using expression 2.4 until the desired number of latent variables (A) is computed.

The maximum number of LVs is J, but for industrial data A is usually smaller than J (A <<

J) due to the highly collinear structure of the data. The final model has the following

structure:

T= +X T P E 2.5

Where the score and loading vectors are collected in the matrices T (I×A) and P (J×A) and

the residuals are stored in matrix E (I×J).

In summary, PCA performs the eigenvector decomposition of X. The p vectors are the

eigenvectors of XTX and the t vectors are the eigenvectors of XXT.

For the numerical computation of the p and t vectors, two alternative algorithms are

available. First, one can use Singular Value Decomposition (SVD) to compute all possible

components (a=1, 2, … J) simultaneously and then select the number of LVs to keep in

the model (A<J) using some selection criteria (discussed later in this chapter). The second

option is to use the Nonlinear Iterative Partial Least Squares (NIPALS), which computes

the components one at a time followed by significance testing. A decision to add the next

component is again taken based on some selection criteria. The advantage of this

approach is computational economy in the sense that only those components that are

deemed significant are calculated. The NIPALS algorithm is used in most commercially

available softwares. PCA components are scaling dependent and this issue will be

discussed in section 2.3. Figure 11 shows the details of the NIPALS algorithm for PCA

(Geladi & Kowalski 1986).

22

Figure 11 – NIPALS algorithm for PCA

When J is very large, which is normally the case with industrial data, this method is

advantageous compared to SVD decomposition since it is often not necessary to compute

all latent variables (in this case, A<<J).

2.2 Projection to Latent Structures (PLS)

Projection to Latent Structure is a multivariate regression method. Assume that a second

dataset Y is also available consisting of H variables and I observations (e.g. response

variables such as product quality attributes) as shown in Figure 12. The PLS method is

used to explore the relationships existing within and in between both datasets, X and Y. It

can be seen as an extension of PCA, but for two sets of data.

Figure 12 – Matrices of PLS

The basic assumption behind PLS is that variations in X and Y are linked together by a few

underlying events described by a common set of A latent variables T and U, respectively

(I×A). These latent variables in the X and Y space are selected in such as way that the

covariance between the two datasets is maximized (i.e. that T is most predictive of U).

1. Set t to any column of X.

2. Start convergence loop.

2.1. p = XTt/(tTt)

2.2. p = p/(pTp) ½

2.3. t = Xp

2.4. Check for convergence of t and p.

Continue to step 3 if converged.

3. E = X – tpT

4. Store p and t as new columns in P and T.

5. Restart at step 1, replacing X by E.

X

pT

I

J

2.1 2.3

t

Y

H

U

QT

X

I

J

T

WT

PT

23

Additional details and tutorials can be found in (Geladi & Kowalski 1986; Höskuldsson

1988; Burnham et al. 1996; Burnham et al. 1999; Wold, Sjöström, et al. 2001; Martens

2001; Kourti 2005).

Mathematically, the latent variables are defined as a set of linear combinations of the X

variables expressed by the so-called weight vectors wi, (i = 1, ..., A), which weights are

computed in such a way to maximize the squared covariance between X and Y. The

solution to this problem is again formulated as an eigen problem with the following

objective function:

{ }T T T T T subject to 1and to 0 for= = ≠w

w X YY Xw w w w wi

i i i i i jmax i j 2.6

As for PCA, the set of constraints ensure that the weight vectors wi are orthonormal and

that latent variables are orthogonal to each other. The PLS model structure is described

below, and is also shown schematically in Figure 12.

T= +X TP E 2.7

T= +Y TQ F 2.8

*=T XW 2.9

( )-1T=*W W P W 2.10

Where T (I×A) is the set of A latent variables defining the common latent variable space

capturing the relationships between X and Y. They correspond to those combinations of

the X variables that are the most highly correlated with the Y data. The weights of each

variable in each component are collected in the weight matrix W* (J×A). The P (J×A) and

Q (H×A) matrices contain the loading vectors defining the latent variable spaces of X and

Y, respectively. E (I×J) and F (I×H) are the PLS model residuals. It was shown by

(Höskuldsson 1988) that the vectors w, q, t and u are the eigenvectors of the following

matrices XTYYTX, YTXXTY, XXTYYT and YYTXXT, respectively.

The NIPALS algorithm was adapted for PLS regression by (Geladi & Kowalski 1986;

Höskuldsson 1988) in order to compute the PLS latent variables sequentially. Again, only

the desired number of LV’s are calculated. The algorithm is shown in Figure 13. The PLS

24

vectors are also scaling dependent. This will be discussed with the selection of the number

of latent variables in sections 2.3 and 2.4, respectively.

Figure 13 – NIPALS algorithm for PLS

2.3 Data scaling

Both PCA and PLS methods are sensitive to how the X and Y data matrices are scaled.

When no prior knowledge is available on the relative importance of the variables, the

common practice is to scale them to unit variance after applying mean-centering. This

scaling procedure is applied to each variable (i.e. columns) of the X and Y data matrices.

Consider a column vector (xj) in the X data matrix and its mean value (xj,mean) and standard

deviation (xj,std). The scaled values (xj*) are obtained using the following equation (element

by element division is assumed):

( ),mean*

,std

-=

x xx

x

j j

j

j

2.11

This method is also called auto-scaling. Mean-centering allow the computation of the

variations of the variables around there mean and scaling to unit variance gives equal

importance to all the variable in the models as not all of them are measured in the same

engineering units (Geladi & Kowalski 1986).

2.4 Number of latent variables (A)

Industrial data are typically highly collinear and noisy. Collinearity implies that a small

number of latent variables are sufficient to capture and explain most of the variations in a

1. Set u to any column of Y.


2.1. w = XTu/(uTu)

2.2. w = w/(wTw) ½

2.3. t = Xw

2.4. q = YTt/(tTt)

2.5. u = Yq/(qTq)

2.4. Check for convergence of t or u.


3. p = XTt/(tTt)

4. E = X – tpT and F = Y – tqT

5. Store w, p, t and u as new columns in W, P, Tand U.

6. Restart at step 1, replacing X by E and Y by F.

X

wT

I

J

32.3

pT

Impossi

ble d'af

fich

er l'im

age

t u H

qT

Y

2.1

2.4

2.5

25

dataset (X and/or Y). The corruption of the data by noise means that carefulness must be

used to model only the systematic variation (i.e. structured variations) and guard against

overfitting the model with noise. When the correct number of latent variables (A) is

selected, the important information is stored in the loadings and weight matrices (P, Q and

W*) and the irrelevant variations are left in the residuals (E and F). The most commonly

used method for selecting the number of latent variable is cross-validation (Wold 1978),

but other methods also exist to determine the model order (Nomikos & MacGregor 1995;

Valle et al. 1999; Duchesne & MacGregor 2001).

The cross-validation (CV) method suggests to keep adding latent variables to the model

until the latest component does not significantly improve predictions of X (PCA) or Y

(PLS). For the cross-validation procedure, the I observations in X and/or Y are divided into

g sub-groups of n observations (I=gn). Each sub-group is removed from the data once and

only once and the data in the remaining g-1 sub-groups are then used to build a PCA or

PLS model using a latent variables. Predictions are computed for the group left out of the

analysis and the prediction error sum of squares (PRESS) is computed for this sub-group.

PRESS(a) is the sum of the PRESS values for all g sub-groups for a model with a latent

variables (a = 1, 2, …, A). The model predictive ability is than evaluated with the predictive

multiple correlation coefficients (Q2CV) (Wold 1978):

( )( )

( )1-

-

2CV

R

PRESSQ

SS 1

aa

a= 2.12

where

( ) � ( )( ), ,

2

1 1

PRESSpred

a a= =

= −∑∑I J

i j i ji j

y y 2.13

and

( ) � ( )( ), ,

2

R

1 1

SStraining

a a= =

= −∑∑I J

i j i ji j

y y 2.14

In the above equations, I is the number of observations, J is the number of variables, a is

the number of model components (a = 1, 2, …, A), and SSR(a-1) is the residual sum of

squares in fit of the model with a-1 latent variables. Equations 2.13 and 2.14 are exactly

26

the same. However they are not computed on the same dataset, SSR is the residual in fit

while PRESS is the residual in prediction (i.e. data left out in cross-validation rounds).

Q2CV is computed sequentially for each new component. Values of Q2

CV > 0 mean that this

component improves the prediction ability of the model and deteriorates it when Q2CV < 0.

The number of component chosen is the last component having a Q2CV >0.

Another definition was also proposed for Q2CV and is currently used in the ProMVTM

software package (ProSensus Inc.) and was described in (Wold, Trygg, et al. 2001).

( ) ( )1-2

CV

Y

PRESSQ

SS

aa = 2.15

Where SSY is the sum of square of the variance of Y. In this case, the Q2CV increases with

the number of components and starts decreasing when overfitting occurs. Usually, the

component chosen is the last one that increases the Q2CV by more than 1% (i.e. 0.01).

This definition of Q2 (equation 2.15) is the one reported in this thesis when the predictive

ability of the models is assessed.

The number of components can also be selected based on the smallest root mean

squared error of prediction in cross-validation (RMSEPCV). This metric is an estimation of

the error variance of the prediction set used in the cross-validation procedure and is

computed for each variable. It is possible to select the number of components based on

the minimum RMSEPCV obtained. When more than one Y variables are present, a

compromise must be made since the minimum RMSEPCV might not be obtained on all

variable with the same number of components.

( ) � ( )( ), ,

2

1

1RMSEPCV

pred

ja a

N =

= −∑I

i j i ji

y y 2.16

In equation 2.16, I is the number of observations, j is the selected variable index, a is the

number of model components (a = 1, 2, …, A),and N is the number of observations.

Selecting too few latent variables leaves some structured information in the residuals.

However, selecting too many latent variables leads to overfitting and modeling of the noise

in the data.

27

Alternatively, one could use a separate validation dataset for computing predictive ability.

While adding one LV at a time, it is possible to compute the PRESS on the validation set

until the predictive ability starts to degrade due to overfitting. This approach with external

data is the better way to validate a model, but a high number of observations is required in

order to split a dataset into a training set and a validation set (i.e. typically 2/3 and 1/3).

2.5 Model interpretation tools

Aside from the model structure of PCA and PLS, which are powerful methods for process

modeling, a number of tools can be used to help interpret the models and learn from the

data. First, the score scatter plots (ti-tj) and loadings plots (pi-pj) are used to interpret the

relationships between the observations and the variables, respectively. A combination of

two or three latent variables can be simultaneously visualized through these tools using 2D

or 3D scatter plots. The use of these score plots will be illustrated later in the results

section.

The residual Q statistic is the perpendicular projection distance of an observation off the

latent variable space. It is useful for detecting outliers because it highlights observations

with a different correlation structure than that of the data used to build the PCA and PLS

models (i.e. outliers in the space orthogonal to the LV space).

( ) ( )2

1Q a e a

==∑

J

i ijj 2.17

Qi(a) is computed from equation 2.17, where eij(a) is the residual of observation i and

variable j obtained with a model built using a latent variables.

�( ),,( ) ( )e a a= − i jij i jx x 2.18

Equations 2.17 and 2.18 define Q for the X space, but it is possible to compute this

statistic for the Y space by replacing e by the residuals of Y (i.e., elements of the F matrix)

and the x variables by y variables.

The Hotelling’s T2 is the Mahalanobis distance of an observation to the center of the LV

space. It can also be used for detecting outliers in the LV space.

2

2

21

tT a

a as=

=∑A

ii

t

2.19

28

An additional tool which can help identify important variables in a PLS model is the

variable importance in projection (VIP) which is an indication of the importance of a

variable in predicting the Y variables (Eriksson et al. 2001):

( )

( )

,

,

2Y

1

Y

SS

SSVIP

a

a

a==∑w

A

j

j A

J

A 2.20

Where wja is the weight of the jth variable (from X) in the ath PLS latent variable, SSY(a) is

the sum of squares of Y explained by the ath LV of the PLS model and SSY(A) is the total

sum of squares of Y explained by the model. Those variables having a VIP value greater

than 1 are considered to be the most influential in the model (Eriksson et al. 2001).

Finally, another useful interpretation tool is the contribution plot. It essentially consists of

the difference in the values of a particular variable between two observations (or averaged

over some clustered observations) weighted by the importance of that variable in the

model given by the PLS model weights (w*). It indicates which combination of variables

contributes the most to a deviation in the score space (T) of a latent variable model. It

does not generally reflect a cause and effect relationship, but it is a good indicator of

possible root causes. The calculation of the contributions is explained in (Westerhuis et al.

2000; Kourti 2005). The contribution of variable j, to the shift between two observations (k1

and k2) is computed using the expression below.

( )( )

1 2

1 2

2

21

t t w*x x

sa

Ajak jak ja

j jk jk

a=

− × = − ∑t

C 2.21

Where xjk1 and xjk2 are the values of the jth variable at time (or observation number) k1 and

k2, w*ja is the weight associated with the jth variable of the ath latent variable and s2ta is the

variance of the ath score. Dividing by the score variance gives an equal importance of

deviations in each LV. For contribution from a group of observations to another group, the

difference in the mean value of the observation in each group for each variable is used.

29

Chapter 3 Image texture analysis

3.1 Machine vision

The use of digital imaging sensors as data acquisition devices is now widespread in very

diverse areas such as for laboratory applications, for medical imaging or for industrial

process control. Machine vision sensors are now used to measure and collect data in the

same way as flowmeters, thermocouples and pH probes for instance.

Machine vision sensors are typically developed according to the general framework

presented in Figure 14 involving four successive stages: 1) image acquisition, 2) image

pre-treatment (when necessary), 3) extraction of image features (spectral, textural or both)

and 4) analysis of these features (Liu 2005; Duchesne 2010).

Figure 14 – Schematic of the machine vision approach (Liu 2005; Duchesne 2010)

The acquisition of a digital image is usually accomplished using a camera equipped with a

CCD sensor (i.e. charged couple device) which converts the photon intensity of the

captured light to a digital signal. However, any other type of system capturing a digital

image can be used. Various pre-treatments can be applied to the raw image in order to

filter noise or to remove irrelevant sources of variations such as non-uniform illumination or

instrumental variations (e.g. pixels-to-pixel variations of a CCD sensor). Gonzalez and

Woods (Gonzalez & Woods 2008) and Sonka et al.(Sonka et al. 2008) describe several

traditional techniques used for image pre-treatment.

30

Multivariate image analysis techniques were recently reviewed by Prats-Montalbán et al.

(Prats-Montalbán et al. 2011) and Duchesne et al. (Duchesne et al. 2012). The methods

are essentially classified according to the nature of the features they are extracting from

images: spectral features only, textural features only, or a combination of both. First,

multivariate image analysis (MIA) is used for extracting spectral features from a

multivariate (i.e. multi-channel) image such as RGB color images or hyperspectral images.

On the other hand, texture analysis methods are used when the spatial relationship

between the pixels is important. A combination of both approaches (i.e. spectral and

texture) can be used if both types of information are present in the image (Liu &

MacGregor 2007).

In this thesis, the most useful information to be extracted from anode past images is

related to its texture. Thus, only image texture analysis methods are reviewed in this

chapter. For a broader perspective on Multivariate imaging techniques, the interested

reader is referred to the following review papers (Prats-Montalbán et al. 2011; Duchesne

et al. 2012). Among the several methods available for texture analysis the following two

were use in this work: the wavelet texture analysis (WTA) and the gray level co-occurrence

matrix (GLCM). Both of them are considered the state-of-the-art methods in the machine

vision field.

3.2 Digital image

Digital images are stored as data matrices or arrays depending on whether the image is

gray-level or multi-channel (e.g. RGB or hyperspectral). A gray-level image is a two-

dimensional function f(x,y) where x and y are the spatial coordinates within the image and f

is the light intensity recorded at each (x,y) position in the image. The (x,y) positions

correspond to a discretization of the image scene into so-called pixels. The number of

pixels in both spatial directions I × J define the image size and its spatial resolution.

Usually, the gray levels of each pixel are represented by integer values (spectral

discretization) ranging between 0 and 255 for an 8-bit system (i.e. 28 possibilities and

therefore 256 possible integers). A simple gray level image is a matrix of size I × J. A color

image is a three-dimensional array of size I × J × K, where each dimension K correspond

to a gray level in a particular channel (color or wavelength). These colors are red, green

and blue in the case of a traditional color image (i.e. also called RGB images). The number

of channels K in multi- and hyperspectral images typically range from tens to hundreds

(Grahn & Geladi 2007).

31

3.3 Image texture analysis

There are several definitions for image texture. For example, the texture of an image is

defined as the spatial variations of pixel intensities (Bharati et al. 2004). It is also a

representation of the structure of an image that is a regular repetition of an element or a

pattern on a surface (Srinivasan & Shobha 2008). In any case, image texture involves

extracting the relationships between the pixel intensities within a certain neighborhood in

the image. The image texture methods developed in the past fall into four categories (Liu

2005; Duchesne et al. 2012) and differ mainly by the way they quantify the spatial

relationships between the pixels: 1) geometric, 2) model based, 3) statistical and 4)

transform based methods.

Geometric methods (or structural approaches) are described in detail in (Gonzalez &

Woods 2008). These methods are best suited to describe well defined geometric shapes

or regular textural patterns. These methods are not appropriate for anode paste images

because textural patterns are not regular (depends on the irregular shapes of the coke

particles and the amount of pitch in the formulation).

Model based methods make use of parametric models to extract textural information from

the images. For example, the Markov random field (MRF) and fractal models (Cross & Jain

1983; Chen et al. 1993) are the mostly used models. However, these methods require high

computing power (Materka et al. 1998). This is not an issue for off-line applications, but

could still be a problem for on-line real-time image analysis applications.

In statistical methods, the image texture is represented by stochastic characteristics that

are calculated from the distribution of gray levels in the image. Simple first order statistics

computed from the intensity distribution (i.e. the average, variance, skewness and kurtosis

of the intensities across the image) describe intensity variations within the image but do

not account for the spatial relationships between the pixels. Second order statistics are

more appropriate for extracting relationships between pixels because they use the joint

distributions of intensities of pairs of pixels located within a certain neighborhood. Hence,

they are more efficient for describing image textural patterns because they maintain the

spatial relationship between the pixels (Gonzalez & Woods 2008; Srinivasan & Shobha

2008). The most widely used statistical method is the gray level co-occurrence matrix

(GLCM) (Haralick et al. 1973). The GLCM is a matrix in which elements contain the

probability of occurrence of gray levels of pairs of pixels at distance and angle from each

32

other. A number of scalar features (or textural descriptors) calculated from this matrix are

typically used to characterize the texture of an image. Similar methods such as the gray

level pixel-run matrix (GLPRM) (Galloway 1975) and the neighbouring gray level

dependence matrix (NGLDM) (Sun & Wee 1983) have been proposed but are less used in

the literature.

The last class of texture methods are transform based approaches. These methods, often

referred to as multi-resolution texture analysis, perform frequency or spatial-frequency

decomposition of the image 2-D signals. It has been shown that the power of the signals in

different frequency bands and in different areas within an image can be related with

textural patterns (Livens et al. 1997; Van de Wouwer et al. 1999; Bharati et al. 2004).

Several different methods using different types of transforms exist (e.g. Fourier and

Wavelet transform). The wavelet transform has been preferred to the Fourier transform for

image texture analysis because it retains the spatial and frequency information instead of

frequency only (i.e. Fourier). Indeed, the spatial information is lost when the Fourier

transform is used (Bharati et al. 2004; Liu & Han 2011).

Both the GLCM (statistical) and the Wavelet Texture Analysis (transform-based)

approaches were used in this thesis for the analysis of anode paste images. Therefore,

these two methods are described in greater details in the rest of this chapter.

3.3.1 Gray level co-occurrence matrix (GLCM)

The GLCM method was proposed by Haralick et al. (Haralick et al. 1973). The GLCM of an

image is an estimate of the joint probability distribution (P(x,y)) of the intensities of two

pixels (x,y) separated by a distance (L) and an angle (θ) (Bharati et al. 2004). P(x,y) is a

square matrix whose dimensions correspond to the number of gray levels in the image

(e.g. typically 256 intensity values for an 8-bit image) or to that used for the analysis

because the number of gray levels can be reduced by binning. Figure 15 shows an

example of 4 GLCM matrices for a simple 4 gray levels image matrix.

Figure 15 – Examples of GLCM matrices (Tessier et al. 2008)

33

In this example published by Tessier et al. (Tessier et al. 2008), the image I contains pixels

with gray level values ranging from 1 to 4. This means that the size of the GLCM matrices

M1, ..., M4 is 4×4. For all the pixels at a given distance L and angle θ from each other

across the image, the number of occurrences (i.e. probability) of the different combinations

of gray levels taken by the pairs of pixels are stored in the GLCM (M). For L = 1 and θ =

90° (i.e. pixels on the same row or horizontal) there are 4 occurrences of pixels with

intensities x=1 and y=1 (yellow rectangles), and so P(1,1) = 4, and 1 occurrence with

pixels of intensities x=3 and y=4 (green rectangle). Thus P(3,4) = 1. These calculations

can be repeated for different values of L and θ to obtain a set of GLCMs (i.e. M2, ..., M4),

which yields a multi-resolution description of the image texture. Small distances L would

extract finer textural patterns whereas longer pixel-to-pixel distance would focus on

coarser textures. The GLCMs obtained from different angles could be used to assess the

level of anisotropicity of the textural patterns (i.e. preferential orientations).

To compare the texture of different images quantitatively, it is necessary to summarize the

information contained in the set of GLCMs into a row vector containing different scalar

features computed from the various matrices. Haralick et al. (Haralick et al. 1973)

proposed 14 different textural features: angular second moment, contrast, correlation,

variance, inverse difference moment, sum average, sum variance, sum entropy, entropy,

difference variance, difference entropy, two measures of correlation (i.e. f12 and f13) and

the maximal correlation coefficient (i.e. f14). However, only a subset of 4-5 features are

used more frequently (Soh & Tsatsoulis 1999; Van de Wouwer et al. 1999; Clausi 2002;

Maillard 2003; Bharati et al. 2004). Maillard (Maillard 2003) also compared the choice of

the textural features for several articles reporting the use of GLCM. The five most used

features are angular second moment (ASM), entropy, contrast, correlation and

homogeneity.

The angular second moment (equation 3.1), also called energy, is the sum of square of all

the GLCM elements. Sometimes, the sum of GLCM elements is used instead of the

squares. It is a measure of relative homogeneity when it is used to compare different

images. A more homogeneous image will contain fewer transitions of gray level intensities.

As a result, the P(x,y) values around the diagonal of the GLCM will be of high magnitude

and much lower values elsewhere (off-diagonal elements). The GLCM of a less

homogeneous image is characterized by lower P(x,y) values spread across the various

GLCM elements. Thus, the ASM of the homogeneous image will be higher in comparison

34

with a non-homogeneous image. In equations 3.1 to 3.9, ng is the number of gray levels

used to compute the GLCM.

( ){ },

2

1 1

ASM P= =

=∑∑g gn n

x y

x y 3.1

Entropy (equation 3.2) is also a measure of homogeneity. It is the opposite of the ASM,

when the homogeneity increases, the entropy decreases since the high values of P(x,y)

have low values of log{P(x,y)}.

( ) ( ){ }, log ,1 1

Ent P P= =

= −∑∑g gn n

x y

x y x y 3.2

Contrast (equation 3.3) is a measure of the local variations in the image. In equation 3.3,

the probability values are weighted by the squared difference in pixel intensities. The value

for this feature will be higher for images containing sharp transitions (i.e. large difference in

intensity from pixel to pixel).

( ) ( ),2

1 1

Cont P= =

= −∑∑g gn n

x y

x y x y 3.3

Homogeneity (equation 3.4), or the inverse moment, is also a measure of the importance

of local variations in intensity. The elements of the GLCM that are far away from the

diagonal (i.e. large difference between x and y) are penalized by the weighting

denominator. Sometimes, the square of the difference between x and y is used instead of

the absolute value. The behavior of this feature is the opposite as the contrast.

( ),

1 1

PHom

1= =

=+ −

∑∑g gn n

x y

x y

x y 3.4

Finally, the correlation (equation 3.5) is a measure of the structure within the image.

( ) ( ),1 1

P

Corr

µ µ

σ σ= =

× −

=∑∑

g gn n

x y

x y

x y

x y x y

3.5

35

( ) ( ),2

1 1

Py

yσ µ= =

= −∑ ∑g gn n

x x

x

x x 3.6

( ) ( ),2

1 1

Pσ µ= =

= −∑ ∑g gn n

y y

y x

y x y 3.7

( ),1 1

P y

x yµ= =

=∑ ∑g gn n

x

x

x 3.8

( ),1 1

Py x

x yµ= =

=∑ ∑g gn n

y y 3.9

As an example, two images (Figure 16) of two different stone surfaces

(http://www.highresolutiontextures.com) are used to illustrate the behavior and

interpretation of the five GLCM features presented in equations 3.1 to 3.5.

Figure 16 – Two stone surfaces with different texture used for the GLCM features example

(http://www.highresolutiontextures.com)

A B

36

Figure 16 contains two stone surface images. Image A has a more uniform surface with

some gray tone differences which varies slowly from the left to the right of the image and

some darker lines in the image. Image B is less homogeneous with more high frequency

(i.e. small details) and structured variation of black and white dots.

A GLCM was computed for each image using a distance L = 1 and angle θ = 90°. The

scalar textural features calculated using the GLCM of each image are presented in Table

3. The number of gray level values used was ng=256.

Table 3 – GLCM features of the images in Figure 16

The ASM of image A is higher than that of image B since it is more homogeneous. The

entropy values have the opposite behaviour since it is lower for more homogeneous

images. Homogeneity and contrast are both related to the magnitude of the gray level

differences between adjacent pixels (L = 1 in this case). Since the image B has smaller

patterns, thus high contrast at low L values, its contrast is higher and homogeneity smaller

than image A. Finally, image B contains regular patterns, hence its correlation feature is

high compared to the image A.

To obtain a multi-resolution analysis of an image, it is possible to compute multiple GLCM

matrices for different distances L. The GLCM features will describe the texture at different

scales in the image using this approach. Furthermore, if the orientation of the texture is

important, one could compute the GLCM for different angles θ. But, usually, the average of

the GLCM features for all angles at each distance L is used to compare the images. This

averaging greatly reduces the number of features and simplifies the classification of the

images.

Van de Wouwer et al. (Van de Wouwer et al. 1999) and Clausi (Clausi 2002) both

discussed the issue of redundancy in the features (i.e. some of the features are

redundant). They have shown that some features explained the same information in the

GLCM and therefore it was not necessary to compute all the features, but only one for

each set of redundant features (i.e. chose a set of independent features). For example, in

the set of five features described above, the ASM and entropy are redundant as well as

Image ASM Entropy Homogeneity Contrast Correlation

A 0,0056 5,66 0,379 30,36 0,939

B 0,0008 7,55 0,317 44,93 0,980

37

the contrast and homogeneity, but only one of each is necessary to characterize the image

texture.

3.3.2 Wavelet texture analysis (WTA)

The other state of the art method for image texture analysis is based on the wavelet

transform. It was originally developed for signal processing, denoising and compression

(Rioul & Vetterli 1991; Usevitch 2001), but was also used for feature extraction and image

classification (Mallat 1989; Livens et al. 1997; Liu & Han 2011). This section presents an

overview of the wavelet transform and its application to image texture analysis. However,

several tutorials and books covering the history, theoretical aspects, and a broader range

of applications of the wavelets are available for interested readers (Antonini et al. 1992;

Chui 1992; Prasad & Iyengar 1997; Sarkar et al. 1998; Stark 2005; Debnath & Shah

2015).

An image is a two dimensional signal representing the variations of light intensities within a

scene (2-D space). These signals typically contain multiple frequencies. Large objects or

coarse textural patterns in the image generate low frequency variations whereas smaller

objects or finer textural patterns appear as high frequency information. In image texture

analysis, it is often desired to extract textural features at different resolutions (i.e. sizes or

levels of scrutiny) which require the image signals to be decomposed into their frequency

components. The Fourier transform can be used to obtain the signal frequency content but

the spatial resolution is lost due to the fact that sine (or cosine) waves are infinite signals in

the spatial (or time) domain. However, wavelets functions have a finite length which

enables a signal (e.g. an image) to be decomposed into both the frequency and spatial

domains. Applying the 2-dimensional wavelet decomposition to an image results in a

series of new images each capturing the variations at different scale or frequency band

contained within the original image.

How the wavelet transform decomposes the information content of an image (i.e. a 2-

dimensional discrete signal) will be explained progressively. The application of the

continuous wavelet transform to a 1-dimensional signal will be presented first. The discrete

representation of a wavelet and its relationship to the traditional linear filters used in signal

processing will then be introduced. Finally, the application of the discrete wavelet

transform to a 2-dimensional signal (i.e. an image) will be described.

38

Wavelets are waveforms of finite length. Several types of wavelets having different shapes

exist (e.g. Haar, Daubechies, Symlet and Mexican Hat). The choice of the wavelet is

usually done by selecting the wavelet shape that matches best the analyzed signal.

However this choice of best wavelet is not unique and more than one type of wavelet can

give similar results. This will be discussed in section 6.3.3.

To perform the wavelet decomposition of a 1-D signal, a series of wavelet bases ψa,b(x)

are generated from the mathematical representation of a mother wavelet ψ(x) using

equation 3.10. In this equation, a and b are integers used to scale and shift the mother

wavelet. The scaling coefficient a compresses or stretches the wavelet to capture different

frequencies and the shift coefficient b moves the wavelet along the signal to capture the

time or space variations.

,

1a b

aaψ ψ

− =

x b 3.10

The continuous wavelet transform (CWT) is based on the convolution (equation 3.11) of

the scaled and shifted wavelet bases ψa,b(x) on the signal f(x) to obtain a series of detail

coefficient da,b. These detail coefficients are the measure of the similarity between the

signal and the wavelet base at a particular location in space or time shifted by b and at the

frequency specified by a. Computing da,b for multiple values of a and for each possible

values of b for a given signal f(x) yield a complete space/time-frequency decomposition of

the signal (Liu & MacGregor 2007; Duchesne 2010).

( ), ,d

a b a bd f ψ= ∫ x x 3.11

For a discrete signal, it is not necessary to compute the detail coefficients da,b for every

possible values of shift and scale coefficients. For the efficient computation of the wavelet

transform, the discrete wavelet transform (DWT) can be used. The DWT is performed on a

smaller number of discrete locations and frequencies based on a dyadic scale (i.e. power

of 2) without degrading significantly the accuracy of the wavelet decomposition. The scale

and shift coefficient are computed using equation 3.12 where m and n are integers.

2ma = and 2mb n= 3.12

39

The DWT detail coefficients are computed similarly to the CWT (equation 3.11) except that

the wavelet bases are computed only for the dyadic scale and shift coefficients.

The implementation of the DWT is performed using a set of low pass and high pass filters.

This concept was introduced by Mallat (Mallat 1989). The work of Mallat adapted the DWT

convolution integral to the signal processing field by developing a set of low pass and high

pass filters that could be applied to a signal to compute the wavelet coefficients. It has

been used in many industrial machine vision applications since then (Liu & MacGregor

2005; Liu et al. 2005; Liu & MacGregor 2006; Tessier et al. 2007; Tessier et al. 2008;

Prats-Montalbán et al. 2009; Reis & Bauer 2009; Facco et al. 2010).

Similarly to the wavelet function ψ(x), Mallat introduced a new function called the scaling

function φ(x). This function is orthogonal to the wavelet function and it is used to capture

the low frequency details. Whereas, the wavelet function is used to capture the high

frequency details. The scaling and wavelet functions are given by equations 3.13 and 3.14.

They correspond to the DWT decomposition for a dyadic scale a = 2s and b = 2sk. Both

functions are expressed as a low pass filter h0 for the scaling function and a high pass filter

h1 for the wavelet function. h0 and h1 are orthogonal filters and they are related to each

other by equation 3.15.

[ ] ( )/

,

22 2ss s

s lk k lφ = − 0h 3.13

[ ] ( )/

,

22 2ss s

s lk k lψ = − 1h 3.14

[ ] ( ) [ ]1 1= − −k

1 0h k h k 3.15

In these equations, s and l are the scale and shift indices and k is the discrete location in

the signal.

The approximation (i.e. low frequency residual) and the details coefficients (i.e. captured

high frequency content at each scale) are computed as the inner product of the wavelet

and scaling function with the signal f(x) using equations 3.16 and 3.17.

( ) [ ] [ ] [ ],,

s lsa l f k kφ= 3.16

( ) [ ] [ ] [ ],,

s lsd l f k kψ= 3.17

40

In the case of the DWT, the decomposition is not done by scaling the wavelet, but by sub-

sampling the signal. The coefficients are computed sequentially at each decomposition

level s. To capture the details at different frequencies, the 1D signal or an image (2D

signal) are sampled by a dyadic function, that is, each dimension is reduced by a factor of

two after every decompositions level s.

When the wavelets are applied to a vector (e.g. time series of a signal or a one-

dimensional spatial vector), the decomposition yields a series of detail coefficient and an

approximation vectors. It captures the highest frequency detail first than the next highest,

etc. and the residual of the signal after S decomposition (i.e. low frequency) is left in the

approximation. Using the DWT, the frequency band is cut in half at each decomposition

level (Figure 17). In Figure 17 and Figure 18, fn is the normalized maximum frequency of

the signal.

Figure 17 – Frequency band divisions of the DWT of a vector (one-dimensional signal)

In an image (two-dimensional signal), the DWT is simply applied in each direction. This

yields a set of three detail images, each more sensitive to a specific direction (i.e

horizontal, vertical and diagonal directions respectively) and one approximation image at

each scale. The process is illustrated in Figure 18 (Liu & MacGregor 2007).

Figure 18 a) shows the filtering and down sampling strategy used for the DWT based on

Mallat’s filtering strategy. First, the low pass and high pass filters are applied horizontally,

then the image dimensions are reduce by removing one row out of two. The same low

pass and high pass filters are then applied vertically on the results of the previous step.

The last operation is the deletion of half of the columns. The dimensions of the images (i.e.

details and approximation) at the decomposition level j correspond to a quarter of the

image at j-1 due to down sampling. The decomposition continues with the approximation

image at level j.

a3 d3 d2 d1

0 fn/8 fn/4 fn/2 fn

Level 1Level 2Level 3

Frequency

41

Figure 18 – 2-D DWT decomposition a) schematics of the filter bank used at the jth

decomposition and b) frequency distribution of the detail and approximation images (Liu &

MacGregor 2007)

The DWT yields detail coefficients images and approximation that are of different size at

each decomposition level. For visualization and computation of features, it is possible to

reconstruct each individual sub-images to its original size using the correct reconstruction

filters. For a certain family of wavelets, the reconstruction is perfect and no information is

lost in the transformation. These reconstructed images from only one detail sub-image

may sometime contain artifacts since perfect reconstruction is only valid when all detail

coefficients and approximation images are used for the reconstruction. The detail images

need to be cheked visually for artifacts. In this thesis, all the features were computed on

the reconstructed detail coefficient images and no obvious artifacts were observed.

Figure 18 b) shows a representation of the band-pass frequency captured by the

decomposition. The horizontal detail contains the high pass information in the horizontal

direction and the low pass frequencies in the vertical directions and vice versa for the

vertical detail. The diagonal detail contains the high pass frequencies in both directions.

Finally, the approximation contains the low pass information in both directions. The

frequency spectrum is cut in half at each decomposition level in both directions.

Another interesting property of the DWT is that each detail coefficients images are

orthogonal. This means that there is no redundancy in each frequency band captured by

each detail images and the approximation.

Figure 19 is used to illustrate the band pass frequency decomposition of the DWT. It is a

composite of 8 different images of surfaces exhibiting different visual textures. The roman

numbers correspond to a class of texture and the Arabic letters to the individual sub-

image. The texture i (sub-image 1) is a smooth surface with only low frequency variations.

a) b)

A1 D1h

0 fn/2 fn

Horizontal

frequency

D1v D1

d

Verticalfrequency

fn/2

fnAs-1

Ash0ver 1↓2

h1ver 1↓2 Ds

v

Dshh0

ver 1↓2

h1ver 1↓2 Ds

d

h1hor 2↓1

h0hor 2↓1

42

The texture ii (6-8) is finer (i.e. higher frequency). In iii (3), the texture is coarser than ii but

still very homogeneous. The texture iv (4) is a mix of fine and coarse texture and it

contains high contrast. The pavement in v (5) contains high frequency information in the

tiles and low frequency information (i.e. the mortar), both features should be captured by

different wavelet scale. Finally, the texture in vi (2) is highly oriented and contains low

frequency details in the horizontal directions and high frequency information in the vertical

direction.

Figure 19 – Composite image of different textures (http://www.highresolutiontextures.com)

Figure 20 presents the results of the DWT decomposition of the image in Figure 19.

Different parts (i.e. sub-images) of the original image at different decomposition levels are

used as examples of the properties of the DWT.

Figure 20 a) shows the reconstructed approximation (A3) image after 3 decomposition

levels. It is blurry since the high frequency details have been removed from the image.

Figure 20 b) contains the reconstructed detail images at scale 1 (D1) (i.e. high frequency)

for the sub-images 1 and 7. Sub-image 1 is very smooth and its detail coefficient contains

almost no information at that scale. However, the detail coefficient image of the sub-image

(7) (i.e. much finer texture) contains information in this frequency band.

i v

ii

ii

ii

vi

iii

iv

1

2

3

4

5

6

7

8

43

Figure 20 c) shows the reconstructed detail coefficients images of the sub-image 5 at two

different scales. D1 is the high frequency band and captures the fine details inside the

pavement tiles. D3 captures the mortar lines between the tiles. The mortar around the tile

is a lower frequency detail in comparison with the fine texture inside the tiles and it is

captured by a lower decomposition level (i.e. coarser texture).

Figure 20 d) shows the reconstructed detail coefficients images separately for the

horizontal and vertical directions for sub-image 2. That image contains oriented texture

and this is captured by the vertical D1v but not by the horizontal D1

h.

Figure 20 – Approximation and detail coefficients of the composite texture image: a)

approximation at scale 3, b) comparison of the reconstructed detail at scale 1 for sub-

images 1 and 7, c) comparison of the reconstructed detail at scale 1 and 3 of sub-image 5

and d) comparison of the direction sensitivity for the reconstructed detail at scale 1 of sub-

image 2.

To compare the texture of a set of images it is common to compute scalar textural

descriptors (or features) from the detail and approximation sub-images, and then use

some multivariate clustering and classification techniques to compare the images in a

quantitative manner. The features are usually computed from the detail sub-images at

each scale and direction (Dsk) but they can also be computed from the approximation sub-

images at each scale (Ask) (Facco et al. 2010). In this case however, the approximations at

a) b)

(1)D1

(7)D1

c)

(5)D1

(5)D3d)

(2)D1h

(2)D1v

A3

44

each scale contain redundant information from the previous approximations since the

approximation at scale s+1 contain part of the frequency distribution of scale s. Finally, the

features can also be computed on the sub-sampled (i.e. not reconstructed) or on the

reconstructed detail coefficients or approximation.

The most frequently used scalar feature is the energy (Scheunders et al. 1997; Liu &

MacGregor 2005; Liu et al. 2005; Tessier et al. 2007; Selvan & Ramakrishnan 2007; Prats-

Montalbán et al. 2009; Liu & Han 2011). Facco et al. (Facco et al. 2010) also proposed the

use of 4 additional features for texture analysis: the entropy, standard deviation, skewness

and kurtosis. The five scalar features are presented in equations 3.18-3.22. In image

texture analysis they are typically computed using the coefficients of the detail sub-images

(DsK) obtained at each scale and direction or those of approximation sub-images.

Alternatively, the scalar textural descriptors can be calculated on the reconstructed

versions of the sub-images RsK at each scale and direction. One advantage of using the

reconstructed sub-image is that their sizes are the same as those of the original image. It

was also proposed by some authors to apply the GLCM on the detail or approximation

images, then compute the GLCM features presented in section 3.3.1 (Van de Wouwer et

al. 1999; Liu & Han 2011; Yousefian-Jazi et al. 2014) and use these to quantify image

texture.

The energy (equation 3.18) is the sum of square of the detail coefficients in a given sub-

image. Since the DWT decomposition conserves the energy of the original image (i.e. the

sum of the energy of all detail images and the approximation is equal to the energy of the

original image), the energy is a measure of how much information is contained at each

scale (i.e. frequency band) and direction. It is also a measure of homogeneity similar to the

energy of the GLCM (equation 3.1).

( ),

2

1 1

R

E= ==

×

∑∑I J

ks

i jks

i j

I J 3.18

The entropy (equation 3.19) and the standard deviation (equation 3.20) are also a

measure of the homogeneity. The entropy is computed from the probability density

function Psk (i.e. histogram) of the reconstructed detail coefficients sub-images Rs

k.

45

( ) ( )( )log1 1

Ent P Pl l= =

= −∑∑I J

k k ks s s

i j

3.19

( )( ),

2

1 1

R µσ = =

−

=×

∑∑I J

ks

i jks

i j

I J 3.20

In equation 3.20, µ is the mean of Rsk.

The skewness (equation 3.21) is a measure of the lopsidedness of a distribution. The

skewness increases when values of the distribution of coefficients are skewed to one side

of the mean.

( )( )

,

,

3

1 1

3

R

Skew

J

µ

σ

= =

−

×=

∑∑I

ks

i j

ks k

s

i j

I J 3.21

The last feature is the kurtosis (equation 3.22). It is a measure of the broadness of the

wavelet coefficients distributions.

( )( )

,

,

4

1 1

4

R

Kurt

µ

σ

= =

−

×=

∑∑I J

ks

i j

ks k

s

i j

I J 3.22

There is no discussion in the literature about the redundancy of these WTA features, but

the energy, standard deviation and entropy are correlated characteristics. Using the same

reasoning as with the GLCM features (Van de Wouwer et al. 1999; Clausi 2002), only one

of these three features together with the skewness and the kurtosis should be enough to

classify the image texture.

47

Chapter 4 Experimental

This chapter introduces the experimental procedures, raw materials and imaging setup

used for the development of the machine vision algorithm. First, the softwares used to

analyze the images are presented. It is followed by a description of the laboratory paste

and anode manufacturing procedures and raw material properties. Then, the industrial

sampling strategy is explained. Finally, the imaging system and image analysis methods

are presented in details.

4.1 List of softwares

Several different softwares were used to obtain the results presented in this thesis. First,

all the computation involved in image processing and texture analysis were performed in

MatlabTM version 7.13 (2011b) from the MathWorks. The wavelet and image processing

toolbox are required to perform the calculations. A third party toolbox developed for

Matlab, the PLS toolboxTM version 7.3 from Eigenvector Research, was also used to

compute most of the latent variable models presented in the thesis. The multi-block

algorithm was also developed within the Matlab environment. Finally, the ProMVTM

software version 13.08 r1685 from ProSensus was used to benchmark the developed PLS

and MB-PLS algorithm programmed in Matlab as well as for exploratory analysis of the

datasets because of its user friendly interface.

4.2 Laboratory anode fabrication

Two series of experiments were conducted using different sets of raw materials to

fabricate laboratory anodes. An industrial raw material formulation was used for the first

set. This decision was made to obtain lab anode paste samples that were the most

representative of the real industrial paste. To achieve this, already classified raw materials

from the three industrial coke fractions and butts were collected from the Alcoa

Deschambault Quebec smelter (ADQ). The second sets of experiments were conducted

using commercially available cokes (no butts). The crushing, grinding and sizing were

performed in the laboratory to obtain a typical aggregate size distribution for laboratory

fabricated anodes (Azari et al. 2013).

4.2.1 Industrial raw material formulation

At the Alcoa Deschambault smelter’s carbon plant, a mix of two cokes is typically used to

formulate the anode paste. The day the material was sampled, the ratio of the two cokes

48

was 65% coke 1 and 35% coke 2. The raw material properties for the coke are listed in

Table 4. In this table, the vibrated bulk density (VBD) is presented for two ranges of

particle size distributions. The VBD was measured using ASTM D4292 standard test

method. Finally, the impurities are measured using X-ray fluorescence.

Table 4 – Properties of the industrial coke used for the laboratory paste manufacturing

Several paste samples were fabricated using this material blend, but these paste samples

were not pressed into anodes. The typical particle size distributions of the blended and

classified coke and butts particles are listed in Table 5. The reference mix formulation is

presented in Table 7.

Table 5 – Particle size distribution (measured at the plant) for each material fractions

Property Coke 1 Coke 2

VBD (-8/+14 US mesh) (g/cm³) 0,80 0,91

VBD (-30/+ 50 US mesh) (g/cm³) 0,86 0,95

Real density (g/cm³) 2,065 2,071

Fe (ppm) 0,033 0,014

Si (ppm) 0,016 0,003

S (%) 2,91 1,43

V (ppm) 0,029 0,019

Na (ppm) 0,01 0,006

Ca (ppm) 0,01 0,009

Ni (ppm) 0,01 0,012

+4 (US mesh) (%) 22,1 27,4

-4/+14 (US mesh) (%) 58,5 58,2

-14/+30 (US mesh) (%) 77,8 74,5

-30/+50 (US mesh) (%) 92,2 86,9

-50/+100 (US mesh) (%) 96,9 93,5

-100/+200 (US mesh) (%) 98,9 97

-200 (US mesh) (%) 99,9 100

Coarse Inter. Fines Butts

+3/8 (in) (%) --- --- --- 16-19

-3/8 (in) /+4 (%) 17-20 --- --- 23-26

-4/+8 (%) 24-27 --- --- 20-23

-8/+14 (%) 21-23 <1 <1 13-16

-14/+30 (%) 26-29 14-17 <1 14-17

-30/+50 (%) 4-6 50-53 <1 4-7

-50/+100 (%) <1 22-25 3-6 <1

-100/+200 (%) <1 6-9 19-22 <1

-200 (%) <1 1-3 74-77 <1

VBD (g/cm³) --- <1 --- ---

Blaine (cm2/g) --- --- 5000-6000 ---

Size (US mesh)

49

The proportion of fines particles (%) shown in Table 5 is the average of three

measurements for each 12h period of plant operation.

The size distribution data presented in Table 5 are given for a range of particle size band

that passed a given screen size (i.e. -) but were retained on the next screen (i.e +). Also

the Blaine number (BN) available for the fine fraction is a measure of the specific surface

area and it is representative of the fine’s fineness. This measurement is obtained using a

Malvern Mastersizer 2000TM laser diffraction particle size analyser.

The pitch properties used for laboratory anodes are listed in Table 6. Each property is the

weekly average of the pitch properties received that week based on the supplier’s

certificate of analysis (COA).

Table 6 – Properties of the industrial pitch supplied by ADQ for the laboratory anodes

The base formulation is presented in Table 7. It is typical of an average industrial

formulation. The percentages were calculated based on the dry aggregate total mass.

Thus, the proportion of pitch is the ratio of the amount of pitch to the amount of dry

particles (i.e. weight ratio).

Table 7 – Base mix formulation for the laboratory anode fabricated with the industrial raw materials

4.2.2 Laboratory raw material formulation

The second set of experiments was also conducted at the laboratory scale, but uses a

typical laboratory formulation. These paste samples were pressed into cylindrical anodes.

The diameter of the mold was 68 mm. Due to the small anode dimensions, the size

distribution of the particles was limited to a maximum size of approximately 4 mm. The

formulation used for the dry aggregate is presented in Table 8.

ADQ pitch

Softening point (°C) 109,1

Toluene soluble (%) 71,9

B fraction (%) 14,5

Quinoline insoluble (%) 13,5

Coking value (%) 58,8

Viscosity 160°C (cP) 1890,8


Property

Fraction Coarse Inter Fines Butts Pitch

% 32,5 26,0 18,2 23,4 16,9

50

Table 8 – Laboratory coke aggregate formulation

Two different commercially available calcined petroleum sponge cokes (A and B) were

used to prepare the laboratory anode formulations. Table 9 shows the physical properties

of these cokes. They were selected because they had the largest difference in density in

the +30/-50 US mesh size fraction of all the cokes available at the University. The crushing

and classification were performed in the laboratory. Both cokes were crushed with jaw and

roll crushers. The particles were separated into six different size fractions (i.e. -4/+8, -

8/+16, -16/+30, -30/+50, -50/+100 and -100/+200 US mesh) using a SwecoTM vibro-energy

round separator. The fines particles were obtained by grinding the -8/+16 US mesh

particles with a ball mill. This fraction is called ball mill fines. Details on the grinding

method are available in (Azari Dorcheh 2013). The real density and impurities

measurements were performed at the Deschambault smelter’s laboratory.

Table 9 – Laboratory coke properties

The pitch used for the laboratory formulated anodes was also supplied by ADQ. This pitch

was used instead of the one used in the industrial formulations for easier comparison and

troubleshooting of the laboratory anode manufacturing since all the other research projects

carried out at Université Laval involving laboratory anode manufacturing (i.e. Prof. Mario

Fafard’s MACE3 industrial research Chair and Prof. Houshang Alamdari’s Collaborative

Research and Development program on anode manufacturing) used this particular pitch.

The laboratory anode pitch properties are listed in Table 10.

Fraction (US mesh) -4/+8 -8/+14 -14/+30 -30/+50 -50/+100 -100/+200 Fine (BN 4000)

% 22,0 10,0 11,5 12,7 9,2 10,8 23,8

Property

Real

density

(g/cm3)

VBD

(-8/+14 US mesh)

(g/cm3)

VBD

(-30/+50 US mesh)

(g/cm3)

Si

(ppm)

S

(%)

Ca

(ppm)

V

(ppm)

Fe

(ppm)

Ni

(ppm)

Coke A 2,073 0,88 0,94 210 1,73 240 90 340 230

Coke B 2,057 0,77 0,86 120 2,13 130 360 460 250

51

Table 10- Laboratory pitch properties

4.2.3 Laboratory anode fabrication

The procedure for the production of the laboratory paste is the same for both the industrial

and the laboratory formulation. First, the dry aggregate mix was prepared for each paste

sample by weighting each coke fraction separately. For the laboratory formulations, the

small size distribution of each of the coke fractions ensures that the dry aggregate size

distribution variations are minimized between the different samples.

For the mixing step, a dough mixer (Hobart N50) was fitted inside a laboratory oven

(Precision Scientific model 28) to control the mixing temperature (Figure 21). This oven

was also used for pre-heating the dry aggregate at the mixing temperature of 178°C for

approximately 16h (i.e. overnight). The paste was prepared by adding the solid pitch in the

dry aggregate mix and in the mixing bowl. The pitch was then pre-heated for 30 min. The

materials were mixed for the desired mixing time and a sample of the paste was

discharged in an aluminium container for image acquisition.

Figure 21 – Details of the mixer and oven for laboratory paste preparation

Lab pitch

Softening point (°C) 109,1

Toluene soluble (%) 70,6

B fraction (%) 12,9

Quinoline insoluble (%) 16,5

Coking value (%) 58,4



Property

BowlMixing blade

Motor inside vent duct

52

The image acquisition was the last step of the sample preparation for the industrial

formulation paste samples. The laboratory formulated paste samples were subsequently

pressed into anodes.

The laboratory formulated paste samples were pressed using a MTS servo hydraulic

press. A 250 kN MTS load cell and a 150 mm position transducer (LVDT) was used to

control the press. The pressing was done under constant displacement rate of 10 mm/min

up to a maximum force of 220kN (60 MPa). The diameter of the mold was 68 mm. The

temperature of the mold and the press was controlled by a three zone split-tube furnace

(LAB-TEMP Thermacraft) mounted on the press. Anodes were pressed at 150°C. Figure

22 presents details of the mold and the press used to produce the anodes.

Figure 22 – Details of the press: a) cylindrical mold and dye and b) the press with the oven

to control the pressing temperature (Azari Dorcheh 2013)

The green anodes obtained after pressing were baked in a Pyradia furnace. To protect the

anodes from oxidation, they were place in a metal box (i.e. made of inconel) and covered

with packing coke (Figure 23). The heat-up rates are listed in Table 11. After the anode

samples were maintained at 1100 °C for 20 hours (i.e. soaking time), the furnace was then

turned off and the anodes were cooled in the closed furnace down to room temperature

(i.e. approximately 30 hours).

a) b)

53

Table 11 – Heat-up rate during the laboratory anode baking

Figure 23 – Laboratory baking furnace and baking box

The apparent density of the green and baked anodes was measured on each sample. The

volume of the anodes was measured geometrically using the average of 4 length

measurements around the sample and the average of 6 diameter measurements (i.e. 3

diameters at 2 different heights). The weight of each sample was measured using a

Sartorius CPA160015 scale. The density (in g/cm3) was then computed by dividing the

sample weight by its measured volume.

4.2.4 Industrial paste sampling

The machine vision algorithm was developed and tested with the laboratory paste, but it

was also validated using industrial paste samples. These samples were collected at the

Alcoa Deschambault smelter’s paste plant. The paste was collected manually on a

conveyor belt after the mixing step, but before the compaction. At each sample time, three

aluminium containers were filled with paste. Each sample had a volume of approximately

550 cm3 and weighted around 500g. The imaging set-up used at the plant is shown in

Figure 24.

Temparature

range (°C)

Heat-up

rate (°C/h)

30-150 60

150-650 20

560-1100 50

54

Figure 24 – Imaging set-up installed at the ADQ industrial plant

Paste sample were collected for normal (i.e. steady-state operation) and plant start-up

operations and also from controlled pitch optimization (PO) experiments. For each paste

samples, the green anode density corresponding to that sample was available as well as

the process operating conditions data. Details of the data synchronization are available in

(Lauzon-Gauthier 2011).

These data are not sufficient to fully characterize the quality of the paste samples. Subsets

of green anode cores were baked in the laboratory furnace at Université Laval (Figure 21)

to obtain the baked anode density. Some of these anodes were also characterized at the

Deschambault laboratory to measure the electrical resistivity, Young’s modulus,

compressive strength and the CO2 reactivity. These core samples were drilled from the

test anodes before baking. The baking was performed in the laboratory furnace to obtain

uniform baking profiles for all cores, which is impossible to obtain if the anodes are baked

in the industrial furnace. The green cores have a diameter of 55 mm and a length of

approximately 356 mm.

4.3 Image analysis methodology

4.3.1 Description of the imaging set-up

The imaging set-up developed in this project is shown in Figure 25. It consists of an Allied

Vision Technologies Prosilica GX 6 megapixels color camera with a 50 mm Kowa lens,

two 4.5 W LED light bulbs and Fresnel lenses to ensure uniformity of lighting. This set-up

allows for a wide variety of adjustment of the lighting angle and camera height.

55

Figure 25 – Image acquisition set-up

The camera is connected to a laptop computer using an Ethernet cable and the network

card of the computer. Allied Vision Technologies proprietary software AcquireControlTM

version 4.0.2 is used for controlling the camera as well as image acquisition. The images

were saved as Tagged Image File Format (TIFF) format images.

The paste samples were placed into an aluminium food container of dimension (109 × 127

× 40 mm) for image acquisition. Each sample was spread into the contained and its

surface was flattened using a metal spatula.

The working distance between the sample and the camera was adjusted to maximize the

visible paste surface without imaging the shiny aluminum container’s edge. The distance

between the lens and the paste surface was 461 mm. At this distance, the pixel resolution

was 40.7 µm. The illumination angle (i.e 45°) was chosen to obtain the most uniform

lighting of the surface possible. The exposure time was selected to minimize the saturation

in the images. The exposure time was 80 ms for the industrial samples and 70 ms for the

lab samples. This is most probably due to the difference in surrounding light intensity in the

paste plant and in the laboratory. It was much darker inside the paste plant then in the well

lit laboratory. In both cases, the exposure time was selected to minimize the pixel

saturation.

4.3.2 Description of the image analysis methodology

The image texture analysis method consists of computing GLCMs and the DWT

decomposition of each paste image. This is followed by a features extraction and analysis

Fresnel

lens

LED lights

Camera

56

steps and finaly the interpretation of the information obtained through the texture analysis

scheme . A flowsheet of the methodology is presented in Figure 26.

Figure 26 – Anode paste machine vision flowsheet

Figure 27 shows an example of laboratory (a) and industrial paste (b) samples. Their

appearance is similar, but the industrial paste is coarser (i.e. contains larger particles) than

the laboratory paste.

- Energy- Entropy- Standard deviation- Skewness

- Kurtosis

Original image

Pre-processing

- RGB to grayscale

- ROI- Low pass filtering

- Contrast enhancement

GLCM DWT

Features for each L and average of θ

- ASM- Entropy- Contrast- Correlation

- Homogeneity

Details

coefficient

Approximation

GLCM

Features for

L(s) and average of θ

- ASM

- Entropy- Contrast

- Correlation

- Homogeneity

Feature analysis

- PCA

- PLS

Desired information

Interpretation,use in monitoring or

control scheme

57

Figure 27 – Example of paste image: a) laboratory paste and b) industrial paste

Unlike most on the shelf commercial cameras, there is no optical low-pass filter installed

on the sensor of the camera used in this project. It is not clearly visible in Figure 27, but

these images contain some high-frequency noise due to the lack of low-pass filter. To

eliminate that noise, a 3×3 Gaussian low-pass filter was applied to the images in the pre-

processing steps. This operation was performed just after the transformation of the RGB

image into a grayscale image. The low-pass filtered and grayscale image is presented in

Figure 28 a).

Paste images are very dark and have a low contrast. It is possible to enhance the contrast

of the images to obtain better results in image texture analysis. The imadujst function built

in MatlabTM was used to pre-process the paste images. This function performs an

adjustment of the distribution of the pixel intensity to obtain 1% of saturation of the pixels

at minimum intensity (i.e. 0 or black) and at maximum intensity (i.e. 1 or white). The image

before contrast enhancement (Figure 28 a) is compared to the results after pre-processing

(Figure 28 b). The pixel intensity histogram for both images is presented in Figure 28 c). It

is shown that the distribution of pixel intensity is more uniform after pre-processing which

results in more contrast in the image.

b)a)

58

Figure 28 – Results of the image pre-processing: a) low-pass filtered grayscale image, b)

image after contrast enhancement and c) comparison of the intensity histogram for both

images

The original image size was 2751×2199. In some images, parts of the aluminium container

or holes in the paste were observed at the edges. To avoid interference with texture

analysis the images were cropped to a smaller region of interest (ROI) of 2560×1882. Both

the GLCM and DWT were then applied to the pre-processed images.

Next, the GLCMs are computed at four different angles θ (i.e. 0°, 45° 90° and 135°) for

each distance L. The GLCM distances were chosen to match the coke particle size

distribution classes as closely as possible as shown in Table 12. The features listed in

section 3.3.1 are computed as the average of the four angles θ at each distance L. The

average is used in the case of paste images since the orientation of the particles is

stochastic and there was no structured or oriented patterns observable.

b)a)

c)

0

2

4

6

8

10

x 104

Pix

el count

Normalized pixel intensity value0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Before

After

59

Table 12 – Choice of GLCM distance L and comparison to the particle size distribution

Prior to applying the Discrete Wavelet Transform (DWT) to the paste images, one needs to

determine the number of decomposition levels to use. It was chosen based on the smallest

dimension of the pre-processed imaged (i.e. 1882 pixels). Seven decomposition levels

were selected for this application since each decomposition level decreases the size of

each dimension by a factor of 2. It is not recommended to reduce the image dimensions

below 10 pixels. After seven decomposition levels, the image has a residual size of 20×14.

The symlet wavelet was selected for the preliminary work (i.e. proof of concept and choice

of preprocessing) since it is a simple wavelet with only one parameter to define. This

parameter is the length of the wavelet filter. It was decided to use the symlet 4 (sym4)

which shape is presented in Figure 29. Several types and length of wavelet have been

tested and these results will be presented in section 6.3.3 of this thesis.

Figure 29 – Symlet 4 wavelet function

L

(pixel)

Length

(mm)

Length

(US mesh)

1 0,04 -200

2 0,08 +200

4 0,16 +100

8 0,33 +50

16 0,65 +30

36 1,47 +14

60 2,44 +8

120 4,88 +4

240 9,77 +3/8 (in)

0 1 2 3 4 5 6 7

-1

-0.5

0

0.5

1

1.5

60

The DWT splits the original image in a series of detail coefficient sub-images capturing

defined frequency bands. Table 13 presents the size range (i.e. period of the discrete

signals) of the features in mm as well as the GLCM distance L used for the machine vision

algorithm. The feature size ranges were computed based on the pixel spatial resolution in

the image and the frequency range of each decomposition level. The GLCM distances

correspond in pixel number to the middle of the decomposition level size range. For

example, 3 pixels correspond to approximately 0.12 mm which is the middle of the level 1

size range. The feature size is also compared to the approximate (i.e. closest)

corresponding particle size range.

Table 13 – Band pass size in period (i.e. spatial dimensions) for each decomposition level of the DWT

The WTA features listed in section 3.3.2 are computed on all the detail coefficient images

and the last approximation image. GLCMs were also computed on the wavelet detail sub-

images for all angles and one distance (i.e. the middle of the level’s band pass). The same

features as the image’s GLCMs are computed for these detail coefficients.

A vector of features was therefore obtained for each paste image after applying both the

GLCM and DWT decomposition. To compare the images together, either PCA or PLS

were used as data clustering or classification tools. PCA was used when no other

information was available on the paste other than the textural features (i.e. when a single

data matrix was available). When additional data on the paste was available, such as

green/baked anode properties or class assignments for each paste sample (i.e. supervised

classification), then a regression problem could be formulated. In those cases, PLS

regression was used.

Decomposition

level (s )

Feature size

(period in mm)

GLCM L on detail

coeff. (pixel)

Approximate particle

size (US mesh)

1 +0,08/-0,16 3 -100/+200

2 +0,16/-0,33 6 -50/+100

3 +0,33/-0,65 12 -30/+50

4 +0,65/-1,30 25 -14/+30

5 +1,30/-2,60 50 -8/+14

6 +2,60/-5,21 100 -4/+8

7 +5,21/-10,42 200 -3/8 (in) /+4

61

Chapter 5 A new Multi-block PLS algorithm

including a sequential pathway

5.1 Introduction

The most common latent variable methods (i.e. PCA and PLS) have been presented in

Chapter 2 along with their interpretation tools. However, graphical interpretation tools such

as loadings and contribution plots can become difficult to interpret when the number of

variables in a model becomes very large. To improve the interpretation, variables in a

dataset can be grouped into meaningful blocks. Usually, a priori knowledge is used to

decide which variable belong in each block. For process data, the blocks often represent

the sequence of steps or unit operations (i.e. pieces of equipment). A typical multi-block

data structure collected from an industrial process is shown in Figure 30. For example, Z

may contain the properties of raw materials at different times or for different lots, the

operating conditions of the different process units used to process each lot of raw

materials would be stored in the Xb blocks (b=1,2,…B), and the resulting properties of the

final product (i.e. quality attributes) could be collected in Y.

Figure 30 – Illustration of the block order for an industrial process

Several multi-block methods have been proposed in the literature and used in various

applications (Westerhuis et al. 1998; Smilde et al. 2003; Kohonen et al. 2008). Early

developments on the multi-block PLS methods were made by Wangen and Kowalski

(Wangen & Kowalski 1989), MacGregor et al. (MacGregor et al. 1994) and Westerhuis and

Coenegracht (Westerhuis & Coenegracht 1997). They proposed a hierarchical structure

allowing visualization and interpretation of the data at two levels: the super level where the

information from all the blocks is combined, and the local block level. These two levels of

scrutiny greatly help dealing with the large number of variables because each individual

block includes a smaller number of them while an overview of the information contained in

all blocks is maintained at the super level. This feature was found particularly useful for

process monitoring and fault detection applications (MacGregor et al. 1994; Kourti et al.

X1 XB

Process units

Z

Raw materials

Ob

se

rva

tio

ns

Variables

Y

Product quality

62

1995) and also for improved process understanding (Kohonen et al. 2008; Tessier,

Duchesne, Tarcy, et al. 2011).

The multi-block PLS models (MB-PLS) are computed based on extensions of the non-

linear iterative partial least squares (NIPALS) algorithm (Geladi & Kowalski 1986;

Höskuldsson 1988; Wold, Trygg, et al. 2001) traditionally used to estimate PLS models.

The main difference between the first two algorithms proposed in the literature is in the

way the regressor blocks (i.e X’s) are deflated. The first one, proposed by Wangen and

Kowalski (Wangen & Kowalski 1989), uses the block scores to deflate each X blocks.

Alternatively, Westerhuis and Coenegracht (Westerhuis & Coenegracht 1997) proposed to

use the super scores for deflating the X blocks. In a review article on multi-block PCA and

PLS methods, Westerhuis et al. (Westerhuis et al. 1998) proved that loss of information

occurs when block score deflation is used. This leads to poorer prediction performance of

the models. The super score deflation approach solved that issue and therefore the

Westerhuis and Coenegracht MB-PLS algorithm is considered as a benchmark in this

chapter. Another interesting feature of this algorithm is that it can be computed directly

from a standard PLS model after applying block scaling to each regressor block, and

combining them in a single X matrix (Westerhuis et al. 1998).

Other alternative methods and hierarchical frameworks have also been proposed in the

literature to accommodate specific data structures (i.e. pathways or correlations) or to

introduce some prior knowledge into the statistical models and improve their performance.

Smilde et al. (Smilde et al. 2003), Höskuldsson and Svinning (Höskuldsson & Svinning

2006) and Höskuldsson (Höskuldsson 2008; Höskuldsson 2014) proposed different

frameworks for computing PCA and PLS-like models for many different types of pathways

from parallel to sequential blocks or a mix of both cases. All of these methods were

focused on modeling the X or Y data and not so much on the interpretability of the models.

Hanafi et al. (Hanafi et al. 2006) proposed the use of either Common Components and

Specific Weights Analysis (CCSWA) or the Multiple Co-inertia Analysis (MCoA) to combine

information based on multiple measurements technique of the same set of samples for a

food processing application. Finally, Hassani et al. (Hassani et al. 2012) discussed

methods to choose the number of components and assess the importance of each block in

MB-PLS models.

63

However, the Westerhuis and Coenegracht MB-PLS method also has some drawbacks.

First, Westerhuis and Smilde (Westerhuis & Smilde 2001) reported that the use of super

score deflation in the MB-PLS algorithm introduced information mixing between the blocks.

This is explained by the fact that the super scores carry information from all the blocks.

When they are regressed onto each X block as part of the deflation procedure, variations

belonging to all the blocks is introduced into blocks in which this information was not

present originally. Hence, the interpretation of the block scores could be misleading. They

proposed to modify the super score deflation to deflate the Y matrix only, based on

previous work published by Dayal and MacGregor (Dayal & MacGregor 1997) for the PLS

algorithm. This effectively removed the information mixing, but no further use of the

modified algorithm was reported in the literature.

The second issue is that all X blocks are constrained to be modelled by the same number

of principal components. In a case where the effective ranks of the blocks (data matrices)

are different, the model could be improved by selecting different number of latent variables

(LVs) for each of them. For example, a block of data may include process variables

involved in an orthogonal design of experiments (DOE), to be combined with a second

block containing highly correlated spectral data collected from raw materials using some

spectroscopic techniques (Næs et al. 2011). In such a case, it is expected that as many

LVs as the number of variables would be necessary for the DOE block since it is full rank

whereas as the reduced rank nature of the spectral data block may require only a few LVs

(e.g. 1-2) to capture most of the information in that block. This is not possible with MB-

PLS.

Finally, interpreting the relationships between the variables at the block level in presence

of between block correlations is not straightforward and may result in misleading

conclusions. The loading vectors are used to interpret the relationships between the

variables in latent variable models. At the super level, the loadings indicate the contribution

of each block in modelling the Y data. Information about the variables is captured by the

block loadings. When interpreting the variations within a given block, one cannot assume

that these variations were introduced by the corresponding process unit (or step or

instrument), but may have been caused by other blocks. Consider a simple two regressor

block example where Z and X represent raw material properties and process variables,

respectively, and Y contains the final product quality attributes. Variations in raw material

properties typically affect both the process operation and the final product quality.

64

Additional process variations introduced by disturbances (other than raw materials) and

changes in operation policies may also impact the final product. Hence, the variations in

the process block X are not only caused by the process itself but also by variations in raw

materials contained in the other block Z. When interpreting the loadings of the X block

individually, variations in process variables caused by raw materials cannot be identified

explicitly and distinguished from other sources of variations, and may be attributed wrongly

to some other process disturbances. However, when prior knowledge is available about

the existing pathway between the blocks, this information should be used to enhance the

model interpretation. For instance, the variance of X could be decomposed into that

correlated with Z (Xcorr) and that orthogonal to Z (Xortho). The effect of raw material

properties on certain process state variables as well as feedback/feedforward control

actions made to compensate for raw material variations would be identified in Xcorr,

whereas Xortho would contain information about other process disturbances and operation

decisions.

The sequential orthogonal PLS (SO-PLS) introduced by Naes et al. (Næs et al. 2011)

addresses the last two issues of MB-PLS to some extent. It follows the framework

proposed by Jørgensen et al. (Jørgensen et al. 2004; Jørgensen et al. 2007) in which

separate individual PLS models are estimated using each regressor block in a stepwise

fashion after the blocks are sequentially orthogonalized with respect to each other.

Basically, a sequential pathway is assumed for the regressor blocks and defines their

ordering. A PLS model is built between the first block and the response data Y.

Subsequent blocks are then othogonalized with respect to the information used from the

first block by regressing them against the scores of the PLS model. The procedure is

repeated for subsequent regressor blocks using only their orthogonal information and the

Y-residuals of the PLS model obtained at the previous step. Hence, at each step, the

regressor block only contains new information to explain the remaining variance left in Y.

One of the motivations for developing the SO-PLS approach was to model the Y data by

combining regressor blocks having very different ranks, such as a DOE block and several

blocks of spectral data (non-full rank) (Jørgensen & Næs 2008; Menichelli et al. 2014). The

proposed method allows using a different number of latent variables per block. It also

helped in the interpretation because each PLS model uses a single regressor block and

only new information is modelled at each step due to the orthogonalization.

65

However, the main issue with SO-PLS is that it totally ignores the correlated information

between the blocks. Although this information is not useful for making predictions, it is

important for interpreting the relationships between the variables especially in industrial

process applications. Consider again the simple two regressor block example discussed

previously. Applying SO-PLS to that dataset would result in two PLS models, the first

between Z and Y, and the second between Xortho and F, where Xortho is X orthogonalized

with respect to Z and F consists of the Y-residuals of the first model. Hence, the correlated

information between Z and X (i.e. Xcorr) is completely removed from the analysis. As

mentioned earlier, Xcorr contains process variations introduced by raw materials as well as

any feedback/feedforward control actions made to attenuate variations in raw material

properties. The authors consider this information very important for process understanding

and improvement, and for quality control. For example, Xcorr is required for establishing

multivariate specification regions for raw material properties in presence of

feedback/feedforward control (Duchesne & MacGregor 2004).

The goal of this chapter is to propose a new sequential multi-block PLS algorithm (SMB-

PLS) that combines the advantages of both MB-PLS and SO-PLS methods. This includes

the integrated two level hierarchical structure (super and block levels) of the first, the

separation of correlated and orthogonal information and the use of different number of

latent variables per block from the second. This is achieved by incorporating a sequential

pathway structure and block orthogonalization within the MB-PLS algorithm. In process

applications, the pathway typically represents the flowsheet of process units, from raw

materials to final product. The key feature of the new algorithm is that correlated

information between a given block and subsequent ones is pooled in a common latent

variable space whereas orthogonal information is captured by other components. Hence,

both between block correlated and orthogonal information is considered in the model and

therefore, no information is lost.

This chapter is organized as follow. First, the different multi-block methods are introduced

in more technical details. Second, two dataset are used to illustrate the properties of the

proposed SMB-PLS algorithm. The first was obtained from a simulated film blowing

polymer extrusion process, and the second from a real anode manufacturing process.

Modeling results and interpretation are then discussed, and conclusions are drawn.

66

5.2 Description of the multi-block methods

5.2.1 Multi-block PLS (MB-PLS)

There are two different implementations of the MB-PLS algorithm. The first was proposed

by Wangen and Kowalski (Wangen & Kowalski 1989) and uses the block score deflation,

whereas the second algorithm proposed by Westerhuis and Coenegracht (Westerhuis &

Coenegracht 1997) uses super score deflation. The computation of both algorithms is

based on the PLS NIPALS algorithm and is essentially the same except for the deflation

step. The MB-PLS based on block score deflation suffers from loss of information in the

deflation step (Westerhuis & Coenegracht 1997; Westerhuis et al. 1998) leading to inferior

prediction ability compared with the super score deflation method. Also, the super score

deflation method gives equivalent result as PLS when all blocks are concatenated in the

single regressor matrix and block scaling is applied (Westerhuis et al. 1998). For these

reasons, only the second MB-PLS algorithm (super score deflation) will be described in the

section. A detailed description of the algorithm and the Matlab code for computing the

model are available in Westerhuis et al. (Westerhuis et al. 1998).

The schematic of the method is presented in Figure 31. In this figure, tb corresponds to the

block score and tT so the super score. This figure shows the main steps in computing the

MB-PLS model. First, vector u is initialized and is then regressed on all blocks X1, X2, ...,

XB separately to obtain block weights w1T, w2

T, ..., wBT. These block weights are used to

compute block scores t1, t2, ..., tB after normalization to length one. The block scores are

combined together column wise in a super block T. Then super weights wTT and super

scores tT are computed using T (i.e. concatenated block scores). Here the super weights

are normalized to length one. This computation cycle is repeated until the convergence of

the super score tT. The super scores tT is used for the deflation of all Xb blocks.

67

Figure 31 – The MB-PLS algorithm for 2 regressor blocks

(adapted from (Westerhuis et al. 1998))

Since the number of variables in each block is different, it is important to apply block

scaling (equation 5.1) to give the same importance to each block. If this step is not

performed, the latent variable model will focus more on those blocks containing many

variables (i.e., carrying more variability) and may hide the variance contained in the

smaller blocks. To do so, each variable is divided by the square root of the number of

variables in that block. This operation sets the variance of each block to one as opposed to

each variable having a variance of one after the normal auto-scaling applied for normal

PLS models.

*

*, =jbl

j

bk

xx 5.1

Where x*j is the auto-scaled variable, j is the variable index and kb is the number of

variables in the block.



For b = 1:B

2.1. wb = XbTu/(uTu)

2.2. wb = wb/(wbTwb)

½

2.3. tb = Xbwb

End

2.4. T = [t1 ... tB]

2.5. wT = TTu/(uTu)

2.6. wT = wT/(wTTwT)½

2.7. tT = TTwT /(wTTwT)

2.8. q = YT tT/(tTTtT)

2.9. u = Yq/(qTq)

2.10. Check for convergence of tT or u.


For b = 1:B

3. pb = XbTtT/(tT

TtT)

4. Eb = Xb – tTpbT

End

5. F = Y – tTqT

6. Store wb, pb, tb, tT and u as new columns in W, P, T, TT and U.

7. Restart at step 1, replacing Xb by Eb and Y by F.

Y

X1 X2

u

w1T w2

T

qT

t1 t2

tT

p1T p2

T

Super level

Block level

T

wTT2.1

2.1

2.3 2.3

2.42.4

2.5

2.7

2.82.9

68

5.2.2 Sequential Orthogonal PLS (SO-PLS)

Sequential orthogonal PLS (SO-PLS) was proposed by Naes et al. (Næs et al. 2011).

They developed this method for the analysis of datasets combining design of experiment

data (i.e. a full rank data matrix), and a very low rank spectral data blocks obtained from

analytical instruments such as spectroscopy or chromatography. In this case, selecting the

number of LVs for the traditional MB-PLS is a compromise between the full rank DOE data

and the low rank data. With SO-PLS, different number of LV can be computed for each

block, overcoming the problem. The schematic of the SO-PLS method is presented in

Figure 32.

Figure 32 – The SO-PLS algorithm shown for 2 regressor blocks

The SO-PLS method is a stepwise procedure by which a separate PLS model is estimated

for each regressor block Xb. The particularity of the method is that each subsequent block

is orthogonalized with respect to the scores (T) of the PLS models built at the previous

steps. The orthogonalization step ensures that only new information not modeled by the

previous blocks is left in the subsequent blocks. The framework is as follow (Figure 32).

First, the regressor blocks Xb are ordered according to a sequential pathway defined by

the user. A PLS model is built between the first block in the sequence X1 and Y. Then the

scores of the first block T1 are used to orthogonalize the data in the second block (and

subsequent blocks if B > 2) using multiple linear regression (MLR):

For b = 1, 2, …, B-1 and for k = 1, 2, …, B-b

( )--1orth T T= X X T T T T Xb bb+k b+k b b b+k 5.2

YX1

X2orth

u1

w1T

w2orth,T

q1T

t1

t2orth,T

p1T

p2orth,T

X2

PLS

F

u2

q2T

PLS

1

2

3

69

Finally, a PLS model is built between the orthogonalized second block (X2orth) and the Y-

residuals (F) of the first PLS model. This sequence is repeated for all subsequent blocks

and is not limited to datasets with only two X blocks. When b=B, then a simple PLS model

is built between the information left in XB after the sequential orthogonalization steps and

the Y-residual at this step.

For this method, the blocks do not need to be block scaled prior to the analysis. Moreover,

contrary to the MB-PLS method, the number of components selected for each block can

be different. This allows capturing information from dataset with very different ranks.

5.2.3 Proposed algorithm: the Sequential Multi-block PLS (SMB-PLS)

The proposed multi-block method is called the Sequential Multi-block PLS (SMB-PLS).

The schematic of the method is presented in Figure 33. It is presented for two blocks only,

but the algorithm can be applied to any number of blocks.

An orthogonalization procedure according to a sequential pathway is introduced into the

MB-PLS framework, while keeping the super level and block level structure for ease of

interpretation. The first step of the algorithm is to compute the first block weight w1T by the

regression of an initial Y score u onto X1. Than in order to differentiate the correlated

information from the orthogonal information, the subsequent blocks Xb are split using the

following equation:

For b = 1, 2, …, B-1 and for k = 1, 2, …, B-b

( )-1corr T TX XX X= ∗X T T T T Xb bb+k b b b+k 5.3

The block score for the subsequent block are computed by regressing u onto Xbcorr to

obtain the block weights wbcorr,T. Than the X block score t1, ... tb are combined in super

level score T as in the MB-PLS method. The last step is the computation of a PLS cycle

between u and T to compute the super level weights (WTT) and super scores tT. This

computation cycle is repeated until convergence on tT. Deflation using the super score is

then performed. Once all the information from X1 has been explained, the same

methodology is applied to the subsequent blocks. Since only the correlated information

with the previous block was removed by the deflation step, the components for the

subsequent block will only model new information not explained by the previous block

components.

70

Figure 33 – The SMB-PLS algorithm for two X blocks

The advantages of this method are that a different number of components can be used for

each block and it also enables the visualization of between blocks correlated information.

Different number of components can be computed for each block since only the correlated

information is removed after the deflation for the subsequent block. This leaves orthogonal

(i.e. new) information in the Xb blocks to further explain variations in Y by additional

components. Finally, for each LV, block scores and loadings are computed for each block

to enable interpretation of relationships between variables, outlier detection, visualization

of clustering patterns and so on. The super scores also give important information on the

correlation structure between the blocks.

For b = 1, ..., B-1


2. Start convergence loop.2.1. wb = Xb

Tu/(uTu)

2.2. wb = wb/(wbTwb)

½

2.3. tb = Xbwb

For k = 1, ..., B-b

2.4. Xb+kcorr = tb(tb

Ttb)-1tb

TXb+k

2.5. wb+k = Xb+kcorr,Tu/(uTu)

2.6. wb+k = wb+k/(wb+kTwb+k)

½

2.7. tb+k = Xb+kcorr wb+k

End

2.8. T = [tb ... tB]

2.9. wT = TTu/(uTu)

2.10. wT = wT/(wTTwT)½

2.11. tT = TTwT /(wTTwT)

2.12. q = YT tT/(tTTtT)

2.13. u = Yq/(qTq)

2.14. Check for convergence of tT or u.


For k = b, ..., B

3. pk = XkTtT/(tT

TtT)

4. Ek = Xk – tTpkT

End

5. F = Y – tTqT

6. Store wb, pb, tb, tT and u as new columns in W, P, TT and U.

7. Restart at step 1, replacing Xb by Eb and Y by F.

8. When the information in blocks Xb is depleted, increment b and start at step 1.

End

9. When b = B, compute a normal PLS between EB and F.

First block components

Second block components FE2

u

w2T qT

t2

p2T

PLS

X2

Y

X1 X2corr

u

w1T w2

corr,T

qT

t1 t2corr,T

tT

p1T p2

T

Super level

Block level

T

wTT2.1

2.7

2.8

2.9

2.11

2.12

2.6

2.3

2.13

2.8

2.4

71

This algorithm has other interesting properties. First, both the block scores and super

scores are orthogonal. In MB-PLS, the block scores are correlated. In fact, the information

captured by the block scores is the same as the super scores for each latent variable. Only

the numerical values are not exactly the same. It is only necessary to use either the super

scores of block scores for interpretation. Also, for the weights, the block weights are

important for the interpretation of what is going on inside each blocks. But the super

weights are equally important since they provide information about how the information

extracted from each block distributes in each component of the model.

5.3 Description of the dataset used for the case studies

This section describes the two datasets used to illustrate the properties of the new SMB-

PLS algorithm and compare them against the traditional MB-PLS and SO-PLS methods.

The first dataset was obtained from a simulated polymer film blowing process. The second

is an industrial dataset collected from an anode manufacturing process.

5.3.1 Simulated data from film blowing process

The data used in this part of the thesis were retrieved from a simulated polymer film

blowing process presented by Duchesne (Duchesne 2000) and Duchesne & MacGregor

(Duchesne & MacGregor 2004). Two case studies are used to illustrate the differences in

the multi-block methods. The first case represents the situation when there is no

correlation between the raw material variations and process variations. That is the

variability in the raw material does not have an effect on the process parameters and there

is no between block correlation. In the second case, correlation between the raw material

and the process variables is introduced through a feedforward control strategy. The final

section of the film blowing process showing the bubble inflation and cooling is presented in

Figure 34.

Polymer films are manufactured by an extrusion process. The material is melted and

mixed in a screw extruder and passed through a hollow circular die. As the polymer melt

reaches the die, air is blown inside the extruded film to maintain a given inflation pressure.

This creates a bubble-shape film of a desired diameter. The film is then cooled by blowing

air at a given temperature on its outer surface.

72

Figure 34 – Simulated end section of a film blowing process

(adapted from (Duchesne 2000))

The data is organized in three different blocks. The raw material block (Z) contains the raw

polymer properties consisting of ten temperature dependent viscosities (η), the heat

capacity (Cp) and the density (ρ). The second block X contains process variables. The

process manipulated variables are the polymer flow rate (Q) and cooling air flow rate. The

cooling air flow rate is however represented by the maximum local heat transfer coefficient

along the bubble (h0). The ambient air temperature (Ta), a measured process disturbance,

is also included in the X block. Two film quality variables (Y) are measured in this

simulation. The film frost line height (FLH) which is the position along the film where the

cooling has no more effect on the film properties (affects film crystallinity). The last

measurement is the full stress in the machine direction (FMDS) taken beyond the FLH. It is

a measure of the mechanical strength of the film in the extrusion direction.

For both case studies, 50 lots of raw materials were simulated by introducing variations in

the polymer resin properties (viscosity curve, heat capacity and density). These were

processed in the simulator according to some operating policies. Details of the simulation

are available in Duchesne’s Ph.D. thesis (Duchesne 2000).

5.3.1.1 First case – No correlation between raw materials and process data

The first case study illustrates the properties of the different multi-block methods when

there is no correlation between the regressor blocks. Variations in both blocks affect final

product properties (Y), but no correlation exists between the two regressor blocks (i.e. Z

and X). In this case, random variations were added to raw material properties (Z) and to

process variables (X) Q, h0, and Ta to simulate the effect on the product quality.

FLH

Die

Cooling air:Ta, h0

Molten polymer :Cp, ρ, η(T) and flow rate Q

73

There is, however, correlated information inside each block. In the Z block, the viscosities

(η) are correlated, but the variations in ρ and Cp are independent. The correlations in X are

due to the feedforward control actions implemented to attenuate variations in Ta by

adjusting Q and h0 (control scheme correcting for a process disturbance and not for

changes in resin properties).

5.3.1.2 Second case – Correlation between raw materials and process data

In this second case study, variations in raw material properties were implemented similarly

as for the first case, but this time process parameters are adjusted by a second

feedforward controller to attenuate variations caused by raw material properties. This

introduces correlations between the raw material and process blocks (Z and X). The

feedforward controller corrects for some of the variability in the polymer heat capacity Cp

by adjusting the flow rate Q. The existing feedforward control for Ta is modified to use only

h0 as the manipulated variable.

5.3.2 Industrial data from the anode manufacturing process

The details of the anode manufacturing process have been described in detail in Chapter 1

of this thesis. Nevertheless, Figure 35 presents a non-exhaustive list of the variables

included in each data block.

The ordering of the blocks is chosen to represent the natural order of the process units.

The first block (Z) includes the properties of all three types of raw materials (coke, pitch

and anode butts). The first process block (X1) contains the variables associated with the

formulation of the paste. That is, the relative amount of each type of material as well as the

particle size distribution of the aggregates. The second process block (X2) represents the

operating conditions measured during the mixing and the vibro-forming of the anode paste.

The operating conditions of the anode baking furnace are stored in the third process block

(X3). Finally, the baked anode quality data obtained from the laboratory (i.e. testing of

anode core samples) are collected in the response block Y. These are listed in Table 14.

74

Figure 35 – Data blocks collected from the anode manufacturing process (Modified from

(Lauzon-Gauthier et al. 2012))

Table 14 – List of the Y variables used for the anode manufacturing dataset case study

This industrial dataset previously investigated by the author (Lauzon-Gauthier 2011;

Lauzon-Gauthier et al. 2012) was found to be a good candidate for testing the multi-block

algorithms for two main reasons. First, it contains repeated observations in the raw

material block allowing to assess between block information mixing (Westerhuis & Smilde

2001). But most importantly, some of the blocks are correlated (i.e. raw material Z and

formulation X1) whereas others are not (i.e. raw material Z and baking X3).

Raw

materials

Classification

of materials

Paste mixing

& Forming

Baking

• Coke density and impurities

• Pitch physical properties

• Butts impurities

• Aggregate size distribution

(shift based)

• Paste formulation

• Temperatures

• Mixing power, etc.

• Bellows pressure

• Anode Height

• Maximum flue wall temperature

• 2 anodes temperature

• Cycle time, etc.

1

2

3

4

5Core

Sampling• Physical properties

Z

Y

X1

X2

X3

Number Variable

1 Green anode apparent density

2 Green anode weight

3 Baked anode weight (mean)

4 Thermal conductivity

5 Baked anode apparent density

6 Real dens

7 Compresive strengh

8 LC

9 Young's modulus

10 Electrical resistivity

75

5.4 Results and discussion

All dataset were auto-scaled and block scaled prior to the analysis. The original anode

manufacturing process dataset contained missing data. These were imputed by values

estimated using the PLS Toolbox mdcheck function which uses a PCA model to replace

the missing data until convergence. This operation was performed on the whole

concatenated dataset (i.e. [Z, X1, X2, X3]).

5.4.1 Selecting the number of components

Selecting the number of component in any latent variable methods is very important for the

interpretation but also the application of the models. The commonly used cross-validation

technique was described in section 2.4 of this thesis.

To compare the prediction ability and the interpretation obtained with the three multi-block

methods, the same criterion was used to select the number of components. This is

especially important for sequential algorithm like SMB-PLS and SO-PLS since the number

of components captured after each block will have an effect on the orthogonalization and

thus the information left in the subsequent blocks.

The selected criterion is a modified version of the Q2 statistic as defined in the ProMV

software and in (Wold, Sjöström, et al. 2001). It is based on the prediction ability of a

model obtained on an external validation dataset instead of using the traditional cross-

validation procedure. Equation 5.4 describes the statistic. An increasing Q2 value (i.e.

predictive ability) from one LV to the next indicates the significance of the added

component.

2

Y

PRESSQ Y 1

SSa

a = − 5.4

Where a corresponds to the number of latent variables and SSY is the total sum of squares

of the Y data. PRESSa is the prediction error sum of squares calculated based on the

external validation dataset and a model containing a latent variables:

( ),ˆ-

2

( )1 1

PRESSI H

a ih aihi h

y y= =

= ∑∑ 5.5

76

In equation 5.5, I and H respectively correspond to the number of observations in the

external validation dataset and the number of variables in Y.

The root mean square error of prediction (RMSEP) defined in the following equation is also

used to choose the number of LVs. Usually the RMSEP decreases rapidly and then

stabilizes. The LV at which the RMSEP stops decreasing is an indication of that no more

information in X can be used to improve predictions of Y.

( ),

,

ˆ2

1RMSEP

I

i i ai

h a

y y

n

=−∑

= 5.6

The number of latent variables of the multi-block PLS models selected based on the

RMSEP criterion was set to the smallest number that meets the one of the following two

criteria: the last LV (i.e. a-1) before the Q2Y increases less than 0.01 (i.e. 1% of additional

prediction performance) or the first LV where all Y variables RMSEP stops decreasing.

To apply this procedure, both the simulated film blowing datasets and the anode

manufacturing data were split in a calibration and a validation set. The calibration set was

formed by selecting randomly two-thirds of the original dataset. The remaining data was

used as the external validation set. To make sure that both datasets spanned the same

range of variations, a PCA model was built on the calibration data (i.e. the concatenation

of Z, all X’s and Y) and the prediction set was than projected onto the model. The score

plots, residuals and T2 were checked to make sure that both dataset spanned the same

sub-space.

Note that the SO-PLS and SMB-PLS algorithms allow for a different number of LVs to be

selected for each regressor block, as opposed to the traditional MB-PLS for which the

number of LVs is the same for all the blocks. Hence, the procedure discussed above for

selecting the number of LVs applies to each regressor block sequentially for SO-PLS and

SMB-PLS. The number of components could also be selected by computing the Q2Y and

RMSEP for all the possible combination of LVs for each block and selecting the optimum

as discussed in (Næs et al. 2011). In this study, the number of components were selected

sequentially for each block for the 3 methods.

77

5.4.2 Results for the film blowing example

This section describes the results for the film blowing example. The number of LVs was

first selected for the six multivariate models obtained by estimating the 3 multi-block

models (i.e. MB-PLS, SO-PLS and SMB-PLS) on the datasets of the 2 case studies. The

prediction performances of the models for case 1 are shown in Figure 36 and for case 2 in

Figure 39. The models were then interpreted and the distribution of the information

captured from each block in each latent variable is discussed in order to illustrate the

properties of the algorithms.

Figure 36 shows both Q2Y statistics as well as the RMSEP for FLH and FMDS for the MB-

PLS model built on the case 1 data (no correlation between Z and X). A total of 6 LVs were

selected for this model because the increase in Q2Y is less than 0.01 and the RMSEP do

not decrease for both Y-variables after 6 components.

Figure 36 – Q2Y and RMSEP statistics for selecting the number of components of the MB-

PLS algorithm for case 1 (Z and X are orthogonal)

For both sequential methods, the selection of the number of components is performed

sequentially and the results are presented in separate figures for each block. Figure 37

shows the statistics for the SO-PLS method.

1 2 3 4 5 6 7 8 9 10

0

0.2

0.4

0.6

0.8

1

LV

Q2 a

nd R

MS

EP

Q²Y RMSEP FLH RMSEP FMDS

78

Figure 37 – Q2Y and RMSEP statistics for selecting the number of component of the SO-

PLS model for case 1

Both statistics indicate that 4 components should be used for the raw material block (Z).

For the second block (X), 2 latent variables are selected. This choice was based on the

Q2Y statistics which stops increasing at the 2nd LV. The maximum numbers of components

for the Z and X blocks are 10 and 3, respectively, which corresponds to the number of

variables in each block.

Figure 38 presents the statistics for the SMB-PLS model for case 1. The number of LVs for

the Z block is also 4 just as the SO-PLS. The RMSEP for FMDS is low and the Q2Y stops

increasing after 4 components. The number of LVs used for the X block is 3 since it is the

number of variables in the block. In this case, there are at least two phenomena included

in the X block. The first is the variations in Q and the second are the variations of h0 based

on Ta. There are also some random variations due to the simulation included in the

dataset. It is reasonable to assume that this block needs to be modeled by 3 LVs.

1 2 3 4 5 6 7 8 9 10

0

0.5

1

Z block

Q2 and RMSEP

1 2 3

0

0.2

0.4

0.6

0.8

1

X block after 4 Z LV

Q2 and RMSEP

LV


79

Figure 38 – Q2Y and RMSEP statistics used for selecting the number of component of the

SMB-PLS model for case 1

The selection of the number of components for the second case study (correlated Z and X

blocks) is presented in Figure 39.

Figure 39 – Q2Y and RMSEP statistics used for selecting the number of component for the

MB-PLS algorithm for case 2

As shown in Figure 39, a total of 7 LVs are recommended because the minimum value for

RMSEP is reached for both Y-variables and this also correspond to the maximum value of

Q2Y.

1 2 3 4 5 6 7 8 9 10

0

0.2

0.4

0.6

0.8

1

Z block

Q2 and RMSEP

1 2 3

0

0.2

0.4

0.6

0.8

1X block after 4 Z LV

Q2 and RMSEP

LV


1 2 3 4 5 6 7 8 9 10

0

0.5

1

1.5

LV

Q2 a

nd R

MS

EP


80


SO-PLS model for case 2

The statistics for the SO-PLS model are presented in Figure 40. For the Z block, the

number of LV was selected to be 3 mainly based on the Q2Y statistic. For the X block, 3

components were also selected.


SMB-PLS model for case 2

1 2 3 4 5 6 7 8 9 10

0

0.5

1

1.5

2

2.5

Z block

Q2 and RMSEP

1 2 3

0

0.5

1

1.5

2


Q2 and RMSEP

LV


1 2 3 4 5 6 7 8 9 10

0

0.5

1

1.5

2

Z block

Q2 and RMSEP

1 2 3

0

0.5

1

1.5


Q2 and RMSEP

LV


81

For the SMB-PLS model (Figure 41), all statistics suggest using 3 components for both

blocks as for SO-PLS.

RMSEP values for the film FLH is not very well predicted since it is above 1 for all LVs. In

this case, the error variance of the MB-PLS is better than both sequential models. If a

global selection approach (i.e. testing all possible combinations of LVs for both blocks) the

FLH RMSEP is minimal for 8 LVs for the Z block and 3 LVs for the X block with a value of

1.07 instead of 1.19 compared to 0.47 for the MB-PLS.

Figure 42 presents the total Y variance captured for both cases and all three algorithms.

This figure allows comparing the prediction ability of the different algorithms. Since each

model does not capture the same latent variable space due to the orthogonalization

process, the number of LVs is different for each method. The choice of the number of LV

can have an impact on the comparison of the R2 and Q2 between the methods. But the

same criteria were used to ensure a fair comparison.

Figure 42 – Explained Y variance for the three multi-block methods built on the film

blowing datasets: a) case 1 and b) case 2. Z and X block variance explained and total (i.e.

concatenated regressor blocks) variance explained: c) case 1 and d) case 2

MB SMB SO0

20

40

60

80

100

R2 a

nd Q

2

Cal R2 Y Val Q2 Y

MB SMB SO0

20

40

60

80

100

b)a)

Total Z X0

20

40

60

80

100

Blocks

R2

Total Z X0

20

40

60

80

100

Blocks

b)

MB-PLS SMB-PLS SO-PLS

a)

R2 a

nd Q

2

40

60b)

c)

R2

d)

82

Figure 42 (a) presents the explained variance for the three models built on the case 1

dataset. The MB-PLS has the highest R2 of all three algorithms. Based on the validation

Q2, the prediction performances are very similar for all the methods.

For the second case study (Figure 42 b), the MB-PLS model perform significantly better

(explained variance almost 30% higher in calibration and 13% in validation) compared with

the two other methods. This is most probably due to the way the numbers of latent

variables for the sequential methods were selected. If a global search is used for SMB-

PLS, the Q2Y is maximized by using 8 and 3 LVs for the Z and X blocks, respectively. In

this case, the Q2Y for SMB-PLS is 70.3% compared to 77.3% for MB-PLS.

Selecting the number of components of latent variable models having different structures

(i.e. MB-PLS vs sequential methods) is not straightforward. Although the same criteria

were applied to all three multi-block methods to ensure a fair comparison, it is clear that in

the simulated film blowing study, the stopping criteria based on Q2Y and RMSEP statistics

do not allow finding the optimal number of components for the sequential methods when it

does for MB-PLS. However, this situation should not be generalized because all multi-

block methods perform equally well on the anode manufacturing dataset as will be shown

later. The main goal of sequential methods is to improve interpretability of the models and

not prediction ability. A comparative analysis of model interpretability is performed next.

The total variance (i.e. the concatenated Z and X regressor blocks) of each block

explained by the multi-block algorithms in both case studies are shown in Figure 42. For

the first case study (Figure 42 c) the explained variances by each algorithm are very

similar, as was the case of the Y-variance (Figure 42 a and b). In the case where there is

no correlation between the blocks there is no difference between the algorithms.

However, for case 2 presented in Figure 42 d), the correlation between the two blocks has

an impact on the X-variance explained by the models. In this case, MB-PLS captures a

greater percentage of variance compared with the two sequential methods. For SMB-PLS

and SO-PLS, the results are very similar except for the X block. This is due to the

orthogonalization of the X block using the scores of the Z block. In fact, the variance

removed in the second block (X) by the orthogonalization from the SO-PLS is 1.4% for the

first case and 7.9% for case 2. It is higher in case 2 due to the correlation between the

blocks. It is important to note that the variance removed in X (i.e. falling in the LV space of

Z) did not have a significant impact on the prediction of Y because this information is

83

redundant with that included in Z. However, SO-PLS does not consider between block

correlations and so the information in X correlated with Z is not available for interpretation.

This explains why MB-PLS and SMB-PLS capture more variance of X compared to SO-

PLS. Although SMB-PLS performs block orthogonalization, the information in X correlated

with Z is still available for interpretation in the block weights and scores of previous block

(Z in this case). As argued earlier, between block correlated information may be important

for process data analysis and interpretation and should not be removed from the analysis.

How the information captured from each block is distributed in each latent variable of MB-

PLS and SMB-PLS models is shown in Figure 43 for both simulated case studies. The

relative contribution of each block is used to illustrate this point. It is calculated as the

square of the super weight of a given block in each LV. Since both MB-PLS and SMB-PLS

have a similar hierarchical structure (i.e. block and super levels) and block contributions

can be computed in the same way, only those two algorithms are compared in Figure 43.

For the SO-PLS algorithm, block contributions could be calculated based on the explained

variance Y by each model (i.e. block).

As shown in Figure 43 a) and b), MB-PLS extracts information from all the blocks in every

component and thus distributes the information among all latent variables no matter the

correlation structure between the blocks. Even when the blocks are orthogonal to each

other (Figure 43 a) the LVs capture information from both blocks. However, SMB-PLS

distributes the information differently depending on the between block correlation structure.

For the case where Z and X are uncorrelated (Figure 43 c) the first 4 LVs extract

information from the raw material block and almost nothing from the process block. The

latter is modeled by subsequent LVs. On the other hand, when Z and X are correlated to

some extent (Figure 43 d) the third component (Z-3) clearly captures correlated

information between the two blocks whereas the first two (Z-1 and Z-2) and the last two (X-

1 and X-2) focuses on the orthogonal information in Z and X, respectively.

84

Figure 43 – Relative importance of each block by LV for: a) MB-PLS case 1, b) MB-PLS,

case 2, c) SMB-PLS case 1 and d) SMB-PLS case 2

The loading plot presented in Figure 44 shows that component Z-3 essentially captures

the feedforward control adjustments made on the process to compensate for variations in

some raw material properties (i.e. the source of correlated variations between Z and X).

The loading values show that Cp is strongly negatively correlated with Q because when Cp

increases the production rate Q is reduced to mitigate the impact on FLH. Basically, when

the heat capacity increases, more heat needs to be removed from the molten polymer to

reach solidification temperature (i.e. FLH) at given heat transfer conditions (h0 and Ta).

Decreasing production rate reduces the heat load and therefore attenuates the effect of

Cp. The fact that FLH is positively correlated with Q is because the control adjustments are

not perfect (i.e. effect of Cp is not removed completely).

1 2 3 4 5 60

0.2

0.4

0.6

0.8

1

Rela

tive w

eig

hts

1 2 3 4 5 6 70

0.2

0.4

0.6

0.8

1

b)

Z-1 Z-2 Z-3 Z-4 X-1 X-20

0.2

0.4

0.6

0.8

1

LV

Rela

tive w

eig

hts

Z-1 Z-2 Z-3 X-1 X-2 X-30

0.2

0.4

0.6

0.8

1

LV

d)

Z X

c)

a)

85

Figure 44 – Loadings of Z, X and Y blocks in the 3rd SMB-PLS component (Z-3) for case 2.

The loadings of the X and Y blocks in the last two SMB-PLS model components (LV4 and

LV5 or X-1 and X-2) for case 2 are shown in Figure 45. Component LV4 captures the

feedforward adjustments made on convective heat transfer (h0) to attenuate the variations

in film properties introduced by environment temperature Ta (main process disturbance).

When Ta increases, heat transfer rate reduces. Thus h0 needs to be increased, for

instance by increasing cooling air flow rate, to attenuate the impact of the disturbance on

FLH. The last component (LV5 or X-2) models the impact of additional variations in Ta and

Q on film properties (FLH, FMDS). The information captured by the two X block

components have nothing to do with variations in raw material properties and are therefore

captured in a separate latent variable space. Note that the loadings of the Z block are all

zero in these components because information of raw materials has been captured when

modelling the Z block (LVs 1-3).

2 4 6 8 10 12 14 16-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Variables

LV

3 (

Z-3

) n96

n97

n102 n103 n137 n138

n146 n147

n194 n195

Cp

Rho

Q

Ta

h0

FLH

FMDS

Z X Y

86

Figure 45 – Loadings of Z, X and Y blocks in the 4th and 5th SMB-PLS component (X-1 and

X-2) for case 2.

This demonstrates the advantages of the method which can distinguish between

correlated information which are explained in the first blocks and new information from the

subsequent blocks. It is very useful for the interpretation of the models.

5.4.3 Industrial data from the anode manufacturing process

The proposed SMB-PLS algorithm is now applied to the industrial dataset collected from

the anode manufacturing process. The selection of the number of components is

presented first and the predictive ability of the three multi-block algorithms is compared.

Finally, how the information extracted from each block is distributed among the

components is discussed along with model interpretation.

Based on the Q2Y statistic, a total of 5 LVs were selected for the MB-PLS model (Figure

46) because it increases by less than 1% after adding this component. According to the

RMSEP, model predictions for a few Y-variables improve slightly beyond 5 LVs but most of

them remain fairly constant. The list of variable names and numbers is given in Table 14.

Since it was decided to stop adding LVs when one of the 2 criteria is met, 5 components

were used for MB-PLS.

-0.5 0 0.5-0.2

0

0.2

0.4

0.6

0.8

1

LV 4 (X-1)

LV

5 (

X-2

) Q

Ta

h0

FLH

FMDS

Z X Y

87

Figure 46 – Selection of the number of LVs for the MB-PLS model computed from the

anode manufacturing dataset: a) Q2Y and b) RMSEP for all Y variables

The statistics used for selecting the number of LVs for both sequential algorithms are

presented in Figure 47 and Figure 48. Since four regressor blocks are involved in this

application instead of two, it was decided to present only the global cumulative Q2Y

statistics for all the blocks on the same plot rather than showing for each individual block.

This reduces the number of figures and simplifies the interpretation. Also, the RMSEP

statistics are not presented since it was not the critical criterion in any of the models.

Figure 47 – Selection of the number of LVs for the SO-PLS model computed from the

anode manufacturing dataset

0

0.1

0.2

0.3

0.4

0.5

Q2Y

1 2 3 4 5 6 7 8 9 10

0.4

0.6

0.8

1

LV

RM

SE

P

1 2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 7 8 9 100.15

0.2

0.25

0.3

0.35

0.4

0.45

LV

Q2

Q²Y Z Q²Y X1 Q²Y X2 Q²Y X3

88

The numbers of components selected for each block when building the SO-PLS model are

shown in Figure 47. Note that the curves in the plot should be interpreted sequentially,

using the same order as the one established for the blocks. Thus, the variance of Y

explained by Z (Q2Y_Z) as a function of the number of latent variables must be used first

to determine the number of Z components. The increase in Q2Y_Z when adding the 6th

component is less than 1%. Therefore 5 LVs were selected for the Z block. The explained

variance for the second block Q2Y_X1 is then used to determine the number of

components for the X1 residuals remaining after orthogonalization of X1 with respect to the

5 latent variables (scores) selected for Z. The Q2Y_X1 value after adding one component

for X1 is the cumulative Y-variance explained after using 5 LVs for Z and 1 LV for X1. Since

the additional variance explained by the latter X1 component increases the cumulative Q2Y

by less than 1%, then no components were retained in X1. This means that after

orthogonalization, X1 does not contain new information to explain additional Y-variance.

The curves for Q2Y_X2 and Q2Y_X3 should then be interpreted sequentially in a similar

way as for the first two blocks. At the end of the procedure, the numbers of components

selected for the SO-PLS model are 5, 0, 1 and 2 for the Z, X1, X2 and X3 blocks,

respectively.

Figure 48 – Selection of the number of LV for the SMB-PLS anode model

The SMB-PLS statistics (Figure 48) are very similar to those of SO-PLS as shown in

Figure 47. Hence, the number of LV for each block is also 5, 0, 1 and 2. The variance of

each regressor block and Y explained by the algorithms are presented in Figure 49. The

total variance of Y explained by each algorithm on the training and validation datasets are

very similar (Fig. 48 a). Hence, the sequential algorithms (SMB-PLS in particular) do not

1 2 3 4 5 6 7 8 9 100.15

0.2

0.25

0.3

0.35

0.4

0.45

LV

Q2

Q²Y Z Q²Y X1 Q²Y X2 Q²Y X3

89

lead to any loss of information. The main differences are the extent to which the algorithms

use each regressor block and how the information is distributed in the various latent

variables.

The variance of each regressor block explained by the algorithms is compared in Figure

49 b). Both sequential algorithms make a slightly greater use of the Xb blocks to model Y

since the total variance explained by these algorithms is about 6% higher compared with

MB-PLS (total variance is the sum of the variance explained of each block). This is most

likely due to the fact that the number of latent variables is different for each block in

sequential algorithms. The Z block contributes the most in all three methods, which is

expected because the anode manufacturing process is strongly driven by raw material

variability. However, the explained variance of Z by MB-PLS is lower compared with

sequential methods because MB-PLS tends to capture information from all the blocks

more evenly due of the existing correlations between the blocks (i.e. Z, X1 and X2).

Sequential methods seem more selective because the effect of Z on subsequent blocks is

captured. This is consistent with observations made on the simulation dataset. The

greatest difference between the sequential methods is how they model the subsequent

blocks. The variance of X1, X2, and X3 explained by SMB-PLS is higher by 18%, 30% and

5%, respectively, mainly because SO-PLS ignores the between block correlated

information as opposed to SMB-PLS which keeps it in the model.

How the information contained in the regressor blocks is distributed in each latent variable

of MB-PLS and SMB-PLS models is shown in Figure 49 c) and d), respectively. Again,

each MB-PLS component captures information from all the blocks in different proportions,

even if X3 is almost uncorrelated with the previous blocks in the sequence (i.e. LVs 3-5

explain variance from all four blocks, including X3). However, the first 5 latent variables of

SMB-PLS (Z-1 to Z-5) concentrate on raw material variations and their impact on

subsequent blocks, X1 and X2 in particular. Anode paste formulation (X1) is typically

adjusted according to changes in raw material properties (i.e. control actions) and, in turn,

raw material properties and formulation affect the mixing and forming process units (X2).

The first 5 LVs capture the correlations between these blocks all originating from variations

in raw material properties. No additional component was found significant for the X1 block

after exhausting information from Z. Hence, all the X1 variations relevant to Y falls in the

space of Z (correlated information) and additional (orthogonal) information left in X1 did to

contribute to explaining more variance of Y. The 6th latent variable (X2-1) captures

90

additional information in the mixing and forming block (orthogonal to raw materials) that

improved Y predictions. Finally, the last two components (X3-1 and X3-2) focus on the

baking block (X3) exclusively. SMB-PLS clearly shows that X3 is nearly orthogonal to the

other blocks because components 1-6 explain almost no variance of X3 and LVs 7-8

extract information from that block only. On the other hand, it would be difficult to draw

similar conclusions using the MB-PLS results since the information captured from all the

regressor blocks is distributed within most components.

Figure 49 – Results obtained with the multi-block algorithms on the anode manufacturing

dataset: a) R2Y and Q2Y for all methods, b) overall R2X by block for all methods, relative

weights (bars) and block variance explained R2X (lines) by LV for c) MB-PLS and d) SMB-

PLS

The enhanced interpretation ability provided by SMB-PLS is now discussed in terms of the

relationships identified between the variables included in different data blocks. Loading bi-

plots will be used to illustrate the similarities and differences between SMB-PLS and MB-

PLS. Comparing the interpretation ability of these two methods is not straightforward

because, in general, there may not be a direct correspondence between the latent

variables of the two approaches (e.g. LV-1 of MB-PLS and SMB-PLS do not necessarily

MB SMB SO0

10

20

30

40

50

Methods

R2 a

nd Q

2

Total Z X1 X2 X30

20

40

60

Blocks

R2

1 2 3 4 50

0.5

1

LV

Rela

tive w

eig

hts

and R

2X

1 2 3 4 5 6 7 80

0.5

1

LV

Rela

tive w

eig

hts

and R

2X

Cal R2Y Val Q2Y MB SMB SO

Z X1 X2 X3 Z X1 X2 X3

b)a)

c) d)

91

extract the same information). Nevertheless, after careful analysis of the latent variables in

each model, it was possible to find five pairs of latent variables (i.e. one in each method)

explaining similar variance of the regressor blocks (Z and Xb’s) and Y. These comparisons

are presented in Figure 50-53. Note that SO-PLS is not included in the comparative

analysis because it provides a similar interpretation as SMB-PLS for the orthogonal

information between the blocks, but no interpretation of the between block correlations is

possible because this information is completely removed from the model.

The loading bi-plot of the first two SMB-PLS components (i.e. Z-1 and Z-2) is shown in

Figure 50. These components were calculated using Z as the first block in the sequence,

which means that they capture the relationships between Z and Y and the variations in the

subsequent blocks (X1, X2 and X3) that are correlated with the block scores of Z. Hence,

the block loadings are calculated for each block in this modelling step (just as MB-PLS),

even for the X1 block (i.e. block with no component in the model) since the variance

contained in X1 that is relevant for Y falls in the latent variable space of Z. Note that some

variable names and the loadings of X3 are not shown in order to declutter the loading bi-

plot. The first two Z components do not explain much variance from the X3 block.

Figure 50 – Bi-plot of the block weights and Y loadings for first two components (Z-1 and

Z-2) of the SMB-PLS model built on the anode manufacturing dataset

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

LV1 (Z-1)

LV

2 (

Z-2

) Coke Na

Coke 28/48 app dens

Coke Si

Coke RT 4 Coke RT 8

Coke RT 14

Coke RT 30

Coke RT 50

Coke RT 100 Coke RT 200

Butts Ni

Pitch QI

Coarse Rt4

Coarse Rt8

Inter Rt50+Rt100

Fines Pt200

Agg Rt3/8 Agg Rt50+Rt100

Agg Rt200+Pt200

Agg Pt200 Coarse %

Fines %

Butts %

Pitch %

MX1 therm oil T

MX2 KW mean pan 2

MX2 dump gate pos

Anode dim 26

GAD Green anode dens

Green weight

Baked weight (mean)

App dens

L c

Elect resis Butts Ca

Z X1 X2 Y

A

D

D

A

B

B

C

B

C

C

92

The loading bi-plot essentially reveals how the anode paste formulation (X1) was adjusted

in response to variations in raw material properties due to supplier changes (Z) in order to

achieve the desired green anode density or GAD (Y-variable). The GAD is mainly affected

by the properties of the coke aggregate mix (i.e. composition, size distribution and coke

properties) as well as the amount of pitch used to formulate the paste. The latter is

typically adjusted in so-called pitch optimization experiments which are performed

periodically and also every time the coke and/or pitch supplier changes. Hence, the coke

aggregate mix and pitch properties are the main disturbances affecting GAD whereas

amount of pitch is the manipulated variable used to correct for these disturbances.

The pitch quinoline insoluble or pitch QI (Z) is the main pitch property requiring the amount

of pitch (X1) to be adjusted. When QI increases, the pitch has a reduced wetting capacity

and so this is compensated by adding more pitch to the formulation in a proportional

manner. The relationship between pitch QI and pitch demand is well characterized in the

literature (Hulse 2000). The positive correlation between pitch % and pitch QI caused by

the feedforward control adjustment is clearly shown in Figure 50 (ellipses labelled A).

The aggregate mix composition (i.e. proportions of butts, coarse and fine particles in the

blend) varies significantly at this particular plant and simultaneously with coke supplier

changes due to plant design and operating policies. Since the dimensions of the green

anodes are fixed, fluctuations in coke density occurring when supplier changes affect the

anode weight and, in turn, the amount of butts returning from the potroom after the anodes

are partially consumed by the reduction reaction. Since there is no inventory for crushed

butts particles at this particular plant, more butts are included in the formulation when the

coke has a higher density. Hence, the amount of butts in the recipe is correlated with coke

supplier changes and this explains why this relationship is extracted in this modeling step

(ellipses labelled B). The loading bi-plot (Figure 48) shows that the amount of butts is

negatively correlated with the amount of coarse and fine coke fractions. Butts particles are

generally coarser and so they replace coarse coke particles in the aggregate mix in order

to obtain the desired size distribution. The role of fine coke particles is to fill the pores of

the coarser coke particles and the voids in the aggregate mix to ensure high anode

density. Since butt particles are less porous than coke particles (Fischer & Perruchoud

1991), less fines are required when more butts are added to the mix.

93

The bi-plot shows that the amount of pitch is adjusted in a positively correlated fashion

with the amount of butts in the mix and in opposite direction with the fines ratio (rectangles

labelled C). These relationships are counterintuitive with respect to process knowledge

and the literature on this subject (Belitskus 1978, p.78; Belitskus 1993). At this point, the

reader is reminded that the model was built on routine production data and no designs of

experiments were performed. Therefore, the reader should not interpret the results based

on known cause and effect relationships between each regressor and response variable

as if the regressor variables were changed according to an orthogonal experimental

design. The amount of pitch used in the formulation is the result of the pitch optimization

experiments (i.e. some form of feedback control) compounding all the changes occurring

in the aggregate mix affecting pitch demand. It is therefore difficult to explain exactly why

the amount of pitch was changed in this way. However, the scatter plot of the amount of

pitch used in the formulation against the proportion of fines in the mix presented in Figure

49 may shed some light on this issue. It is clearly shown that the negative correlation

between these two variables is driven by the sources of raw materials (i.e. coke and pitch

suppliers) which suggests that unmeasured changes in some properties of fine coke

particles (perhaps Blaine number) may have modified the pitch demand. It is also

interesting to note the positive slope between pitch % and fines % for most of the

individual blends, which is consistent with process knowledge. That is more fines in the

formulation requires more pitch for a given coke type because of the higher surface area of

the fines to be coated by the pitch. Finally, it is important to point out that this interpretation

could not be obtained from the SO-PLS model because the relationship between pitch %

and fines % (X1) is correlated with the raw materials (Z) and this information would be

removed by the block orthogonalization procedure.

Additional interesting information can be extracted from Figure 50. For example, the

relationship between the size distribution of the raw coke and the size distribution of the

classified particles shown by the ellipses labelled D. This means that a change in the raw

material particle size as an effect on the efficiency of the screening and classification of the

particles in the three coke fractions. An increase in coke particle size increases the amount

of coarse in the coarse fractions (e.g. RT4 and RT8 particles) but lowers the amount of

intermediate size particles (i.e. RT50 and RT100) in the intermediate fractions. Another

example is with the thermal oil temperature in the first mixer (MX1) and the power

consumption of the second mixer (MX2), both variables included in the X2 block. These

94

variables are correlated with the raw materials block (Z) most likely because of changes in

the paste viscosity caused by fluctuations in raw materials properties and the required

formulation adjustments.

Figure 51 – Amount of pitch used in the formulation as a function of the amount of coke

fines particles for different raw material blends (combinations of coke and pitch suppliers)

The Z and X1 block weights of the first two components (LV1-LV2) of the MB-PLS model

are shown in Figure 52 for comparison. The interpretation of these components is very

similar to that made for the first two components of the SMB-PLS model (Figure 50). Both

models capture raw material and formulation variability and the pitch% and fines% are still

important in both components. That was expected because the process variability is

mainly driven by raw materials and supplier changes in particular. The advantage of SMB-

PLS in this case is that is clearly shows that most of the variations in formulation (X1) are

associated with the raw materials (Z) because of the imposed pathway between the

blocks. The remainder of this section will focus on differences between MB-PLS and SMB-

PLS.

-1 -0.5 0 0.5 1-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Fines %

Pitch %

Blend 1

Blend 2

Blend 3

Blend 4

Blend 5

Blend 6

Blend 7

Blend 8

Blend 9

Blend 10

95

Figure 52 – Z and X1 block weights bi-plot for LV1 and LV2 of MB-PLS

The paste plant process data block (X2) contains both correlated and orthogonal

information with respect to Z and X1 as extracted by SMB-PLS. As shown in Figure 49, the

5 LVs computed for the Z block (i.e. LVs Z-1 to Z-5) capture 30.4% of the variance of X2

whereas the 6th LV (i.e. X2-1) explains an additional 6.9% of its variance. It was found that

the information extracted from X2 in LV5 of the MB-PLS model closely matches that of the

6th LV (X2-1) of SMB-PLS. The loading bi-plots are shown in Figure 53 for MB-PLS (a) and

SMB-PLS (b). The SMB-PLS component X2-1 explains additional variance of X2 relevant

for predicting Y but orthogonal to the Z and X1 blocks (i.e. not related with raw material

variations and formulation adjustments). This component captures the relationship

between the green anode height and its density (green anode density or GAD). At the

plant, changes in anode density are compensated by changing the anode weight (i.e.

amount of paste fed in the mold) in order to maintain the anode height as constant as

possible, while the other physical dimensions of the anodes are fixed by the mold area.

This operation strategy aims at reducing variability of the anode dimensions to facilitate

downstream operations. The relationship between anode height and green anode density

(negative correlation) is clearly visible in the loadings of the SMB-PLS model (Figure 53 b).

It also appears in the MB-PLS loadings (Figure 53 a) but several other variables also have

strong loading values because of correlations with variables of other blocks. This makes

the interpretation more difficult to make.

-0.4 -0.2 0 0.2 0.4-0.5

0

0.5

LV 1

LV

2 Coke real dens

Coke Ca

Coke S

Coke Si

Coke Ni

Coke RT 50

Coke RT 200

Coke 8/14 app dens

Coke Rx CO2

Butts Al

Butts Ca

Butts ash

Butts Ni

Butts Si Butts Na

Butts V

Butts Na/Ca

Pitch SP Pitch TS

Pitch B/QI

Pitch CV

Pitch ash

Pitch S Pitch dist

Coarse Rt4 Coarse Rt8

Fines Pt200

Agg Rt3/8 Agg Rt4@Rt30

Agg Rt50+Rt100

Agg Rt200+Pt200

Agg Pt200

Dry agg (tph)

Paste (tph)

Coarse %

Fines %

Inter. %

Butts %

Pitch %

Green recyc %

Fines rot valve speed

Z

X1

96

Figure 53 – Bi-plots of X2 block weights and Y loadings: a) LV 5 of MB-PLS and b) LV6

(X2-1) of SMB-PLS

The next comparison between MB-PLS and SMB-PLS is based on the last block (X3)

which is almost orthogonal to Z, X1 and X2 as discussed previously. Indeed, the variance

of X3 explained by the first 6 SMB-PLS components (i.e. LVs Z1 to Z5 and X2-1) is only

4.6%, but nearly 50% of the variance in that block is captured by the last two components.

It was found that LV4-LV5 from MB-PLS explains similar information as LV7-LV8 from

SMB-PLS, and therefore these pairs of components were selected for the comparison.

The X3 block scores and weights are shown in Figure 54 for both methods. The blue and

red markers represent the two sampling position in the baking furnace. Since there is a

distribution of final temperature and heat-up rate gradients in the furnace, these positions

were chosen to sample the coldest and hottest anodes in the furnace.

The first observation made from the block weights (Figure 54 b and d), is that the baking

position (pit position) is the most important variable for both models. The position in the

furnace is a very important variable affecting the anode properties (Lauzon-Gauthier 2011)

due to the distribution of heat-up rate and final anode temperature between the anodes.

This is not affected by raw material variability and paste plant operating conditions and it

explains why this block still contains new information (i.e. for the SMB-PLS) after all the

other blocks have been modeled.

The second observation is the improved separation of the 2 classes of anodes based on

the pit position by SMB-PLS. This is due to the sequential orthogonalization of the blocks.

As the correlated information is removed block by block at each new component, only new

information not explained by previous block is left to explain the variability in the Y data.

This improves the interpretation of the subsequent blocks by removing redundant

X2 Y

5 10 15 20 25 30-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Variables

LV

5

Agg pre-heater T

MX1 KW mean Paste T between MX

Paste T after MX2

PP mean ext T Green anode height

VC bellows P


Thermal cond

App dens

L c

Agg pre-heater_2 current

5 10 15 20 25 30-1

-0.5

0

0.5

1

Variable

LV

6 (

X2-1

)

Agg pre-heater T Paste T between MX

Paste T after MX2

PP mean ext T

Green anode height

VC bellows P


App dens

L c

a) b)

97

information. In the MB-PLS model, the LVs associated with most X3 block variance also

contains information from other block that adds variability to the component and degrades

the interpretation ability.

Figure 54 – Baking block (X3) scores and loadings bi-plot: a) MB-PLS block scores, b) MB-

PLS block weights for LV4-LV5, c) SMB-PLS block scores and d) SMB-PLS block weights

for LV7-LV8 (X3-1 and X3-2). The blue and red markers indicate the anodes baked in the

coldest and hottest positions in the furnace

In terms of interpretation, both the MB-PLS and the SMB-PLS capture the correlation

between the baking temperature (i.e. pit position) and the LC (i.e. crystallinity) and the real

density of the anode. A higher final baking temperature increases the crystallinity of the

anode micro-structure (Keller & Sulger 2008). But the SMB-PLS (Figure 54 d)

interpretation for the mechanical properties (i.e. compressive strength and Young’s

modulus) and the electrical resistivity is much clearer than the MB-PLS (Figure 54 b). It is

known in the literature that the final baking temperature and heat-up gradient influences

the mechanical and electrical properties of the anodes (Fischer et al. 1993). The SMB-PLS

shows a much stronger covariance between these properties than MB-PLS. This is likely

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

LV4

LV

5

Cold Hot

a)

-1 -0.5 0 0.5-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

LV4

LV

5

Oven

Fire

Pit position

Fire cycle T

BF pit starting T

BB1 pit max T pos A

BB2 pit max T pos A BB3 pit max T pos A

BB3 pit max T pos B

BB1 flue 3 max T

BB3 flue 3 max T

BB3 flue 3 T set point


Green weight

Baked weight (mean)

Thermal cond

App dens

Real dens

Comp strengh

L c

Young's mod

Elect resis

X3 Y

b)

-1 -0.5 0 0.5 1-1.5

-1

-0.5

0

0.5

1

1.5

2

LV7 (X3-1)

LV

8 (

X3-2

)

c)

-1 -0.5 0 0.5-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

LV7 (X3-1)

LV

8 (

X3-2

)

Fire

Pit position BF pit starting T BB1 pit max T pos A

BB2 pit max T pos A BB3 pit max T pos A

BB3 pit max T pos B

BB3 flue 3 T set point

GAD Green anode dens Green weight

Baked weight (mean)

Thermal cond App dens

Real dens

Comp strengh

L c

Young's mod

Elect resis

Oven BB1 flue 3 max T

Fire cycle T

d)

98

because the information captured from X3 by MB-PLS is spread more evenly in all the

components of the model, whereas the influence of the baking step is captured mainly by

two components in the SMB-PLS model.

The last point of comparison between MB-PLS and SMB-PLS is based on information

mixing. This problem with MB-PLS was investigated by Westerhuis and Smilde

(Westerhuis & Smilde 2001). Basically, they have shown that when the Xb block is deflated

using the super-scores of the MB-PLS model, information from the other blocks is

introduced into the Xb block (hence the name information mixing), which affects the

interpretability of MB-PLS models. Westerhuis and Smilde proposed to deflate the Y data

only as a solution to this problem, but this modification is currently not implemented in

commercial multivariate data analysis softwares, such as ProMVTM (ProSensus Inc.).

Hence, the traditional MB-PLS super-score deflation approach will be used in the

comparative study.

To demonstrate that SMB-PLS do not suffer from information mixing, a similar strategy as

the one adopted by (Westerhuis & Smilde 2001) is used. Basically, the raw material

properties matrix (Z) contains blocks of repeated data because the properties are only

available as weekly averages. Hence, all anodes produced in the same week have the

same raw material data. This also means that all these anodes should have the same Z

block score values (i.e. should overlap perfectly in a score plot) in absence of information

mixing, and will have different values if mixing occurs. The comparison will be performed

using both the super-scores and block scores of MB-PLS and SMB-PLS. First, the super

scores are used to demonstrate that MB-PLS captures information from all blocks at the

same time. Conversely, the block orthogonalization steps of the SMB-PLS algorithm forces

the model to capture only the variability related with the block of interest. Second, the

block scores will be used to show the presence or absence of information mixing between

the blocks.

99

Figure 55 – Comparison of the information mixing in MB-PLS and SMB-PLS models: a)

super scores (LV1-LV2) of MB-PLS, b) super scores (LV1-LV2) of SMB-PLS, c) Z scores

(LV2-LV3) of MB-PLS and d) Z scores (LV2-LV3) of SMB-PLS

The super-scores for the first two components of MB-PLS and SMB-PLS are presented in

Figure 55 a) and b). These components in both algorithms were shown to capture the

impact of raw material variations and associated process variations on anode properties

(Figure 48 and 50). Although the general trends in the super-scores are similar for both

algorithms, those of the SMB-PLS models (Figure 55 b) show less variability. This is

because LV1 and LV2 model the impact of Z on subsequent blocks and Y, and are not

corrupted by other types of information contained in the subsequent X blocks. The

observations having the same Z data overlap on top of each other (Figure 53 b), whereas

the super-score values of MB-PLS (Figure 55 a) are clearly different even if some of them

share the same Z data.

To illustrate the mixing of information during the deflation step, Figure 55 c) and d) show

the block scores of the second and third components of MB-PLS and SMB-PLS. These

components were selected since information mixing only appears after the first deflation

step. Once again, both models capture the similar information with LV2 and LV3. The

-0.5 0 0.5 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

LV 2

LV

3

-0.5 0 0.5 1-1

-0.5

0

0.5

1

LV 2 (Z-2)

LV

3 (

Z-3

)

-1.5 -1 -0.5 0 0.5 1 1.5-1.5

-1

-0.5

0

0.5

1

1.5

LV 1

LV

2

-1.5 -1 -0.5 0 0.5 1 1.5 2-1.5

-1

-0.5

0

0.5

1

1.5

LV 1 (Z-1)

LV

2 (

Z-2

)

a) b)

c) d)

100

information mixing is clearly visible in the block scores of MB-PLS (Figure 55 c), consistent

with the study by (Westerhuis & Smilde 2001). No sign of information mixing appear with

SMB-PLS (Figure 55 d). The small differences in block score values for a few observations

are due to the missing data imputation procedure and not to information mixing. Indeed,

the iterative PCA approach used to estimate the few missing data could compute different

values for anodes produced in the same week.

5.5 Conclusion

This chapter describes an improved multi-block PLS algorithm for the analysis of complex

process data. The interpretation of PLS models (e.g. scores and loadings plots) of

industrial dataset can be difficult due to the large number of variables and the complex

correlation structure between the blocks (e.g. control actions). Fortunately, processes are

often a succession of smaller operations done in a meaningful and structured order and

the variables of the dataset can be blocked accordingly.

Multi-block PLS has several advantages over the normal PLS due to the ability to

scrutinise the model at a super level and a block level. Unfortunately, it does not help to

differentiate between correlated and orthogonal information contained in each component

of the model. Alternatively, the SO-PLS takes advantage of the sequential nature of

process data by removing correlated information in subsequent blocks. However, it does

not provide the possibility to interpret that information.

The objective of this chapter was to develop a new sequential multi-block PLS algorithm

called SMB-PLS to combine the advantages of the MB-PLS and SO-PLS to improve the

interpretation of complex industrial process data. The MB-PLS structure with the super and

block levels (i.e. two level of scrutiny) was used as the basis of the algorithm. To avoid

misinterpretation of the results due to the correlation between the blocks, the

orthogonalization scheme used in the SO-PLS was incorporated into the MB-PLS

structure. This enables the interpretation of both the correlated and orthogonal separately

without loss of information. Also, no significant differences were observed in the

computational load between the three multi-block methods.

The performance of the new SMB-PLS was illustrated using two datasets. The first was a

simulated polymer film blowing process and the second was a real industrial dataset from

101

the carbon anode manufacturing process. The prediction performances of the new

algorithm were found to be similar to the MB-PLS and SO-PLS algorithm for both datasets.

However, the SMP-PLS has some limitations. First, it does not improve the predictive

ability compared to a regular PLS. Second, the method to choose the number of latent

variables has not been explored much. A sequential approach has been used in this

thesis, but a global selection could also be possible. Finally, the sequential structure

imposed in SMB-PLS may not be suited for more complex and highly integrated processes

including recycle streams.

The simulated dataset contained two different case studies that were used to illustrate the

pathway orthogonalization properties of the SMB-PLS. One without correlation between

the raw material and process data blocks while the other one contained both correlated

and orthogonal variability. Using the SMB-PLS, the correlated variations due to the

feedback control actions were captured by a different set of latent variables than the

orthogonal variability. This was not the case with the MB-PLS, in which both the correlated

and orthogonal information was spread in all components.

The anode manufacturing dataset was used to validate the new algorithm on a real life

dataset. It was found that the information contained in each latent variable was different

than with the MB-PLS algorithm for which it is not possible to differentiate correlated and

orthogonal information. In the SMB-PLS, the subsequent blocks only contain new

information and it was shown that removing the correlated variability in a sequential order

led to better interpretation of models. The raw material block (Z) components showed the

effect of raw material variability on the process variables while subsequent blocks

contained variability that was not due to the raw materials. Finally, it was showed that

there was no information mixing between the blocks when using the SMB-PLS algorithm.

103

Chapter 6 Paste image texture analysis

This chapter presents the first part of the development of the anode paste machine vision

methodology. It introduces the laboratory work performed on anode paste. Testing of the

image sensor with industrial paste will be presented in Chapter 7.

6.1 Introduction

The need for real-time quantitative measurements of the anode quality has been

discussed in details in the introduction chapter (section 1.6) of this thesis. The most

important issue is with the increased variability of the anode raw materials (i.e. coke and

pitch). Even if good quality materials are still currently available on the market, these have

a higher cost. In order to reach a compromise between cost and quality, the anode

manufacturers blend materials from different suppliers, including some lower cost/quality

raw materials. Producing anodes with consistent quality attributes therefore requires the

manufacturers to track the properties of incoming coke and pitch materials and those of

the anodes itself, and adjust the anode formulation accordingly using feedforward and/or

feedback control. However, the key raw materials and anodes properties are currently

measured in the laboratory using a limited number of samples and the results are typically

available after long time delays. Developing new real-time sensors for tracking the

properties of the materials at different stages of the production chain is therefore important

to help manufacturers mitigate the impact of increasing raw material variability. It was

decided to focus first on developing a sensor for the quality of the anode paste because it

is the material used to form the anode. Using images as a high frequency measurement to

compensate for the lack of real-time measurements of raw material quality and some

operating parameters such as the particle size distribution could enable quick feedback

control actions on formulation with little dead-time.

Anode paste quality is generally defined in terms of the material that yields the desired

baked anode properties. In this work, a high quality paste is defined as one made from the

right combination of coke aggregate size distribution and amount of pitch for a given coke

source such that baked anode density (BAD) is maximized. Changes in aggregate size

and coke properties affect the so-called pitch demand and eventually the baked anode

density if not corrected for by adjusting the amount of pitch in the formulation.

104

The hypothesis tested in this research was that changes in the coke aggregate mix

affecting pitch demand and the amount of pitch in the formulation modify the paste visual

appearance. Hence, a machine vision sensor could be used in real-time and non-

intrusively to indicate whether the amount of pitch in the formulation should be adjusted.

In spite of an extensive literature review, no applications of machine vision to anode paste

images (i.e. nor to any kind of paste materials) were found. Therefore, the review was

broadened to include applications to the anode itself and other similar materials.

Image analysis has been used before to characterize anode quality both qualitatively and

quantitatively. These applications were based on optical microscopy images of carefully

prepared and polished anode samples. Adams et al. (Adams et al. 2002) found that the

optimum binder layer thickness correlated well with the optimum BAD and electrical

resistivity of the anodes. The binder thickness was measured using a combination of

thresholding and dilation techniques to segregate the binder and the coke particles in the

image. Rorvik et al. (Rorvik et al. 2006) used thresholding on images gathered by

polarizing light microscopy. This method could segment the pitch, the coke particles and

the pores. It was also used to quantify the pitch thickness distribution and correlate this

measurement to some anode properties. Finally, Sadler (Sadler 2012) qualitatively

analysed microscopy images of baked anode surfaces. He found that changing the

operating conditions in the paste plant had an effect on the visible micro-structure of the

anodes. This study provides supporting evidence that the visual appearance of the anode

changes with variations in processing conditions.

However, these methods cannot be used for real-time application on anode paste samples

for two reasons. First, these imaging microscopy techniques require sample preparation

and are limited to small sample size. Hence, they are time consuming and may lack

representativeness with respect to the throughput of industrial production lines. Second,

thresholding methods perform well with polarizing light microscopy images because the

contrast is generally large. This is not the case with the low contrast paste images

collected using industrial cameras (no magnification) as shown in Figure 56 where the

macro-texture formed by the various components is all dark.

105

Figure 56 – Anode paste image

Similarly to anodes, asphalt and concrete are two granular materials that also contain a

certain amount of binder. Thresholding and segmentation techniques have also been

applied to asphalt (Yue & Morin 1996; Bruno et al. 2012) and concrete mixes (Dequiedt et

al. 2001). Geometric measurements were made on the segmented particles to

characterize these granular materials. Dispersion was also measured from the concrete

images. Unfortunately, these image analysis applications are based on core samples of

cured material (i.e. sample preparation required) and not the paste material itself.

Internal research was conducted in the past at the Alcoa Technical Center (Pittsburgh,

USA) in order to investigate the possibility of estimating the amount of pitch in the paste

using images (Adams et al. 2007; Adams et al. 2009). The proposed approach using

statistics computed from the gray level intensity histogram of the images showed

promising results on paste samples prepared in the laboratory. Additional work was

needed to improve robustness with industrial paste and also to verify the sensitivity to

more parameters than the pitch ratio.

Image texture analysis methods seem more appropriate for tracking variations in the paste

related with coke aggregate properties (i.e. pitch demand and size) and the amount of

pitch. For example, changes in the aggregate size distribution should affect the degree of

fineness or coarseness of the paste. In addition, an under-pitched paste should look

rougher, but smoother when it is over-pitched. Texture methods should be sensitive to

these changes in the paste visual appearance because they extract information about the

spatial organization of the pixels within the image (i.e. relationships between the light

intensities of neighbouring pixels or patterns). Furthermore, multi-resolution textural

106

methods, such as the wavelet wexture analysis (WTA), have been shown to be more

robust to variations in lighting conditions (Bharati et al. 2004). Finally, image texture

analysis is increasingly used in the process industries for monitoring products for defects,

predicting overall product quality, and for feedback control of product quality as reviewed

by Duchesne et al. (2012). They have proven to be useful in several applications that are

relevant for the paste imaging problem studied in this work. Texture methods were applied

to the mineral processing field for the characterization and control of the froth flotation

process (Liu et al. 2005; Liu & MacGregor 2008) and for classification of ore materials

(Tessier et al. 2007). They have been used for the detection and classification of surface

defect in many industrial applications such as manufacturing of steel sheets (Bharati et al.

2004), polymer films (Liu & MacGregor 2005; Gosselin et al. 2009), paper sheets (Reis &

Bauer 2009; Reis & Bauer 2010), artificial stone countertops (Liu & MacGregor 2006),

glass substrates for TFT-LCD screens (Yousefian-Jazi et al. 2014), semi-conductors

(Facco et al. 2009), textile fabrics (Zhang et al. 2007) and pharmaceutical tablets (García-

Muñoz & Carmody 2010). They were also used to measure the degree of homogeneity of

a binary mixture of polymer powders (Gosselin et al. 2008) and for automatic

characterization of nanofibers using SEM (scanning electron microscopy) (Facco et al.

2010).

The objective of this chapter consists of developing a machine vision sensor for anode

paste characterization based on its surface texture (i.e. visual appearance), and to

demonstrate its performance on paste samples formulated in the laboratory. This sensor

should be sensitive to changes in paste formulation and pitch demand. The first is related

with the coke aggregate size distribution and pitch level and the second with the

processing conditions and raw material properties. The sensor should also provide some

indication of what the optimum amount of pitch is for a given coke source. Furthermore,

this study should contribute to a better understanding of the relationships between the

paste surface texture and its macro properties (i.e. formulation and pitch demand) in order

to set the stage for an application to industrial paste samples.

Two texture methods were selected for anode paste characterization, namely the gray

level co-occurrence matrix (GLCM) and the wavelet texture analysis (WTA). The former is

a statistical approach to texture analysis whereas the latter is a transform based textural

method. These two are recognized as state-of-the-art techniques (Bharati et al. 2004) and

were selected because textural patterns in paste images (Figure 56) are stochastic in

107

nature rather than highly repetitive and clearly identifiable structures. Note that some

preliminary results of using DWT were already published (Lauzon-Gauthier et al. 2014) to

show the potential of using texture analysis on anode paste images. In this publication, the

ability of the sensor to detect the optimum pitch demand was not covered. This chapter

goes much beyond and completes this work by comparing different wavelets types, and

exploring different combinations of image pre-processing techniques and textural features.

The best combination of methods, wavelet and features for the paste images application

as well as the relationships between the textural features and the paste properties were

determined using three sets of paste samples produced in the laboratory under different

formulations and experimental processing conditions.

This chapter is organized as follows. First, the experimental details on the fabrication of

the paste samples are given in section 6.2. Then, section 6.3 describes the development

of the paste image texture analysis scheme including the choice of wavelets and textural

features. This is followed by the results section (6.4) where the interpretation of the

features is presented, after which conclusions are drawn.

6.2 Laboratory paste and anode experiments

Three different sets of experiments were performed in the laboratory. The first two were

made using a formulation very similar to those used in the industry, including coke in all

size fractions and butts in order to be as representative of the real paste as possible. The

third used a typical formulation for lab scale anode fabrication in which no butts and coarse

coke particles are used to be less sensitive to variability due the large particles because of

the small size of the anode samples.

The goals of the laboratory experiments were to find the best combination of methods and

features to capture the visual information of the paste and to understand the relationships

between the features and the properties of the paste.

6.2.1 Preliminary design on paste formulation

This is the first experiment that was performed for the development of the machine vision

algorithm. The goal was to verify that the image texture analysis method was sensitive to

changes in the amount of coke fines in the paste. Since 75% of particles in the fines

fraction are smaller than the camera’s resolution (i.e. 1 pixel = 40.7µm), they should not

influence the “visible” size distribution in the image. However, the fines have an impact on

108

the pitch demand of the paste which should influence its surface texture. It is important to

note that the optimum pitch demand of the paste was not determined in this test.

Paste samples were prepared by using five different amounts of fines (i.e. in grams) for

two different amounts of pitch. The other fractions were the coarse and intermediate coke

and the butts. The amount of these fractions was fixed at 125g, 100g and 90g respectively.

Table 15 presents the details of the 10 pastes formulations.

Table 15 – Formulations used in the first series of experiments aiming at varying the amounts of coke fines and pitch in the paste.

Since the total amount of paste was different for each paste formulation, the percentages

(i.e. relative mass) of each fraction is different for each experiment. The pitch % is

computed on a dry aggregate basis and is therefore a ratio.

6.2.2 Detailed design on paste formulation

The aim of the second set of experiments performed in the laboratory was again to assess

the sensitivity of the paste image texture analysis but for a wider range of variations in the

paste formulation. The ratios of the various constituents, the size distribution of the dry

aggregate mix, the fineness of the coke fines fraction as well as some mixing parameters

were modified in order to introduce changes in pitch demand, size of the aggregate mix,

and ratio of pitch to dry aggregate. These variations should influence the visual

appearance of the paste samples. More details on how each parameter is expected to

change the paste appearance are provided later in this section. As for the first experiment,

the optimum pitch demand for each set of properties was not determined experimentally.

Experiment

Number of

replicates

Pitch

(g)

Pitch

(%)

Fines

(g)

Fines

(%)

P-L_F-1 1 55,0 15,07 50,0 11,90

P-L_F-2 1 55,0 14,67 60,0 13,95

P-L_F-3 1 55,0 14,29 70,0 15,91

P-L_F-4 1 55,0 13,92 80,0 17,78

P-L_F-5 1 55,0 13,58 90,0 19,57

P-H_F-1 3 65,0 17,81 50,0 11,63

P-H_F-2 3 65,0 17,33 60,0 13,64

P-H_F-3 3 65,0 16,88 70,0 15,56

P-H_F-4 3 65,0 16,46 80,0 17,39

P-H_F-5 3 65,0 16,05 90,0 19,15

109

The tested parameters are listed in Table 16. To preserve confidentiality, the formulation of

the base mix (i.e. nominal paste formulation serving as a reference) is not revealed.

Therefore, the various changes made in this series of experiments are reported as

percentage of deviation with respect to the base mix, except for the nominal mixing time

and temperature which were 10 min and 178°C, respectively. The percentages for the

coarse, intermediate, fines and butts fractions are reported on a dry aggregate basis, but

the pitch % is based on the paste weight. Finally, the weight of each paste sample was

kept fixed at 450 g. Hence, the weights of each fraction were adjusted accordingly.

In this second series of experiments, 23 different paste formulations were prepared and

some of them were replicated for a total of 32 batches. Two images were acquired for

each paste sample for a total of 64 images.

The overall visual appearance of the paste in this experiment is described as wet or dry.

Since the optimum pitch demand is not known for any of the prepared samples, it is

described in relative terms. For a fixed pitch/dry aggregate ratio, a change in a given

parameter that caused the paste to look dryer was considered to increase the pitch

demand and vice-versa for a wet paste. If more pitch is left on the surface of the particles

for the same amount of pitch in the paste (wetter paste), it means that the aggregate mix

requires less pitch (i.e. lower pitch demand). The expected effects of each of the

parameters included in the experimental design on the paste visual appearance are

described below.

110

Table 16 – Changes in the paste formulation tested in the second set of experiments

Five different parameters were manipulated from the base mix to influence the pitch

demand of the paste. First, a ±10% change was made to the butts ratio. The change in the

dry aggregate weight was compensated by adding/removing coke from the coarse fraction

(similarly as plant operation practice) since the butts is mainly composed of large particles.

Crushed anode butt particles are less porous than fresh coke. Thus, less pitch is required

to wet them properly. A higher amount of butts fraction decreases the pitch demand and

should lead to a wetter paste (for the same amount of pitch).

The Blaine number (BN) (i.e. fineness) of the fines was also varied. Different coke fines

with specific BN were prepared in the laboratory and used in substitution of the industrial

fines fraction. The BN of the samples were measured using a Malvern laser diffraction

particle size analyzer. Ball mill fines with 2300, 4000 and 6000 BN were used. The BN

number of the industrial fines was within the range of the laboratory fines. The industrial

fines also differ from the lab fines because they contain very fine dust particles collected

Description

Number of

replicates Changes from base mix

Base mix 4 ---

Decreased butts ratio 2 -10 %

Increased butts ratio 2 +10 %

Different Blaine number (fines fraction) 2 BN 2300 cm2/g



Decreased fines ratio in the aggregate mix 1 -4 %

Decreased fines ratio in the aggregate mix 1 -2 %

Increased fines ratio in the aggregate mix 1 +2 %

Increased fines ratio in the aggregate mix 1 +4 %

Decreased pitch ratio in the paste 2 -1,4 %

Increased pitch ratio in the paste 2 +1,6 %

Decreased coarse and intermediate frac. 1 Coarse -12,5 % Inter -6 %

Decreased intermediate frac. 1 Inter -11 %

Increased coarse and intermediate frac. 1 Coarse +7,5 % Inter +4 %

Increased intermediate frac. 1 Inter +9 %

Substitution of coarse frac. by shot coke 1 20 %



Decreased mixing temperature 1 158 °C

Increased mixing temperature 1 188 °C

Decreased mixing time 1 -5 min

Increased mixing time 1 +5 min

111

by the fume treatment systems at the plant. Hence, the particle size distribution of the

industrial fines is not entirely similar to the particle size distribution of ball mill fines even if

they have the same Blaine number. Finer fine particles (i.e. higher BN) should require

more pitch because of their higher specific surface area. The paste should therefore look

dryer when using higher BN fines and wetter when using lower BN fines if the amount of

pitch remains unchanged.

Changes made to the fines ratio were compensated by adding/removing coarse and

intermediate coke particles and butts particles in equal amounts, similarly as in the first

series of experiments. The tested fines ratios were ±2 and ±4%. The fine coke particles

have a higher surface area than the coarser fractions and therefore require more pitch. A

finer formulation increases the pitch demand of the paste and the images should look

dryer.

In some formulations, a proportion of the coarse coke fraction was replaced with shot coke

(e.g. 20, 40 and 60% of the coarse fraction). Shot coke has inferior mechanical properties,

lower thermal shock resistance and a lower level of open porosity for pitch penetration. It is

therefore not used in industrial anode formulations (Edwards et al. 2009). However, it was

used in this study to generate additional variations in pitch demand. Because shot coke is

typically less porous, the pitch demand should decrease with the addition of shot coke and

the paste should look wetter for the same amount of pitch. This effect is similar to the

addition of butts to the formulation.

The mixing conditions do not introduce variations in pitch demand per se, but affect the

pitch penetration in the pores of the particles and are likely to alter the paste visual

appearance in a similar way as when a change in pitch demand occurs. However, too long

mixing times can be detrimental to the paste and increase the pitch demand (Hulse 2000).

Over-mixing can break some of the coke particles and create more surface area that need

to be wetted by the pitch. Mixing time was changed by ±5 minutes. Mixing temperature

was varied from 158 to 188 °C. When it is too low, the pitch is more viscous and should

not penetrate as much in the pores of the particles, leading to a wetter paste. This may be

detrimental to all anode properties. More detailed explanations of the effects of the

formulation and processing conditions on anode properties are available in (Belitskus

1978; Belitskus 1981; Belitskus 1993; Belitskus 2013; Belitskus & Danka 1988; McHenry

et al. 1998; Hulse 2000).

112

In order to deliberately change the wetness of the paste, the amount of pitch was also

changed (i.e. -1.4% and + 1.6%). For a constant dry aggregate formulation, there is a

direct relationship between the amount of pitch in the paste and its degree of wetness.

Finally, the dry aggregate size distribution was manipulated by either substituting the

coarse and intermediate fractions, or the intermediate fraction only, by fine coke particles

(e.g. +7.5% coarse, + 4% inter and -11.5% fines or - 11% inter and + 11% fines). Although

changes in the aggregate size distribution may also modify pitch demand due to changes

in overall porosity of the aggregate mix, it is expected that these variations will have a

different impact on paste visual appearance (i.e. its texture) compared with that of a

change in pitch demand. Hence, the two types of disturbances should be distinguishable

by the machine vision system.

6.2.3 Pitch optimization experiments

The last set of laboratory experiments was performed in order to assess the possibility of

quantitatively detecting the optimum pitch demand of the aggregate mix (i.e. optimal

amount of pitch to use in the paste formulation) using the machine vision approach. The

optimum pitch demand (OPD) is defined as the amount of pitch required to obtain the

maximum apparent baked anode density (BAD) on laboratory scale anodes.

To find the OPD, it is necessary to perform a pitch optimization experiment, a current

practice in the industry. It consists of changing the pitch ratio over a broad range of levels

for a given dry aggregate mix and fixed processing conditions, and measuring the BAD at

each pitch level. This requires collecting a sample from each paste formulation, forming it

as a laboratory scale anode, and baking it to obtain its BAD value. Due to the small

dimensions of the lab scale anodes, the size distribution of the particles was limited to a

maximum size of approximately 4 mm. Hence, the formulation of the aggregate mix is

different compared to the first two series of experiments (no butts and less of the coarse

coke fraction). The laboratory formulation presented in section 4.2.2 was used for this

experiment. Two different cokes were used (see section 4.2.2) to test the machine vision

system on materials having different properties. Coke A has a higher VBD than coke B.

Thus, the density of the anodes made with coke A will be higher than those made with

coke B. The range of pitch used for each coke and the number of replicated samples

produced are presented in Table 17.

113

Table 17 – List of experiments for the laboratory pitch optimization

The results of the pitch optimization procedure are presented in Figure 57. In this figure,

the solid lines represent the baked anode density or BAD (left axis) and the dash lines

correspond to the green anode density or GAD (right axis). The mean value of all the

replicate samples at each pitch level is plotted in the figure and not the individual values.

Also the ± 1 standard deviation error bars are plotted for the BAD.

Figure 57 – Baked and green anode density (BAD and GAD) for the pitch optimization

anodes using cokes from two different sources (A and B)

Pitch %

Number of

samples Pitch %

Number of

samples

15,0 3 15,2 2

16,0 2 15,7 3

17,0 2 16,2 4

17,5 3 16,7 3

18,0 3 17,2 3

18,5 3 18,2 3

19,0 3 19,2 2

20,0 3 20,0 2

21,0 2 21,0 2

22,0 2 22,0 2

23,0 2 23,0 1

24,0 2 24,0 2

25,0 2

26,0 2

Coke A Coke B

1,48

1,5

1,52

1,54

1,56

1,58

1,6

1,62

1,64

1,66

1,68

1,46

1,48

1,5

1,52

1,54

1,56

1,58

1,6

14,0 16,0 18,0 20,0 22,0 24,0 26,0

GA

D

BA

D

Pitch % (ratio)

BAD coke A BAD coke B GAD coke A GAD coke B

114

It is shown in Figure 57 that the OPD was found for both types of cokes. Coke A has an

optimum BAD of 1.578 g/cm3 at 20% of pitch ratio (16.67% in percentage) and coke B has

a maximum BAD of 1.551 g/cm3 at 21% of pitch ratio (17.36% in percentage). As

expected, coke A has a higher BAD and a lower optimum pitch demand than coke B due

to its higher VBD (Table 9). The average and maximum standard deviation of the

replicates for the BAD measurements were 0.005g/cm3 and 0.012g/cm3 respectively. This

is consistent with the variability of the laboratory anodes fabrication from previous work

performed on the same experimental set-up (Azari Dorcheh 2013). Finally, the GAD shows

no optimum. Adding more pitch will always increase the green density of the anode and

this is the reason why it is not a good indicator of the anode quality.

The OPD of the formulations used in this series of experiments was reached at a higher

pitch ratio than what is typically observed in the industry for a pitch of 16.5% QI. However,

this is explained by the fact that the laboratory formulation has no butts and no very coarse

coke particles so it is expected to have a higher pitch demand.

6.3 Selection of preprocessing operations and image textural

features

The aim of this section consists of determining the optimal combination of image

preprocessing operations to apply to paste images and textural features to extract from

them in order to maximise the performance of the machine vision sensor. The latter is

defined in terms of its capacity to detect changes in the paste (e.g. pitch demand and

aggregate size) that ultimately will affect the baked anode properties. A sequential

approach is adopted to arrive at the best results. The impact of applying contrast

enhancement to the images is tested first using the Discrete Wavelet Transform (DWT) as

the initial texture method and the energy computed on detail sub-images as textural

descriptors (features). This choice was motivated by the fact that the DWT-energy

combination is commonly used in texture analysis (Bharati et al. 2004; Liu et al. 2005;

Duchesne et al. 2012) and preliminary results published by the author using this approach

showed promise (Lauzon-Gauthier et al. 2014). In the second step, the effect of selecting

different types of mother wavelets on the imaging sensor performance is investigated.

Finally, additional textural features are added in different combinations to determine the

best set of descriptors to use for anode paste image analysis. Note that the Gray Level

Co-occurrence Matrix (GLCM) is also considered at this stage as an alternative texture

method to DWT.

115

The dataset and criteria used to perform the comparison are presented first, followed by

the assessment of image preprocessing, type of mother wavelet and textural feature

extraction.

6.3.1 Dataset and criteria used for the comparative analysis

The dataset collected during the pitch optimization experiments (section 6.2.3) is used to

compare the various alternatives. Indeed, the baked anode density (i.e. Y data) was

measured in this series of experiments which allows for the estimation of PLS regression

models between the image textural features and the paste quality attribute. The prediction

performances as well as the interpretation offered by these models enable a quantitative

comparison between the different ways of preprocessing and extracting the information

from the paste images.

Baked anode density is a complex function of raw material properties (e.g. coke density,

pitch demand, size distribution and particle shape), formulation and process operation

(mixing, forming and baking). At the formulation stage, the main source of variations to

deal with is that caused by the coke properties due to frequent supplier changes. The goal

of the formulator is to achieve the maximum BAD for a given coke source rather than

obtaining a fixed target value because the optimal BAD is expected to change with coke

supply as shown in Figure 57. Thus, to assist the operators, the imaging sensor should

ideally be able to indicate whether the actual anode paste should yield the optimum BAD

(if subsequent process operations are performed adequately) or if the current coke blend

is under or over pitched. This raises the following question: does the paste visual

appearance (i.e. surface texture) look similar when the optimal pitch content is reached for

different coke sources? If that is indeed the case, one could establish a simple multivariate

statistical process control scheme on paste textural features to ensure that these fall within

the desired (optimal) region and adjust the pitch ratio accordingly when the paste texture

drifts outside this region. In order to address this questions while searching for the best

combination of image preprocessing and analysis techniques, the BAD values for each

anode were transformed into deviation from the optimum density (∆BAD) using equation

6.1. This transformation was applied separately to each of the two cokes and pitch level.

pitch% max pitch%BAD BAD BAD∆ = − 6.1

116

In equation 6.1 BADmax is the maximum BAD obtained for the full range of pitch level for a

given coke and BAD is the actual BAD at a given pitch level for the same coke. Using the

∆BAD measurement shown in Figure 58, it is possible to quantify the distance to the

optimum pitch demand for each paste sample and each type of coke. Of course, this

means that the optimum BAD values for different cokes are known at the time of the

calibration of the imaging sensor, but not when used for on-line monitoring. In the latter

case, the PLS model would predict the deviation from optimal BAD (if desired), or one

could simply look at the PLS scores to verify that they fall within the optimal region.

Figure 58 – ∆BAD of the lab formulation anodes

The PLS regression problems are formulated as follows. The image features calculated for

each case are collected in the regressor matrix X (N×K) where N and K correspond to the

number of paste samples in the dataset and number of features extracted from the

images. The response matrix Y (N×1) contains the ∆BAD measured for each paste sample

after forming and baking for each pitch level and each coke. Note that the data collected

for replicated samples (features of the paste images and ∆BAD values) were averaged

and then stored in the X and Y matrices.

6.3.2 Choice of preprocessing

Three initial preprocessing steps were performed by default on each image. First the RGB

image is transformed into a grayscale image to obtain a univariate image. Then low pass

Gaussian filtering is applied to remove some camera CCD noise. Finally, the image is

cropped to exclude irrelevant data around the borders showing the aluminum container,

etc.

-0,07

-0,06

-0,05

-0,04

-0,03

-0,02

-0,01

0

14 16 18 20 22 24 26

∆B

AD

Pitch % (ratio)

Coke A Coke B

117

Applying contrast enhancement to the images is considered a preprocessing option to be

tested in this section. Adjusting the saturation was chosen since it enhances the contrast

in the image. This was performed using the imadjust function (MaltabTM) which adjusts the

image gray level histogram to obtain 1% of saturation at minimum and maximum

intensities of the image. When applied independently on all images, it can also

compensate for image to image lighting variations adding robustness in the machine vision

sensor. A downside of using contrast enhancement is that the vision system may not be

able to detect drifts (i.e. degradation) in the lighting system or the imaging device in long

term applications. This problem can be addressed by monitoring the adjustments made by

the equalization algorithm over time in order to detect any drift.The usefulness of the

contrast enhancement was verified by comparing the performance of PLS models of the

features before and after adding this preprocessing step.

Two PLS models were computed using the energy of the DWT detail coefficients in X

(26×7) and ∆BAD in Y (26×1), with and without applying contrast enhancement. Note that

7 wavelet decomposition levels were calculated and the detail sub-images in all three

directions were averaged prior to computing the energy. The samples were split in 7

consecutive subsets for the cross-validation procedure (section 2.4). The number of

components was selected based on the smallest root mean squared error in prediction by

cross-validation (RMSEPCV) of the ∆BAD using equation 2.16.

For interpretation, the variance explained (R2) for X and Y as well as the CVQ2 (equation

2.15) and RMSEPCV for Y are compared in Table 18. The RMSEPCV is an indication of

the model prediction error and it can be compared with the ∆BAD measurement error

standard deviation obtained from replicate samples which is 0.019 g/cm3 in this case. The

RMSEPCV are very close to measurement errors which suggest a good adequacy of the

models. Additionally, the score plots of the PLS models are also shown in Figure 59 for

interpretation purposes.

Table 18 – Impact of adding contrast enhancement on PLS model statistics

Preprocessing

Number

of LV

R2X

(%)

R2Y

(%)

R2Y

LV1-2

(%)

CVQ2Y

(%)

∆BAD

RMSEPCV

(g/cm3)

No preprocessing 4 99,69 72,77 33,92 36,69 0,016

Contrast enhancement 4 98,94 84,19 77,29 72,42 0,010

118

Using contrast enhancement doubles the prediction ability of ∆BAD (in cross-validation).

The estimated error standard deviation is also reduced by 0.006 g/cm3. The explained

variance of Y in calibration (R2Y) for the first two LVs (LV1-2) is also presented in Table 18

since these components are used to compare the interpretation of the models. Since the fit

is almost doubled for these 2 LVs with contrast enhancement, the interpretation is also

expected to be much clearer (Figure 59). The improvement in performance is due to the

elimination of the change in lighting intensity from one batch of experiments to the other

since these anodes were produced over a few weeks in the laboratory. Based on these

results, contrast enhancement was added as a standard preprocessing option for the

machine vision algorithm.

Figure 59 – Scores of the PLS models for the lab formulated anodes: a) no contrast

enhancement and b) with contrast enhancement

Figure 59 compares the impact of adding contrast enhancement on the interpretation of

both PLS models. The first 2 LVs are used since they capture most of the information in X

and Y (for the second model). These are the two orthogonal linear combinations of the

image textural features that are the most predictive of ∆BAD. The percentages shown in

the axis labels are the Y variance captured (R2) by each LV. Also, the markers are colored

according to their ∆BAD values (scale on the right hand side of the plot). Paste images

clustering close to each other in the score plots (i.e. similar score values) have similar

textural characteristics.

Contrast enhancement has a significant beneficial impact on the ability of the imaging

sensor to detect the optimum BAD (hence pitch demand of the coke) as revealed in Figure

-4 -2 0 2 4 6-2

-1.5

-1

-0.5

0

0.5

1

1.5

t LV1 (10.84%)

t LV

2 (

23.0

8%

)

A 15

B 15A 16

A 17B 17.2A 17.5

B 18.2A 18.5

A 19B 19.2

A 20

A 21

B 21

A 22

B 22

A 23B 23

A 24

B 24

B 25

B 26

B 20

-10 -5 0 5 10-8

-6

-4

-2

0

2

4

t LV1 (36.71%)

t LV

2 (

43.8

1%

)

A 15B 15.2

B 15.7

A 16B 16.2

B 16.7

A 17.5

A 18.5

B 19.2A 20

A 21

B 21

A 22

B 22

A 23

B 23

A 24

B 24

B 25

B 26

-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0OPD

Coke A Coke B

a) b)

Under

pitched

Over

pitched

t1 (10.84%) t1 (36.71%)

t 2(2

3.0

8%

)

t 2(4

3.8

1%

)

119

59. Without contrast enhancement, the visual appearance of the paste sample

corresponding to optimum BAD (i.e. yellow markers) are spread across the LV space

(Figure 59 a), but they cluster very clearly in the north east quadrant of the score plot when

using contrast enhancement (Figure 59 b). This preprocessing is therefore important for

dark materials such as anode paste. It enhances the textural information related with

changes in pitch demand and ensures robustness of the sensor to irrelevant sources of

variations such as lighting intensity.

6.3.3 Choice of wavelet

Wavelets from three distinct families and different support lengths were tested. Orthogonal

and biorthogonal wavelets were selected because they can be applied using the DWT

algorithm for fast computation and they also allow perfect reconstruction of the original

images from the details coefficient and approximation sub-images (Wavelet Toolbox

Documentation 2015). Also, they roughly matched the image 1D signal (i.e. one line or one

column of the image). Hence, the symlets (sym), Daubechies (db) and Biorthogonal (bior)

wavelets were selected. Also, for the sym and bior wavelets, different lengths of the filters

were used to verify if the shape of the wavelet had an impact on the performance. The

same dataset was used as in section 6.3.2 with contrast enhancement. The statistics of

each PLS models are presented in Table 19.

Table 19 – Impact of wavelet type and filter length on PLS model statistics

Except for the symlet 24 (sym24) which has a lower performance in cross-validation (i.e.

Q2Y and RMSEPCV), the performance obtained with all types of wavelet and support

length is very similar. Hence, the original choice of the symlet 4 was kept for the

continuation of the work on the machine vision approach.

Wavelet

Number

of LV

R2X

(%)

R2Y

(%)

R2Y

LV1-2

(%)

CVQ2Y

(%)

∆BAD

RMSEPCV

(g/cm3)

sym4 4 98,94 84,19 77,29 72,42 0,010

sym14 4 98,47 84,60 78,35 74,45 0,010

sym24 4 98,44 84,47 78,45 63,20 0,012

db4 3 97,27 81,28 80,29 71,78 0,010

bior2.2 4 99,13 84,08 76,83 72,31 0,010

bior3.5 3 97,42 79,92 79,00 71,12 0,010

bior4.4 4 98,89 84,44 78,80 72,98 0,010

120

6.3.4 Selection of textural features

The final step to determine the image texture analysis methodology was to select which

textural features to use. Both the DWT and GLCM texture methods are used separately

and in combination to extract textural descriptors that are sensitive to relevant changes in

paste images (i.e. coke size distribution and pitch demand) and robust to irrelevant

sources of variations. The computed features are described in detail in section 3.3 of this

thesis. For the DWT method, the tested features are the energy (E), entropy (Ent),

standard deviation (Std), Skewness (Skew) and kurtosis (Kurt). These were calculated

either on the DWT approximation sub-images (low-pass frequency information) obtained at

each scale or on the detail sub-images (high-pass information). The features computed

from GLCM are the angular second moment (ASM), entropy (Ent), contrast (Cont),

homogeneity (Hom) and correlation (Corr). These were calculated after applying GLCM

either directly on the preprocessed images or on the DWT detail sub-images. The co-

occurrence matrices were obtained at four angles (i.e. 0°, 45° 90° and 135°) commonly

used in the literature (Haralick et al. 1973; Maillard 2003; Bharati et al. 2004) and multiple

distances (e.g. 1, 2, 3 and up to twelve in those references) to yield a multi-resolution

description of the paste texture.

No automatic feature selection methods were used in this work. Selecting only a subset of

features is a tradeoff between model complexity and monitoring ability. Model

interpretation is easier when fewer features are used, but it looses in generality for process

monitoring applications (e.g., may miss new variations appearing in the images that could

have been captured by the features removed from the model). Using or not feature

selection methods depends upon the objective of the application. For these reasons, all

the details images where used in all models even if image characteristics are often

confined in certain frequency bands in most texture analysis applications.

A set of nine PLS models were built between a selected subset (i.e. different

combinations) of features (X) and the ∆BAD (Y). The features were all computed on the

preprocessed images with contrast enhancement and the DWT was performed using the

wavelet sym4. For each model, the number of components was selected based on the

lowest ∆BAD RMSEPCV. The performance statistics are provided in Table 20 and the

interpretation of the models using the first two PLS scores (LVs 1 and 2) is presented in

Figure 60. Again, both were used for the feature selection.

121

Table 20 – Impact of different combinations of textural features on PLS model statistics

In terms of variance captured, all models perform well with R2X ranging from 79.2% to

99.9% and R2Y from 80.0% to 96.4%. The worst is model 5 with the DWT details on the

approximations only. Each approximation sub-image contains the low frequency

information from previous decomposition levels. It appears this frequency content

degrades the prediction ability since it probably contains information unrelated to the

∆BAD variability (i.e. lighting variations and paste spreading patterns). The models based

on features computed from the DWT details sub-images only performed better (i.e. models

1-4). This may be due to the fact that each decomposition level is orthogonal from the

others and contains unique textural information. The models based on the GLCMs (i.e.

models 8 and 9) computed directly on the images have high fit on both X and Y matrices

and high predictive ability in cross-validation, but the lowest fit in the first two LVs only.

These two components are important because they capture the optimum pitch demand.

Finally, the best model seems to be number 7. All the features are based on the DWT

detail coefficients, but only energy, skewness and kurtosis and the GLCM ASM, contrast

and correlation are used. This model has the best predictive ability with the lowest

RMSEPCV. The first two PLS components also capture the greatest amount of variance of

Y and this gives the best classification of the optimum pitch demand from paste images as

shown in Figure 60 d compared to the results obtained with models 1, 4 and 9 (Figure 60

a-c).

The redundancy in the features was discussed in section 3.3.1 and in (Van de Wouwer et

al. 1999; Clausi 2002). For the DWT detail coefficients models (i.e. 3 and 4), removing the

redundant features entropy and std does not change the performance of the models but

the interpretation of the loadings is much simpler with fewer variables. In the case of the

Features #

Number

of

features

Number

of LV

R2X

(%)

R2Y

(%)

R2Y

LV1-2

(%)

CVQ2Y

(%)

∆BAD

RMSEPCV

(g/cm3)

All 1 155 3 83,41 83,05 79,64 69,32 0,011

DWT details and approximations

(E, Ent, Std, Skew and Kurt)2 75 3 79,45 82,66 79,43 68,36 0,011

DWT details only (E, Ent, Std, Skew and Kurt) 3 35 5 92,73 90,39 78,85 65,26 0,012

DWT details only (E, Skew and Kurt) 4 21 6 96,04 92,19 77,36 67,06 0,012

DWT approximations only

(E, Ent, Std, Skew and Kurt)5 40 3 79,21 79,99 75,97 67,71 0,011

DWT+GLCM details only (ASM, Cont and Corr) 6 21 4 95,09 83,26 78,52 68,97 0,011

DWT details only (E, Skew and Kurt) and

DWT+GLCM details only (ASM, Cont and Corr)7 42 7 96,77 96,39 80,52 76,88 0,009

GLCM on images (ASM, Ent, Cont, Hom and Corr) 8 45 6 99,87 86,63 74,87 73,53 0,010

GLCM on images( ASM, Cont and Corr) 9 27 6 99,85 88,00 74,86 77,22 0,009

122

GLCM features, model 9 with fewer features performs slightly better in prediction with

3.7% higher Q2Y. Once again the loadings are easier to interpret. It seems that for this

machine vision application, the performance of the models are not affected by the features

redundancy, but since it will make the interpretation simpler, only non-redundant features

were selected for the final application.

All the models in Figure 60 can detect the OPD based on the PLS model between the

image features and the ∆BAD. The black dash lines in all plots show the direction of the

pitch % from low pitch to high pitch paste. The color map again shows the distance to the

optimum BAD (i.e. ∆BAD). All models can capture the differences between the under and

over pitched anodes, but the model with the best clustering of the OPD is presented in

Figure 60 d. It is the model combining the DWT and GLCM features on the DWT detail

coefficients (i.e. model 7 in Table 20).

Figure 60 – Score plots for the first two PLS components (LVs 1-2) of four models from

Table 20: a) model 1, b) model 4, c) model 9 and d) model 7

-20 -15 -10 -5 0 5 10

-15

-10

-5

0

5

10

t LV1 (48.91%)

t L

V2

(30

.74%

)

A 15B 15.2

B 15.7

A 16

B 16.2

B 16.7

A 17

B 17.2

A 17.5A 18

B 18.2

A 18.5A 19

B 19.2 A 20

B 20

A 21

B 21

A 22

B 22

A 23

B 23

A 24

B 24

B 25

B 26

-8 -6 -4 -2 0 2 4-10

-8

-6

-4

-2

0

2

4

t LV1 (62.23%)

t LV

2 (

12.6

3%

) A 15B 15.2 B 15.7

A 16

B 16.2

B 16.7

A 17

B 17.2

A 17.5 A 18

B 18.2

A 18.5A 19

B 19.2A 20

B 20

A 21

B 21

A 22

B 22

A 23

B 23

A 24

B 24

B 25

B 26

-8 -6 -4 -2 0 2 4

-5

-4

-3

-2

-1

0

1

2

3

t LV1 (40.62%)

t L

V2

(36

.74%

)

A 15B 15.2

B 15.7

A 16 B 16.2

B 16.7

A 17

B 17.2

A 17.5

A 18

B 18.2

A 18.5A 19

B 19.2 A 20B 20

A 21

B 21

A 22

B 22

A 23

B 23

A 24

B 24

B 25

B 26

-10 -5 0 5 10-8

-6

-4

-2

0

2

4

t LV1 (36.71%)

t LV

2 (

43.8

1%

)

A 15B 15.2

B 15.7

A 16B 16.2

B 16.7

A 17.5

A 18.5

B 19.2A 20

A 21

B 21

A 22

B 22

A 23

B 23

A 24

B 24

B 25

B 26

-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0

a) b)

c) d)

Coke A Coke B

OPD

Under

pitched

Over

pitched

t1 (48.91%)

t 2(3

0.7

4%

)

t1 (40.62%)

t 2(3

6.7

4%

)

t1 (62.23%)

t 2(2

1.6

3%

)

t1 (36.71%)

t 2(4

3.8

1%

)

123

A schematic of the final choice of preprocessing, wavelet and features is presented in

Figure 61. After noise removal and contrast enhancement, 7 levels of DWT decomposition

are calculated using the symlets 4 wavelet. The energy, skewness and kurtosis features

are calculated from the DWT detail sub-images. Finally, the GLCM ASM, contrast and

correlation features are calculated from the DWT detail coefficients at each decomposition

level for the number of distances and angles mentioned previously.

Figure 61 – Final image texture analysis procedure

6.4 Results

The detailed results for the three laboratory datasets are presented in this section. The

PLS models statistics, most informative scores as well as interpretation of the loadings are

provided.

6.4.1 Preliminary design on paste formulation

In this model, only the pitch and the fines weights in the paste have been manipulated

while the coarse, intermediate and butts fractions weights remained constant. The amount

of pitch has a direct correlation to the wetness of the paste since adding more pitch will

make it look shinier. Adding fines should have the opposite behavior of making the paste

dryer by increasing the pitch demand due to the higher surface area of the fines.

Two PLS models were built on this dataset, one based on all individual paste samples (i.e.

replicates considered as separate observations in the data matrices) and a second where

the X and Y data for the replicated samples were averaged. The image features were

Original image

Pre-processing

- RGB to grayscale

- ROI- Low pass filtering- Contrast enhancement

DWT (sym 4)

Details coefficient(7 levels)

- Energy

- Skewness- Kurtosis

GLCM(L(s) and average of θ)

- ASM

- Contrast- Correlation

124

stored in X whereas the changes made to formulation (i.e. fines and pitch %) were used as

Y data. The objective of building PLS regression models were not so much to assess the

predictive ability of the fines % and pitch %, but the Y data were rather used for supervised

clustering of the images to verify that variations in paste texture was indeed correlated with

changes in the formulation. The PLS models statistics are presented in Table 21. The

number of components was selected to minimize the RMSEPCV of both Y variables. The

standard deviation of the fines % and pitch % were 2.75 % and 0.98 % respectively. The

RMSEPCV of both variables are much lower than the dataset variability and this means

that the model error is small.

Table 21 – PLS model statistics for changes in fines and pitch percentages in the paste formulation

The model built after averaging the replicate data performs well at capturing the texture

information (R2X) as well as the formulation variation (R2Y and Q2Y). The model built using

all samples (no replicate averaging) will be used to verify the repeatability of the

methodology and experimental procedure.

Model

Number

of LV R2X (%) R

2Y (%)

CVQ2Y

(%)

Fines %

RMSEPCV

Pitch %

RMSEPCV

All samples 4 90,65 87,17 64,74 2,32 0,38

Replicate averages 3 86,15 98,06 89,33 1,27 0,31

-10 -5 0 5 10-6

-4

-2

0

2

4

t1 (58.16%)

t 2 (

33.4

5%

)

P-H_F-1

P-H_F-2

P-H_F-3

P-H_F-4P-H_F-5

P-L_F-1

P-L_F-2P-L_F-3

P-L_F-4

P-L_F-5

High pitch Low pitch

0 5 10 15 20 25 30 35 40 45-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Variables

w*q

LV

2 (

33.4

5%

)

12 3

4

5 67

1

2 34

5

6

7

1

2

3 4

56

7

1

23

4

5 67

12

3

4 5 6 7

1

2

3

4

56

7

Fin

es %

Pitch

%

a)

b)

c)

Wet paste

Dry paste

0.3

Inte

r (

%)

E Skew Kurt GLCM ASM GLCM Cont GLCM Corr Y

-0.2

-0.1

0

0.1

0.2

0.3

w*q

LV

1 (

58.1

6%

)

1 2

3

4

5 6 71

2 3 4

5

6

7

1

2

3

4

56

7

1

2

3 4

5 67

1 23

4

5 6 7

1

2

3

4

56

7F

ine

s %

Pitc

h %

125

Figure 62 – Scores and loadings weights of the PLS model (replicates averaged) for the

case where fines and pitch variations were introduced in the paste formulation: a) LV1-LV2

scores, b) weights and loadings of LV1 and c) weights and loadings of LV 2

The interpretation of the PLS model built on averaged replicates is presented in Figure 62.

The variations introduced in the paste formulation clearly drive the paste textural features

in two main directions captured by the scores (Figure 62 a). The marker labels in the score

plot indicate both the level of pitch (P-L for low and P-H for high pitch) and the amount of

fines used in the formulation (F-1 the smallest and F-5 the highest). The first LV mainly

captures the variations in the amount of pitch, with the higher pitched anodes falling in the

positive t1 region and the lowest pitched anodes in the negative t1 range. LV1 also

captures some of the variations in fines in each of the two groups (low and high pitch). The

second component is associated with the changes in fines %. The amount of fines is

positively correlated with the t2 scores in this case. The loadings bi-plot shown in Figure 62

b) and c) can be used to understand how the textural features are influenced by the

changes in formulation. The color of the bars corresponds to each set of X features and Y

variables, and the numbers to the DWT decomposition level.

In the case of the first component LV1 (Figure 62 b) all loading weights for the energy and

the contrast (except in level 4) are positive. This is an indication that a positive correlation

exists between E and Cont and the amount of pitch. This means that this component

captures the shininess or reflectivity of the paste. A paste with more pitch on the surface is

more reflective and the total energy of that image (i.e. sum of square of all pixel intensity

values) increases. Since the DWT conserves the total energy of the image, this increase in

energy is visible in all the detail coefficients. The reflectivity of the paste increases the

specular reflection and so it increases the contrast captured by the GLCM on the detail

coefficients. Finally, the skewness and kurtosis decreases for levels 5-7 when the amount

of pitch increases. Skewness is a measure of the normality and kurtosis is a measure of

the broadness of a distribution. When both features decrease at the same time, it is an

indication that the values of the detail coefficients are more evenly distributed and have a

lower spread (i.e. narrower distribution). This indicates that the paste texture is smoother

at the lowest frequencies (i.e. larger details) when the pitch increases.

For the second component LV 2 (Figure 62 c) the increase in fines also increases the

reflectivity of the paste because the energy features all have positive weights. However,

126

contrary to LV1, the skewness and kurtosis in the decomposition levels 1-4 decrease with

increasing fines content which suggests less specular reflection is obtained since the

paste is more homogeneous in the high frequency levels. Furthermore, the ASM

decreases when the amount of fines decreases. This indicates that a finer paste as a

smoother appearance even if the reflectivity is increased.

The third LV of the PLS model is not presented in this thesis since it did not improve the

understanding of the features.

In summary, both an increase in fine% and pitch% increase the paste reflectivity. The pitch

will also increase the specular reflection which makes the paste look rougher in the high

frequencies. However, the paste appearance of the finest samples will tend to appear

smoother due to less specular reflection in the high frequencies.

For this experiment, all the samples with the highest amount of pitch (P-H_F-#) were

prepared three times (replicated). Each mix was imaged only once for a total of three

images for each of the five fines fraction levels. No other quantitative measurements than

the paste images themselves were available to verify the repeatability of the paste

preparation method. It was postulated that the replicates should have similar paste textural

characteristics. The paste appearance is characterized by 42 image textural features and it

is not convenient to verify the repeatability of the fabrication and imaging procedure for

each feature separately. Since these are correlated, individual confidence limits on the 42

features can be misleading. However the scores of the PLS model are orthogonal and

individual approximate uncertainty interval can be computed on the scores of the paste

samples image features. To compute these uncertainty intervals, the PLS model built

using all the sample is used to obtain the score values for each samples, including the

replicated ones.

Figure 63 presents the one standard deviation intervals for the LV1 and LV2 scores values

calculated based on the replicated samples. In this figure, the markers represent the

average of the replicated score values and the error bars are set to one standard deviation

around the mean. All the samples, even those without replicates (i.e. low pitch samples)

were used to build the PLS model, but only the samples with replicates are shown in

Figure 63. Samples F-1, F-2, F-3 and F-4 can be discriminated completely while the error

bars of samples F-3 and F-5 slightly overlap along the second component. This indicates

that the machine vision sensor is sensitive to the variations in pitch and fines.

127

Figure 63 – Reproducibility of the imaging sensor in the case of the preliminary design on

formulation. The averaged LV1 and LV2 scores are shown for replicated samples along

with their one standard deviation error bars

6.4.2 Detailed design on paste formulation

In this series of experiments, the pitch demand of the paste was manipulated using more

parameters than just the amounts of pitch and fines. However, the greater number of

samples to prepare using the same lot of coke and butts aggregates forced the formulation

of small size paste samples (i.e. 450g). This was an important issue, particularly for the dry

aggregate fractions which were stored in large 20-25 kg buckets. Only 60g to 125g of each

constituent was needed for each paste sample. Thus, it was difficult to sample the

fractions from the bucket consistently and obtain a representative size distribution for each

dry aggregate fraction.

This problem was even more critical for the recycled butts fraction which contains a very

large distribution of particle sizes (i.e. from 2-3 cm to a few µm). To minimize sample to

sample variations, the full source sample was split into several smaller fractions of

approximately 100g using sample splitters. Even with careful manipulations, it was not

possible to obtain a constant size distribution in all split samples. This is illustrated in

Figure 64, where the size distributions for 5 split butts samples are shown. Variations in

the particle size distribution when preparing the paste samples were mainly due to the

coarser particle fractions. These inconsistencies in aggregate size distribution may affect

pitch demand unintentionally.

-4 -2 0 2 4 6

-2

0

2

4

6

t1

t 2P-H_F-1

P-H_F-2P-H_F-3P-H_F-4

P-H_F-5

128

Figure 64 – Butts size distribution span

As discussed in section 6.2.2, a certain number of paste samples were replicated. In

addition, two images were collected for each paste sample. This allowed to assess the

reproducibility of the paste image itself separately from that of the entire experimental

procedure (i.e. sampling errors, etc.). Therefore, three PLS models were built using the

image features in X and changes to formulation in Y, but the data included in these

matrices depend on how the replicates were averaged. The first PLS model was based on

including all replicates of paste samples and images as a row in the data matrices (no

averaging at all). For the second model, the textural features of the two images collected

for each sample were averaged and for the third model, these features were also

averaged for each replicate sample (i.e. replicated formulation). The five variables included

in Y are the paste formulation percentages for each particle fractions (coarse,

intermediate, fines and butts) and the pitch.

Table 22 – PLS models statistics for the detail design on paste formulation

The statistics of the models are presented in Table 22. In this case, the number of

components was chosen to maximize the cross-validation Q2Y instead of the RMSEPCV.

This is due to their relatively high values compared to the standard deviations of the

coarse %, inter %, fines %, butts % and pitch % which are 4.56 %, 2.74 %, 4.59 %, 3.82 %

and 0.52 % respectively. The captured variance (R2X) for the feature space (X) is high

Rt4 Rt10 Rt18 Rt30 Rt50 Pt500

5

10

15

20

25

30

35

40

45

50

Sieve size (Mesh)

Perc

en

tag

e o

f to

tal sa

mp

le (

%)

Sample 1

Sample 2

Sample 3

Sample 4

Sample 5

Model

Number

of LV

R2X

(%)

R2Y

(%)

CVQ2Y

(%)

Coarse %

RMSEPCV

Inter %

RMSEPCV

Fines %

RMSEPCV

Butts %

RMSEPCV

Pitch %

RMSEPCV

All samples 4 86,84 38,73 19,72 4,76 1,91 4,21 3,78 0,51

Replicated image

averages2 76,67 36,67 15,82 6,63 2,59 7,22 3,24 0,53

Replicated formulation

averages2 73,23 29,96 22,08 5,03 2,34 4,52 3,75 0,53

129

while the numbers of LV are low (i.e. 4 and 2 for the all samples model and averages

models). This is an indication that most of the changes performed in these experiments

(i.e. 8 types of variations) only drives the paste image texture in a few (i.e. 4 or 2)

directions. This means that they have a similar effect on the paste visual appearance. This

was expected since the types of variations were carefully selected to excite the paste’s

pitch demand and particle size distribution in different manners. The prediction ability

(CVQ2) is low in this case. This is probably due to the high particle size distribution of the

butts fraction. The Y data is also corrupted by errors. This is due to the fact that the

proportions in the imaged samples may not be exactly those of the design conditions

stored in Y. However, the PLS models were not intended to be used for predicting the

formulation variables, but the interpretation of the image texture features based on the

change in paste characteristics (i.e. pitch demand and formulation). The supervised

clustering of the image textural features as a function of the change made to the

formulation helps to remove some of the unwanted image variability as opposed to

applying a PCA model on the features only.

The scores and loadings bi-plot for the second model (i.e. the replicated image averages

model) are presented in Figure 65. The scores for the X and Y spaces are presented in

Figure 65 a) and b). The light blue arrow indicates the direction in the score space

dominated by the pitch demand variations (e.g. changes in amount of pitch, fines and shot

coke). The light gray arrow indicates the direction capturing changes in the size distribution

of the aggregate mix (i.e. formulation). However, the formulation variations contribute to

both directions (Figure 65 a) since variations in the aggregate size also affect the pitch

demand. The interpretation of the textural features based on these two main directions is

not straightforward using the loadings bi-plot (c and d) since the axes (i.e. LVs) are not

aligned with them. That is, both pitch demand and formulation have contributions in both

components.

130

Figure 65 – Scores and loadings weights of the PLS model built on averaged replicated

samples data for the case of the detailed design on formulation: a) X scores on LV1 and

LV2, b) Y scores on LV1 and LV2, c) weights and loadings of LV1 and d) weights and

loadings of LV2

To improve the interpretation, contribution plots are used instead to highlight the changes

in the image features that distinguish two groups of observations. Contribution plots show

differences between observations more specifically compared to the loading plots. Groups

of observations showing variations due to change in pitch ratio, amount of shot coke and

fines, and aggregate size are illustrated in the score plots shown in Figure 66 a), c), e) and

g). The corresponding contributions plots are presented next to each score plot (Figure 66

b, d, f and h). The contribution of each image textural feature is computed using equation

2.21. The contribution plots show the change in the features from group 1 (i.e. ellipse) to

group 2 (i.e. rectangle) in each score plots. The arrow also indicates the direction of the

change under study.

-15 -10 -5 0 5 10-10

-5

0

5

10

t1 (24.85%)

t 2 (

5.1

0%

)

B_+10%

F_-4%

SD_+C+I

F_+4%

BL_2300

F_+2%t_5min

shot_40%base

SD_-I

F_-2%

P_-1.4%

B_-10%BL_4000

P_+1.6%

SD_-C-I

SD_+Ishot_20%

T_158°C

T_188°C

BL_6000

B_+10%shot_60%

B_-10%

BL_2300P_-1.4%

P_+1.6%

-20 -15 -10 -5 0 5 10 15 20-15

-10

-5

0

5

10

15

u1 (24.85%)

u2 (

5.1

0%

)

F_-4%

SD_+C+I

F_+4%F_+2%

base

SD_-I

F_-2%

B_-10%

P_+1.6%

SD_-C-I

SD_+I

B_+10%P_-1.4%

0 5 10 15 20 25 30 35 40 45-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Variables

w*q

LV

2 (

5.1

0%

)

1

2

3

4

5

67

1

234

5

67

12

3

4

5

6

7

1

2 3

4

5

67

1

2

34

5

67

1

23

4

56

7

Co

ars

e (

%)

Inte

r (

%)

Fin

es (%

)B

utts (%

)P

itch

(%

)

-0.2

-0.1

0

0.1

0.2

0.3

w*q

LV

1 (

24.8

5%

)

12

3

45

6

7

1 23

4

5

6

7

1

234

5

6

7

1

2

34

5

6 7

12

3

4

5

6

7

1

2

3

4

56

7

Co

ars

e (

%)

Inte

r (

%)

Fin

es (%

)B

utts (%

)P

itch

(%

)

Formulation

Pitch demand

Formulation

Pitch demand

10

BL Base Butts Fines Mix_Temp Mix_t Pitch SD Shot

0.3

Inte

r (

%)


a) c)

d)b)

131

Figure 66 – Interpretation of the PLS model built using averaged replicated samples data

for the case of the detailed design on formulation. Variations in the scores and associated

contribution plots: a) and b) increase in the pitch ratio, c) and d) shot coke addition, e) and

f) decrease in the fines ratio and g) and h) change from a coarser to a finer formulation

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

w*

contr

ibutions L

V1-2

1

2

3

4

5

6

7

1

2 3 4

5

6

7

1

23

4

5

7

1

2

3

4

5

6

7

1

2

3

4

5

6

7

1

2

3

4

5

6

7

-15 -10 -5 0 5 10-10

-5

0

5

10

t LV1 (24.85%)

t LV

2 (

5.1

0%

)

SD_+C+I

SD_-I

SD_-C-I

SD_+I

-15 -10 -5 0 5 10-10

-5

0

5

10

t LV

2 (

5.1

0%

)

F_-4%

F_+4%F_-2%

F_+2%

-15 -10 -5 0 5 10-10

-5

0

5

10

t LV1 (24.85%)

t LV

2 (

5.1

0%

)

shot_40%

base

base

base

shot_60%

base

-15 -10 -5 0 5 10-10

-5

0

5

10

t LV1 (24.85%)

t LV

2 (

5.1

0%

)

P_-1.4%

P_+1.6%

P_-1.4%

P_+1.6%

-2

-1.5

-1

-0.5

0

0.5

1

1.5

w*

contr

ibutions L

V1-2

12

3

4

5

67

1

23 4

5

67

1

2 34

5 6

7

1

2

3

4

5

6 7

12

3

45

6

7

12

3

4

5 6

7

-0.2

-0.1

0

0.1

0.2

0.3

w*

contr

ibutions L

V1-2

12

3

4

5

6 7

1

2

3

4

5

6

7

1

2

3 4

5

6

7

1

2

3 4

5

6

7

12

3

45

6

71

2

3

45

6

7

a) b)

c) d)

e) f)

g)

E Skew Kurt GLCM ASM GLCM Cont GLCM Corr

Base Fines Pitch SD Shot

0 5 10 15 20 25 30 35 40

-6

-4

-2

0

2

4

6

Variables

w*

contr

ibutions L

V1-2

1 23

4

5

6

7

1

2

34

5

6

7

1

23

4

5

6

7

1

2

3

4

56

7 12

3

4

56

71

2

3

4

5

6

7

h)

1

2

1

2

1

2

12

t1 (24.85%)

t 2(5

.10%

)t 2

(5.1

0%

)t 2

(5.1

0%

)t 2

(5.1

0%

)

132

The information extracted from the score plots shown in Figure 66 a), c) and e) focus on

changes along the pitch demand direction where the paste appearance evolves from a

dryer to wetter appearance. In Figure 66 a), the pitch content was increased from -1.4 to

+1.6% for the same dry aggregate which drives the paste appearance from dry to wet. In

Figure 66 b), the addition of shot coke to the base mix also increases the wetness of the

paste. Both contribution plots (Figure 66 b and d) are very similar. The energy and contrast

features increase at the high and low frequencies (levels 1, 2, 6 and 7) while decreasing in

the middle frequencies. The accumulation of pitch on the surface of the particles increases

specular reflection and this appears as small details (i.e. high frequency) in the images. At

the same time, the skewness, kurtosis and correlation also increase in the high frequency

detail coefficients. The decrease in skewness and kurtosis in the large details corresponds

to a smoother texture at these frequencies. As the pitch demand decreases, more pitch

saturate the surface of the paste and it smoothes the surface while creating high frequency

reflectivity. This is very similar to the behavior observed in the previous experiment

(section 6.4.1).

The change in the fines fraction is presented in Figure 66 e) and f). The contributions are

shown for an increase in the fines %. The effect of this change on the texture features is

similar from what was observed previously (Figure 62 c) on the preliminary design on

paste formulation. The fines addition tends to smooth the surface texture and decrease the

specular reflection (i.e. high frequency content). This is shown by the decrease of the

energy and contrast in the high frequency levels 1 and 2. Also the detail images of the

higher fines pastes are more uniform since the kurtosis and skewness decrease compared

to the lower fines samples.

Finally, the variables contribution from the coarser mixes (i.e. SD_+C+I and SD_+I) to the

paste with less intermediate size coke (i.e. SD_-I) is presented in Figure 66 g) and h). This

contribution plot captures the contribution of the particle size changes on the paste textural

features. The experiments made to change the dry aggregate size distribution moved the

paste sample appearance along an almost orthogonal direction from pitch demand. In this

case, increasing the fineness of the aggregates by removing particles in the intermediate

fraction leads to a smoother paste (i.e. lower skewness and kurtosis). It also concentrates

the information (i.e. energy) in the low frequency band and increased the contrast of the

low level decomposition details. These effects are similar to the fines fraction changes

presented in Figure 66 f). The main difference however is in the correlation features. In this

133

case, only the intermediate fraction was replaced by fines instead of all other fractions

which as was done previously. This seems to have more effect on the correlation features

than previously observed in Figure 66 b), d) and f) where only the correlation of

decomposition level 2 has a high contribution. For this case, the removal of intermediates

only seems to make the image texture more regular and smooth since the correlation

increases more in the low frequency details.

Two additional observations can be made from this figure. In Figure 66 c, the spread of the

base mix is almost perpendicular to the pitch demand direction change. This is an

indication of the variability of the particle size distribution of the dry aggregate mix from

sample to sample. In addition, in Figure 66 g), the position of the finer formulation paste

(i.e.SD_-C-I) in the bottom left corner indicates that the high content of the fines affected

both the pitch demand and the size distribution of the paste.

Two types of replicates were available in these experiments to assess the reproducibility of

the results. Each paste sample was imaged twice to verify the repeatability of the imaging

system itself. In addition, some of the mixes were repeated twice and the base mix four

times to check the repeatability of the paste mixing methodology. Reproducibility is

assessed similarly as to the preliminary set of experiments. A PLS model was built on all

the individual images available (no averaging at all). Then, the average and standard

deviations of the scores were computed for the image replicates and the mix replicates for

both model components. The results are presented in Figure 67.

The approximate uncertainty intervals (±one standard deviation) for replicated images are

presented in Figure 67 a). The intervals for these replicates are small in comparison with

the range of score values. This indicates that the imaging system and feature extraction

produce consistent results for most of the paste samples.

The results for the mix replicates are shown in Figure 67 b). Only the score values for the

replicated paste mixes are shown in the plot. In this case, only the samples obtained by

changing the pitch ratio seem distinguishable from the other paste samples. This indicates

that there is a large variability in the sample preparation method. The principal cause of

this variation comes from the large particle size distribution of the coke and butts fractions

and the difficulty of ensuring uniform and consistent sampling of the aggregates. However,

the model can still be used for the interpretation of the relationship between the changes in

the paste and the image texture variations.

134

Figure 67 – Reproducibility of the imaging sensor in the detailed design on formulation.

The averaged LV1 and LV2 score values are shown for a) image replicates and b) mix

replicates along with their one-standard deviation error bars

6.4.3 Pitch optimization experiment anodes

This dataset was used to select the preprocessing, wavelet type and features for the

machine vision algorithm. The experiment was designed to find the optimum pitch demand

of each coke based on the same formulation by finding the amount of pitch necessary to

obtain the maximum BAD. The particularity of this dataset is that each paste sample was

pressed and baked so the baked anode density can be used as a quantitative

measurement of the anode quality.

For this dataset, only one image was captured for each sample. However each pitch level

for both cokes have been repeated at least twice and up to four times for certain samples.

The X matrix contains the image textural features and the ∆BAD is stored in Y. The PLS

model statistics and scores have been presented and discussed in section 6.3 and Figure

59 b), but the loading weights were not interpreted. The PLS model based on the

averaged features is used for interpretation whereas the model computed on all samples

enables a comparison of the repeated samples.

The statistics for both models are available in Table 23. In this case, the model

performances are very good. Both the X and Y captured variance (R2) are high. The

predictive ability (CVQ2) is very close to the R2Y with less than 4% difference for the

averaged replicates model. Also, the RMSEPCV of the models are low compared to the

standard deviation (0.019 g/cm3) of the measured ∆BAD.

-15 -12.5 -10 -7.5 -5 -2.5 0 2.5 5 7.5 10-10

-7.5

-5

-2.5

0

2.5

5

7.5

10

t LV1

t LV

2

B_+10%

SD_+C+I

F_+4%

BL_6000t_5min

shot_40%

SD_-I

F_-2%

P_-1.4%

BL_4000

SD_-C-I

SD_+I

T_158°C

B_+10%

B_-10%

BL_2300

P_+1.6%

-6 -4 -2 0 2 4 6-5

-2.5

0

2.5

5

t LV1

t LV

2

BL_2300

BL_4000

BL_6000

B_+10%B_-10%

P_+1.6%

P_-1.4%

base

10

BL Base Butts Fines Mix_Temp Mix_t Pitch SD Shot

t1

t 2

a)

t1

t 2

b)

135

Table 23 – PLS model statistics for the pitch optimization experiments

A comparison of the predicted and measured ∆BAD is presented in Figure 68. This figure

shows that all the prediction closely fit the measured values.

Figure 68 – Comparison of the predicted and measured ∆BAD for the replicated averages

model

The scores and loadings bi-plot for the averaged replicates model are presented in Figure

69. The LV1 and LV2 scores are presented individually to improve the interpretation. For

the score plots in Figure 69 a) and c), the shape of the markers corresponds to the type of

coke, the color map to the ∆BAD values, and the labels indicate the combination of coke

type and pitch ratio. It is interesting to note that the paste at the OPD for both coke have a

similar visual appearance (i.e. texture features).

Model

Number

of LV

R2X

(%)

R2Y

(%)

R2Y

LV1-2

(%)

CVQ2Y

(%)

∆BAD

RMSEPCV

(g/cm3)

Replicate averages 7 96,77 96,39 80,52 76,88 0,009

All samples 5 89,78 85,45 73,58 63,70 0,012

-0.06 -0.04 -0.02 0

-0.06

-0.04

-0.02

0

∆BADmeasured

∆B

AD

pre

dic

ted

Coke A Coke B

136

Figure 69 – Scores and loadings of the PLS model (averaged replicates) for the pitch

optimization experiments: a) LV1 scores , b) LV1 weights and loadings, c) LV2 scores and

d) LV2 weights and loadings

The LV1 scores and loading weights are shown in Figure 69 a) and b). This component

captures the variations introduced in the pitch ratio. The values of t1 clearly increase with

the pitch %. No optimum is captured by this component as the values of the scores do not

decrease when the paste is over pitched. The interpretation of this component is similar to

PLS models built in previous experiments when pitch ratio was varied. The energy

increases proportionally with the pitch% which corresponds to higher reflectivity. The

texture is rougher in the high frequencies compared to the low frequencies. Indeed, the

skewness and kurtosis of levels 1-3 increase with the pitch% while they decrease in the

lower frequency range (i.e. levels 5-7). Finally, the contrast features which are a measure

of heterogeneity also increase with the pitch %.

The second component, however, does capture the optimum BAD based on the image

textural features. The t2 values are negative when the anodes are under-pitched and over-

pitched while optimum pastes have high positive t2 values. Except for the correlation

features, the decomposition level number 4 (i.e. resolution of +0.65mm/-1.30mm) has the

0 5 10 15 20 25-8

-6

-4

-2

0

2

4

Samples

t LV

2 (

43

.81%

)

A 15

A 16

A 17

A 17.5

A 18A 18.5A 19

A 20

A 21

A 22

A 23

A 24

B 15.2

B 15.7

B 16.2

B 16.7B 17.2

B 18.2

B 19.2B 20

B 21B 22

B 23B 24

B 25

B 26

-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0

0 5 10 15 20 25-10

-5

0

5

10

Samples

t LV

1 (

36.7

1%

)

A 15

A 16

A 17

A 17.5

A 18A 18.5

A 19A 20A 21

A 22

A 23A 24

B 15.2

B 15.7

B 16.2B 16.7

B 17.2B 18.2

B 19.2

B 20B 21

B 22

B 23

B 24B 25

B 26

-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0

0 5 10 15 20 25 30 35 40-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Variables

w*q

LV

2 (

43.8

1%

)1

23

4

5

67

1 2

3

4

5

6

71 2

3

4

5

6

7

1 2 3

4

5

6

7 1 2 3

4

5

67

1

2

3

4 56

7

de

lta B

AD

0 5 10 15 20 25 30 35 40-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Variables

w*q

LV

1 (

36.7

1%

)

1

2

3

4

5

67

12

3

4

56

7

1 2

3

4 5

6

7

1

23

4

5

6

7

12

3

4

5

67

1

2

3 4 56

7 de

lta B

AD

Coke A Coke B

0.3

Inte

r (

%)


b)

d)

OPD

Pitch %

t 1(3

6.7

1%

)t 2

(43.8

1%

)

a)

c)

137

strongest weights in all other features. The optimum pitch demand has a concentration of

the energy in this frequency band. The contrast on the GLCM matrix is also the highest at

this level. Finally, the kurtosis measuring the spread of the distribution of the detail

coefficients is the smallest for the OPD pastes. It seems that the textural details of about

1mm in length (i.e. approximately 25 pixels) are the most sensitive to the OPD. The other

features are characterized by a contrast in the positive vs. negative weight values in the

different decomposition levels.

The scores of components 3 and 4 are shown in Figure 70. Although it does not show any

specific pattern related with changes made on pitch percentage, the maximum density

anodes do cluster in the center of the score plot. Based on this observation, these LVs

may also be useful for detection the optimum pitch demand. The loadings bi-plots are not

shown since there physical meaning is not straightforward. Components 5-7 are not shown

since they mainly capture variations that are not related with the pitch demand.

Figure 70 – Scores of the 3rd and 4th components of the PLS model built on the pitch

optimization dataset (averaged features)

The reproducibility of the results obtained in these experiments is now assessed using the

replicated samples. The PLS model trained using all individual samples was again used to

compute the average and standard deviations of the scores for replicated samples. The

results are presented in Figure 71 for the first two components (most highly correlated with

BAD). In this case, the uncertainties are small enough to allow the bell-shape curve to be

clearly distinguished from experimental errors. The sensor therefore seems sensitive to

the variations in pitch demand of the paste.

-4 -2 0 2 4-3

-2

-1

0

1

2

t LV3 (2.56%)

t LV

4 (

4.2

6%

)

A 15

B 15.2

B 15.7

A 16

B 16.2B 16.7

A 17

A 17.5

A 18

A 18.5

A 19

A 20

B 20

A 21

B 21

A 22

B 22

A 23

B 23 A 24

B 24

B 25

B 26

-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0

Coke A Coke B

OPD

t3 (2.56%)

t 4(4

.26%

)

138

Figure 71 – Reproducibility of the imaging sensor in the pitch optimization experiments.

The scores of the first two components of the PLS model built on all samples are shown

along with one standard deviation error bars

Good results were obtained for detecting the optimum pitch demand for two different coke

sources. It is highly probable that this will also be the case for other coke sources. The

variation in porosity of different cokes should have a similar effect as a change in pitch

demand as this will change the amount of pitch needed to fill the pores. The OPD is

correlated to the thickness of the pitch layer on the coke particles (Adams et al. 2002).

Since this thickness should be similar for different coke sources, the sensor should be

robust to these raw material variations.

6.5 Conclusion

There is a need in the aluminium industry to develop new non destructive on-line

measurements for the characterization of the anode production process. Anode producers

face an increase in variability of the anode raw materials (i.e. coke and pitch). To find a

good compromise between raw material costs and anode quality, materials are blended

from different suppliers, including some lower cost/quality raw materials. Producing

anodes with a consistent quality is not straightforward using the current quality control

strategy. The key raw materials and anodes properties are currently measured in the

laboratory on a limited number of samples and the results are typically available after long

delays due to the production time and also sampling and analysis delays.

Anode paste quality is defined in terms of the resulting baked anode properties. These are

influenced by the raw material properties, the formulation such as the particle size

distribution and the pitch ratio and also the process operating conditions. The hypothesis

-10 -5 0 5 10-10

-8

-6

-4

-2

0

2

4

t LV1

t LV

2

A 15B 15.2

B 15.7

B 16.2

A 17.5B 18.2

A 20B 20

B 21

A 23

B 23

B 25

B 26

Coke A Coke B

t1

t 2

139

tested in this thesis was that changes in the particle size, pitch ratio and mixing parameter

are affecting the paste visual appearance (i.e. its texture). Therefore, it is possible to

quantify the variations in paste quality using the right combination of image analysis

methods.

Image analysis methods are well suited for on-line measurement sensors of product

appearance. The objective of this chapter was to develop a machine vision sensor for the

anode paste characterization based on its image texture. First, to demonstrate its

sensitivity to changes in paste formulation and pitch demand on paste samples formulated

in the laboratory. Second, the laboratory experiments where designed to gain an

understanding of the relationships between the paste surface texture (i.e. image features)

and its macro properties (i.e. formulation and pitch demand) to help the interpretation of

the sensor results on real industrial paste images.

The image analysis methodology used for the paste machine vision sensor is based on

contrast enhancement preprocessing, image texture analysis methods and features

extraction. It was found that a contrast enhancement preprocessing improved the

performance of the image analysis method and to detect an optimum in the pitch

optimization experiments. Finally, a combination of DWT and GLCM features computed on

the image DWT detail coefficients of seven level of decomposition were found best suited

to capture the texture information contained in the images.

To interpret the texture features, PLS models were used as supervised classification

algorithm to force the latent variable space to explain the desired paste fabrication

variations by these features. It was found that the amount of pitch on the surface of the

particle influences the reflectivity of the paste (i.e. shininess) and that the specular

reflection was captured by the high frequency levels. Also, paste with high pitch levels or

low pitch demand have a smooth texture in the low frequencies. Finally, finer formulation

mixes tend to smooth the surface of the paste in all frequencies compared to the coarser

paste.

The main difficulty in comparing the paste mixes together was to obtain quantitative

information about the paste quality. This was done in the final laboratory experiment by

performing a pitch demand optimization based on the maximum anode BAD.

140

This experiment showed that it is possible to capture the OPD using the machine vision

algorithm. Also, the texture features of the paste at the OPD for both type of coke was

similar even if the pitch ratio and BAD values were not the same for these cokes.

141

Chapter 7 Industrial paste imaging

7.1 Introduction

The experiments performed in the laboratory were useful to develop the image texture

analysis methodology for the anode paste images. It was also useful for understanding the

relationships between the variations in paste visual appearance captured by the textural

features and the changes introduced in coke source, formulation and processing

conditions. The next step of this project was aimed at testing the machine vision algorithm

on samples collected directly from an industrial paste plant.

Collecting data and samples and tracking anodes from an industrial carbon plant is not

straightforward due to the size and complexity of the manufacturing process. For example,

Deschambault’s carbon plant produces approximately 3200 anodes every week. The

paste plant processes 32 tons per hour of anode paste. Also, the 2 baking furnaces

contain 32 sections of 6 pits with 16 anodes per pit. This means that there are more than

5000 anodes loaded in the baking furnaces at any given time. Finally, the plant also stores

an inventory of green and bakes anodes.

Three major difficulties were faced during these tests. First, obtaining quantitative

measurements of the paste and/or anode quality is not straightforward in comparison with

laboratory experiments. For example, measuring the BAD of a particular anode requires

tracking the green anode block in the paste plant and in the baking furnace, and then drill

a core sample from the block after it is unloaded from the furnace. Tracking the anode

block requires human resources and logistics and is often difficult to implement with high

accuracy. Second, the heat-up rate and final baking temperature of the anodes depend

upon the location where they were baked in the furnace. These two parameters are known

to have a strong impact on the anode properties (Fischer et al. 1993). Therefore, it is

necessary to account for the effect of the baking process on BAD in addition to raw

material properties and paste formulation. This was not necessary in the laboratory

experiments because the green anode samples were baked in a smaller furnace where

baking conditions were controlled and homogeneous (i.e. core samples were all baked

under very similar conditions). Third, the range of the variations that can be safely applied

during the operation of the plant is limited to avoid the production of defective anodes or

breakdown of the plant. This limits the range at which it is possible to test the sensitivity of

the image texture analysis method.

142

The objective of this chapter is to test the robustness and the sensitivity of the machine

vision sensor developed in Chapter 6 on industrial paste samples using different datasets

obtained from sampling campaigns conducted at the Alcoa Deschambault (ADQ) smelter’s

carbon plant.

The robustness and sensitivity of the machine vision sensor is first studied under different

process operating conditions, including normal operation, plant start-up and pitch

optimization experiments (sections 7.3.1, 7.3.2, 7.3.3, respectively) by building PLS

regression models between the image textural features and the formulation variables,

similarly to what was done in the laboratory development phase. The aim was again to

perform supervised clustering of the paste image features based on the changes made to

the manipulated variables in the paste plant as part of the industrial sampling campaign.

In a second step (section 7.4), the textural features computed for each paste samples

were added to the larger database including raw material properties, formulation and

process conditions in order to establish relationships between the information extracted

from the images and the data collected in the different parts of the plant. This also

provides an assessment of the added value brought by using the machine vision sensor in

addition to the data already routinely available at the plant.

This chapter is organized as follows. First, the paste sampling, imaging procedure and

data synchronization are discussed in section 7.2. This is followed by the results section

(7.3) where the datasets and the interpretation of the features and the robustness of each

industrial experiment are discussed. The fusion of the sensor’s data to the plant datasets

using the new SMB-PLS algorithm is presented in section 7.4. Finally, some conclusions

are drawn.

7.2 Sampling and data synchronization

For all the experiments performed in the ADQ paste plant, the routine operating conditions

data were collected. This includes the raw material properties, the formulation ratios, the

particle size distribution of the dry aggregate mix and the processing conditions of the

mixing and anode forming units (e.g. temperatures, mixing energy, etc.). These data

needed to be synchronized to account for the residence time within each piece of

equipment and dead-times introduced by conveyor belts. The synchronization procedure

was described in previous work (Lauzon-Gauthier 2011). The only difference with the

143

procedure used in this thesis is that the basis for the synchronization is the paste sampling

time instead of the anode forming time (approximately 4 minutes difference between the

two events). The synchronization schedule was adjusted accordingly.

In order to preserve the confidentiality of the industrial data, all the process data presented

in this chapter were mean-centered. The pre-processing was done for each set of

experiments independently since the range of variation was wide due to the large time gap

between each experiment. The mean-centering hides the absolute values of each

variables but the variability is preserved so the variations in the data can be interpreted in

real engineering units.

Finally, at each sampling time, three aluminium containers were filled with anode paste.

The three paste samples were assumed to be true replicates since they were collected

within a very short period of time. It was thus possible to capture three different images for

every sampling time (i.e. one from each container). The paste samples were grabbed

manually from the conveyor belt and it was not straightforward to sample the whole flow of

paste with the sampling bucket. Most of the results presented in this chapter were

obtained after averaging the features of the three images of replicate paste samples. This

decision was made to average the differences due to the manual sampling variability. The

uncertainties due to sampling will also be quantified using the standard deviations of the

scores of the replicate samples.

7.3 Datasets and results

This chapter is organized differently compared to the previous chapter (Chapter 6). The

datasets and the results are presented in the same section since the interpretation of the

variations in the formulation is important to the understanding of the image features.

However, similarly to the laboratory development stage (section 6.4), two PLS models are

built on each dataset consisting of the image features (X) and the formulation data (Y).

One model is computed from all the individual images to obtain an estimate of the

uncertainties based on the replicated samples and the second model is built after

averaging the X and Y data for the replicated samples. The PLS scores and loadings are

interpreted using the second model (averaged replicate data) for sake of simplicity.

For all the PLS models, the number of components was selected to minimize the

RMSEPCV (equation 2.16) of all formulation variables. For the PCA models, the CVQ2 of

144

the X dataset was used instead. The cross-validation procedure was implemented by

splitting the dataset in 7 consecutive blocks.

7.3.1 Normal operation

The first industrial dataset was collected during normal operation of the paste plant without

any designed experiment on the processing conditions. It was performed to test the

sampling procedure and imaging equipment at the site. In total, anode paste was collected

at 118 sampling times in triplicate and subsequently imaged (354 paste images). The

samples were gathered at regular time intervals during normal plant operation for six

different days (numbers in Figure 72) over the course of 2 weeks.

The formulation variables are presented in Figure 72. The fines percentage is not shown

for better clarity of the figure, but its variance is 0.078% around the average value. It was

not manipulated during these six days and was affected by common cause variations only.

Figure 72 – Formulation variables for the normal operation industrial dataset: a) dry

aggregate % and b) pitch ratio

-4

-3

-2

-1

0

1

2

3

Aggre

gate

form

ula

tion

0 10 20 30 40 50 60 70 80 90 100 110 120-1

-0.75

-0.5

-0.25

0

0.25

0.5

0.7

Paste samples

Pitch (

%)

a)

b)

1 2 3 4 5 6

Coarse % Inter. % Butts % Pitch %

1

22

145

Due to operational constraints, the ratios of coarse and butts (Figure 72 a) must be

adjusted daily and this was the main source of variability contained in this dataset. The

butts vary in opposite direction to coarse ratio because the butts particles are always

substituted by coarse coke particles in the dry aggregate formulation. Furthermore, the

pitch percentage in the paste (Figure 72 b) was adjusted according to the changes made

to the butts/coarse fractions since the pitch demand of the aggregate mix is lower when

the butts ratio increases (i.e. pitch ratio decreases with butts content).

The correlation coefficients between the different dry aggregate fractions and the pitch %

are presented in Table 24. This confirms the strong negative correlation (-0.93) between

the butts % and coarse % and the negative correlation (-0.75) between the butts % and

the pitch %.

Table 24 – Correlation coefficients between the paste formulation variables for the normal operation data

This dataset contains day-to-day variations in the paste formulation mostly caused by

change in the anodes butts and coarse coke particles ratios. The pitch ratio was also

adjusted when changes in the dry aggregate mix modified the pitch demand. Hence, in this

case, the pitch % is correlated with the dry aggregate formulation since it was not

manipulated independently. No designs of experiment were implemented in this sampling

campaign.

The PLS models statistics are presented in Table 25. Again here, it was not intended to

predict the paste formulation variables but to cluster the image features based on changes

made to formulation. Thus the prediction ability will not be discussed. The models use 82%

and 78% of the variance contained in the image features dataset (X) to explain 40% and

28% of the variance in the formulation variables (Y) for the models built on averaged

replicate data and all images (no averaging), respectively. Lower explained variance of Y

was expected in this case because of the lower signal-to-noise ratio (i.e. smaller range of

variations and no design of experiments). Nevertheless, it will be shown that the image

Coarse % Inter % Fines % Butts % Pitch %

Coarse % 1,00 -0,49 0,02 -0,93 0,59

Inter % 1,00 -0,19 0,26 -0,16

Fines % 1,00 0,09 -0,19

Butts % 1,00 -0,75

Pitch % 1,00

146

features do correlate with formulation data. For comparison, the standard deviation of the

formulation variables were 1.95 %, 0.39 %, 0.27 %, 1.70 % and 0.25 % for the coarse %,

intermediate %, fines %, butts % and pitch %. Estimated model errors (RMSEPCV) are

lower for coarse, fines, and butts % but larger or similar for intermediate % and fines %.

Table 25 – Statistics of the PLS models built on normal operation data

The scores and loadings of the PLS model built on averaged replicate data are shown in

Figure 73-72. The main sources of variations captured by each LV are discussed first,

followed by the model interpretation based on the image features loading weights. These

are used interpret the effect of change in formulation on the paste visual appearance.

Figure 73 – First component’s scores (a) and loadings (b) of the PLS model (averaged

replicate data) built on normal operation data of the ADQ paste plant

The first PLS model component mainly captures the large changes made in pitch ratio in

between the different days of operation as indicated by the ellipse labelled #1 in Figure 72

b) and Figure 73 a). The Y loadings (Figure 73 b) also indicate that this component

captures the variations in coarse and butts % which occurred simultaneously. As the pitch

increases, the energy and contrast of the high frequency levels (1-4) increase while they

decrease in the low frequency range (5-7). This is an indication that the paste becomes

smoother for large details, but with more specular reflections due to the higher pitch

content in the fine details. This is consistent with the observations made from the

laboratory paste experiments.

Model

Number

of LV

R2X

(%)

R2Y

(%)

CVQ2Y

(%)

Coarse %

RMSEPCV

Inter %

RMSEPCV

Fines %

RMSEPCV

Butts %

RMSEPCV

Pitch %

RMSEPCV

Replicate

averages3 81,72 40,55 20,05 1,64 0,43 0,30 1,19 0,24

All samples 3 78,32 28,31 11,43 1,82 0,44 0,19 1,44 0,25

-10

-5

0

5

10

15

t LV

1 (

20.0

3%

)

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

w*q

LV

1 (

20.0

3%

)

12 3

4

56 7

1

2 3

4 5

67

1

2

3

4

5

6

7

1 23

4

56 7

1 23

4

5

67

1

2

3

4

5 6 7

Co

ars

e %

Fin

es %

In

ter.

% B

utts %

Pitc

h %

a) b)


1

1 2 3 4 5 6

147

Figure 74 – Second component’s scores (a) and loadings (b) of the PLS model (averaged


The second component (Figure 74 a) seems to focus on the local variations within each

experimental day. These local variations in pitch are indicated by the black arrows in

Figure 72 b) and Figure 74 a). Based on the LV2 Y loadings (Figure 74 b), this component

also captures information from the coarse/butts and pitch variations. However, since it is

orthogonal to LV1 it explains a different combination of events. The energy features which

are positive in almost all levels seem to explain the variations in pitch since they are

positively correlated with this variable (i.e. pitch increases the reflectivity of the paste).

Figure 75 – Third component’s scores (a) and loadings (b) of the PLS model (averaged


Finally, the third component (Figure 75 a) mainly captures the changes in the intermediate

coke fraction as indicated by the ellipse labelled #2 in Figure 72 a) and Figure 75 a). The

particle size for this coke fraction is +0.15 mm/-1.40 mm (Table 5). The loadings for this

-10

-5

0

5

Observations

t LV

2 (

16.2

4%

)

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

w*q

LV

2 (

16.2

4%

)

12 3

4

5

6

7

1

2

345

6

71

2

3

4

5

6

7

12

3

4

5

6

7

1 23

4

5

6

7

1

2

3

4

5 6

7

Co

ars

e %

Fin

es %

In

ter.

% B

utts %

Pitc

h %

a) b)


1 2 3 4 5 6

0 20 40 60 80 100 120-8

-6

-4

-2

0

2

4

Observations

t LV

3 (

4.2

8%

)

0 5 10 15 20 25 30 35 40 45

-0.4

-0.2

0

0.2

0.4

0.6

Variables

w*q

LV

3 (

4.2

8%

)

12

3

4

5

6

7

1

2

34

56

7 12 3

4

5

671

23

4

5

6

7

1 2

3

4

5

6

71

2

3

45

6

7

Co

ars

e %

Fin

es %

In

ter.

% B

utts %

Pitc

h %

a) b)

2


1 2 3 4 5 6

148

component (Figure 75 b) show that the decomposition levels 4 and 5 have the highest

weights in all the features. The intermediate particle size range approximately corresponds

to the details captured by these decomposition levels which are approximately +0.65 mm/-

1.30 mm and +1.30 mm/-2.60 mm respectively. Since the variations in intermediate

fraction had a very small correlation (Table 24) with the other formulation variable these

results suggest that the sensor is sensitive to changes in size distribution of the dry

aggregate mix used to manufacture the industrial paste.

The uncertainties in the scores of the PLS models built using all samples (no averaging)

are presented in Figure 76. The standard deviations of the scores were calculated based

on the three images collected at each sampling time. Therefore, these represent the

uncertainties in the surface appearance (texture) of the paste at a given time. The trends

in the scores shown in Figure 76 are clearly significant. For example, the variations in LV1

within each day lay within the error bars, but the score values of the 2 days with the largest

pitch difference (day 3 and 4) is larger than the standard deviation (i.e. the error bars for

these day do not overlap). This is also the case for the variations captured by LV2 (within

day variability for pitch and coarse coke fraction) and LV3 (intermediate fraction

variations). The fact that the uncertainties seem more important in this case compared to

the laboratory development phase was expected since the images were collected from an

industrial process where the environment is not as controlled as in the laboratory.

Furthermore, the signal-to-noise ratio is lower due to normal operation of the plant (mostly

common cause variations).

149

Figure 76 – Uncertainties in the scores of the PLS model built on normal operation data: a)

LV1, b) LV2 and c) LV3. One standard deviation error bars on the scores are shown.

7.3.2 Paste plant start-up

The second dataset was collected during the start-up of the paste plant. After the start-up

procedure is completed (lasts a few hours), the paste plant is run in a very stable steady-

state operation until the next shut-down. This test was performed to verify the sensitivity of

the machine vision sensor to changes in the paste during transient operation. In total,

paste samples were collected at 20 sampling times during start-up in triplicates and were

subsequently imaged (60 paste images).

The first paste sample was collected as soon as there was enough paste flowing through

the mixers onto the conveyor where sampling was performed. The time elapsed since the

first sample was collected are provided in Table 26. The sampling period was set to 4-5

0 20 40 60 80 100 120-7.5

-5

-2.5

0

2.5

5

Samples

t LV

3

-12

-10

-8

-6

-4

-2

0

2

4t LV

2

-10

-5

0

5

10

15

t LV

1

a)

b)

c)

1 2 3 4 5 6

150

minutes for the first hour (sampling period #1) when the paste quality improves the most

due to the equipments heating-up and the improvement in mixing. This was the fastest

manual sampling rate that could be implemented using the current set-up. After one hour,

the sampling rate was extended to 15 minutes for 1.5 hours (sampling period #2). At that

point the paste quality (i.e. density) is not yet in steady-state, but the changes are slower.

Finally, four additional samples were collected after a 1.5 hour delay (i.e. after the start-up

period, sampling period #3).

Table 26 – Sample number and elapse time since the first start-up sample

As an example of the transient paste quality, the GAD is presented in Figure 77 a). There

is a 0.02 g/cm3 increase in the first 30 minutes within the start-up. The numbers 1 to 3

correspond to the sampling period and are used in Figure 77 and Figure 78.

For the paste plant start-up dataset, only the image features data matrix (X) could be used

to analyze the changes in the image texture as the start-up progressed towards normal

operation. Indeed, it was not possible to apply the data synchronization procedure for the

first few samples because the residence times within the process units and dead-times

Sample #

Elapsed time

since S-U (h)

1 0,00

2 0,07

3 0,13

4 0,22

5 0,27

6 0,35

7 0,42

8 0,62

9 0,72

10 0,87

11 0,97

12 1,23

13 1,48

14 1,75

15 2,05

16 2,32

17 3,80

18 4,33

19 4,67

20 4,98

151

between them were changing during the start-up as opposed to steady state operation

(e.g. load of paste in the mixers increase to its normal level during start-up). Therefore, no

Y data was available in this case and so PCA was applied to model the feature matrix X

instead of PLS. The statistics of the PCA model are presented in Table 27. Three

components were found significant by cross-validation. In this case, the statistical model is

presented only for the average features of the three images per sampling time. This test

was not performed to validate the precision of the sensor but to verify if it was sensitive to

transient operation. For this reason, only the statistics for the replicate average (i.e. three

images per sample) are presented in Table 27. In this case, 3 components explain 73% (in

cross-validation) of the variance contained in the 42 image features variables.

Table 27 – Statistics PCA model built on the paste plant start-up data

Figure 77 b) shows a time series of the three component’s scores. The variations in the

score values can be jointly interpreted with the GAD from Figure 77 a). The transient

changes in the paste appearance and quality are captured by the imaging sensor. As

previously discussed, the paste quality changes rapidly in the first 30 minutes of the start-

up. The GAD increases rapidly (Figure 77 a). This is also visible in the increase of the

score values for LV1 and LV2 (Figure 77 b). Then, the GAD and all three LVs have a

period of higher variability (i.e. period #2). Finally, as the process reaches its steady-state,

the GAD stabilises at the same time as LV1 and LV3.

Model

Number

of LV R2X (%) CVQ

2X (%)

Replicate averages 3 86,21 73,18

152

Figure 77 – Time series of a 5h plant start-up period: a) GAD and b) scores of LV1, LV2

and LV3

Based on visual assessment of the score trends, the scores of LV1 and LV3 capture most

of the variations related with the evolution of the paste image features during the start-up

process. The score space of these components (Figure 78 a) shows that early in the start-

up (samples 1-6), the image textural features project in the negative t1 region identified by

ellipse #1 and their variance between the samples is high. Then, the paste texture features

transition towards the positive t1 region (square #2). Also note that the variance between

the samples, especially along t3, is also smaller. The LV1 loading plot presented Figure 78

b) is used to interpret the changes in paste appearance as the start-up progressed

(dominant component). Positive t1 values are associated with lower energy in the high

frequency levels, higher energy in the low frequency levels (5, 6 and 7) and also higher

GLCM correlation features. This is an indication that the paste appearance becomes

smoother as the start-up progressed and the process becomes more stable (i.e. the mixing

is more homogenous). The last four samples (17-20) are characterized by positive t1 and t3

values. These sample where taken a few hours after the start-up was completed (normal

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-10

-5

0

5

10

Sampling time

Sco

res

LV1 LV2 LV3

b)

a)

i ii iii

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-0.02

-0.015

-0.01

-0.005

0

0.005

0.01

Sampling time

GA

D (

g/c

m3)

i

ii iii

153

operation). The loadings for LV3 are not presented since the features could not be easily

interpreted.

Figure 78 – Scores and loadings of the PCA model built on the industrial paste start-up

data (averaged image replicates): a) LV1 and LV3 score plot and b) LV1 loadings plot

The motivation for using the imaging sensor during paste plant start-up is to monitor the

transient process operation and eventually to assess if the paste textural features are in

agreement with those obtained in normal operation. This could provide a real-time

indication that the paste plant has reached steady-state and is ready for anode production.

To achieve this, a region in the score space corresponding to paste textural features

obtained in normal operation could be established. Then, it would suffice to verify whether

the current values of the scores fall within the region or not.

7.3.3 Industrial pitch optimization experiments

The last dataset was collected during five different experiments where the pitch ratio was

varied. These five experiments were performed in five different days during an eight month

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

p L

V1 (

48.0

8%

)

1 23

4

5

67

1

2

34 5

6

7

12 3 4

5

6

7

12 3

4

5

67 1 2

3

4

5

67 1

2 3 45

67


-10 -5 0 5

-4

-2

0

2

4

6

8

t LV1 (48.08%)

t LV

3 (

13.0

5%

)

1

2

3

45

6

7

891011

12

131415

16

17

18

1920

b)

i

0 5 10 15 20 25 30 35 40-0.3

Variables

ii

iii

a)

154

period in 2013-2014 in order to validate the sensitivity of the machine vision sensor to

changes in pitch and dry aggregate formulation. Indeed, changes in aggregate formulation

occurred during the experiments as part of a standard plant operating policy consisting of

substituting butts by coarse coke particles (and vice-versa) to meet constraints on butts

inventory. Some fluctuations in the percentage of intermediate and fines coke fractions

also occurred in between the different sampling campaigns. The variations in aggregate

formulation and pitch ratio that took place in this set of experiments are shown in Figure

79. The fines fraction is usually kept at a constant ratio in normal operation. However, a

step change of -2% was implemented in the experiment labelled E (see ellipses #1). The

fines were substituted by coarse coke particles to maximise the change in the particle size

distribution of the dry aggregate. In total, paste samples were collected in triplicates at 184

sampling times and were subsequently imaged (552 paste images).

Figure 79 – Changes in the formulation variables for the industrial dataset where pitch ratio

was varied. The five sampling campaigns are indicated by letters A-E.

a)

b)

-4

-3

-2

-1

0

1

2

3

4

Form

ula

tion

0 20 40 60 80 100 120 140 160 180-1

-0.5

0

0.5

1

Variables

Form

ula

tion

A B C D E

Coarse % Fines % Inter. % Butts % Pitch %

1

1

Ag

gre

ga

te f

orm

ula

tio

n

Paste samples

Pitch (

%)

155

In each experiment (A-E), the pitch percentage was manipulated by small increments of

0.1 % to 0.4 % around the nominal set-point. The latter was selected by the operators

based on the standard plant pitch adjustment policy. The measured pitch % for each

sample is presented in Figure 79 b). The changes made to pitch % set-points for each

experiment are presented in Table 28.

Table 28 – Changes implemented on pitch % set-point in the industrial pitch variations dataset

The overall range of variations in all the experiment was less than 1%. This was deemed

the maximum range allowed for maintaining safe process operation. In experiment E, two

designs of experiments were implemented on pitch ratio for two different fines/coarse

ratios during the same day. This was done to force a change in the pitch demand of the

paste for the same raw material and operating conditions, which is difficult to achieve

when the samples are collected on different days.

Different average pitch levels were maintained during each day of the sampling campaign

either because the operators adjusted pitch ratio to changes in raw materials and in the

dry aggregate formulation, or changed on purpose to meet the objectives of the

experimental program. However, within each day of experimentation, only the pitch ratio

was manipulated. Therefore, this dataset contains variation in pitch that are both

correlated with and orthogonal to the dry aggregate formulation.

Table 29 presents the correlation between the paste formulation variables. The correlation

between the coarse and butts fractions is again high (i.e. -0.93), but it is lower between the

pitch and butts (-0.52 instead of -0.75) compared to the normal operation dataset. This is

explained by the independent pitch variations implemented in each experiment. It was

extremely difficult to achieve a high signal-to-noise ratio for the changes made on pitch %

due, on one hand, to the limited range of pitch variations and, on the other hand, to

fluctuations in processing conditions, raw material quality and dry aggregate formulation,

Experiment

# of pitch

level

Pitch

levels

A 5 -0,2/+0,2

B 4 -0,1/+0,2

C 4 -0,2/+0,2

D 5 -0,3/+0,3

E (high fines) 5 -0,4/+0,3

E (low fines) 3 -0,2/+0,3

156

which may all have an effect on the pitch demand of the aggregate and on the paste visual

appearance. Nevertheless, this dataset contains more excitation (controlled or not) in

comparison with the normal operation dataset.

Table 29 – Correlation coefficients between the paste formulation variables for the experiments on pitch ratio

Some properties of the coke and pitch used during the sampling campaigns are presented

in Table 30. The coke apparent density (VBD) and real density as well as the pitch QI and

viscosities at 160°C and 180°C are provided in this table. These values consist of

weighted averages of the properties of each coke used in the blend (coke from at least two

sources are typically blended at the ADQ plant). The coke and pitch materials for

experiments A, B, and C come from the same suppliers, but different lots (i.e. ships and/or

trains). The pitch used in experiments D and E were from a different supplier than

experiments A, B and C. Finally, one of the cokes in the blends used during experiments D

and E is different for each of these blends.

Table 30 – Coke and pitch properties for each pitch variation experiments

For three experiments (i.e. C, E low fines and E high fines), four anode core samples were

collected from green anodes produced at each pitch levels. This was done in an effort to

quantitatively measure the paste quality and to find the real optimum pitch demand based

on the BAD. This is similar to what was performed with the laboratory anodes, with the

exception that a much smaller range of pitch variations was used in the industrial

Coarse % Inter % Fines % Butts % Pitch %

Coarse % 1,00 -0,01 0,10 -0,93 0,37

Inter % 1,00 0,64 0,24 -0,30

Fines % 1,00 -0,06 -0,11

Butts % 1,00 -0,52

Pitch % 1,00

A B C D E

Coke VBD (-30/+ 50 mesh) (g/cm³) 2,06 2,05 2,05 2,08 2,07

Coke real density (g/cm³) 0,90 0,89 0,90 0,88 0,91

Coke blend 1 1 1 2 3

Pitch QI (%) 7,4 6,4 6,4 17,2 13,5

Pitch viscosity 160°C (cP) 1890,0 1730,0 1730,0 2080,9 1470,0

Pitch viscosity 180°C (cP) 525,0 452,0 452,0 641,9 442,0

Pitch supplier 1 1 1 2 2

Properties

Experiments

157

experiments (i.e. less than 1% versus 11% for the lab experiments). These core samples

were baked in the same small scale furnace used in the laboratory experiments after

which BAD was measured. The baked core samples from experiments E low fines (E_LF)

and E high fines (E_HF) were also sent to the ADQ laboratory to measure the BAD,

electrical resistivity, compressive strength, CO2 reactivity and Young’s modulus. Hence,

two BAD measurements were available for experiments E_HF and E_LF. The top 10 cm of

the anode core was kept at the University and a first BAD measurement (labelled Top)

was performed. The remaining lengths of each core samples (approximately 25 cm) were

sent to the ADQ lab. The second BAD measurements (labelled Lab) were measured by

ADQ from a 13 cm long sub-sample.

The BAD data are presented in Figure 80 a) whereas the other properties measured in the

ADQ laboratory are presented in Figure 80 b) to e). The markers are the average of the

measurements for all core samples for each pitch level and the error bars represent the ±

1 standard deviation. All the results have been mean centered to protect the confidentiality

of the results. In this figure, LF and HF correspond to the low fines and high fines

experiments E and Top and Lab correspond to the two different measurements on the

same core sample.

A few observations can be made from the BAD measurements presented in Figure 80 a).

First, in all three experiments, the BAD is positively correlated with the pitch ratio. This

means that the paste was on dry side of the pitch optimum curves (Figure 4) for each

experiment. Secondly, there is a difference between the BAD measured from the top (Top)

part of the core sample and that of the laboratory (Lab) sample. This may be explained by

a greater surface porosity on the top sample and from measurement variability. Finally, the

optimum BAD could not be measured for any of the three experiments. For both

experiments C and E_LF, the BAD increases with the pitch %, therefore the optimum pitch

demand was not reached. For the E_HF dataset, the BAD also increases with pitch ratio

except for the last pitch level (+0.3%). However, the BAD values obtained at +0.2% and

+0.3% of pitch fall within the 1 std error bars of each other. A few additional pitch levels

(i.e. +0.4% and +0.5%) would have been necessary to validate that BAD effectively started

decreasing from the +0.3% pitch level.

158

Figure 80 – Baked anode core properties: a) BAD for experiments C and E, b) electrical

resistivity, c) compressive strength, d) CO2 reactivity residue (CRR) and e) Young’s

modulus

The main hurdle encountered during these experiments was the impossibility of obtaining

the BAD of the anodes in real-time. Thus it was not possible to know exactly when to stop

the experiment to obtain a full pitch optimisation curve where both under pitched and over

pitched anodes are fabricated. This is one of the reasons why it is important to develop a

method for measuring the OPD in real-time.

Unfortunately, the OPD could not be determined due to lack of over pitched anodes.

However, the laboratory measurements of the four anode properties (Figure 80 b-e) on the

E datasets can be used to validate that the BAD and the other properties correlate well

with each other. For these anodes, the maximum CO2 reactivity residue (CRR), and

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

Delta pitch (%)

CR

R

-3

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

Com

pre

ssiv

e s

tren

gth

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-2000

-1500

-1000

-500

0

500

1000

Delta pitch (%)

Young's

modulu

s

-1.5

-1

-0.5

0

0.5

1

1.5

2

Ele

ctr

ical re

sis

tivity

C E_LF_Top E_LF_Lab E_HF_Top E_HF_Lab

b) c)

d) e)

a)

-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-0.03

-0.02

-0.01

0

0.01

0.02

0.03

Delta pitch (%)

BA

D (

g/c

m3)

159

minimum electrical resistivity are all obtained for the anodes with the maximum BAD. The

Young’s modulus is an indication of the thermal shock resistance. It is a measure of

inelasticity of the anodes and therefore must be minimized in order to minimize the strain

due to thermal shock in the potroom. The relationship between the BAD and Young’s

modulus is not as clear as with CRR and electrical resistivity. Furthermore, it seems that

pitch ratio had no effect on compressive strength for the range of pitch % tested in these

two experiments. Based on these results, it can be concluded that the quality of the

anodes increased with pitch ratio and the use of BAD to assess the changes in pitch

demand was deemed valid for these experiments. However, the implemented range of

variations was too small to confirm what the optimum pitch levels were for each individual

anode.

PLS regression models were computed between the image features (X) and the paste

formulation variables (Y). Again, the model is not intended to be used for predicting

formulation. Indeed, two pastes with the same pitch level, but formulated using a different

coke will very likely have different textural characteristics, and therefore poor prediction

performance of the Y data are expected from such a model. The objective of using PLS is

rather to extract a small number of orthogonal combinations of textural features that are

correlated with the various changes made to paste formulation variables and that allows

detecting what the optimal pitch demand is for a given coke source. Unfortunately, the

pitch demand of the paste could not be measured in any of the industrial experiments.

The PLS models statistics are presented in Table 31. Once again, the models capture

most of the variance in the features. Thus, the PLS scores and loading weights can be

used for the interpretation of the relationship between the paste image texture and the

variations in formulation. The averaged replicates (i.e. the three images per sampling time)

model has a very good percentage of variance explained in fit (73%) and in prediction

(58%, shown in Figure 81) for pitch ratio. This is a good performance for a model built from

industrial data. For comparison, the standard deviation of the formulation variables were

1.76%, 0.24 %, 0.67 %, 1.94 % and 0.45% for the coarse %, intermediate %, fines %,

butts % and pitch %. In this case, the RMSEPCV are lower that the dataset variability for

all variables except the fines %.

160

Table 31 – Statistics of the PLS model for the design of experiments on pitch ratio

The predicted versus measured pitch ratio (in fit) obtained from the averaged replicates

model is presented in Figure 81. This confirms the sensitivity of the paste image texture to

changes in pitch ratio.

Figure 81 – Predicted versus measured pitch ratio obtained using the PLS model built on

data collected during the design of experiments on pitch ratio (averaged replicates)

The PLS scores and loadings bi-plots for the averaged replicates data are presented in

Figure 82, Figure 83 and Figure 84. Only the first 3 components are discussed since the

last component does not improve the interpretation. This component seems to focus on

some low frequency information contained in level 7 that may be due to the manual

flattening of the paste in the aluminium containers. The variations captured by each

component can be interpreted from the Y loadings (Figure 82 b, Figure 83 b and Figure 84

b) and the scatter plots of the scores against paste formulation variables (Figure 82 c and

d and Figure 83 c).

The first component (Figure 82) captures two phenomena. First, it models the feedback

control strategy where pitch ratio is adjusted to attenuate the changes in pitch demand

introduced by fluctuations in the coarse coke and anode butts fractions as discussed

previously (i.e. limitations on butts inventory). The variations in formulation are illustrated in

Figure 79 a). The scatter plots of the LV1 scores against the percentage of the coarse

Model

Number

of LV

R2X

(%)

R2Y

(%)

CV

Q2Y

(%)

Pitch %

R2 (%)

Pitch %

Q2 (%)

Coarse %

RMSEPCV

Inter %

RMSEPCV

Fines %

RMSEPCV

Butts %

RMSEPCV

Pitch %

RMSEPCV

Replicate

averages4 89,88 48,86 27,66 72,66 58,20 1,49 0,29 0,83 1,54 0,30

All samples 4 84,87 42,42 24,39 62,12 48,30 1,51 0,28 0,80 1,57 0,33

-1 -0.5 0 0.5 1-1

-0.5

0

0.5

1

Measured Pitch %

Pre

dic

ted P

itch %

A B C D E

161

fraction (Figure 82 c and d) also support this interpretation. Second, this component also

captures the designed experiments pitch variations. The variations in LV1 (Figure 82 a)

clearly matches the overall trends in pitch percentage (Figure 79 b). The span of score

values (LV1) for each coarse % set-point in Figure 82 c) is due to the pitch variations

within each experiment. The linear variation or the scores as a function of the pitch %

(Figure 82 d) is also an indication of the sensitivity of this component to the pitch variations

to the change in formulation and to the designed experiments.


the designed experiments on pitch ratio: a) LV1 scores, b) LV1 weights and loadings, c)

scatter plots of LV1 scores and coarse % and d) scatter plots of LV1 scores and pitch %

It is very difficult to interpret the texture features loadings from Figure 82 b). The first

model component captures both the variations in coarse/butts % and its associated pitch%

adjustment and also the design pitch % experiments. First, as pitch and coarse

percentages are increased, the energy decreases in almost all decomposition levels

except for a small increase in level 1 and 2. From previous results, an increase in pitch

should increase the reflectivity of the paste and the energy of all the detail coefficients.

-4 -2 0 2 4-10

-5

0

5

10

Coarse %

t LV

1

-1 -0.5 0 0.5 1-10

-5

0

5

10

Pitch %

t LV

1

-10

-8

-6

-4

-2

0

2

4

6

8

t LV

1 (

32.1

2%

)

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

w*q

LV

1 (

32.1

2%

)

1 2

3

4 5

6

7

1

234

5

6 7

1

2 3 4

5

6 7

1

2

3 45

6

7

12

3

4 5

6

7

1

2

3

4 5

6

7

Co

ars

e %

Fin

es %

In

ter.

% B

utts %

Pitc

h %

a) b)

c) d)


A B C D E

20 25

Variables50 75 100 125

Observations

162

However, the decrease in butts and increase in coarse % also increases the pitch demand

and should have the opposite effect. As discussed using Figure 80, the samples in

experiments C and E were on the dry side of the optimum pitch demand so these energy

loadings suggest that the changes in formulation were not compensated completely by an

increase in pitch%. Second, in this particular dataset, except for experiment E, the

increase in coarse % was a response to the decrease in butts %. It was not compensated

by the fines fraction as was previously done in the laboratory experiments. The main

differences in the particle size distribution between the coarse and the butts fractions are

the +3/8 inches particles. Only the butts fraction contains this particle size range (Table 5)

which falls in the level 7 detail coefficient sensitivity (Table 13). Figure 82 b) shows that

skewness and kurtosis increase in levels 1-5, but decrease in levels 6-7 which are more

sensitive to the particle size range of the butts fraction.

An interesting point to note in Figure 82 b) is that the samples from experiment D (solid

black arrow) do not clustered in the same region as all the other samples (dashed black

arrow). This indicates that the textural features captured by the LV1 component are

different for the same pitch % compared to the other paste mixes. However, both cluster

showed a positive correlation between pitch % and t1 which means that the same features

combination captures the pitch % but the pitch baseline is not the same. Variations in raw

materials have an effect on the dry aggregate pitch demand. Two pastes samples

manufactured using different coke blends but with the same pitch % may have a different

appearance. The shift for the experiment D samples shown in Figure 82 b) illustrates the

point that trying to predict the pitch content of the paste using the images as was

attempted in the past can lead to erroneous results. The models will not be robust to

change in raw materials. It is more important to determine what combinations of image

textural features are sensitive to pitch demand and build models that have the ability to

predict the extent of the deviation from the OPD instead of trying the absolute value of

pitch ratio.

The second component also captures some of the variations in pitch ratio as shown by the

black dashed arrow in Figure 83 c). However, these variations are orthogonal to those

explained by LV1 (i.e. both components are orthogonal). In addition, the Y loadings

provided in Figure 83 b) show that pitch ratio in LV2 is negatively correlated with all the dry

aggregate formulation variables. This suggests that LV2 mainly captures the variations in

pitch ratio that are independent from the changes in dry aggregate formulation.

163


the designed experiments on pitch ratio: a) LV2 scores, b) LV2 weights and loadings and

c) scatter plots of LV2 scores and pitch %

Furthermore, the trends in the LV2 scores (Figure 83 a) seem related with the variations

introduced in pitch ratio as part of the experimental design (Table 28). This is particularly

clear for the first two days (red and green dots) where an optimum in LV2 clearly appears.

Whether or not these correspond to the optimum pitch demand for the dry aggregates

used in the formulation during the first two days cannot, however, be confirmed. In this

case, the pitch ratio is negatively correlated to the reflectivity of the paste since all energy

features decrease in opposition to the pitch increase. This seems opposite to what was

previously observed for the laboratory paste samples. However, this dataset also contains

changes in raw materials blends which were not explored in previous case studies. It can

be seen in Figure 83 c) that the main correlation between the pitch and the scores of LV2

is due to the difference between each cluster of experiments. However, the local variations

(i.e. black solid arrow) are almost orthogonal to the main variations. In each of these local

variations, the pitch is negatively correlated to the LV2 scores. Hence the energy

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

w*q

LV

2 (

10.7

8%

)

1

2

3

4

5

6 7

1

2

3

45

6

7

1

23 4

5

6

7

1

2

3

4

5

6

7

1

2

3

4

5

67

1

2

3

4

5

6

7

Co

ars

e %

Fin

es %

In

ter.

% Bu

tts %

Pitch

%

a) b)


A B C D E

-6

-4

-2

0

2

4

6

t L

V2

(1

0.7

8%

)

c)

-1 -0.5 0 0.5 1-10

-5

0

5

10

Pitch %

t LV

2

20 25


Observations

164

increases with the pitch % in the local variations within each different experiment. Changes

in raw materials may have also affected paste appearance.

The third component focuses on changes made to the fines % and pitch ratio as indicated

by the Y loadings of LV3 (Figure 84 b). The ellipse labeled #1 in Figure 79 a) and Figure

84 a) illustrate the step change implemented in fines % in experiment E and the impact on

LV3, respectively. For this component, the loading weights of the energy features in the

high frequency levels are negative and so is the fines % loading. This indicates a positive

relationship between fines content and the energy features information, that is, the energy

increases in the high frequency bands when more fines are added to the paste. This is an

indication that the reflectivity of the paste increases in the high frequency due the

reflectivity of the fines as was observed in the laboratory experiments. At the same time as

the pitch and fines decrease, the skewness and correlation in all levels and kurtosis in the

high frequency levels increase which is an indication of more homogeneity in the paste

texture.


the designed experiments on pitch ratio: a) LV3 scores and b) LV3 weights and loadings

Finally, the uncertainties in the scores of the PLS model are presented in Figure 85. The

one standard deviation error bars around the mean score values of the replicated samples

are used to assess reproducibility. As was observed with the normal operation dataset

(7.3.1) the standard deviation of the replicate is smaller than the variability captured by the

component for each LV of the model.

-6

-4

-2

0

2

4

6

t LV

3 (

1.8

1%

)

-0.2

-0.1

0

0.1

0.2

0.3

w*q

LV

3 (

1.8

1%

)

1

2

3

4

56

7

1

23

4

5

67

12

3

4

5

6

71

2

3

4

5

6

71

23

4

5

6

7

1

2

3

45

6

7

Co

ars

e %

Fin

es %

In

ter.

% B

utts %

Pitc

h %

a) b)


A B C D E

1

1

20 25


Observations

165

Figure 85 – Uncertainties in the scores of the PLS model (all sample) for the data obtained

during the design of experiments on pitch ratio): a) LV1, b) LV2 and c) LV3. One standard

deviation error bars of the scores are shown in the figure.

7.4 Joint modelling of image features and paste plant data using

SMB-PLS

Previous work by Lauzon-Gauthier et al. (Lauzon-Gauthier 2011; Lauzon-Gauthier et al.

2012) demonstrated the possibility to use PLS modelling to predict baked anode properties

at the end of the baking cycle from raw material, paste plant and baking furnace data. An

update of this work is also presented in Appendix A The proposed machine vision sensor

provides 42 new measurements (i.e. image features) that can be fused with the data

collected from the plant instrumentation in order to verify if the characterization of the

paste and the prediction of anode properties can be improved further.

-8

-6

-4

-2

0

2

4

6

8

t LV

1

-6

-4

-2

0

2

4

t LV

2

50 100 150

-8

-6

-4

-2

0

2

4

6

8

Samples

t LV

3

a)

b)

c)

A B C D E

166

The new Sequential Multi-block PLS algorithm (SMB-PLS) presented in Chapter 5 was

shown to improve interpretability of large industrial datasets consisting of multiple data

blocks. The new data block containing the paste image textural features adds to the

complexity of the already available data structure. Thus, the use of SMB-PLS is even more

important now that new blocks of data become available.

This section presents a joint analysis of raw material properties, paste plant data and

paste image features using a SMB-PLS model. Both the industrial datasets collected

during normal operation (7.3.1) and the experimentation on pitch ratio (7.3.3) were used to

build the model. The green anode density (GAD) was used as the single Y variable since it

is the only on-line measurement available for anode quality. The baking furnace data were

not included in the model since no baked anode properties were available in datasets

selected for building the SMB-PLS model.

Figure 86 – Data blocks and variables used in the SMB-PLS model for predicting GAD

The multi-block structure of the dataset used to build the SMB-PLS model is presented in

Figure 86. The number of variables within each block is provided in Table 32. The raw

materials block (Z) contains the coke density and particle size distribution and the pitch

physical properties. The impurities in the butts do not have an effect on the green density

and were not included in this model. The formulation block (X1) contains the paste

formulation as well as the dry aggregate and some key dry aggregate fractions particle

size distribution. The X2 block discussed previously in Chapter 5 was split in two blocks

Raw

materials

Classification

of materials

Paste mixing

Image

features

• Coke density

• Coke size distribution

• Pitch physical properties

• Aggregate size distribution

(shift based)

• Paste formulation

• Temperatures

• Mixing power, etc.

• DWT detail coefficients

features

Anode block • GAD

Z

Y

X1

X2

X3

Forming• Bellows pressure

• Anode HeightX4

167

containing the paste mixing conditions (X2) and the forming variables (X4). The image

features (X3) were inserted in the dataset before the forming variables since the sampling

is done prior to the compaction. The prediction dataset (Y) consists of the GAD.

The statistics of the SMB-PLS model are presented in Table 32. The important information

from these statistics is not the high predictive ability of the GAD for these historical data,

but the very small (i.e. 3%) difference between the fit of the model (R2Y) and the prediction

ability (Q2Y). The number of components for each block was selected sequentially based

on the lowest RMSEP and 1% improvement of the Q2Y as described in Chapter 5. The

training dataset contains 2/3 of the observations (anodes) selected randomly from the

original dataset. The rest were used as the validation dataset.

Table 32 – Statistics of the GAD SMB-PLS model

The main result shown in Table 32 is that the paste image features do not add information

for the prediction of the GAD since no component is required from this block to improve

the prediction of Y. This was expected since it was shown in previous work (Lauzon-

Gauthier et al. 2012) that GAD is well predicted by using only the routinely collected raw

material and paste plant data. However, if baked anode properties would have been

available in these experiments (e.g. anode electrical resistivity, baked anode density and

mechanical properties), the image features would have likely provided additional

information. Nevertheless, the fact that 77% of the variance in the features block (X3) fall

in the space of the three previous blocks (Z, X1 and X2) deserves further discussion. The

high degree of correlation between the image features and the raw materials and

formulation blocks is very important for two reasons. First, this validates the sensitivity of

the machine vision sensor to the raw material and formulation variations. Secondly, the

raw material properties (Z) are only available as weekly averages and the aggregate size

distributions (X1) are measured once to three times per 12h shifts. Since the paste image

Block

Number of

variables

Number

of LV R2 (%) Q

2 (%)

Z 15 5 96,66 96,33

X1 18 3 86,47 83,18

X2 17 2 83,93 81,82

X3 42 0 76,96 73,63

X4 2 1 83,93 79,40

Total X 94 11 85,59 82,59

Total Y 1 11 89,19 86,05

168

features are highly correlated with those infrequent measurements, this suggests the

images may inform of changes in raw materials and aggregate size distribution in real-

time, which is a major advantage for using the proposed sensor. Hence, the machine

vision system could be used to monitor and control the paste plant and compensate for

infrequently measured variables.

The amount of variance of each block and GAD explained by the SMB-PLS model as well

as the relative contribution of the regressor blocks in each component are shown in Figure

87. The first observation is that most of the blocks contain some information about

subsequent blocks. This indicates that the blocks are correlated to each other. This is a

nice feature of the SMB-PLS algorithm which allows to quickly quantify the amount of new

information each data blocks add to the model. This cannot be obtained by traditional

multi-block PLS methods.

Figure 87 – Relative contribution (bars) of each regressor block in the SMB-PLS model.

The explained variance of each regressor block R2X (black lines) and of the Y block R2Y

(gray line) are also shown

Furthermore, it is possible that the image features would add information to a multivariate

statistical model of the paste plant data (i.e. raw material and operating conditions) if the

baked anode properties at the optimum pitch demand were used as the predicted variable

instead of the GAD. The laboratory results of the pitch demand experiment with the

laboratory formulation (section 6.4.3) showed sensitivity of the machine vision algorithm to

optimum pitch demand.

Z-1 Z-2 Z-3 Z-4 Z-5 X1-1 X1-2 X1-3 X2-1 X2-2 X4-10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

LV

Rela

tive w

eig

hts

blo

ck R

2X

and R

2Y

Z X1 X2 X3 X4 R2 Z R2 X1 R2 X2 R2 X3 R2 X4 R2 Y

169

To complement the above discussion, some interpretations of the SMB-PLS model is now

provided using plots of the loading weights for the different data blocks. In particular, some

of the correlations between the image features block (X3) and the raw materials and

process blocks (Z, X1 and X2) are analyzed. The loading plots are shown in Figure 88 to

Figure 90.

Figure 88 – Loading weights of the raw material properties (Z) in component LV Z-1

The loading weights of the first LV of the raw material block are presented in Figure 88.

This component (Z-1) explains 46% of the variance of the GAD using only 15% of the

variance of the raw materials block. Note that the variance explained from the subsequent

blocks in this component is correlated with the information extracted from the first block in

the sequence (Z). This explains why only the Z block loadings are interpreted here. Coke

density and coke size distribution seem to be the main drivers for changes in GAD as

indicated by the loading weights (Figure 88).

0 2 4 6 8 10 12 14 16

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Variables

w Z

blo

ck L

V Z

-1 (

46.4

2%

)

Cok

e re

al d

ens

Cok

e 28

/48 ap

p de

ns

Coke RT 4

Coke RT 8

Coke R

T 14

Coke R

T 30

Cok

e RT 5

0

Cok

e RT 1

00

Cok

e RT 2

00

Pitc

h SP

Pitch

TS

Pitc

h Beta

Pitch Q

IPitch CV

Pitch dist

170

Figure 89 – Block weights of LV Z-2: a) raw material (Z) and b) image features (X3)

The loadings of the second raw material block component (Z-2) are presented in Figure

89. This component captures 4 % of the variance in GAD, and 24 % and 17 % of the

variance in the raw material and the image features blocks, respectively. The main drivers

in this latent variable are a combination of coke particle size distribution and, most

importantly, pitch properties as shown in Figure 89 a). The paste’s pitch demand is

positively correlated with the Pitch QI (Hulse 2000). For these experiments, this change in

QI and its effect on the paste is captured by the higher energy in the high frequency band

(1-3) and lower energy in the lower frequencies (4-7) (Figure 89 b). Also, the skewness

which is an indication of inhomogeneity increases.

0 5 10 15 20 25 30 35 40

-0.2

-0.1

0

0.1

0.2

0.3

Variables

w X

3 b

lock L

V Z

-2 (

3.6

6%

)

1 23

45

67

1

23 4

5

6

7

1

2

3

4

5

67

1 2

3

4 5

67

1 2 3

4

56

7 1

2

3 45

67

b)

0 2 4 6 8 10 12 14 16

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Variables

w Z

blo

ck L

V Z

-2 (

3.6

6%

)

Coke real dens

Coke 28/48 app dens

Cok

e RT 4

Cok

e RT 8

Cok

e RT 1

4

Cok

e RT 3

0

Cok

e RT 5

0

Cok

e RT 1

00

Coke R

T 200

Pitc

h SP

Pitc

h TS

Pitc

h Bet

a

Pitch QI

Pitch CV

Pitc

h dist

a)


171

Figure 90 – Block weights for LV X1-2: a) formulation (X1) and b) image features (X3)

Finally, the loadings of the second component of the formulation block LV X1-2 is

presented in Figure 90. The X1-2 component captures an additional 4% of the variance in

the GAD, and 6% and 2% of the variance of the formulation and the image features

blocks, respectively. This LV focuses mostly on changes in the dry aggregate particle size

distribution. The loading of the coarse fraction is negative while those of the butts % and

pitch % are positive (Figure 90 a). This relationship between the three variables has been

well explained in this chapter and is again captured by the SMB-PLS model. The positive

correlation between the butts largest sizes (Rt 3/8 (+3/8) and Rt4 (-3/8/+4)) and the

amount of large particles (Rt 3/8) in the dry aggregate is also captured by this LV. This

component clearly models the increase in the coarseness of the anode paste and it is

correlated with an increase in the energy of the low frequency detail coefficients number 5

to 7 (Figure 90 b) which correspond to a detail size of 1.3 mm to 10.4 mm (i.e. the coarsest

fraction of the anode paste).

7.5 Conclusions

It was easier to develop the machine vision algorithm with laboratory paste and anodes

due to the more controlled and large span of formulation changes possible. The objective

of this thesis, however, is to apply this new measurement method on a real life industrial

0 5 10 15 20 25 30 35 40

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

Variables

w X

3 b

lock L

V X

1-2

(4.2

7%

)

1

2

3

4

5

6 7

1

2 3

4 5

67

1

2

3

4

5

6

7

1

23

4

5

6

7

12

3

4

5

6 7 1

2

3

4 56 7

b)

a)


0 2 4 6 8 10 12 14 16 18-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Variables

w X

1 b

lock L

V X

1-2

(4

.27%

)

Past

e (tph

)

Coarse %

Fin

es %

Inte

r. % B

utts %

Pitc

h %

Gre

en re

cyc %

Fines

rot v

alve

spe

ed

But

ts R

t3/8

+Rt4

Coarse R

t4

Coarse R

t8

Inter Rt50+R

t100

Fines Pt200

Agg

Rt3

/8

Agg

Rt4

@Rt3

0

Agg R

t50+Rt100

Agg R

t200+Pt200

Agg P

t200

172

anode manufacturing process. Therefore, it was necessary to test this sensor on real

industrial paste variations.

Industrial paste contains day to day variations in formulation, changes in raw materials and

other normal process variability. The objectives of the chapter were to validate the

sensitivity and robustness of the machine vision algorithm. Also the understanding of the

texture features gained from the laboratory experiments presented in Chapter 6 were used

to interpret the texture variations of the industrial paste images. A second objective was to

verify if the paste image texture could add new information or complement existing

measurements. A multivariate statistical model of the paste plant using the SMB-PLS

algorithm developed in Chapter 5 was also built after fusion of plant and imaging sensor

data.

Three different datasets containing changes in paste formulation, different raw materials,

pitch design of experiments as well as a plant start-up transient operation were used to

test the sensitivity of machine vision sensor to the industrial variability. The sensor was

found sensitive to normal operation formulation variability. It could capture change in daily

butts/coarse % variations and its associated change in pitch level due to the effect of the

dry aggregate formulation on pitch demand. It could also capture independent variations of

the intermediate %. The variations of pitch and formulation within each day of the

experiment could also be observed with the image features. In addition, the transient state

of the paste during the start-up was tracked by the image features and this suggests that

the sensor is sensitive to fast variations of the paste appearance.

Then, the sensitivity to the variations in pitch was tested with six pitch optimization

experiments. In this case, the dry aggregate formulation variations as well as the pitch

changes could be detected using the image features. The interpretation of the texture

features using the PLS models weights is not as straightforward as with the laboratory

samples. However, the energy was still found sensitive to the pitch % in the paste and the

skewness and kurtosis to changes in formulation.

In an effort to obtain a quantitative measurement of the optimum pitch demand, the BAD of

anode core sample was measured on a few anodes spanning the range of tested pitch

level for three experiments. Green core sample of anodes were baked in controlled

conditions to obtain representative measurement of the BAD. Unfortunately, all the anodes

measured where on the dry side of the optimum pitch demand and the range of pitch

173

variations was not large enough to reach the optimum. Additional work is needed to verify

if the machine vision is sensitive to the optimum pitch demand with the industrial paste.

Furthermore, the repeatability of the image features was studied using replicates of the

paste sample. It was found that the range of variability from the replicate is smaller than

the process variations measured by each component of the statistical models for both the

normal operation and pitch variations experiments.

Finally, a SMB-PLS model of the raw material, paste plant operating conditions and

images features combined for the prediction of the GAD was presented. This was used to

verify the correlation of the machine vision sensor with the raw material and formulation

variables. The results have shown that the sensor could compensate for the lack of real-

time raw material properties and particle size distribution data.

175

Chapter 8 Conclusions and recommendations

The most important aspect of the anode quality regarding the performance of the

aluminium reduction cells are their consistency. This has become a challenge in the last

few years with the increasing variability of incoming raw material to the carbon plant due to

the degrading crude oil quality and the increasing frequency of supplier changes and

blending of different cokes. The lack of real-time quality monitoring and control of the

green and baked anodes, as well as the lack of sensors for the most important raw

material and process parameters (e.g. variations in particles size distribution, OPD, etc.)

impairs the plants ability to compensate for this increased variability.

The general objective of this thesis was to address some of the issues related with the lack

of real-time quality control of the anode quality and the lack of fast and relevant

measurements to cope with raw materials and process variability.

Based on multiple industrial machine vision applications, a non-destructive image texture

analysis algorithm was developed to track changes in the anode paste texture (visual

appearance) and eventually relate this information to the anode paste quality. The anode

paste was the focus of this machine vision application because it enables the fast

sampling of large proportions of paste compared to the use of images from formed anode

surfaces. Also, sampling the paste minimizes the delays if the measurements are to be

used in a feedback control strategy.

A new sequential multi-block PLS algorithm was also developed for improving the

interpretation of existing empirical latent variable models of the baked anode quality and

the perspective of adding new real-time measurement from the carbon plant to this model

in the future.

This chapter discusses the main conclusions and contributions from the specific objectives

and presents some future perspectives on the real-time quality control of the carbon plant.

8.2 Development of the machine vision sensor

The first specific objective was the development of a machine vision sensor using

laboratory scale paste and pressed anodes. This new sensor needed to be sensitive to

changes in formulation and in the pitch demand of the paste. The laboratory scale paste

was used since it enables more control on the paste formulation and a wider range of

176

variations. These datasets were also used to understand the relationship between the

variations in the paste the image texture features.

The anode paste appearance is very dark and has low contrast variations. In addition, the

relevant changes in the paste formulation affecting the paste surface appearance were

found to modify the size of the objects in the paste (finer/coarser), its degree of roughness

(rougher/smoother) and homogeneity. Hence, the use of image texture features seemed

the most appropriate approach to characterize the paste visual appearance variability.

A combination of preprocessing, wavelet type, GLCMs parameters and textural features

were tested using a dataset of images coming from a pitch optimization experiment on two

different cokes. The best preprocessing option was found to consist of using contrast

enhancement of the images by adjusting the saturation of the extreme values of pixel gray

level intensity distribution. This removed some of the light intensity variation from sample

to sample illumination variations. It also improved the differentiation of the paste samples

based on the optimum pitch demand. A set of six features computed from seven DWT

decomposition levels (i.e. 42 features in total) were found to provide optimum prediction

and clustering of the anode at the optimum pitch demand (OPD). The optimum pastes

were characterized by the concentration of the textural information in a particular

frequency band (i.e. decomposition level #4) and resulted in similar texture features

combinations for both type of coke even if the optimum BAD and pitch % was not the

same.

The pitch demand dataset was used to develop the algorithm, but the other two dataset

were used to understand the relationship between the features and the paste variations.

The first dataset included variations in pitch ratio and in fines ratio. Both an increase in

pitch and fines caused the paste to be more reflective. However, the pitch increased the

inhomogeneity of the paste in the high frequency details due to the increased specular

reflection of the pitch on the surface of the particles, but the fines had the opposite effect of

smoothing the paste image texture. Both phenomena could be captured by different latent

variables (components) of the PLS regression models between the image features and

formulation variables. The other dataset was designed to incorporate additional sources of

pitch demand variations such as change in formulation, mixing temperature, fines

fineness, etc. The interpretation of these results was consistent with the first dataset. An

177

increase in pitch had the same effect as a decrease in pitch demand (e.g. addition of low

porosity shot coke) and resulted in less homogeneity in the high frequencies.

The ability to detect the OPD for two different type of coke is a major contribution to the

characterization of the paste quality since it is the first time that it is reported. The

understanding of the relationship between the process variations and the change in the

image texture is also an important contribution. In addition, the features used in the paste

quality sensor were already reported in the literature, but this particular combination of

features was not reported before. Finally, no reliable machine vision application on any

type of paste was ever reported prior to this work.

8.3 Sensitivity and robustness to industrial paste

The final objective of the project was to develop a sensor for real-time paste quality

monitoring and control. It is not enough to show that it is sensitive to laboratory paste

quality. It was thus important to test the sensitivity and robustness of the machine vision

sensor on real industrial samples.

Three different datasets containing normal operation variations, plant start-up and

designed pitch variations were collected from the Alcoa Deschambault smelter’s paste

plant. It was shown that the sensor could capture the pitch demand variations due the

change in formulation in butts and coarse fraction and its effect on the pitch % (i.e. less

homogeneity). Also independent pitch variations introduced by the formulation could be

captured independently by a different combination of features (i.e. smoother paste). The

sensor could also differentiate variations in the fines % and intermediate fractions using

different and orthogonal latent variables (i.e. they were driven by different combinations of

features).

The selection of the BAD to determine the optimum pitch demand of the dry aggregates

was validated using laboratory analysis based on four anode properties measured on

baked green core samples collected during two of the pitch variation experiments.

Finally, the major contribution from this work is the validation that it is possible to quantify

the variability in the paste plant from various process parameters using images of the

paste.

178

8.4 SMB-PLS algorithm

The development of the new sequential multi-block algorithm is a major fundamental

contribution in this thesis. The idea of this new method arose from the difficulty to interpret

PLS models containing many variables (e.g. 100) from different process units for process

monitoring and troubleshooting. Multi-block PLS methods already exist in the literature and

some of them were developed more than 20 years ago in order to simplify the

interpretation of complex multi-block data sets. However they have some drawbacks. The

mixing of information between the blocks and the absence of explicit consideration of the

correlations between the data blocks can all lead to misleading interpretation. In addition,

some multi-block methods simply remove the between blocks correlated variations which

results in loss of information for interpretation purposes.

The most important improvement of the new algorithm is that correlated information

between a given block and subsequent ones is captured in the same latent variable space

as opposed to the orthogonal space which is captured by other components. When used

for the investigation of a process dataset, this feature highlights the effect of raw materials

on downstream process operating conditions and the effect of control loops between

variables from different blocks. Also each new block in this sequential approach only

contains new information so these components focus the interpretation on the most

important parameters without interference from previous blocks. Another key feature is the

possibility to select different number of components for each block, which is particularly

useful when the blocks have very different statistical ranks.

The modeling performance and interpretation of the new SMB-PLS were illustrated using

two datasets. First, the simulated polymer film blowing process dataset contained two

different case studies, one without and the other with correlation between the blocks, that

were used to illustrate the pathway orthogonalization properties of the SMB-PLS. As

opposed to the MB-PLS algorithm, the correlated variations due to the feedback control

actions were captured by a different set of latent variables than the orthogonal variability.

Second, the anode manufacturing dataset was used to validate the new algorithm on a

real life dataset. It was again proven that the distribution of information contained in each

latent variable was different than with the MB-PLS algorithm in which it is not possible to

differentiate correlated and orthogonal information. Improvements in the interpretation

were also illustrated using the scores and block weights of the different algorithms using

the anode dataset.

179

Finally, the SMB-PLS algorithm was used to validate the correlation between the raw

material properties, paste plant operating conditions and the paste image features. The

dataset was used to predict the GAD. It was found that the images features variations

correlated with the GAD were also correlated to raw material and process conditions

validating the sensitivity of the sensor to paste plant variability. It was also used to

demonstrate the possible use of the SMB-PLS model when quantitative measurements of

the optimum pitch demand and baking furnace data will be available.

8.5 Recommendations

8.5.1 Multivariate monitoring and control

The importance and use of multi-block PLS models will increase in the future due to the

size of the databases which continuously gain in complexity and size. For prediction only,

the usual PLS models are still the most effective and simple tools to use. However, for

troubleshooting and understanding, the multi-block models become more useful. As the

number of real-time measurements increases with the development on new non-

destructive sensors, the necessity of having access to good and reliable multi-block

methods will be important.

The focus of the presented multi-block results have been on the interpretation but many

more aspect of this algorithm remains to be tested:

• Fault detection ability compared to normal PLS and MB-PLS methods

• Using the orthogonalization for better selection of multivariate specification on raw

material properties and process operating conditions

• Implementing block based monitoring or control schemes in the latent variable

space

• Use SMB-PLS for other types of problems such as batch process analysis and

monitoring, where each batch phase could be regarded as different blocks.

Therefore, the effect of batch a trajectory deviations in a specific block on the rest

of that batch could be distinguished from the within phases variations.

• Comparing different approaches for the selection of the number of latent variables.

A sequential approach was used in this thesis, but a more global approach could

180

be used by testing all possible combination of number of components for each

block.

For a monitoring strategy in the carbon plant, latent variable models should be

implemented to detect major changes in raw material properties, the coke fractions and

dry aggregate particles size distribution and in the combination of baked anode properties.

The current practice is to use univariate statistical process control (SPC) to detect

abnormal situations which is time consuming and inefficient due to the large number of

variables.

An additional research area that still has to be studied is the use of latent variable model in

optimization and control strategies. For example, a PLS model between all the paste plant

data (X) and the image features (Y) could be used to compute the necessary combination

of change to adopt in X to compensate for a deviation measured in the features from the

target Y scores. This could be a good implementation of the SMB-PLS model because X

would be a mix of variable that cannot be manipulated and other that can be changed.

8.5.2 Real-time paste quality measurement

The laboratory results have shown the possibility to measure the optimum pitch demand

for two different cokes. The sensor could not only differentiate between under pitched

versus over pitched anodes, but the combination of image features was similar for both

cokes at the optimum pitch demand. This indicates that there is an opportunity to use the

sensor in a feedback control system to adjust in real-time the amount of pitch in the paste

in response to raw material variability. Additional laboratory work is needed to test if the

paste texture features are similar at the OPD for a different formulation as well. Then, it

would enable control of the pitch % in response to changes in formulation. It is suggested

that plant trials are conducted to find the optimum pitch level based on a full pitch

optimization procedure. It will be the only way to verify that the laboratory results (i.e.

detection of the OPD) can also be repeated in the industrial paste plant.

For industrial implementation, it is necessary to develop an automatic paste sampling and

imaging device. Manual sampling could be sufficient for an off-line industrial use, but an

automatic method with a fast sampling rate for each anode produced can enable the use

of the machine vision sensor for real-time monitoring or control. Then the machine vision

control scheme could be used for many applications:

181

1. Off-line monitoring during raw material changes

2. Off-line monitoring of design of experiment variation in pitch, formulation and

process operating conditions

3. On-line monitoring of target texture

4. On-line control of the pitch ratio in the paste

5. On-line multivariate control of both the formulation and pitch % in response to raw

material variations

The success of long term implementation of the sensor will depend more on the

robustness of the automatic sampling and imaging procedure than on the analysis

algorithm itself. Representative sampling and consistent and uniform illumination are some

of the challenges of industrial imaging applications. For the anode paste machine vision,

the fumes emitted from the hot paste could be an issue for the illumination and image

quality as it may cause sticking of volatile compounds on the lights and lens surfaces.

However, devices using compressed air already exist to keep the lens surface clean. Also,

if the images cannot be obtained directly from an existing conveyor, the design of an

automatic paste sampler will need to be self cleaning to avoid paste clogging.

At last, the industrial samples collected show that the paste plant is operated on the dry

side of the optimum pitch demand. An effort should be made to quantify the cost of

consistently using anodes that are below the optimum. This would help build a case study

for the implementation of this machine vision sensor in real-time in the process. The

control of this optimum pitch demand is probably one of the most important tools needed

to face the current paste plant variability challenges.

183

Bibliography

Adams, A.N. et al., 2009. Personal Communication,

Adams, A.N., Coleman, D.E. & Blake, R.A., 2007. Personal Communication,

Adams, A.N., Mathews, J.P. & Schobert, H.H., 2002. The Use of Image Analysis for the Optimization of Pre-Baked Anode Formulation. In Light Metals 2002. TMS, pp. 547–552.

Antonini, M. et al., 1992. Image coding using wavelet transform. IEEE Transactions on Image Processing, 1(2), pp.205–220.

Azari Dorcheh, K., 2013. Investigation of the materials and paste relationships to improve forming process and anode quality. Ph.D. Thesis. Université Laval.

Azari, K. et al., 2013. Mixing variables for prebaked anodes used in aluminum production. Powder Technology, 235, pp.341–348.

Baron, J.T., McKinney, S.A. & Wombles, R.H., 2009. Coal tar pitch – past, present, and future. In Light Metals 2009. TMS, pp. 935–939.

Belitskus, D., 1993. An Evaluation of Relative Effects of Coke, Formulation, and Baking Factors on Aluminum Reduction Cell Anode Performance. Light Metals 1993, pp.677–681.

Belitskus, D., 1981. Effect of carbon recycle materials on properties of bench scale prebaked anodes for aluminum smelting. Metallurgical Transactions B, 12, pp.135–139.

Belitskus, D., 1978. Effects of Coke and Formulation Variables on Fracture of Bench Scale Prebaked Anodes for Aluminum Smelting. Metallurgical Transactions B, 9, pp.705–710.

Belitskus, D., 2013. Effects of Mixing Variables and Mold Temperature on Prebaked Anode Quality. In A. Tomsett & J. Johnson, eds. Essential Readings in Light Metals. John Wiley & Sons, Inc., pp. 328–332.

Belitskus, D. & Danka, D.J., 1988. The effects of petroleum coke properties on carbon anode quality. JOM, 40(11), pp.28–29.

Bharati, M.H., Liu, J.J. & MacGregor, J.F., 2004. Image texture analysis: methods and comparisons. Chemometrics and intelligent laboratory systems, 72(1), pp.57–71.

Bruno, L., Parla, G. & Celauro, C., 2012. Image analysis for detecting aggregate gradation in asphalt mixture from planar images. Construction and Building Materials, 28(1), pp.21–30.

Burnham, A.J., MacGregor, J.F. & Viveros, R., 1999. Latent variable multivariate regression modeling. Chemometrics and Intelligent Laboratory Systems, 48(2), pp.167–180.

184

Burnham, A.J., Viveros, R. & MacGregor, J.F., 1996. Frameworks for latent variable multivariate regression. Journal of Chemometrics, 10(1), pp.31–45.

Charmier, F., Martin, O. & Gariepy, R., 2015. Development of the AP Technology Through Time. JOM, 67(2), pp.336–341.

Chen, S.S., Keller, J.M. & Crownover, R.M., 1993. On the calculation of fractal features from images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 15(10), pp.1087 –1090.

Chui, C.K., 1992. An Introduction to Wavelets, Academic Press.

Clausi, D.A., 2002. An analysis of co-occurrence texture statistics as a function of grey level quantization. Canadian Journal of Remote Sensing, 28(1), pp.45–62.

Cross, G.R. & Jain, A.K., 1983. Markov Random Field Texture Models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-5(1), pp.25 –39.

Dayal, B.S. & MacGregor, J.F., 1997. Improved PLS algorithms. Journal of Chemometrics, 11(1), pp.73–85.

Debnath, L. & Shah, F.A., 2015. Wavelet Transforms and Their Applications, Boston, MA: Birkhäuser Boston.

Dequiedt, A.-S. et al., 2001. Study of phase dispersion in concrete by image analysis. Cement and Concrete Composites, 23(2–3), pp.215–226.

Duchesne, C., 2000. Improvement of processes and product quality through multivariate data analysis. Ph.D. Thesis. McMaster University.

Duchesne, C., 2010. Multivariate Image Analysis in Mineral Processing. In D. Sbárbaro & R. del Villar, eds. Advanced Control and Supervision of Mineral Processing Plants. Advances in Industrial Control. Springer London, pp. 85–142.

Duchesne, C., Liu, J.J. & MacGregor, J.F., 2012. Multivariate image analysis in the process industries: A review. Chemometrics and Intelligent Laboratory Systems, 117, pp.116–128.

Duchesne, C. & MacGregor, J.F., 2004. Establishing Multivariate Specification Regions for Incoming Materials. Journal of Quality Technology, 36(1), pp.78–94.

Duchesne, C. & MacGregor, J.F., 2001. Jackknife and bootstrap methods in the identification of dynamic models. Journal of Process Control, 11(5), pp.553–564.

Edwards, L. et al., 2012. Evolution of anode grade coke quality. In Light Metals 2012. TMS.

Edwards, L. et al., 2009. Use of shot coke as an anode raw material. In Light Metals 2009. TMS, pp. 985–990.

Eilertsen, J.L. et al., 1996. An automatic image analysis of coke texture. Carbon, 34(3), pp.375–385.

185

Eriksson, L. et al., 2001. Multi-and megavariate data analysis: principles and applications,

Facco, P. et al., 2010. Automatic characterization of nanofiber assemblies by image texture analysis. Chemometrics and Intelligent Laboratory Systems, 103(1), pp.66–75.

Facco, P. et al., 2009. Monitoring roughness and edge shape on semiconductors through multiresolution and multivariate image analysis. AIChE Journal, 55(5), pp.1147–1160.

Fischer, W.K. et al., 1995. Anodes for the aluminium industry, R & D Carbon Ltd.

Fischer, W.K. et al., 1993. Baking Parameters and the Resulting Anode Quality. In Light Metals 1993. TMS, pp. 683–694.

Fischer, W.K. & Perruchoud, R.C., 1985. Influence of Coke Calcining Parameters on Petroleum Coke Quality. In Light Metals 1985. TMS, pp. 811–826.

Fischer, W.K. & Perruchoud, R.C., 1991. Interdependence Between Properties of Anode Butts and Quality of Prebaked Anodes. In Light Metals 1991. TMS, pp. 721–724.

Galloway, M.M., 1975. Texture analysis using gray level run lengths. Computer Graphics and Image Processing, 4(2), pp.172–179.

García-Muñoz, S. & Carmody, A., 2010. Multivariate wavelet texture analysis for pharmaceutical solid product characterization. International Journal of Pharmaceutics, 398(1-2), pp.97–106.

Geladi, P. & Kowalski, B.R., 1986. Partial least-squares regression: a tutorial. Analytica Chimica Acta, 185, pp.1–17.

Gonzalez, R.C. & Woods, R.E., 2008. Digital Image Processing, Prentice Hall.

Gosselin, R. et al., 2009. Potential of Hyperspectral Imaging for Quality Control of Polymer Blend Films. Industrial & Engineering Chemistry Research, 48(6), pp.3033–3042.

Gosselin, R., Duchesne, C. & Rodrigue, D., 2008. On the characterization of polymer powders mixing dynamics by texture analysis. Powder Technology, 183(2), pp.177–188.

Grahn, H. & Geladi, P., 2007. Techniques and applications of hyperspectral image analysis, Chichester England; Hoboken NJ: J. Wiley.

Grégoire, F., Gosselin, L. & Alamdari, H., 2013. Sensitivity of Carbon Anode Baking Model Outputs to Kinetic Parameters Describing Pitch Pyrolysis. Industrial & Engineering Chemistry Research, 52(12), pp.4465–4474.

Grjotheim, K. & Kvande, H., 1993. Introduction to Aluminium Electrolysis: Understanding the Hall-Hérloult Process 2nd ed., Düsseldorf, Germany: Aluminium-Verlag.

186

Hanafi, M. et al., 2006. Common components and specific weight analysis and multiple co-inertia analysis applied to the coupling of several measurement techniques. Journal of Chemometrics, 20(5), pp.172–183.

Haralick, R.M., Shanmugam, K. & Dinstein, I., 1973. Textural Features for Image Classification. IEEE Transactions on Systems, Man and Cybernetics, 3(6), pp.610–621.

Hassani, S. et al., 2012. Model validation and error estimation in multi-block partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 117, pp.42–53.

Höskuldsson, A., 2008. Multi-block and path modelling procedures. Journal of Chemometrics, 22(11-12), pp.571–579.

Höskuldsson, A., 2014. Path regression models and process control optimisation. Journal of Chemometrics, 28(4), pp.235–248.

Höskuldsson, A., 1988. PLS regression methods. Journal of Chemometrics, 2(3), pp.211–228.

Höskuldsson, A. & Svinning, K., 2006. Modelling of multi-block data. Journal of Chemometrics, 20(8-10), pp.376–385.

Hulse, K.L., 2000. Anode manufacture: raw materials, formulation and processing parameters, Sierre, Switzerland: R&D Carbon Ltd.

Jentoftsen, T. et al., 2009. Correlation between anode properties and cell performance. In Light Metals. pp. 301–304.

Jones, S.S., 1986. Anode-Carbon Usage in the Aluminum Industry. In J. D. Bacha, J. W. Newman, & J. L. White, eds. Petroleum-Derived Carbons. Washington, DC: American Chemical Society, pp. 234–250.

Jørgensen, K. et al., 2004. A comparison of methods for analysing regression models with both spectral and designed variables. Journal of Chemometrics, 18(10), pp.451–464.

Jørgensen, K., Mevik, B.-H. & Næs, T., 2007. Combining designed experiments with several blocks of spectroscopic data. Chemometrics and Intelligent Laboratory Systems, 88(2), pp.154–166.

Jørgensen, K. & Næs, T., 2008. The use of LS–PLS for improved understanding, monitoring and prediction of cheese processing. Chemometrics and Intelligent Laboratory Systems, 93(1), pp.11–19.

Keller, F. & Sulger, P.O., 2008. Anode Baking: Baking of Anodes for the Aluminum Industry 2nd ed., Sierre, Switzerland: R&D Carbon Ltd.

Kohonen, J. et al., 2008. Multi-block methods in multivariate process control. Journal of Chemometrics, 22(3-4), pp.281–287.

187

Kourti, T., 2005. Application of latent variable methods to process control and multivariate statistical process control in industry. International Journal of Adaptive Control and Signal Processing, 19(4), pp.213–246.

Kourti, T. & MacGregor, J.F., 1995. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemometrics and Intelligent Laboratory Systems, 28(1), pp.3–21.

Kourti, T., Nomikos, P. & MacGregor, J.F., 1995. Analysis, monitoring and fault diagnosis of batch processes using multiblock and multiway PLS. Journal of Process Control, 5(4), pp.277–284.

Lauzon-Gauthier, J., 2011. Multivariate latent variable modelling of the pre-baked anode manufacturing process used in aluminum smelting. M.Sc. Thesis. Québec city, Canada: Laval University.

Lauzon-Gauthier, J., Duchesne, C. & Tessier, J., 2012. A Database Approach for Predicting and Monitoring Baked Anode Properties. JOM Journal of the Minerals, Metals and Materials Society, 64(11), pp.1334–1342.

Lauzon-Gauthier, J., Duchesne, C. & Tessier, J., 2013. Diagnosing Changes in Baked Anode Properties using a Multivariate Data-Driven Approach. In B. A. Sadler, ed. Light Metals 2013. John Wiley & Sons, Inc., pp. 1219–1223.

Lauzon-Gauthier, J., Duchesne, C. & Tessier, J., 2014. Texture Analysis of Anode Paste Images. In J. Grandfield, ed. Light Metals 2014. John Wiley & Sons, Inc., pp. 1123–1126.

Liu, J., 2005. Machine Vision for Process Industries: Monitoring, Control, and Optimization of Visual Quality of Processes and Products. Hamilton, Ont., Canada, Canada: McMaster University.

Liu, J.J. et al., 2005. Flotation froth monitoring using multiresolutional multivariate image analysis. Minerals Engineering, 18(1), pp.65–76.

Liu, J.J. & Han, C., 2011. Wavelet texture analysis in process industries. Korean Journal of Chemical Engineering, 28(9), pp.1814–1823.

Liu, J.J. & MacGregor, J.F., 2006. Estimation and monitoring of product aesthetics: application to manufacturing of “engineered stone” countertops. Machine Vision and Applications, 16(6), pp.374–383.

Liu, J.J. & MacGregor, J.F., 2008. Froth-based modeling and control of flotation processes. Minerals Engineering, 21(9), pp.642–651.

Liu, J.J. & MacGregor, J.F., 2005. Modeling and Optimization of Product Appearance: Application to Injection-Molded Plastic Panels. Industrial & Engineering Chemistry Research, 44(13), pp.4687–4696.

Liu, J.J. & MacGregor, J.F., 2007. On the extraction of spectral and spatial information from images. Chemometrics and intelligent laboratory systems, 85(1), pp.119–130.

188

Livens, S. et al., 1997. Wavelets for texture analysis, an overview. In , Sixth International Conference on Image Processing and Its Applications, 1997. , Sixth International Conference on Image Processing and Its Applications, 1997. pp. 581–585 vol.2.

MacGregor, J.F. et al., 1994. Process monitoring and diagnosis by multiblock PLS methods. AIChE Journal, 40(5), pp.826–838.

MacGregor, J.F. & Kourti, T., 1995. Statistical process control of multivariate processes. Control Engineering Practice, 3(3), pp.403–414.

Maillard, P., 2003. Comparing Texture Analysis Methods through Classification. Photogrammetric Engineering & Remote Sensing, 69(4), pp.357–367.

Mallat, S.G., 1989. A theory for multiresolution signal decomposition: the wavelet representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 11(7), pp.674 –693.

Mannweiler, U., Fischer, W.K. & Perruchoud, R.C., 2009. Carbon products: A major concern to aluminum smelters. In Light Metals 2009. TMS, pp. 909–911.

Mannweiler, U. & Keller, F., 1994. The design of a new anode technology for the aluminium industry. JOM, (46), pp.15–21.

Martens, H., 2001. Reliable and relevant modelling of real world data: a personal account of the development of PLS Regression. Chemometrics and Intelligent Laboratory Systems, 58(2), pp.85–95.

Materka, A., Strzelecki, M. & others, 1998. Texture analysis methods–a review. Technical university of lodz, institute of electronics, COST B11 report, Brussels, pp.9–11.

McHenry, E.R., Baron, J.T. & Krupinski, K.C., 1998. Development of Anode Binder Pitch Laboratory Characterization Methods. LIGHT METALS-WARRENDALE-, pp.769–778.

Menichelli, E. et al., 2014. SO-PLS as an exploratory tool for path modelling. Food Quality and Preference, 36, pp.122–134.

Næs, T. et al., 2011. Path modelling by sequential PLS regression. Journal of Chemometrics, 25(1), pp.28–40.

Nomikos, P. & MacGregor, J.F., 1995. Multivariate SPC Charts for Monitoring Batch Processes. Technometrics, 37(1), pp.41–59.

Prasad, L. & Iyengar, S.S., 1997. Wavelet Analysis with Applications to Image Processing, CRC Press.

Prats-Montalbán, J.M. et al., 2009. Prediction of skin quality properties by different Multivariate Image Analysis methodologies. Chemometrics and Intelligent Laboratory Systems, 96(1), pp.6–13.

189

Prats-Montalbán, J.M., de Juan, A. & Ferrer, A., 2011. Multivariate image analysis: A review with applications. Chemometrics and Intelligent Laboratory Systems, 107(1), pp.1–23.

Reis, M.S. & Bauer, A., 2010. Image-based classification of paper surface quality using wavelet texture analysis. Computers & Chemical Engineering, 34(12), pp.2014–2021.

Reis, M.S. & Bauer, A., 2009. Wavelet texture analysis of on-line acquired images for paper formation assessment and monitoring. Chemometrics and Intelligent Laboratory Systems, 95(2), pp.129–137.

Rioul, O. & Vetterli, M., 1991. Wavelets and signal processing. IEEE signal processing magazine, pp.14–38.

Rorvik, S., Ratvik, A.P. & Foosnaes, T., 2006. Characterisation of green anode materials by image analysis. Light metals, pp.553–558.

Sadler, B.A., 2012. Diagnosing Anode Quality Problems Using Optical Macroscopy. In Light Metals 2012. TMS, pp. 1289–1292.

Sarkar, T.K. et al., 1998. A tutorial on wavelets from an electrical engineering perspective. I. Discrete wavelet techniques. IEEE Antennas and Propagation Magazine, 40(5), pp.49–68.

Scheunders, P. et al., 1997. Wavelet-based Texture Analysis. Int. Journal of Computer Science and Information Management, Special issue on Image Processing (IJCSIM, 1.

Selvan, S. & Ramakrishnan, S., 2007. SVD-Based Modeling for Image Texture Classification Using Wavelet Transformation. IEEE Transactions on Image Processing, 16(11), pp.2688–2696.

Sinclair, K.A. & Sadler, B.A., 2006. Improving carbon plant operations through the better use of data. In Light Metals 2006. TMS, pp. 577–582.

Sinclair, K.A. & Sadler, B.A., 2009. Which strategy to use when sampling anodes for coring and analysis? Start with how the data will be used. In Light Metals 2009. TMS, pp. 1037–1041.

Smilde, A.K., Westerhuis, J.A. & de Jong, S., 2003. A framework for sequential multiblock component methods. Journal of Chemometrics, 17(6), pp.323–337.

Soh, L.-K. & Tsatsoulis, C., 1999. Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE Transactions on Geoscience and Remote Sensing, 37(2), pp.780–795.

Sonka, M., Hlavac, V. & Boyle, R., 2008. Image processing, analysis, and machine vision Third., Thompson Learning.

Srinivasan, G.N. & Shobha, G., 2008. Statistical texture analysis. In Proceedings of world academy of science, engineering and technology. pp. 1264–1269.

190

Stark, H.-G., 2005. Wavelets and signal processing: an application-based introduction, Berlin ; New York: Springer.

Sun, C. & Wee, W.G., 1983. Neighboring gray level dependence matrix for texture classification. Computer Vision, Graphics, and Image Processing, 23(3), pp.341–352.

Tabereaux, A., 2000. Prebake cell technology: A global review. JOM, 52(2), pp.23–29.

Tessier, J. et al., 2008. Estimation of alumina content of anode cover materials using multivariate image analysis techniques. Chemical Engineering Science, 63(5), pp.1370–1380.

Tessier, J., Duchesne, C., Tarcy, G.P., et al., 2011. Multivariate Analysis and Monitoring of the Performance of Aluminum Reduction Cells. Industrial & Engineering Chemistry Research, 51(3), pp.1311–1323.

Tessier, J., Duchesne, C. & Bartolacci, G., 2007. A machine vision approach to on-line estimation of run-of-mine ore composition on conveyor belts. Minerals Engineering, 20(12), pp.1129–1144.

Tessier, J., Duchesne, C. & Tarcy, G.P., 2011. Multiblock Monitoring of Aluminum Reduction Cells Performance. In S. J. Lindsay, ed. Light Metals 2011. John Wiley & Sons, Inc., pp. 407–412.

Usevitch, B.E., 2001. A tutorial on modern lossy wavelet image compression: foundations of JPEG 2000. Signal Processing Magazine, IEEE, 18(5), pp.22–35.

Valle, S., Li, W. & Qin, S.J., 1999. Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a Comparison to Other Methods†. Ind. Eng. Chem. Res., 38(11), pp.4389–4401.

Van de Wouwer, G., Scheunders, P. & Van Dyck, D., 1999. Statistical texture characterization from discrete wavelet representations. IEEE Transactions on Image Processing, 8(4), pp.592–598.

Vitchus, B., Cannova, F. & Childs, H., 2013. Calcined Coke from Crude Oil to Customer Silo. In A. Tomsett & J. Johnson, eds. Essential Readings in Light Metals. John Wiley & Sons, Inc., pp. 1–10.

Wangen, L.E. & Kowalski, B.R., 1989. A multiblock partial least squares algorithm for investigating complex chemical systems. Journal of Chemometrics, 3(1), pp.3–20.

Wavelet Toolbox Documentation, 2015. Wavelet Toolbox Documentation. Mathworks.com. Available at: http://www.mathworks.com/help/wavelet/index.html [Accessed June 28, 2015].

Westerhuis, J.A. & Coenegracht, P.M.J., 1997. Multivariate modelling of the pharmaceutical two-step process of wet granulation and tableting with multiblock partial least squares. Journal of Chemometrics, 11(5), pp.379–392.

191

Westerhuis, J.A., Gurden, S.P. & Smilde, A.K., 2000. Generalized contribution plots in multivariate statistical process monitoring. Chemometrics and Intelligent Laboratory Systems, 51(1), pp.95–114.

Westerhuis, J.A., Kourti, T. & MacGregor, J.F., 1998. Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12(5), pp.301–321.

Westerhuis, J.A. & Smilde, A.K., 2001. Deflation in multiblock PLS. Journal of Chemometrics, 15(5), pp.485–493.

Wise, B.M. & Gallagher, N.B., 1996. The process chemometrics approach to process monitoring and fault detection. Journal of Process Control, 6(6), pp.329–348.

Wold, S., 1995. Chemometrics; what do we mean with it, and what do we want from it? Chemometrics and Intelligent Laboratory Systems, 30(1), pp.109–115.

Wold, S., 1978. Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models. Technometrics, 20(4), pp.397–405.

Wold, S., Trygg, J., et al., 2001. Some recent developments in PLS modeling. Chemometrics and Intelligent Laboratory Systems, 58(2), pp.131–150.

Wold, S., Esbensen, K. & Geladi, P., 1987. Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1-3), pp.37–52.

Wold, S., Sjöström, M. & Eriksson, L., 2001. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58(2), pp.109–130.

Yousefian-Jazi, A. et al., 2014. Decision support in machine vision system for monitoring of TFT-LCD glass substrates manufacturing. Journal of Process Control, 24(6), pp.1015–1023.

Yue, Z.Q. & Morin, I., 1996. Digital image processing for aggregate orientation in asphalt concrete mixtures. Canadian Journal of Civil Engineering, 23(2), pp.480–489.

Zhang, J., Wang, X. & Palmer, S., 2007. Objective Grading of Fabric Pilling with Wavelet Texture Analysis. Textile Research Journal, 77(11), pp.871–879.

193

Appendix A Update of the anode properties

prediction model

Previous work presented a PLS model used for the prediction of baked anode properties

(Lauzon-Gauthier 2011; Lauzon-Gauthier et al. 2012) and investigation (Lauzon-Gauthier

et al. 2013) of process deviations. Updated results are presented in this appendix to

discuss the robustness of this monitoring approach with a long term (i.e. six years) and

real industrial dataset.

The model used as a comparison basis was computed from the dataset presented in JOM

(Lauzon-Gauthier et al. 2012). This dataset contained 708 and 375 anodes in the training

and validation set respectively and spans a period starting in February 2009 to December

2011. Since then, 965 new anodes have been collected up to July 2014. For this PLS

model, 88 X variables are used instead of the 92 in the reported results. One coke

properties and one process variables stopped being measured. Also, the anode height is

almost a direct measurement of the GAD which is in the Y dataset because the other

dimensions of the anode are fixed. It was decided to remove this variable to verify the

ability to predict the GAD without it. The variables in the Y dataset are the same as the

physical properties model in JOM.

A PLS model was computed on the training set (2/3 of the original dataset) and the

number of LV was chosen by cross-validation of 10 random subsets of observation. 9

components were chosen because it minimized the RMSEPCV of most variables. The

validation set (1/3 of the original dataset) was used to compute the prediction

performance. Table 33 presents the fit (R2Y) and prediction performance in cross-

validation (CVQ2), for the validation set (Q2Y Pred original data) and for new data (Q2Y

Pred new data) for each variables and overall.

194

Table 33 – Performance statistics of the original dataset PLS model in cross-validation, prediction of the validation set and prediction of new data

The performance of the model in prediction has been discussed in the previous work, but it

is important to observe that Q2Y on the new dataset is not adequate. Plots of residual and

Hotelling’s T2 (Figure 91) can be used to diagnose these prediction issues. The

computation of the two statistics and the 95% control limits (i.e. the red dash line) are

presented in (Kourti & MacGregor 1995).

Figure 91 – Model residuals: a) Hoteling’s T2 and b) prediction residual

Variable R2Y (%) Q

2Y CV (%)

Q2Y Pred

original data (%)

Q2Y Pred

new data (%)

GAD Green app dens 59,03 55,29 46,90 4,66

Green weight 68,30 65,01 61,29 0,04

Baked weight (mean) 81,63 79,53 83,18 3,70

Thermal cond 25,61 20,54 21,46 0,89

BAD Baked app dens 39,60 32,95 30,56 9,21

Real dens 41,93 37,74 44,91 16,76

Comp strengh 25,66 16,93 22,68 7,24

Lc 54,85 50,22 57,80 29,54

Youngs mod 29,50 21,88 17,29 11,60

Elect resis 46,93 42,37 55,27 14,87

total 47,30 42,25 44,13 9,85

0 100 200 300 400 500 600 700 800 900

102

103

Resid

ual

Anode cores

0

20

40

60

80

100

120

Hote

lling's

T2

b)

a)

C

A B

2012 2013 2014

195

Figure 91 a) presents the T2 statistics for the projection of the new observation in the

original PLS model. Except for the excursions indicated by the red arrows, the projection in

the latent variable space is normal for most observations.

Figure 91 b) presents the model residuals for the prediction of the new anode properties

(Y) from the original model. Up to observation 300 (November 2013) the residual is higher

than the limit, but it is still acceptable. The most important variations in the carbon plant

come from the raw material. If every new batch of cokes or pitch have properties different

than what was previously used in the plant, the model will always be less robust to new

data and will need periodic updating. After observation 300 the residual is very large with

some spikes indicated by the red arrow. The fact that the observations with these arrows

have both large residual and T2 is an indication of gross outliers in the data and it needs

investigation.

It is possible to compute contribution plots (Kourti & MacGregor 1995) of the residual for

each observation. These plots (Figure 92) indicate the combination of X variables that are

associated with the lack of prediction for a particular observation. Three observations

indicated by letters in Figure 91 were investigated. Observation’s A lack of prediction is

part of the small residual period of the first 300 new observations. Its contribution plot is

presented in Figure 92 a). The main contributors are coke and pitch properties that are

different than the historical dataset. Also, the fine feeder’s rotating valve speed is a

contributor since it was rebuilt in 2013 and was operated at a different set point since then.

The lack of robustness to raw material variations has been discussed earlier, but is it

possible that this issue will be less with enough historical data spanning a wide range of

possible properties combination. The contribution plot for observation B is presented in

Figure 92 b). The same fine feeder valve speed is a contributor, but this time, the major

contributors are the particles size distribution of the coke and butts fractions and the dry

aggregate. This is due the tightening of the size distribution range of the coke and butts

fractions. The span of the new operating parameters were not contained in the original

dataset and caused the deviation in prediction. Only one observation was used to illustrate

this case, but it is consistent for most observation after # 300. The last observation to be

investigated is one of the gross outliers (point C). Its contribution plot is presented in

Figure 92 c). Only one variable contributes to this lack of prediction and it is a coke

properties measured in the laboratory. The value entered in the database was 10 times

higher than its normal range. This is probably due to a manual entry error and was not

196

detected using the normal weekly average SPC monitoring of this variable. If it was

detected in a timely manner (i.e. as soon as the data are available) it could have been

retested of checked with the recorded results.

Figure 92 – Residual contribution: a) Observation A, b) Observation b and c) Observation

C of Figure 91

Due to change in raw material and process operating conditions the model was not

adequate anymore and needed to be computed again. All the available observations were

split in a training (2/3) and validation (1/3) sets. The fit and prediction statistics of this new

model are presented in Table 34.

-4

-3

-2

-1

0

1

2

3

4

5

Resid

ual C

ontr

ibution

0 10 20 30 40 50 60 70 80 90-5

0

5

10

15

20

25

30

35

Variables

Resid

ual C

ontr

ibution

-3

-2

-1

0

1

2

3

4

5

Resid

ual C

ontr

ibution

b)

a)

Coke impurity

c)

Fine feeder valve

Fine feeder valve

i

ii

Coke and pitch properties

Particles size distribution

197

Table 34 – Performance statistics of the new PLS model in cross-validation and prediction of the validation set

After the computation of a new model including the anodes up to July 2014, the prediction

performances are acceptable and similar to the previous original model.

This is a good example of monitoring the performances of a PLS model over-time. Being

able the detect changes in the correlation structure of the regressor (X) dataset is one of

the most powerful feature of the PLS algorithm.

Finally, this model should be used in real-time. In this case, this is every time a new set of

laboratory measurements is available. It can be used to monitor multiple aspects of the

process at the same time. For example, to verify if particular combination of raw material

has been used previously, to check for gross manual entry errors in the database and also

for monitoring the process operating conditions. All of the above tasks can be accomplish

using only three plots of model residuals, T2 and a contribution plot.

Variable R2Y (%) Q

2Y CV (%)

Q2Y Pred

(%)

GAD Green app dens 41,06 36,07 39,70

Green weight 37,94 31,02 40,18

Baked weight (mean) 73,60 71,99 73,45

Thermal cond 26,12 23,14 22,47

BAD Baked app dens 32,82 29,93 27,57

Real dens 42,47 40,27 44,88

Comp strengh 22,66 18,50 20,24

Lc 53,62 50,25 57,53

Youngs mod 26,25 22,85 22,05

Elect resis 37,16 33,57 44,66

total 48,12 35,76 39,27

Documents

Monitoring of a carbon anode paste manufacturing process ...€¦ · Résumé Le procédé de ... and gray level co-occurrence matrix (GLCM) methods were selected. These features