Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Monitoring of a carbon anode paste manufacturing process using machine vision
and latent variable methods
Thèse
������������ �������
Doctorat en génie chimique
Philosophiæ doctor (Ph.D.)
�����������
© Julien Lauzon-Gauthier, 2015
iii
Résumé
Le procédé de réduction électrolytique Hall-Héroult est utilisé pour la fabrication
industrielle d’aluminium primaire. Ce procédé nécessite l’utilisation d'anodes de carbone.
L’uniformité de la qualité de celles-ci est un paramètre très important pour assurer la
stabilité et des performances optimales des cuves d’électrolyse.
Malheureusement, les fabricants d'anodes sont actuellement confrontés à une
augmentation de la variabilité des matières premières. Cette situation est due à une
diminution de la disponibilité de matières premières de bonne qualité à faibles coûts. Pour
compenser, les fabricants d'anodes doivent diversifier leur choix de fournisseurs, ce qui
augmente la variabilité. Cependant, les usines ne sont pas préparées pour réagir à cette
situation tout en maintenant une qualité d'anode stable. Cette situation est due, entre
autres, à un manque de mesures quantitatives en temps réel de la qualité des anodes.
Plusieurs exemples d’applications industrielles de vision numérique ont été présentés
dans la littérature. Par conséquent, il existe une opportunité de développer un tel système
pour obtenir une mesure non destructive et en temps réel de la qualité de la pâte d'anode.
Le développement du capteur a été fait avec de la pâte et des anodes pressées à l'échelle
laboratoire. Un ensemble de caractéristiques de texture d'images calculées à partir de la
transformée en ondelettes discrète (DWT) et de matrices de cooccurrence de niveaux de
gris (GLCM) ont été sélectionnées. Ces caractéristiques étaient sensibles aux variations
dans la formulation et de la quantité de brai dans la pâte. Le capteur est aussi capable de
détecter la quantité optimale de brai (OPD) pour différents cokes. Ensuite, la sensibilité et
la robustesse du capteur ont été testées avec de la pâte industrielle.
Finalement, les usines collectent déjà beaucoup de mesures de procédé en temps réel.
Ces données peuvent être utilisées dans une stratégie de monitorage statistique pour
détecter et investiguer des déviations de qualité. Une nouvelle méthode statistique
multivariée par variables latentes PLS multi-blocs séquentiels (SMB-PLS) a été
développée pour améliorer l'interprétation des données industrielles par rapport aux
méthodes usuelles de PLS multi-blocs. Cette méthode a également été utilisée pour
discuter de la pertinence d’utiliser les caractéristique d'image de la pâte à un modèle
statistique pour la surveillance de la variabilité du procédé.
v
Abstract
The Hall-Héroult electrolysis reduction process used for the industrial aluminium smelting
relies on the consumption of carbon anodes. The quality and consistency of these anodes
are very important for the stability and performance of the reduction cells.
Unfortunately, the anode manufacturers currently face an increase in the raw material
variability. This is due to the declining availability of high quality, low cost and consistent
materials on the market forcing the anode manufacturers to diversify their suppliers.
However, the anode plants are not prepared to compensate for this increase in variability
and still maintain consistent anode quality. There is a lack of real-time quality monitoring
and control of the baked anodes properties and the most important raw material and
process parameters.
Machine vision applications have been successful in many industrial applications.
Therefore there is an opportunity to develop such a system to obtain a non destructive and
online measurement of the anode paste quality. This sensor could then be used in a
feedback/feedforward control strategy for attenuating the unmeasured raw material and
process variations.
The sensor development was performed using laboratory scale paste and pressed
anodes. A set of image texture features computed from discrete wavelet transform (DWT)
and gray level co-occurrence matrix (GLCM) methods were selected. These features could
capture variations in formulation, pitch ratio in the paste and in pitch demand. The sensor
was also found to be sensitive to the optimum pitch demand (OPD) of two different cokes.
Then, the sensitivity and robustness of the sensor was tested using industrial paste.
Finally, the anode plants already collect some real-time process measurement and off-line
raw material and baked anode properties that can be used to monitor and troubleshoot
process and quality deviations. A new sequential multi-block PLS (SMB-PLS) method was
developed to improve the interpretation of complex industrial dataset compared to already
available multi-block PLS methods. This method was also used to discuss the relevance of
adding real-time paste image feature to a statistical model for monitoring of the process
variability.
vii
Contents
Résumé ............................................................................................................................. iii Abstract .............................................................................................................................. v
Contents ........................................................................................................................... vii Table list ............................................................................................................................ ix
Figure list .......................................................................................................................... xi Acknowledgments ........................................................................................................... xix
Chapter 1 Introduction ................................................................................................... 1
1.1 Aluminium manufacturing ..................................................................................... 1
1.2 Anode manufacturing ........................................................................................... 2
1.3 Anode raw materials ............................................................................................ 4
1.4 Anode fabrication process .................................................................................... 5
1.5 Anode properties .................................................................................................. 8
1.6 Problems ............................................................................................................ 10
1.7 Objectives .......................................................................................................... 15
1.8 Thesis organization ............................................................................................ 17
Chapter 2 Latent variable methods .............................................................................. 19
2.1 Principal Component Analysis (PCA) ................................................................. 19
2.2 Projection to Latent Structures (PLS) ................................................................. 22
2.3 Data scaling ....................................................................................................... 24
2.4 Number of latent variables (A) ............................................................................ 24
2.5 Model interpretation tools ................................................................................... 27
Chapter 3 Image texture analysis ................................................................................ 29
3.1 Machine vision ................................................................................................... 29
3.2 Digital image ...................................................................................................... 30
3.3 Image texture analysis ....................................................................................... 31
3.3.1 Gray level co-occurrence matrix (GLCM) .................................................... 32
3.3.2 Wavelet texture analysis (WTA) .................................................................. 37
Chapter 4 Experimental ............................................................................................... 47
4.1 List of softwares ................................................................................................. 47
4.2 Laboratory anode fabrication .............................................................................. 47
4.2.1 Industrial raw material formulation ............................................................... 47
4.2.2 Laboratory raw material formulation ............................................................ 49
4.2.3 Laboratory anode fabrication ....................................................................... 51
4.2.4 Industrial paste sampling ............................................................................. 53
4.3 Image analysis methodology .............................................................................. 54
4.3.1 Description of the imaging set-up ................................................................ 54
4.3.2 Description of the image analysis methodology ........................................... 55
Chapter 5 A new Multi-block PLS algorithm including a sequential pathway ................ 61
5.1 Introduction ........................................................................................................ 61
5.2 Description of the multi-block methods ............................................................... 66
5.2.1 Multi-block PLS (MB-PLS) ........................................................................... 66
5.2.2 Sequential Orthogonal PLS (SO-PLS) ......................................................... 68
5.2.3 Proposed algorithm: the Sequential Multi-block PLS (SMB-PLS) ................ 69
5.3 Description of the dataset used for the case studies .......................................... 71
5.3.1 Simulated data from film blowing process ................................................... 71
5.3.1.1 First case – No correlation between raw materials and process data ... 72
5.3.1.2 Second case – Correlation between raw materials and process data ... 73
5.3.2 Industrial data from the anode manufacturing process ................................ 73
viii
5.4 Results and discussion ....................................................................................... 75
5.4.1 Selecting the number of components .......................................................... 75
5.4.2 Results for the film blowing example ........................................................... 77
5.4.3 Industrial data from the anode manufacturing process ................................ 86
5.5 Conclusion ....................................................................................................... 100
Chapter 6 Paste image texture analysis ..................................................................... 103
6.1 Introduction ...................................................................................................... 103
6.2 Laboratory paste and anode experiments ........................................................ 107
6.2.1 Preliminary design on paste formulation .................................................... 107
6.2.2 Detailed design on paste formulation ........................................................ 108
6.2.3 Pitch optimization experiments .................................................................. 112
6.3 Selection of preprocessing operations and image textural features .................. 114
6.3.1 Dataset and criteria used for the comparative analysis .............................. 115
6.3.2 Choice of preprocessing ............................................................................ 116
6.3.3 Choice of wavelet ...................................................................................... 119
6.3.4 Selection of textural features ..................................................................... 120
6.4 Results ............................................................................................................. 123
6.4.1 Preliminary design on paste formulation .................................................... 123
6.4.2 Detailed design on paste formulation ........................................................ 127
6.4.3 Pitch optimization experiment anodes ....................................................... 134
6.5 Conclusion ....................................................................................................... 138
Chapter 7 Industrial paste imaging ............................................................................ 141
7.1 Introduction ...................................................................................................... 141
7.2 Sampling and data synchronization .................................................................. 142
7.3 Datasets and results......................................................................................... 143
7.3.1 Normal operation ....................................................................................... 144
7.3.2 Paste plant start-up ................................................................................... 149
7.3.3 Industrial pitch optimization experiments ................................................... 153
7.4 Joint modelling of image features and paste plant data using SMB-PLS .......... 165
7.5 Conclusions ..................................................................................................... 171
Chapter 8 Conclusions and recommendations ........................................................... 175
8.2 Development of the machine vision sensor ...................................................... 175
8.3 Sensitivity and robustness to industrial paste ................................................... 177
8.4 SMB-PLS algorithm .......................................................................................... 178
8.5 Recommendations ........................................................................................... 179
8.5.1 Multivariate monitoring and control ............................................................ 179
8.5.2 Real-time paste quality measurement ....................................................... 180
Bibliography ................................................................................................................... 183
Appendix A Update of the anode properties prediction model ........................................ 193
ix
Table list
Table 1 – Typical dry aggregate particle size (Jones 1986) ................................................ 5
Table 2 – Anode properties typically measured from core samples .................................... 9
Table 3 – GLCM features of the images in Figure 16 ....................................................... 36
Table 4 – Properties of the industrial coke used for the laboratory paste manufacturing .. 48
Table 5 – Particle size distribution (measured at the plant) for each material fractions ..... 48
Table 6 – Properties of the industrial pitch supplied by ADQ for the laboratory anodes .... 49
Table 7 – Base mix formulation for the laboratory anode fabricated with the industrial raw
materials .......................................................................................................................... 49
Table 8 – Laboratory coke aggregate formulation ............................................................ 50
Table 9 – Laboratory coke properties ............................................................................... 50
Table 10- Laboratory pitch properties ............................................................................... 51
Table 11 – Heat-up rate during the laboratory anode baking ............................................ 53
Table 12 – Choice of GLCM distance L and comparison to the particle size distribution .. 59
Table 13 – Band pass size in period (i.e. spatial dimensions) for each decomposition level
of the DWT ....................................................................................................................... 60
Table 14 – List of the Y variables used for the anode manufacturing dataset case study . 74
Table 15 – Formulations used in the first series of experiments aiming at varying the
amounts of coke fines and pitch in the paste. ................................................................. 108
Table 16 – Changes in the paste formulation tested in the second set of experiments ... 110
Table 17 – List of experiments for the laboratory pitch optimization ................................ 113
Table 18 – Impact of adding contrast enhancement on PLS model statistics.................. 117
Table 19 – Impact of wavelet type and filter length on PLS model statistics ................... 119
Table 20 – Impact of different combinations of textural features on PLS model statistics 121
Table 21 – PLS model statistics for changes in fines and pitch percentages in the paste
formulation ..................................................................................................................... 124
Table 22 – PLS models statistics for the detail design on paste formulation ................... 128
Table 23 – PLS model statistics for the pitch optimization experiments .......................... 135
Table 24 – Correlation coefficients between the paste formulation variables for the normal
operation data ................................................................................................................ 145
Table 25 – Statistics of the PLS models built on normal operation data .......................... 146
Table 26 – Sample number and elapse time since the first start-up sample ................... 150
Table 27 – Statistics PCA model built on the paste plant start-up data ........................... 151
Table 28 – Changes implemented on pitch % set-point in the industrial pitch variations
dataset ........................................................................................................................... 155
Table 29 – Correlation coefficients between the paste formulation variables for the
experiments on pitch ratio .............................................................................................. 156
Table 30 – Coke and pitch properties for each pitch variation experiments .................... 156
Table 31 – Statistics of the PLS model for the design of experiments on pitch ratio ....... 160
Table 32 – Statistics of the GAD SMB-PLS model.......................................................... 167
Table 33 – Performance statistics of the original dataset PLS model in cross-validation,
prediction of the validation set and prediction of new data .............................................. 194
x
Table 34 – Performance statistics of the new PLS model in cross-validation and prediction
of the validation set ........................................................................................................ 197
xi
Figure list
Figure 1 – Cross section of a prebaked reduction cell technology (Courtesy of Alcoa) ....... 2
Figure 2 – Anode manufacturing process flowsheet (Fischer et al. 1995) ........................... 3
Figure 3 – New anode assembly (Courtesy of Alcoa) ......................................................... 4
Figure 4 – Illustration of the difference in pitch demand for two paste mixes ...................... 7
Figure 5 – Schematic of a baking furnace section (Grégoire et al. 2013) ............................ 8
Figure 6 – Illustration of the different behavior of GAD and BAD as a function of pitch % . 12
Figure 7 – Effect of constant operating conditions ............................................................ 12
Figure 8 – Illustration of the effect of different raw material and processing conditions of
the anode paste visual appearance: a) and b) 2 different industrial pastes and c) laboratory
paste ................................................................................................................................ 15
Figure 9 – Schematic of the machine vision methodology for anode paste ....................... 16
Figure 10 – Schematic representation of PCA .................................................................. 20
Figure 11 – NIPALS algorithm for PCA ............................................................................. 22
Figure 12 – Matrices of PLS ............................................................................................. 22
Figure 13 – NIPALS algorithm for PLS ............................................................................. 24
Figure 14 – Schematic of the machine vision approach (Liu 2005; Duchesne 2010) ........ 29
Figure 15 – Examples of GLCM matrices (Tessier et al. 2008) ......................................... 32
Figure 16 – Two stone surfaces with different texture used for the GLCM features example
(http://www.highresolutiontextures.com) ........................................................................... 35
Figure 17 – Frequency band divisions of the DWT of a vector (one-dimensional signal) .. 40
Figure 18 – 2-D DWT decomposition a) schematics of the filter bank used at the jth
decomposition and b) frequency distribution of the detail and approximation images (Liu &
MacGregor 2007) ............................................................................................................. 41
Figure 19 – Composite image of different textures (http://www.highresolutiontextures.com)
......................................................................................................................................... 42
Figure 20 – Approximation and detail coefficients of the composite texture image: a)
approximation at scale 3, b) comparison of the reconstructed detail at scale 1 for sub-
images 1 and 7, c) comparison of the reconstructed detail at scale 1 and 3 of sub-image 5
and d) comparison of the direction sensitivity for the reconstructed detail at scale 1 of sub-
image 2. ........................................................................................................................... 43
Figure 21 – Details of the mixer and oven for laboratory paste preparation ...................... 51
Figure 22 – Details of the press: a) cylindrical mold and dye and b) the press with the oven
to control the pressing temperature (Azari Dorcheh 2013)................................................ 52
Figure 23 – Laboratory baking furnace and baking box .................................................... 53
Figure 24 – Imaging set-up installed at the ADQ industrial plant ...................................... 54
Figure 25 – Image acquisition set-up ................................................................................ 55
Figure 26 – Anode paste machine vision flowsheet .......................................................... 56
Figure 27 – Example of paste image: a) laboratory paste and b) industrial paste ............. 57
Figure 28 – Results of the image pre-processing: a) low-pass filtered grayscale image, b)
image after contrast enhancement and c) comparison of the intensity histogram for both
images ............................................................................................................................. 58
Figure 29 – Symlet 4 wavelet function .............................................................................. 59
xii
Figure 30 – Illustration of the block order for an industrial process ................................... 61
Figure 31 – The MB-PLS algorithm for 2 regressor blocks (adapted from (Westerhuis et al.
1998)) ............................................................................................................................... 67
Figure 32 – The SO-PLS algorithm shown for 2 regressor blocks .................................... 68
Figure 33 – The SMB-PLS algorithm for two X blocks ...................................................... 70
Figure 34 – Simulated end section of a film blowing process (adapted from (Duchesne
2000)) ............................................................................................................................... 72
Figure 35 – Data blocks collected from the anode manufacturing process (Modified from
(Lauzon-Gauthier et al. 2012)) .......................................................................................... 74
Figure 36 – Q2Y and RMSEP statistics for selecting the number of components of the MB-
PLS algorithm for case 1 (Z and X are orthogonal) ........................................................... 77
Figure 37 – Q2Y and RMSEP statistics for selecting the number of component of the SO-
PLS model for case 1 ....................................................................................................... 78
Figure 38 – Q2Y and RMSEP statistics used for selecting the number of component of the
SMB-PLS model for case 1 .............................................................................................. 79
Figure 39 – Q2Y and RMSEP statistics used for selecting the number of component for the
MB-PLS algorithm for case 2 ............................................................................................ 79
Figure 40 – Q2Y and RMSEP statistics used for selecting the number of components of the
SO-PLS model for case 2 ................................................................................................. 80
Figure 41 – Q2Y and RMSEP statistics used for selecting the number of components of the
SMB-PLS model for case 2 .............................................................................................. 80
Figure 42 – Explained Y variance for the three multi-block methods built on the film
blowing datasets: a) case 1 and b) case 2. Z and X block variance explained and total (i.e.
concatenated regressor blocks) variance explained: c) case 1 and d) case 2 ................... 81
Figure 43 – Relative importance of each block by LV for: a) MB-PLS case 1, b) MB-PLS,
case 2, c) SMB-PLS case 1 and d) SMB-PLS case 2 ....................................................... 84
Figure 44 – Loadings of Z, X and Y blocks in the 3rd SMB-PLS component (Z-3) for case 2.
......................................................................................................................................... 85
Figure 45 – Loadings of Z, X and Y blocks in the 4th and 5th SMB-PLS component (X-1 and
X-2) for case 2. ................................................................................................................. 86
Figure 46 – Selection of the number of LVs for the MB-PLS model computed from the
anode manufacturing dataset: a) Q2Y and b) RMSEP for all Y variables .......................... 87
Figure 47 – Selection of the number of LVs for the SO-PLS model computed from the
anode manufacturing dataset ........................................................................................... 87
Figure 48 – Selection of the number of LV for the SMB-PLS anode model ....................... 88
Figure 49 – Results obtained with the multi-block algorithms on the anode manufacturing
dataset: a) R2Y and Q2Y for all methods, b) overall R2X by block for all methods, relative
weights (bars) and block variance explained R2X (lines) by LV for c) MB-PLS and d) SMB-
PLS .................................................................................................................................. 90
Figure 50 – Bi-plot of the block weights and Y loadings for first two components (Z-1 and
Z-2) of the SMB-PLS model built on the anode manufacturing dataset ............................. 91
Figure 51 – Amount of pitch used in the formulation as a function of the amount of coke
fines particles for different raw material blends (combinations of coke and pitch suppliers)
......................................................................................................................................... 94
xiii
Figure 52 – Z and X1 block weights bi-plot for LV1 and LV2 of MB-PLS ........................... 95
Figure 53 – Bi-plots of X2 block weights and Y loadings: a) LV 5 of MB-PLS and b) LV6
(X2-1) of SMB-PLS........................................................................................................... 96
Figure 54 – Baking block (X3) scores and loadings bi-plot: a) MB-PLS block scores, b) MB-
PLS block weights for LV4-LV5, c) SMB-PLS block scores and d) SMB-PLS block weights
for LV7-LV8 (X3-1 and X3-2). The blue and red markers indicate the anodes baked in the
coldest and hottest positions in the furnace ...................................................................... 97
Figure 55 – Comparison of the information mixing in MB-PLS and SMB-PLS models: a)
super scores (LV1-LV2) of MB-PLS, b) super scores (LV1-LV2) of SMB-PLS, c) Z scores
(LV2-LV3) of MB-PLS and d) Z scores (LV2-LV3) of SMB-PLS ........................................ 99
Figure 56 – Anode paste image...................................................................................... 105
Figure 57 – Baked and green anode density (BAD and GAD) for the pitch optimization
anodes using cokes from two different sources (A and B) .............................................. 113
Figure 58 – ∆BAD of the lab formulation anodes ............................................................ 116
Figure 59 – Scores of the PLS models for the lab formulated anodes: a) no contrast
enhancement and b) with contrast enhancement ........................................................... 118
Figure 60 – Score plots for the first two PLS components (LVs 1-2) of four models from
Table 20: a) model 1, b) model 4, c) model 9 and d) model 7 ......................................... 122
Figure 61 – Final image texture analysis procedure ....................................................... 123
Figure 62 – Scores and loadings weights of the PLS model (replicates averaged) for the
case where fines and pitch variations were introduced in the paste formulation: a) LV1-LV2
scores, b) weights and loadings of LV1 and c) weights and loadings of LV 2 ................. 125
Figure 63 – Reproducibility of the imaging sensor in the case of the preliminary design on
formulation. The averaged LV1 and LV2 scores are shown for replicated samples along
with their one standard deviation error bars .................................................................... 127
Figure 64 – Butts size distribution span .......................................................................... 128
Figure 65 – Scores and loadings weights of the PLS model built on averaged replicated
samples data for the case of the detailed design on formulation: a) X scores on LV1 and
LV2, b) Y scores on LV1 and LV2, c) weights and loadings of LV1 and d) weights and
loadings of LV2 .............................................................................................................. 130
Figure 66 – Interpretation of the PLS model built using averaged replicated samples data
for the case of the detailed design on formulation. Variations in the scores and associated
contribution plots: a) and b) increase in the pitch ratio, c) and d) shot coke addition, e) and
f) decrease in the fines ratio and g) and h) change from a coarser to a finer formulation 131
Figure 67 – Reproducibility of the imaging sensor in the detailed design on formulation.
The averaged LV1 and LV2 score values are shown for a) image replicates and b) mix
replicates along with their one-standard deviation error bars .......................................... 134
Figure 68 – Comparison of the predicted and measured ∆BAD for the replicated averages
model ............................................................................................................................. 135
Figure 69 – Scores and loadings of the PLS model (averaged replicates) for the pitch
optimization experiments: a) LV1 scores , b) LV1 weights and loadings, c) LV2 scores and
d) LV2 weights and loadings .......................................................................................... 136
Figure 70 – Scores of the 3rd and 4th components of the PLS model built on the pitch
optimization dataset (averaged features) ........................................................................ 137
xiv
Figure 71 – Reproducibility of the imaging sensor in the pitch optimization experiments.
The scores of the first two components of the PLS model built on all samples are shown
along with one standard deviation error bars .................................................................. 138
Figure 72 – Formulation variables for the normal operation industrial dataset: a) dry
aggregate % and b) pitch ratio ........................................................................................ 144
Figure 73 – First component’s scores (a) and loadings (b) of the PLS model (averaged
replicate data) built on normal operation data of the ADQ paste plant ............................ 146
Figure 74 – Second component’s scores (a) and loadings (b) of the PLS model (averaged
replicate data) built on normal operation data of the ADQ paste plant ............................ 147
Figure 75 – Third component’s scores (a) and loadings (b) of the PLS model (averaged
replicate data) built on normal operation data of the ADQ paste plant ............................ 147
Figure 76 – Uncertainties in the scores of the PLS model built on normal operation data: a)
LV1, b) LV2 and c) LV3. One standard deviation error bars on the scores are shown. ... 149
Figure 77 – Time series of a 5h plant start-up period: a) GAD and b) scores of LV1, LV2
and LV3 .......................................................................................................................... 152
Figure 78 – Scores and loadings of the PCA model built on the industrial paste start-up
data (averaged image replicates): a) LV1 and LV3 score plot and b) LV1 loadings plot.. 153
Figure 79 – Changes in the formulation variables for the industrial dataset where pitch ratio
was varied. The five sampling campaigns are indicated by letters A-E. .......................... 154
Figure 80 – Baked anode core properties: a) BAD for experiments C and E, b) electrical
resistivity, c) compressive strength, d) CO2 reactivity residue (CRR) and e) Young’s
modulus ......................................................................................................................... 158
Figure 81 – Predicted versus measured pitch ratio obtained using the PLS model built on
data collected during the design of experiments on pitch ratio (averaged replicates) ...... 160
Figure 82 – Scores and loadings of the PLS model (averaged replicates) component 1 for
the designed experiments on pitch ratio: a) LV1 scores, b) LV1 weights and loadings, c)
scatter plots of LV1 scores and coarse % and d) scatter plots of LV1 scores and pitch %
....................................................................................................................................... 161
Figure 83 – Scores and loadings of the PLS model (averaged replicates) component 2 for
the designed experiments on pitch ratio: a) LV2 scores, b) LV2 weights and loadings and
c) scatter plots of LV2 scores and pitch % ...................................................................... 163
Figure 84 – Scores and loadings of the PLS model (averaged replicates) component 3 for
the designed experiments on pitch ratio: a) LV3 scores and b) LV3 weights and loadings
....................................................................................................................................... 164
Figure 85 – Uncertainties in the scores of the PLS model (all sample) for the data obtained
during the design of experiments on pitch ratio): a) LV1, b) LV2 and c) LV3. One standard
deviation error bars of the scores are shown in the figure. .............................................. 165
Figure 86 – Data blocks and variables used in the SMB-PLS model for predicting GAD 166
Figure 87 – Relative contribution (bars) of each regressor block in the SMB-PLS model.
The explained variance of each regressor block R2X (black lines) and of the Y block R2Y
(gray line) are also shown .............................................................................................. 168
Figure 88 – Loading weights of the raw material properties (Z) in component LV Z-1 ..... 169
Figure 89 – Block weights of LV Z-2: a) raw material (Z) and b) image features (X3) ..... 170
Figure 90 – Block weights for LV X1-2: a) formulation (X1) and b) image features (X3).. 171
xv
Figure 91 – Model residuals: a) Hoteling’s T2 and b) prediction residual ......................... 194
Figure 92 – Residual contribution: a) Observation A, b) Observation b and c) Observation
C of Figure 91 ................................................................................................................ 196
xvii
Dedicated to Christophe, Justin and Marilou
xix
Acknowledgments
For the last four years I have spent a significant amount of time working on this Ph.D.
project. It was a very challenging but also rewarding period of my life. I am grateful for the
support of many people and organizations and I would like to take this opportunity to
express my gratitude to all of these important persons.
I am thankful for the financial support of Alcoa, the Fonds de recherche du Québec –
Nature et technologies (FRQNT), the Aluminum Research Centre – REGAL, NSERC,
Université Laval and Rio Tinto Alcan. With this support, I was able to focus my attention on
my project and my family.
I would like to express my profound gratitude and respect for my supervisor, Dr. Carl
Duchesne. You have given me all the tools, support and opportunities that I needed to
accomplish this project and much more. With your guidance, I have become confident in
my abilities and knowledge.
Special tanks to Dr. Jayson Tessier, your continued support of my project within Alcoa and
your thoughtful inputs on the problems and results were invaluable.
The Université Laval chemical engineering department has been my home for the last ten
years. I need to mention the contribution of the many people that support the students
every day. Thanks to Dr. Alain Garnier, Ann Bourassa, Nadia Dumontier, Pierrette
Vachon, Jean-Nicolas Ouellet and Yann Giroux.
I would like to thank the technicians and research assistants who helped me during my
experimental work: Guillaume Gauvin, Donald Picard, Hughes Ferland and Vicky Dodier. I
had a lot of fun working with you. I would also like to thank Jean-Phillip Giguère who
helped me as an intern to fabricate numerous laboratory anodes and for the BAD
measurements.
Many thanks to Kamran Azari Dorcheh and Francois Chevarin, this Ph.D. would have
been much more difficult without your previous laboratory work on anode fabrication and
your generous help in the lab. I also appreciated working with the MACE3 chair and RDC-
anode groups: Geoffroy, Ramzi, Behzad, François G., Dave, Pierre-Olivier, Stéphane and
the others.
xx
Many folks within Alcoa have contributed to the success of this project. I have felt at home
every time I went to the plant and I felt that you supported the project. This was important
for me. Thanks to Francis-Joé, Isabelle, Réal, Romain, Christian, Marc, Katie, Don, John
and the many operators who helped me during my experiments.
To my colleagues in the office, much of the day to day life in the department was shared
with you and I enjoyed it very much. It was also a pleasure to share many activities with all
of you. Thanks to Amélie, Pierre-Marc, Wilinthon, Alexandre, Jean-Pascal, Thierry,
Mathias, Juliette, Karl, Simon, Corinne and Moez.
Massoud, thanks for your friendship and availability when we needed to discuss the latent
variable methods or the interpretation of the results.
Finally, many thank to my family for the support during all these years, especially to my
wife Marilou. I could not have completed a Ph.D., raise two beautiful kids, start my
professional career and still have a normal life without you.
Merci!
1
Chapter 1 Introduction
The aluminium industry is a very important component of the Canadian economy. In 2010,
Canada was the third largest aluminium producer with 7% of the world production (source:
International Aluminium Institute http://www.world-aluminium.org). It sustains
approximately 10,000 direct jobs in the Province of Québec alone and injects 2.5 billion
dollars per year in its economy (source: Association de l'aluminium du Canada
http://ledialoguesurlaluminium.com).
There are some manufacturing challenges to address during the fabrication of the carbon
anodes used to produce aluminium. The most important aspect in terms of smelters
operation is the consistency of the anode quality. Unfortunately, anode manufacturers
have to cope with increasing raw material variability and they are not adequately prepared
to face this situation. To maintain consistent anode quality over time, carbon plants need
to adapt the formulation and the processing conditions in response to the raw materials
variations. However, the key raw material properties and process measurements are not
available in real-time to implement such adjustment. This thesis focuses on issues related
with real-time quality control of baked carbon anodes and the lack of fast and relevant
measurements to cope with raw materials variability. New data-driven methods and non-
destructive sensing techniques are proposed to improve process understanding,
monitoring and control of the anode manufacturing process.
Some sections (i.e. 1.1 to 1.5) of this chapter are a reproduction with minor modifications
and additions of the most important parts of the chapter 2 of the author’s M.Sc. thesis
(Lauzon-Gauthier 2011). It is reproduced here to give the readers the necessary
background of the manufacturing of industrial carbon anodes.
1.1 Aluminium manufacturing
The industrial production of primary aluminum is performed using the so-called Hall-
Héroult process (Grjotheim & Kvande 1993). Basically, aluminum is obtained through the
electrolytic reduction of alumina taking place within a typically large number of
metallurgical reactors (reduction cells) electrically connected in series. The
electrochemical reaction (shown below) involves dissolved alumina (in a cryolitic bath) and
carbon as the reactants, and yields liquid aluminum and carbon dioxide (gaseous
emission).
2
( ) ( ) ( ) ( )2 3 diss s l 2 g2Al O 3C = 4Al 3CO+ + 1.1
Figure 1 presents a schematic diagram of a pre-baked anode reduction cell, also called
pot in the industry. A high electrical current (i.e. from 100 kA to 600 kA for current
technologies (Tabereaux 2000; Charmier et al. 2015)) is passed through the cell, entering
from the conducting rods and the baked carbon anodes, and exiting by the cathode block
after passing through the cryolitic bath and the liquid aluminum pad. The anodes are
immersed into a bath made of cryolite, a chemical that dissolve alumina. The reduction
reaction takes place in the bath and liquid aluminium settles at the bottom of the pot. The
metal is tapped on a daily basis to ensure a constant height of liquids (i.e. liquid aluminium
and molten electrolytic bath) in the pot. Since the anodes are consumed by the alumina
reduction reaction (equation 1.1), they need to be periodically replaced. During pot
operation, the anodes are lowered continually as they are consumed to keep the anode to
cathode distance (ACD) constant. When the anodes reach approximately 1/3 of their
original size, they are replaced by new ones. The residual anodes, called butts, are
recycled to produce new anodes.
Figure 1 – Cross section of a prebaked reduction cell technology (Courtesy of Alcoa)
1.2 Anode manufacturing
The anode manufacturing plant is a vital part of a smelter’s operation because it supplies
one of the main raw materials for the aluminium reduction process (i.e. the baked carbon
New anode
Alumina feeders
Steel shell
Conducting rod
Spent anode
Bath
Molten aluminum
Lining
Cathode block
anodes). A typical process flowsheet is shown in
briefly described in the following paragraphs
2000; Lauzon-Gauthier 2011; Azari Dorcheh 2013)
effect of raw material properties and process operation on anode quality
Figure 2 – Anode manufacturing process flowsheet
The anode raw materials consist of calcined petroleum coke, liquid coal tar pitch and
recycled anode butts. The anode filler particles (e.g. coke and butts) are classified and
ground into a desired particle size distribution. The mix of coke and butts is called the dry
aggregate mix. The dry aggregate is than pre
(i.e. the binder) to obtain
anode block of specific dimensions using either
“green” anode. Finally, the green anode is baked in a furnace
then attached to a conducting
A typical process flowsheet is shown in Figure 2 (Fischer et al. 1995)
briefly described in the following paragraphs. The interested readers are referred to
Gauthier 2011; Azari Dorcheh 2013) for a more detailed description
effect of raw material properties and process operation on anode quality.
Anode manufacturing process flowsheet (Fischer et al. 1995)
The anode raw materials consist of calcined petroleum coke, liquid coal tar pitch and
The anode filler particles (e.g. coke and butts) are classified and
into a desired particle size distribution. The mix of coke and butts is called the dry
aggregate mix. The dry aggregate is than pre-heated before it is mixed with
to obtain the so-called anode paste. The paste is then
anode block of specific dimensions using either a press or a vibrocompactor to obtain a
Finally, the green anode is baked in a furnace. The baked anode block
conducting rod, and the assembly is finally ready to be
3
(Fischer et al. 1995) and is
are referred to (Hulse
descriptions of the
(Fischer et al. 1995)
The anode raw materials consist of calcined petroleum coke, liquid coal tar pitch and
The anode filler particles (e.g. coke and butts) are classified and
into a desired particle size distribution. The mix of coke and butts is called the dry
it is mixed with liquid pitch
then formed into an
vibrocompactor to obtain a
baked anode block is
ready to be set in the pots.
4
Figure 3 – New anode assembly (Courtesy of Alcoa)
An anode assembly (i.e. baked anode and connecting rod) is presented in Figure 3. The
aluminium rod is used to connect the anode assembly to the pots. The tripod is fixed to the
anode by pouring cast iron in the stub holes gaps.
1.3 Anode raw materials
Calcined coke is manufactured from the residual heavy oil fractions of the petroleum
refining industry. It is a low value by-product (i.e. waste) and therefore, refineries have no
incentive to control and/or improve its quality. Therefore, the quantity, quality and price of
calcined cokes available on the market vary significantly over time. This implies that
carbon plants need to adapt to cokes having important differences in physical properties
and chemical impurities from shipment to shipment (McClung and Ross 2000).
The following steps are required to transform heavy oil into coke: a delayed coking
process yields the green coke and this process is followed by a calcining operation to
produce the calcined coke of interest for the aluminum industry. Calcined coke quality is
influenced by the calcining conditions and green coke quality which is influenced by crude
oil quality, refining operation and delayed coking operation parameters (Fischer et al.
1995). Several papers describe the effects of oil quality, and process operation on green
coke quality (e.g. (Fischer & Perruchoud 1985) and (Vitchus et al. 2013)).
Coal tar pitch (CTP) is the binder used for making the baked anodes for the aluminum
industry. This pitch is produced from coal tar through a distillation process. Coal tar is a by-
product of the metallurgical coke production from coal. The role of the pitch in the anode
Conducting rod
Tripod
New anode
5
recipe is to bind the dry aggregate together to enable forming the anode into a block of
specific dimensions. It is also useful to fill some of the coke particle porosity. To obtain
good mechanical properties after forming, the anodes are baked to transform the
amorphous pitch into semi-crystalline coke.
Anode butts consist of the unconsumed portion of the anodes left after they are removed
from the pots (typically about 1/3 of their original size). Anodes are not consumed
completely to avoid metal contamination from the steel stubs. However, anode butts
surfaces are contaminated by sodium and other contaminants from the anode cover
material and frozen bath. Thus a cleaning step is required before the butts are stripped
from the stubs. After cleaning and stripping, the butts are crushed, screened to the desired
size distribution and stored in silos for use in the production of fresh anodes. This reduces
the amount of waste materials and the amount of fresh coke needed to formulate the
anodes. Butts constitute approximately 15-30% of the green anode formulation (Fischer
and Perruchoud 1991).
1.4 Anode fabrication process
In the first step of the process, the dry aggregate particles (coke and butts) are pre-
processed by screening and crushing. The finer coke particles are produce by milling
some of the material in a ball mill as well as collecting the dust throughout the anode plant.
The coke is usually classified in three distinct fractions: coarse, intermediate and fines.
The butts, which are less porous than the coke, consist mainly of coarse material (Fischer
et al. 1995). The typical particle size for each fraction is given in Table 1 (Jones 1986).
Table 1 – Typical dry aggregate particle size (Jones 1986)
The fineness of the fines size fraction is characterized by the Blaine number, which
measures the particle surface area. Blaine number increases with decreasing particle size
because the particle surface area increases for smaller sizes. Hence, it is used to
characterize particles too small to be classified by sieve analysis. This parameter is usually
closely monitored by the paste plant operators.
Particle size
(US mesh)
Particle size
(µµµµm)
Coarse -¼ in/+30 -6.3 mm/+600
Intermediate -30/+100 -600/150
Fine -100 -150
max 1 in max 25 mm
Dry aggregate
Coke
Butts & baked scrap
6
The dry aggregate blend is formulated using weight belts and is discharged in a pre-
heating equipment. Dry aggregate temperature is raised to between 150 and 200°C (Hulse
2000) prior to adding pitch to the dry aggregate blend. Pitch is also pre-heated to a
temperature ranging between 170 and 230°C (Hulse 2000) before it is incorporated in the
dry aggregate mix to form the anode paste The temperature difference between the pitch
and the dry aggregate is closely monitored to avoid partial solidification of the pitch on the
coke particles when these are put into contact, which would hinder proper pitch
penetration in the filler matrix and lead to a more heterogeneous paste. The paste is then
fed into a mixer in order to evenly distribute the pitch within the dry aggregate and to
ensure that the internal pores of the coke particles are filled with the binder. Mixing
temperature is usually set between 155°C to 180°C. Anode quality generally increases
with increased coke and pitch temperature up to the degassing temperature of pitch
volatiles. The paste viscosity decreases with an increase in the temperature and this will
improve the mixing, spreading and penetration of the binder matrix in the paste (Hulse
2000).
The paste’s pitch ratio is also of great importance. Under-pitched anodes will have
insufficient mechanical properties leading to anode failure in the pots and high electrical
resistivity due to a poor binding behaviour. Over-pitched anodes lead to slump formation
(i.e. problems when forming the anodes), high weight loss, swelling and cracks formation
during baking due to greater volatiles degassing, to packing material sticking also while
baking and finally stub hole deformations (Mannweiler & Keller 1994; Hulse 2000). Pitch
demand (i.e. appropriate amount of pitch for a given dry aggregate) is a function of the
coke fines fraction and filler particle properties but also mixing temperature and duration.
There exist an optimum between particle size distribution, formulation, mixing duration and
temperature and pitch ratio (Belitskus 1978; Hulse 2000).
Optimum pitch demand (OPD) is defined by the amount of pitch needed to obtain optimum
anode properties for a given type of coke, formulation and processing parameters. This is
illustrated in Figure 4 where the baked apparent anode density (BAD), a key anode
property, and the amount of pitch needed to reach the OPD are different for two paste
recipes. It is shown in this figure that the BAD increases with pitch % up to the optimum.
Then, adding more pitch becomes detrimental to the anode quality.
7
Figure 4 – Illustration of the difference in pitch demand for two paste mixes
In this thesis, the optimum pitch demand will often be defined using the baked anode
apparent density (BAD). The BAD is used because it usually correlates well with the
optimum of other anode properties (Belitskus 1978; Belitskus 1981; Belitskus 1993;
Belitskus & Danka 1988; McHenry et al. 1998; Hulse 2000; Belitskus 2013). It is also
straightforward to measure on small anodes (i.e. lab scale or core samples). Other
properties such as electrical resistivity or any groups of properties can also be used to
define the OPD (Hulse 2000).
Anode forming is performed either by pressing or vibro-compaction. The quality of pressed
anodes depends largely on raw material properties and recipe. Vibrated anode quality
depends also on raw material quality but is more sensitive to anode forming process
parameters (e.g. paste temperature during vibro-compaction) (Hulse 2000). If the
temperature is too high, the paste viscosity will be too low and the anode could collapse
when taken out of the mold and a low temperature causing high viscosity will lead to
improper compaction. How evenly the paste is distributed within the mold also has an
impact on anode quality. Uneven distribution usually leads to anisotropic anode properties
within the block.
Anode baking is typically performed using an open ring baking furnace. Details of the
operation of this type of unit are provided in (Fischer et al. 1995; Keller & Sulger 2008). In
brief, a section of the furnace is made of several pits (generally 6 or 7) where the anodes
are staked vertically (e.g. 6 anodes large by 3 anodes high). The space between each pit
(i.e. flue wall) is a cavity where natural gas and pitch volatiles are burned in order to supply
heat to the anodes according to a pre-defined baking cycle. A schematic of a baking
furnace section in provided in Figure 5 (Grégoire et al. 2013).
BAD
Pitch %
Paste 1
Paste 2
8
Figure 5 – Schematic of a baking furnace section (Grégoire et al. 2013)
Anode baking aims essentially at developing the mechanical properties of the anodes. The
pitch needs to be cokefied in order to increase the anode mechanical strength to sustain
the pot’s operating temperature (e.g. about 960°C). The heat-up rate, the final temperature
and soaking time (e.g. amount of time the anodes are maintained at final temperature) are
the most important baking parameters (Mannweiler & Keller 1994). Also, a minimal
temperature gradient between the different positions within the furnace needs to be
maintained to minimise the variability in the anode properties at different position in the
furnace (Fischer et al. 1993). This is accomplished by an appropriate design of the baffles
and flow path in the flue walls as well as adjusting the pressure and diameters of the
burners to obtain an optimum flame profile. Additionally, some process parameters can
also be adjusted. First, the under-pressure (i.e. to adjust the amount of oxygen in the flues)
can be manipulated. Also, temperature profiles can be adjusted in response to variations
in raw material (e.g. amount of volatiles in the pitch).
1.5 Anode properties
Anode quality is defined by a number of properties measured in the laboratory from core
samples collected from a certain number of baked anodes according to a pre-defined
sampling plan. These quality attributes (listed in Table 2) are grouped into four categories:
physical properties, mechanical properties, reactivity and chemical composition (e.g.
contaminants). Details on the laboratory analyses are available in (Fischer et al. 1995).
Anodes
Coke
Refractories
Flue wall
9
Table 2 – Anode properties typically measured from core samples
The anode properties measured in the laboratory are used as an indication of process
stability and anode quality, but there are some issues with the use of core samples for
quality control. Typically, less than 1% of the weekly anode production is sampled (i.e.
core physically extracted from the anode). Moreover, the core samples (50 mm diameter
and approximately 400 mm in length) are collected from a specific location on the anode
and are not necessarily representative of the anode block (i.e. approximately 0.6 m3 and
930 kg), which properties can be anisotropic. Furthermore, the cores are generally not
long enough to measure all the properties on the same sample. Thus, the lab results might
Unit
Air permeability nPm
Apparent density kg/dm3
Thermal conductivity W/mK
Electrical resistivity mohm*cm
Flexural strengh MPa
Fracture energy J/m2
Coefficient of thermal expansion K-1
Compressive strength MPa
Young's modulus GPa
Real density kg/dm3
Cristalite size Lc nm
Ash content %
CO2 reactivity CO2 loss CRL %
CO2 dust CRD %
CO2 residue CRR %
Air reactivity Air loss ARL %
Air dust ARD %
Air residue ARR %
Chemical impurities Sulphur S %
Vanadium V ppm
Nickel Ni ppm
Silicon Si ppm
Iron Fe ppm
Aluminium Al ppm
Sodium Na ppm
Calcium Ca ppm
Properties
10
not be representative of the whole anode population (Sinclair & Sadler 2009). There is also
a few weeks delay between the sampling and the availability of the lab results, so
oftentimes, the anodes have already been set in the pots and it is too late to take
corrective actions on the anode manufacturing process. However, when used correctly
(i.e. good sampling strategies), these laboratory measurements can be use to assess the
overall anode quality over a certain time window (aggregated measurements) or to
compare the effect of different operating parameters. But due to the long processing time,
the anode properties obtained from the quality control laboratory cannot be used for real-
time feedforward/feedback control (Sinclair & Sadler 2006; Sinclair & Sadler 2009).
1.6 Problems
As introduced earlier, anode quality is critical to the optimum operation of the aluminium
smelters. However, the quality of baked anodes is becoming less consistent over time due
to three main reasons. First, the declining availability of good quality anode raw materials:
coke and pitch. Second, most smelters purchase coke materials from an increasing
number of suppliers to meet availability and quality targets (i.e. they blend cokes) but also
to reduce purchasing costs. This introduces supplier-to-supplier variations in addition to
lot-to-lot variability from any given supplier. Finally, some of the anode plants (especially
the older plants) do not have the flexibility to cope with such an increased variability.
Indeed, frequent adjustments to process operating parameters are required to attenuate
the impact of raw material variability, and most plants are not equipped to make those
corrections in a timely fashion. For example, there is a general lack of on-line sensors to
measure critical-to-quality attributes at the various stages of the anode manufacturing,
from raw materials characterization, to the different processing steps and final anode
quality assessment. Minimizing the impact of raw material variability is of great importance
for aluminium smelters because of the significant impact it has on the performance of the
reduction cells and the economical performance of the smelters (Fischer & Perruchoud
1991; Jentoftsen et al. 2009).
In the future, the suppliers of coke and pitch are expected to support the growing demand
for aluminum, but the major issues will be the increasing cost and the availability of high
quality raw material (Mannweiler et al. 2009; Baron et al. 2009; Edwards et al. 2012).
There are several reasons for the decrease in quality of the carbon materials and the most
important one is that both coke and pitch are by-products of their respective industries
which have no or very little economical incentive to improve or control their quality. The
11
second reason is that both raw materials are dependent on the source of crude oil or coal
and the conditions of the cokefaction or distillation process used to produce them. Coke
properties are highly dependent on the diversity of crude oil sources. As low sulfur, low
contaminant crudes become rarer, higher contaminated crudes are being refined. This
leads to higher levels of contaminations in the coke, especially for vanadium and sulphur.
It also leads to changes in the coke micro-texture from a sponge-like appearance to a
more isotropic texture which can increase anode cracking in the cells (Edwards et al.
2009; Edwards et al. 2012).
Due to the rarity of high quality raw materials on the market, the higher cost of coke and
pitch drives the carbon plant to more frequent supplier changes which in turn increase
even more the variability of raw material incoming to the carbon plants. Some
manufacturer are even considering using non-traditional lower cost type of coke in the
anode paste (e.g. shot coke (Edwards et al. 2009)).
Unfortunately, the anode manufacturers are not well prepared to manage this increase in
raw material variability. There is a general lack of real-time measurements of key raw
material and green and baked anode properties. The coke and pitch properties are
characterized by laboratory analysis or by simply using the certificate of analysis (COA)
from the manufacturer, but these are often available after the manufacturing of the anodes.
Furthermore, the baked anode properties are measured by sampling cylindrical core of
less than 1% of the production.
The issues with the representativeness of the anode core sample properties have been
discussed earlier, but the main problem is with the long delay to obtain those
measurements (i.e. a few weeks). The results are available too long after the anode has
been produced to be used for implementing corrective actions in the process (Sinclair &
Sadler 2006; Sinclair & Sadler 2009). Only long term deviations can be observed by this
monitoring strategy, and it is of limited use when raw material supply changes frequently.
Currently, the green anode quality is controlled by manipulating the amount of pitch in the
paste (i.e. pitch ratio in the formulation). Hulse (Hulse 2000) presented a review of
empirical model based pitch optimization techniques. At the plant, a combination of
operator experience, visual inspection of the newly formed green anode and the use of the
green anode density (GAD) as a quantitative metric are used to estimate the required
amount of binder (i.e. pitch demand) in the anode. Pitch is also adjusted to ensure smooth
12
operation of the paste plant for any given formulation of paste or raw material blend.
Unfortunately, the GAD is not a good indicator of baked anode properties since it does not
show an optimum based on the pitch demand. The GAD increases even if the baked
anode properties are decreasing due to over-pitching (Hulse 2000). This is illustrated in
Figure 6 as opposed to the BAD which has an optimum. Also, since this choice of the
optimum pitch level depends on the operator’s experience, the quality of the anode can
change from one operator to the other.
Figure 6 – Illustration of the different behavior of GAD and BAD as a function of pitch %
Other process conditions (e.g. mixing temperature, paste formulation, etc.) are generally
kept at constant operating values. Almost no real-time changes are implemented in
response to the raw material variability due to the lack of on-line quantitative quality
measurements. This situation is illustrated in Figure 7. It is shown that given the increasing
variability of incoming raw materials, if the process conditions are not adjusted (kept
constant), that the variability propagates through the baked anode final properties. Real-
time adjustments of process conditions, through feedforward and feedback control actions,
are necessary to help produce anodes of consistent quality.
Figure 7 – Effect of constant operating conditions
GADBAD
Pitch %
GAD
BAD
13
In summary, the long delays in obtaining the raw material and baked anode properties
from the laboratory and the lack of online quantitative measurements in the paste plant
make it very difficult to face the increase in coke and pitch variability.
This situation can be improved by using the data that are already available at the smelters
and carbon plants. Tessier et al. (Tessier, Duchesne, Tarcy, et al. 2011) identified a set of
combined (i.e. multivariate) anode properties that could help ensure good pot performance
in the smelter. The availability of the information on anode quality, in real-time and for all
anodes, could prevent the introduction of faulty anodes in the reduction cells. The carbon
plant data coming from the raw material properties and the process operation conditions
can be used to predict the baked anode properties. Lauzon-Gauthier et al. (Lauzon-
Gauthier et al. 2012) have shown that, at the Deschambault smelter (ADQ), between 20-
60% of the variance (i.e. model fit) in the anode properties (i.e. physical and mechanical
properties and gas reactivity) can be explained by using the routinely collected raw
material and process data. The multivariate statistical model proposed in that work only
used the data routinely measured in the plant and was shown to provide useful predictions
of the baked anode properties, available right after the baking process. This could allow for
early detection of faulty anodes and investigation of process deviations. However, the
model could only predict the properties for anodes baked at two specific positions within
the furnace due to the anode sampling strategy in place at ADQ. The above studies have
both shown that the data already routinely collected at the plant contains relevant and
useful information, but that new measurements are still necessary to explain a greater
percentage of the variations in baked anode properties which, in turn, will help improve
quality control. A combination of the currently available data and new measurement
techniques is therefore sought as a promising solution.
There are good opportunities in the anode manufacturing process to develop new tools
and sensors to improve the measurement of process variability and increase the ability of
the manufacturers to reduce the impact of the raw material variability. Machine vision
applications for process monitoring and control have become increasingly popular in
recent years. Duchesne et al. (Duchesne et al. 2012) reviewed several of these
applications including the detection of defaults in lumber wood, monitoring of the mineral
froth in a flotation process and detection of steel slab surface defects. Since the variations
in raw material quality and operating conditions of the paste plant can influence the visual
14
appearance of the anode paste, there is a good opportunity to develop a machine vision
sensor capable of monitoring change in the paste quality.
Several methods for characterizing coke or paste properties using images have been
reported in the literature. Most of the proposed method used some automatic image
analysis scheme, but all of them lack the possibility to be applied in real-time. Eilertsen et
al. (Eilertsen et al. 1996) have proposed a method for analysing the coke micro-texture
(i.e. coarseness and anisotropy) using a polarising light microscopy technique. Adams et
al. (Adams et al. 2002) developed a method to measure the thickness of the pitch layer on
coke particles by microscopy. Rorvik et al. (Rorvik et al. 2006) also proposed a method
using a microscope to measure the pitch layer thickness and the pore sizes. The main
disadvantages of these methods are the sample size is small and time consuming sample
preparation is required for each measurement. These techniques are not rapid enough to
support on-line monitoring of the process.
Sadler (Sadler 2012) proposed a method to monitor the macroscopic visual appearance of
the baked anode surfaces using a microscope and found that visible structural changes in
the surface texture could be observed on anode fabricated under different operating
conditions. It was used on baked anodes, but applicability to green anode or paste would
enable carbon plants to react to process changes before the baking step. However, this
approach was not automated and also suffers from the same drawbacks as the other
methods using microscopy and described in the previous paragraphs.
An internal report from Alcoa (Adams et al. 2007) describes a method based on images
used to measure the amount of pitch in the paste. This is the only known method of
automatic paste image analysis so far. Its major drawback was its lack of robustness to
changes occurring in industrial paste samples.
The fundamental hypotheses made and tested in this Ph.D. thesis is that the paste visual
textural appearance is influenced by the dry aggregate particle size distribution, the coke
particle porosity (i.e. pitch demand), the amount of pitch and the processing conditions of
the paste, and a machine vision approach should allow quantifying the effect of these
parameters. To support this, a few anode paste images obtained under different
formulation and processing conditions are shown in Figure 8.
15
Figure 8 – Illustration of the effect of different raw material and processing conditions of
the anode paste visual appearance: a) and b) 2 different industrial pastes and c) laboratory
paste
1.7 Objectives
The general objective of this thesis is to address issues related with the lack of fast and
relevant measurements to help cope with raw materials and process variability and enable
real-time quality control of the green and baked anodes. It is a twofold approach where a
new non-destructive machine vision system is developed to add real-time information
about the green anode paste quality. This sensor could be used in a feedfoward/feedback
control strategies to compensate for disturbances in raw material properties or formulation
variability that are difficult to measure with the usual monitoring approach. Also, a new
multi-block latent variable method is developed to improve the interpretation of the data
already available from the manufacturing process and to be able to include the additional
data coming from the machine vision sensor and other non-destructive real-time
measurement in the future in empirical models of the carbon plant.
The new sensor should be sensitive to changes in formulation and in the pitch demand of
the paste. The images are taken on the paste after mixing, but before compaction. An
illustration of the methodology is presented in Figure 9.
a)
b)
c)
16
Figure 9 – Schematic of the machine vision methodology for anode paste
The first specific objective is to develop a machine vision algorithm (i.e. image
preprocessing, image analysis and features selection) at a laboratory scale. This method
was developed with lab scale anodes in the laboratory at Université Laval. Paste samples
were prepared by varying the conditions of fabrication. These variations included the use
of different types of coke and pitch, variations of the dry aggregate particle size
distribution, of the fine particles fineness (i.e. the Blaine number), of the amount of pitch as
well as the mixing temperature of the paste. Each paste sample was imaged using a
camera in the visible spectrum (i.e. RGB). The image texture characteristics that enabled
the differentiation and classification of the different blends of paste were identified. The
image texture features were computed using advanced image texture analysis method: the
co-occurrence of gray level matrices (GLCM) and also wavelet texture analysis (WTA).
Multivariate latent variable statistical methods such as principal component analysis (PCA)
and projection to latent structures (PLS) were use to analyse the image features.
The second specific objective is to test the robustness of the machine vision sensor for
industrial scale anode paste. This was performed in the Alcoa Deschambault smelter’s
(ADQ) carbon plant. Off-line samples of paste were taken from the process after mixing
during several days of operation and the sensitivity of the sensor developed in the
laboratory to the various process conditions were tested.
Extraction de caractéristiques
Classification (PCA),Régression (PLS)
Classification (PCA)
Regression (PLS)
Image texture
features extraction(GLCM, WTA)
Anode properties
(BAD, resistivity, ...)Feedforward
controller
Feedback
controller
Formulation, pitch, particle
size distribution, ...
Forming and baking
parameters
17
The third specific objective is the development of a new sequential multi-block PLS
algorithm (SMB-PLS). Based on observations made in previous work by Tessier et al.
(Tessier, Duchesne & Tarcy 2011) and Lauzon-Gauthier et al. (Lauzon-Gauthier et al.
2012) it was found that there is a need to improve the visualization and interpretation of
PLS models for large and complex industrial datasets. It is also important to develop such
algorithm as new real-time measurements (e.g. the paste machine vision sensor) are
available. This new algorithm will be useful in the future to integrate all the data related to
anode quality into one single empirical model. These data can be available from the raw
materials, the process operating conditions, some real-time non-destructive
measurements of the paste, green anodes and baked anodes, the baking furnace
operation data, etc. The algorithm was developed using a simulation dataset and the
anode manufacturing data from (Lauzon-Gauthier et al. 2012). The new method is
described and compared to the multi-block PLS (MB-PLS) and sequential orthogonal PLS
(SO-PLS). Also the use of the machine vision data from the industrial paste is discussed.
The SMB-PLS algorithm was not developed to be specific to the anode manufacturing
process and could also be applied to other complex multi-block structured problems.
1.8 Thesis organization
This thesis is organized as follows. Chapter 2 and Chapter 3 provide background
information on statistical and image analysis methods, respectively. Chapter 4 discusses
the material properties and the experimental procedures. Chapter 5 presents a new
sequential multi-block PLS algorithm used to improve the interpretation of PLS model built
on industrial data using the anode manufacturing process data. Chapter 6 discusses the
choice of texture features chosen for the anode paste machine vision methodology and the
results obtained with laboratory pastes and anodes. Chapter 7 focuses on describing the
industrial paste results obtain with the machine vision method and the use of these data in
a SMB-PLS model of the paste plant. Finally, some conclusions are drawn and future work
is discussed.
19
Chapter 2 Latent variable methods
This chapter presents the relevant statistical background information useful for the
understanding of the work presented in this thesis. It is a reproduction of chapter 3 of the
author’s M.Sc. thesis (Lauzon-Gauthier 2011) with modification to sections 2.4 and 2.5.
The basic latent variable methods for multivariate statistical analysis are presented in this
chapter. These methods were developed in the field of chemometrics, defined by Svante
Wold as “How to get chemically relevant information out of measured chemical data, how
to represent and display this information, and how to get such information into data” (Wold
1995). The goal of these methods is to extract the most useful information from complex
and big datasets. It has been extended to chemical process analysis and monitoring as
well in the early 1990’s (Wise & Gallagher 1996; MacGregor & Kourti 1995). Two of the
most used methods, Principal Component Analysis (PCA) and Projection to Latent
Structures (PLS), also referred to as Partial Least Squares, are presented in the following
sections together with a discussion on data scaling, the selection of the number of latent
variables to include in the models and various interpretation tools.
In this thesis, the following notation is used. Scalar quantities are identified using normal
lower case characters (scalar). Vectors are shown using bold lowercase characters
(lowercase), matrices are represented by bold capital characters (CAPITAL) and the
transpose operator is illustrated using uppercase capital T (e.g. XT or tT).
2.1 Principal Component Analysis (PCA)
Principal Component Analysis is the basic multivariate data analysis approach. It is used
to model and investigate multivariate datasets. Detailed tutorials and examples can be
found in (Wold et al. 1987; Kourti 2005). Assume a data matrix X is available consisting of
I rows, commonly called observations or measurements, obtained from J different
variables (columns of X) as illustrated in Figure 10. Most industrial datasets are very large,
noisy, and the variables are typically highly collinear (i.e. X is not full rank). However,
measuring hundreds to thousands of variables on a given process does not necessarily
mean that a hundred independent events occurred on this process. In fact, process
operation is usually driven by a much lower number of underlying independent events
called lurking or latent variables (LV) involving linear combinations of the original variables
(the p’s in Figure 10). These LVs cause the large number of process variables to vary
20
together in certain directions (i.e. in a correlated fashion). PCA is one of the basic methods
for extracting these few latent variables capturing most of the variance in a dataset. The
projection of the dataset onto the lower dimensional space of A dimensions spanned by
the latent variables can then be used to visualise and interpret the relationships between
the variables and between the observations.
Figure 10 – Schematic representation of PCA
The first principal component is the linear combination of the J columns (variables) of X,
defined by the orthonormal vector p1, explaining the greatest amount of variance in the
dataset. This is mathematically formulated as an eigenvector-eigenvalue problem with the
following objective function:
{ }1
T T T1 1 1 1max subject to = 1
pp X Xp p p 2.1
Where the term within brackets represents the variance of the first latent variable t1
defined as the projection of X in the direction of p1:
1 1=t Xp 2.2
This latent variable explains the most variance in X and it is removed from the dataset
leaving the residual matrix E1:
- T1 1 1=E X t p 2.3
If the first component is not sufficient for explaining the variations in X, a second PCA
component can be added to the model. It corresponds to the linear combination of the J
variables explaining the greatest amount of variance not captured by the first component,
X
I
J
T
PT
Variables
Ob
se
rvatio
ns
21
(i.e. left in the residual matrix E1). The second component is the solution to the following
eigen problem:
{ }2
T T T T2 2 2 2 1 2max subject to 1 and 0= =
pp X Xp p p p p 2.4
The additional constraint for this second component ensures that the latent variables are
orthogonal to each other. Additional components can be added sequentially to the PCA
model using expression 2.4 until the desired number of latent variables (A) is computed.
The maximum number of LVs is J, but for industrial data A is usually smaller than J (A <<
J) due to the highly collinear structure of the data. The final model has the following
structure:
T= +X T P E 2.5
Where the score and loading vectors are collected in the matrices T (I×A) and P (J×A) and
the residuals are stored in matrix E (I×J).
In summary, PCA performs the eigenvector decomposition of X. The p vectors are the
eigenvectors of XTX and the t vectors are the eigenvectors of XXT.
For the numerical computation of the p and t vectors, two alternative algorithms are
available. First, one can use Singular Value Decomposition (SVD) to compute all possible
components (a=1, 2, … J) simultaneously and then select the number of LVs to keep in
the model (A<J) using some selection criteria (discussed later in this chapter). The second
option is to use the Nonlinear Iterative Partial Least Squares (NIPALS), which computes
the components one at a time followed by significance testing. A decision to add the next
component is again taken based on some selection criteria. The advantage of this
approach is computational economy in the sense that only those components that are
deemed significant are calculated. The NIPALS algorithm is used in most commercially
available softwares. PCA components are scaling dependent and this issue will be
discussed in section 2.3. Figure 11 shows the details of the NIPALS algorithm for PCA
(Geladi & Kowalski 1986).
22
Figure 11 – NIPALS algorithm for PCA
When J is very large, which is normally the case with industrial data, this method is
advantageous compared to SVD decomposition since it is often not necessary to compute
all latent variables (in this case, A<<J).
2.2 Projection to Latent Structures (PLS)
Projection to Latent Structure is a multivariate regression method. Assume that a second
dataset Y is also available consisting of H variables and I observations (e.g. response
variables such as product quality attributes) as shown in Figure 12. The PLS method is
used to explore the relationships existing within and in between both datasets, X and Y. It
can be seen as an extension of PCA, but for two sets of data.
Figure 12 – Matrices of PLS
The basic assumption behind PLS is that variations in X and Y are linked together by a few
underlying events described by a common set of A latent variables T and U, respectively
(I×A). These latent variables in the X and Y space are selected in such as way that the
covariance between the two datasets is maximized (i.e. that T is most predictive of U).
1. Set t to any column of X.
2. Start convergence loop.
2.1. p = XTt/(tTt)
2.2. p = p/(pTp) ½
2.3. t = Xp
2.4. Check for convergence of t and p.
Continue to step 3 if converged.
3. E = X – tpT
4. Store p and t as new columns in P and T.
5. Restart at step 1, replacing X by E.
X
pT
I
J
2.1 2.3
t
Y
H
U
QT
X
I
J
T
WT
PT
23
Additional details and tutorials can be found in (Geladi & Kowalski 1986; Höskuldsson
1988; Burnham et al. 1996; Burnham et al. 1999; Wold, Sjöström, et al. 2001; Martens
2001; Kourti 2005).
Mathematically, the latent variables are defined as a set of linear combinations of the X
variables expressed by the so-called weight vectors wi, (i = 1, ..., A), which weights are
computed in such a way to maximize the squared covariance between X and Y. The
solution to this problem is again formulated as an eigen problem with the following
objective function:
{ }T T T T T subject to 1and to 0 for= = ≠w
w X YY Xw w w w wi
i i i i i jmax i j 2.6
As for PCA, the set of constraints ensure that the weight vectors wi are orthonormal and
that latent variables are orthogonal to each other. The PLS model structure is described
below, and is also shown schematically in Figure 12.
T= +X TP E 2.7
T= +Y TQ F 2.8
*=T XW 2.9
( )-1T=*W W P W 2.10
Where T (I×A) is the set of A latent variables defining the common latent variable space
capturing the relationships between X and Y. They correspond to those combinations of
the X variables that are the most highly correlated with the Y data. The weights of each
variable in each component are collected in the weight matrix W* (J×A). The P (J×A) and
Q (H×A) matrices contain the loading vectors defining the latent variable spaces of X and
Y, respectively. E (I×J) and F (I×H) are the PLS model residuals. It was shown by
(Höskuldsson 1988) that the vectors w, q, t and u are the eigenvectors of the following
matrices XTYYTX, YTXXTY, XXTYYT and YYTXXT, respectively.
The NIPALS algorithm was adapted for PLS regression by (Geladi & Kowalski 1986;
Höskuldsson 1988) in order to compute the PLS latent variables sequentially. Again, only
the desired number of LV’s are calculated. The algorithm is shown in Figure 13. The PLS
24
vectors are also scaling dependent. This will be discussed with the selection of the number
of latent variables in sections 2.3 and 2.4, respectively.
Figure 13 – NIPALS algorithm for PLS
2.3 Data scaling
Both PCA and PLS methods are sensitive to how the X and Y data matrices are scaled.
When no prior knowledge is available on the relative importance of the variables, the
common practice is to scale them to unit variance after applying mean-centering. This
scaling procedure is applied to each variable (i.e. columns) of the X and Y data matrices.
Consider a column vector (xj) in the X data matrix and its mean value (xj,mean) and standard
deviation (xj,std). The scaled values (xj*) are obtained using the following equation (element
by element division is assumed):
( ),mean*
,std
-=
x xx
x
j j
j
j
2.11
This method is also called auto-scaling. Mean-centering allow the computation of the
variations of the variables around there mean and scaling to unit variance gives equal
importance to all the variable in the models as not all of them are measured in the same
engineering units (Geladi & Kowalski 1986).
2.4 Number of latent variables (A)
Industrial data are typically highly collinear and noisy. Collinearity implies that a small
number of latent variables are sufficient to capture and explain most of the variations in a
1. Set u to any column of Y.
2. Start convergence loop.
2.1. w = XTu/(uTu)
2.2. w = w/(wTw) ½
2.3. t = Xw
2.4. q = YTt/(tTt)
2.5. u = Yq/(qTq)
2.4. Check for convergence of t or u.
Continue to step 3 if converged.
3. p = XTt/(tTt)
4. E = X – tpT and F = Y – tqT
5. Store w, p, t and u as new columns in W, P, Tand U.
6. Restart at step 1, replacing X by E and Y by F.
X
wT
I
J
32.3
pT
Impossi
ble d'af
fich
er l'im
age
t u H
qT
Y
2.1
2.4
2.5
25
dataset (X and/or Y). The corruption of the data by noise means that carefulness must be
used to model only the systematic variation (i.e. structured variations) and guard against
overfitting the model with noise. When the correct number of latent variables (A) is
selected, the important information is stored in the loadings and weight matrices (P, Q and
W*) and the irrelevant variations are left in the residuals (E and F). The most commonly
used method for selecting the number of latent variable is cross-validation (Wold 1978),
but other methods also exist to determine the model order (Nomikos & MacGregor 1995;
Valle et al. 1999; Duchesne & MacGregor 2001).
The cross-validation (CV) method suggests to keep adding latent variables to the model
until the latest component does not significantly improve predictions of X (PCA) or Y
(PLS). For the cross-validation procedure, the I observations in X and/or Y are divided into
g sub-groups of n observations (I=gn). Each sub-group is removed from the data once and
only once and the data in the remaining g-1 sub-groups are then used to build a PCA or
PLS model using a latent variables. Predictions are computed for the group left out of the
analysis and the prediction error sum of squares (PRESS) is computed for this sub-group.
PRESS(a) is the sum of the PRESS values for all g sub-groups for a model with a latent
variables (a = 1, 2, …, A). The model predictive ability is than evaluated with the predictive
multiple correlation coefficients (Q2CV) (Wold 1978):
( )( )
( )1-
-
2CV
R
PRESSQ
SS 1
aa
a= 2.12
where
( ) � ( )( ), ,
2
1 1
PRESSpred
a a= =
= −∑∑I J
i j i ji j
y y 2.13
and
( ) � ( )( ), ,
2
R
1 1
SStraining
a a= =
= −∑∑I J
i j i ji j
y y 2.14
In the above equations, I is the number of observations, J is the number of variables, a is
the number of model components (a = 1, 2, …, A), and SSR(a-1) is the residual sum of
squares in fit of the model with a-1 latent variables. Equations 2.13 and 2.14 are exactly
26
the same. However they are not computed on the same dataset, SSR is the residual in fit
while PRESS is the residual in prediction (i.e. data left out in cross-validation rounds).
Q2CV is computed sequentially for each new component. Values of Q2
CV > 0 mean that this
component improves the prediction ability of the model and deteriorates it when Q2CV < 0.
The number of component chosen is the last component having a Q2CV >0.
Another definition was also proposed for Q2CV and is currently used in the ProMVTM
software package (ProSensus Inc.) and was described in (Wold, Trygg, et al. 2001).
( ) ( )1-2
CV
Y
PRESSQ
SS
aa = 2.15
Where SSY is the sum of square of the variance of Y. In this case, the Q2CV increases with
the number of components and starts decreasing when overfitting occurs. Usually, the
component chosen is the last one that increases the Q2CV by more than 1% (i.e. 0.01).
This definition of Q2 (equation 2.15) is the one reported in this thesis when the predictive
ability of the models is assessed.
The number of components can also be selected based on the smallest root mean
squared error of prediction in cross-validation (RMSEPCV). This metric is an estimation of
the error variance of the prediction set used in the cross-validation procedure and is
computed for each variable. It is possible to select the number of components based on
the minimum RMSEPCV obtained. When more than one Y variables are present, a
compromise must be made since the minimum RMSEPCV might not be obtained on all
variable with the same number of components.
( ) � ( )( ), ,
2
1
1RMSEPCV
pred
ja a
N =
= −∑I
i j i ji
y y 2.16
In equation 2.16, I is the number of observations, j is the selected variable index, a is the
number of model components (a = 1, 2, …, A),and N is the number of observations.
Selecting too few latent variables leaves some structured information in the residuals.
However, selecting too many latent variables leads to overfitting and modeling of the noise
in the data.
27
Alternatively, one could use a separate validation dataset for computing predictive ability.
While adding one LV at a time, it is possible to compute the PRESS on the validation set
until the predictive ability starts to degrade due to overfitting. This approach with external
data is the better way to validate a model, but a high number of observations is required in
order to split a dataset into a training set and a validation set (i.e. typically 2/3 and 1/3).
2.5 Model interpretation tools
Aside from the model structure of PCA and PLS, which are powerful methods for process
modeling, a number of tools can be used to help interpret the models and learn from the
data. First, the score scatter plots (ti-tj) and loadings plots (pi-pj) are used to interpret the
relationships between the observations and the variables, respectively. A combination of
two or three latent variables can be simultaneously visualized through these tools using 2D
or 3D scatter plots. The use of these score plots will be illustrated later in the results
section.
The residual Q statistic is the perpendicular projection distance of an observation off the
latent variable space. It is useful for detecting outliers because it highlights observations
with a different correlation structure than that of the data used to build the PCA and PLS
models (i.e. outliers in the space orthogonal to the LV space).
( ) ( )2
1Q a e a
==∑
J
i ijj 2.17
Qi(a) is computed from equation 2.17, where eij(a) is the residual of observation i and
variable j obtained with a model built using a latent variables.
�( ),,( ) ( )e a a= − i jij i jx x 2.18
Equations 2.17 and 2.18 define Q for the X space, but it is possible to compute this
statistic for the Y space by replacing e by the residuals of Y (i.e., elements of the F matrix)
and the x variables by y variables.
The Hotelling’s T2 is the Mahalanobis distance of an observation to the center of the LV
space. It can also be used for detecting outliers in the LV space.
2
2
21
tT a
a as=
=∑A
ii
t
2.19
28
An additional tool which can help identify important variables in a PLS model is the
variable importance in projection (VIP) which is an indication of the importance of a
variable in predicting the Y variables (Eriksson et al. 2001):
( )
( )
,
,
2Y
1
Y
SS
SSVIP
a
a
a==∑w
A
j
j A
J
A 2.20
Where wja is the weight of the jth variable (from X) in the ath PLS latent variable, SSY(a) is
the sum of squares of Y explained by the ath LV of the PLS model and SSY(A) is the total
sum of squares of Y explained by the model. Those variables having a VIP value greater
than 1 are considered to be the most influential in the model (Eriksson et al. 2001).
Finally, another useful interpretation tool is the contribution plot. It essentially consists of
the difference in the values of a particular variable between two observations (or averaged
over some clustered observations) weighted by the importance of that variable in the
model given by the PLS model weights (w*). It indicates which combination of variables
contributes the most to a deviation in the score space (T) of a latent variable model. It
does not generally reflect a cause and effect relationship, but it is a good indicator of
possible root causes. The calculation of the contributions is explained in (Westerhuis et al.
2000; Kourti 2005). The contribution of variable j, to the shift between two observations (k1
and k2) is computed using the expression below.
( )( )
1 2
1 2
2
21
t t w*x x
sa
Ajak jak ja
j jk jk
a=
− × = − ∑t
C 2.21
Where xjk1 and xjk2 are the values of the jth variable at time (or observation number) k1 and
k2, w*ja is the weight associated with the jth variable of the ath latent variable and s2ta is the
variance of the ath score. Dividing by the score variance gives an equal importance of
deviations in each LV. For contribution from a group of observations to another group, the
difference in the mean value of the observation in each group for each variable is used.
29
Chapter 3 Image texture analysis
3.1 Machine vision
The use of digital imaging sensors as data acquisition devices is now widespread in very
diverse areas such as for laboratory applications, for medical imaging or for industrial
process control. Machine vision sensors are now used to measure and collect data in the
same way as flowmeters, thermocouples and pH probes for instance.
Machine vision sensors are typically developed according to the general framework
presented in Figure 14 involving four successive stages: 1) image acquisition, 2) image
pre-treatment (when necessary), 3) extraction of image features (spectral, textural or both)
and 4) analysis of these features (Liu 2005; Duchesne 2010).
Figure 14 – Schematic of the machine vision approach (Liu 2005; Duchesne 2010)
The acquisition of a digital image is usually accomplished using a camera equipped with a
CCD sensor (i.e. charged couple device) which converts the photon intensity of the
captured light to a digital signal. However, any other type of system capturing a digital
image can be used. Various pre-treatments can be applied to the raw image in order to
filter noise or to remove irrelevant sources of variations such as non-uniform illumination or
instrumental variations (e.g. pixels-to-pixel variations of a CCD sensor). Gonzalez and
Woods (Gonzalez & Woods 2008) and Sonka et al.(Sonka et al. 2008) describe several
traditional techniques used for image pre-treatment.
30
Multivariate image analysis techniques were recently reviewed by Prats-Montalbán et al.
(Prats-Montalbán et al. 2011) and Duchesne et al. (Duchesne et al. 2012). The methods
are essentially classified according to the nature of the features they are extracting from
images: spectral features only, textural features only, or a combination of both. First,
multivariate image analysis (MIA) is used for extracting spectral features from a
multivariate (i.e. multi-channel) image such as RGB color images or hyperspectral images.
On the other hand, texture analysis methods are used when the spatial relationship
between the pixels is important. A combination of both approaches (i.e. spectral and
texture) can be used if both types of information are present in the image (Liu &
MacGregor 2007).
In this thesis, the most useful information to be extracted from anode past images is
related to its texture. Thus, only image texture analysis methods are reviewed in this
chapter. For a broader perspective on Multivariate imaging techniques, the interested
reader is referred to the following review papers (Prats-Montalbán et al. 2011; Duchesne
et al. 2012). Among the several methods available for texture analysis the following two
were use in this work: the wavelet texture analysis (WTA) and the gray level co-occurrence
matrix (GLCM). Both of them are considered the state-of-the-art methods in the machine
vision field.
3.2 Digital image
Digital images are stored as data matrices or arrays depending on whether the image is
gray-level or multi-channel (e.g. RGB or hyperspectral). A gray-level image is a two-
dimensional function f(x,y) where x and y are the spatial coordinates within the image and f
is the light intensity recorded at each (x,y) position in the image. The (x,y) positions
correspond to a discretization of the image scene into so-called pixels. The number of
pixels in both spatial directions I × J define the image size and its spatial resolution.
Usually, the gray levels of each pixel are represented by integer values (spectral
discretization) ranging between 0 and 255 for an 8-bit system (i.e. 28 possibilities and
therefore 256 possible integers). A simple gray level image is a matrix of size I × J. A color
image is a three-dimensional array of size I × J × K, where each dimension K correspond
to a gray level in a particular channel (color or wavelength). These colors are red, green
and blue in the case of a traditional color image (i.e. also called RGB images). The number
of channels K in multi- and hyperspectral images typically range from tens to hundreds
(Grahn & Geladi 2007).
31
3.3 Image texture analysis
There are several definitions for image texture. For example, the texture of an image is
defined as the spatial variations of pixel intensities (Bharati et al. 2004). It is also a
representation of the structure of an image that is a regular repetition of an element or a
pattern on a surface (Srinivasan & Shobha 2008). In any case, image texture involves
extracting the relationships between the pixel intensities within a certain neighborhood in
the image. The image texture methods developed in the past fall into four categories (Liu
2005; Duchesne et al. 2012) and differ mainly by the way they quantify the spatial
relationships between the pixels: 1) geometric, 2) model based, 3) statistical and 4)
transform based methods.
Geometric methods (or structural approaches) are described in detail in (Gonzalez &
Woods 2008). These methods are best suited to describe well defined geometric shapes
or regular textural patterns. These methods are not appropriate for anode paste images
because textural patterns are not regular (depends on the irregular shapes of the coke
particles and the amount of pitch in the formulation).
Model based methods make use of parametric models to extract textural information from
the images. For example, the Markov random field (MRF) and fractal models (Cross & Jain
1983; Chen et al. 1993) are the mostly used models. However, these methods require high
computing power (Materka et al. 1998). This is not an issue for off-line applications, but
could still be a problem for on-line real-time image analysis applications.
In statistical methods, the image texture is represented by stochastic characteristics that
are calculated from the distribution of gray levels in the image. Simple first order statistics
computed from the intensity distribution (i.e. the average, variance, skewness and kurtosis
of the intensities across the image) describe intensity variations within the image but do
not account for the spatial relationships between the pixels. Second order statistics are
more appropriate for extracting relationships between pixels because they use the joint
distributions of intensities of pairs of pixels located within a certain neighborhood. Hence,
they are more efficient for describing image textural patterns because they maintain the
spatial relationship between the pixels (Gonzalez & Woods 2008; Srinivasan & Shobha
2008). The most widely used statistical method is the gray level co-occurrence matrix
(GLCM) (Haralick et al. 1973). The GLCM is a matrix in which elements contain the
probability of occurrence of gray levels of pairs of pixels at distance and angle from each
32
other. A number of scalar features (or textural descriptors) calculated from this matrix are
typically used to characterize the texture of an image. Similar methods such as the gray
level pixel-run matrix (GLPRM) (Galloway 1975) and the neighbouring gray level
dependence matrix (NGLDM) (Sun & Wee 1983) have been proposed but are less used in
the literature.
The last class of texture methods are transform based approaches. These methods, often
referred to as multi-resolution texture analysis, perform frequency or spatial-frequency
decomposition of the image 2-D signals. It has been shown that the power of the signals in
different frequency bands and in different areas within an image can be related with
textural patterns (Livens et al. 1997; Van de Wouwer et al. 1999; Bharati et al. 2004).
Several different methods using different types of transforms exist (e.g. Fourier and
Wavelet transform). The wavelet transform has been preferred to the Fourier transform for
image texture analysis because it retains the spatial and frequency information instead of
frequency only (i.e. Fourier). Indeed, the spatial information is lost when the Fourier
transform is used (Bharati et al. 2004; Liu & Han 2011).
Both the GLCM (statistical) and the Wavelet Texture Analysis (transform-based)
approaches were used in this thesis for the analysis of anode paste images. Therefore,
these two methods are described in greater details in the rest of this chapter.
3.3.1 Gray level co-occurrence matrix (GLCM)
The GLCM method was proposed by Haralick et al. (Haralick et al. 1973). The GLCM of an
image is an estimate of the joint probability distribution (P(x,y)) of the intensities of two
pixels (x,y) separated by a distance (L) and an angle (θ) (Bharati et al. 2004). P(x,y) is a
square matrix whose dimensions correspond to the number of gray levels in the image
(e.g. typically 256 intensity values for an 8-bit image) or to that used for the analysis
because the number of gray levels can be reduced by binning. Figure 15 shows an
example of 4 GLCM matrices for a simple 4 gray levels image matrix.
Figure 15 – Examples of GLCM matrices (Tessier et al. 2008)
33
In this example published by Tessier et al. (Tessier et al. 2008), the image I contains pixels
with gray level values ranging from 1 to 4. This means that the size of the GLCM matrices
M1, ..., M4 is 4×4. For all the pixels at a given distance L and angle θ from each other
across the image, the number of occurrences (i.e. probability) of the different combinations
of gray levels taken by the pairs of pixels are stored in the GLCM (M). For L = 1 and θ =
90° (i.e. pixels on the same row or horizontal) there are 4 occurrences of pixels with
intensities x=1 and y=1 (yellow rectangles), and so P(1,1) = 4, and 1 occurrence with
pixels of intensities x=3 and y=4 (green rectangle). Thus P(3,4) = 1. These calculations
can be repeated for different values of L and θ to obtain a set of GLCMs (i.e. M2, ..., M4),
which yields a multi-resolution description of the image texture. Small distances L would
extract finer textural patterns whereas longer pixel-to-pixel distance would focus on
coarser textures. The GLCMs obtained from different angles could be used to assess the
level of anisotropicity of the textural patterns (i.e. preferential orientations).
To compare the texture of different images quantitatively, it is necessary to summarize the
information contained in the set of GLCMs into a row vector containing different scalar
features computed from the various matrices. Haralick et al. (Haralick et al. 1973)
proposed 14 different textural features: angular second moment, contrast, correlation,
variance, inverse difference moment, sum average, sum variance, sum entropy, entropy,
difference variance, difference entropy, two measures of correlation (i.e. f12 and f13) and
the maximal correlation coefficient (i.e. f14). However, only a subset of 4-5 features are
used more frequently (Soh & Tsatsoulis 1999; Van de Wouwer et al. 1999; Clausi 2002;
Maillard 2003; Bharati et al. 2004). Maillard (Maillard 2003) also compared the choice of
the textural features for several articles reporting the use of GLCM. The five most used
features are angular second moment (ASM), entropy, contrast, correlation and
homogeneity.
The angular second moment (equation 3.1), also called energy, is the sum of square of all
the GLCM elements. Sometimes, the sum of GLCM elements is used instead of the
squares. It is a measure of relative homogeneity when it is used to compare different
images. A more homogeneous image will contain fewer transitions of gray level intensities.
As a result, the P(x,y) values around the diagonal of the GLCM will be of high magnitude
and much lower values elsewhere (off-diagonal elements). The GLCM of a less
homogeneous image is characterized by lower P(x,y) values spread across the various
GLCM elements. Thus, the ASM of the homogeneous image will be higher in comparison
34
with a non-homogeneous image. In equations 3.1 to 3.9, ng is the number of gray levels
used to compute the GLCM.
( ){ },
2
1 1
ASM P= =
=∑∑g gn n
x y
x y 3.1
Entropy (equation 3.2) is also a measure of homogeneity. It is the opposite of the ASM,
when the homogeneity increases, the entropy decreases since the high values of P(x,y)
have low values of log{P(x,y)}.
( ) ( ){ }, log ,1 1
Ent P P= =
= −∑∑g gn n
x y
x y x y 3.2
Contrast (equation 3.3) is a measure of the local variations in the image. In equation 3.3,
the probability values are weighted by the squared difference in pixel intensities. The value
for this feature will be higher for images containing sharp transitions (i.e. large difference in
intensity from pixel to pixel).
( ) ( ),2
1 1
Cont P= =
= −∑∑g gn n
x y
x y x y 3.3
Homogeneity (equation 3.4), or the inverse moment, is also a measure of the importance
of local variations in intensity. The elements of the GLCM that are far away from the
diagonal (i.e. large difference between x and y) are penalized by the weighting
denominator. Sometimes, the square of the difference between x and y is used instead of
the absolute value. The behavior of this feature is the opposite as the contrast.
( ),
1 1
PHom
1= =
=+ −
∑∑g gn n
x y
x y
x y 3.4
Finally, the correlation (equation 3.5) is a measure of the structure within the image.
( ) ( ),1 1
P
Corr
µ µ
σ σ= =
× −
=∑∑
g gn n
x y
x y
x y
x y x y
3.5
35
( ) ( ),2
1 1
Py
yσ µ= =
= −∑ ∑g gn n
x x
x
x x 3.6
( ) ( ),2
1 1
Pσ µ= =
= −∑ ∑g gn n
y y
y x
y x y 3.7
( ),1 1
P y
x yµ= =
=∑ ∑g gn n
x
x
x 3.8
( ),1 1
Py x
x yµ= =
=∑ ∑g gn n
y y 3.9
As an example, two images (Figure 16) of two different stone surfaces
(http://www.highresolutiontextures.com) are used to illustrate the behavior and
interpretation of the five GLCM features presented in equations 3.1 to 3.5.
Figure 16 – Two stone surfaces with different texture used for the GLCM features example
(http://www.highresolutiontextures.com)
A B
36
Figure 16 contains two stone surface images. Image A has a more uniform surface with
some gray tone differences which varies slowly from the left to the right of the image and
some darker lines in the image. Image B is less homogeneous with more high frequency
(i.e. small details) and structured variation of black and white dots.
A GLCM was computed for each image using a distance L = 1 and angle θ = 90°. The
scalar textural features calculated using the GLCM of each image are presented in Table
3. The number of gray level values used was ng=256.
Table 3 – GLCM features of the images in Figure 16
The ASM of image A is higher than that of image B since it is more homogeneous. The
entropy values have the opposite behaviour since it is lower for more homogeneous
images. Homogeneity and contrast are both related to the magnitude of the gray level
differences between adjacent pixels (L = 1 in this case). Since the image B has smaller
patterns, thus high contrast at low L values, its contrast is higher and homogeneity smaller
than image A. Finally, image B contains regular patterns, hence its correlation feature is
high compared to the image A.
To obtain a multi-resolution analysis of an image, it is possible to compute multiple GLCM
matrices for different distances L. The GLCM features will describe the texture at different
scales in the image using this approach. Furthermore, if the orientation of the texture is
important, one could compute the GLCM for different angles θ. But, usually, the average of
the GLCM features for all angles at each distance L is used to compare the images. This
averaging greatly reduces the number of features and simplifies the classification of the
images.
Van de Wouwer et al. (Van de Wouwer et al. 1999) and Clausi (Clausi 2002) both
discussed the issue of redundancy in the features (i.e. some of the features are
redundant). They have shown that some features explained the same information in the
GLCM and therefore it was not necessary to compute all the features, but only one for
each set of redundant features (i.e. chose a set of independent features). For example, in
the set of five features described above, the ASM and entropy are redundant as well as
Image ASM Entropy Homogeneity Contrast Correlation
A 0,0056 5,66 0,379 30,36 0,939
B 0,0008 7,55 0,317 44,93 0,980
37
the contrast and homogeneity, but only one of each is necessary to characterize the image
texture.
3.3.2 Wavelet texture analysis (WTA)
The other state of the art method for image texture analysis is based on the wavelet
transform. It was originally developed for signal processing, denoising and compression
(Rioul & Vetterli 1991; Usevitch 2001), but was also used for feature extraction and image
classification (Mallat 1989; Livens et al. 1997; Liu & Han 2011). This section presents an
overview of the wavelet transform and its application to image texture analysis. However,
several tutorials and books covering the history, theoretical aspects, and a broader range
of applications of the wavelets are available for interested readers (Antonini et al. 1992;
Chui 1992; Prasad & Iyengar 1997; Sarkar et al. 1998; Stark 2005; Debnath & Shah
2015).
An image is a two dimensional signal representing the variations of light intensities within a
scene (2-D space). These signals typically contain multiple frequencies. Large objects or
coarse textural patterns in the image generate low frequency variations whereas smaller
objects or finer textural patterns appear as high frequency information. In image texture
analysis, it is often desired to extract textural features at different resolutions (i.e. sizes or
levels of scrutiny) which require the image signals to be decomposed into their frequency
components. The Fourier transform can be used to obtain the signal frequency content but
the spatial resolution is lost due to the fact that sine (or cosine) waves are infinite signals in
the spatial (or time) domain. However, wavelets functions have a finite length which
enables a signal (e.g. an image) to be decomposed into both the frequency and spatial
domains. Applying the 2-dimensional wavelet decomposition to an image results in a
series of new images each capturing the variations at different scale or frequency band
contained within the original image.
How the wavelet transform decomposes the information content of an image (i.e. a 2-
dimensional discrete signal) will be explained progressively. The application of the
continuous wavelet transform to a 1-dimensional signal will be presented first. The discrete
representation of a wavelet and its relationship to the traditional linear filters used in signal
processing will then be introduced. Finally, the application of the discrete wavelet
transform to a 2-dimensional signal (i.e. an image) will be described.
38
Wavelets are waveforms of finite length. Several types of wavelets having different shapes
exist (e.g. Haar, Daubechies, Symlet and Mexican Hat). The choice of the wavelet is
usually done by selecting the wavelet shape that matches best the analyzed signal.
However this choice of best wavelet is not unique and more than one type of wavelet can
give similar results. This will be discussed in section 6.3.3.
To perform the wavelet decomposition of a 1-D signal, a series of wavelet bases ψa,b(x)
are generated from the mathematical representation of a mother wavelet ψ(x) using
equation 3.10. In this equation, a and b are integers used to scale and shift the mother
wavelet. The scaling coefficient a compresses or stretches the wavelet to capture different
frequencies and the shift coefficient b moves the wavelet along the signal to capture the
time or space variations.
,
1a b
aaψ ψ
− =
x b 3.10
The continuous wavelet transform (CWT) is based on the convolution (equation 3.11) of
the scaled and shifted wavelet bases ψa,b(x) on the signal f(x) to obtain a series of detail
coefficient da,b. These detail coefficients are the measure of the similarity between the
signal and the wavelet base at a particular location in space or time shifted by b and at the
frequency specified by a. Computing da,b for multiple values of a and for each possible
values of b for a given signal f(x) yield a complete space/time-frequency decomposition of
the signal (Liu & MacGregor 2007; Duchesne 2010).
( ), ,d
a b a bd f ψ= ∫ x x 3.11
For a discrete signal, it is not necessary to compute the detail coefficients da,b for every
possible values of shift and scale coefficients. For the efficient computation of the wavelet
transform, the discrete wavelet transform (DWT) can be used. The DWT is performed on a
smaller number of discrete locations and frequencies based on a dyadic scale (i.e. power
of 2) without degrading significantly the accuracy of the wavelet decomposition. The scale
and shift coefficient are computed using equation 3.12 where m and n are integers.
2ma = and 2mb n= 3.12
39
The DWT detail coefficients are computed similarly to the CWT (equation 3.11) except that
the wavelet bases are computed only for the dyadic scale and shift coefficients.
The implementation of the DWT is performed using a set of low pass and high pass filters.
This concept was introduced by Mallat (Mallat 1989). The work of Mallat adapted the DWT
convolution integral to the signal processing field by developing a set of low pass and high
pass filters that could be applied to a signal to compute the wavelet coefficients. It has
been used in many industrial machine vision applications since then (Liu & MacGregor
2005; Liu et al. 2005; Liu & MacGregor 2006; Tessier et al. 2007; Tessier et al. 2008;
Prats-Montalbán et al. 2009; Reis & Bauer 2009; Facco et al. 2010).
Similarly to the wavelet function ψ(x), Mallat introduced a new function called the scaling
function φ(x). This function is orthogonal to the wavelet function and it is used to capture
the low frequency details. Whereas, the wavelet function is used to capture the high
frequency details. The scaling and wavelet functions are given by equations 3.13 and 3.14.
They correspond to the DWT decomposition for a dyadic scale a = 2s and b = 2sk. Both
functions are expressed as a low pass filter h0 for the scaling function and a high pass filter
h1 for the wavelet function. h0 and h1 are orthogonal filters and they are related to each
other by equation 3.15.
[ ] ( )/
,
22 2ss s
s lk k lφ = − 0h 3.13
[ ] ( )/
,
22 2ss s
s lk k lψ = − 1h 3.14
[ ] ( ) [ ]1 1= − −k
1 0h k h k 3.15
In these equations, s and l are the scale and shift indices and k is the discrete location in
the signal.
The approximation (i.e. low frequency residual) and the details coefficients (i.e. captured
high frequency content at each scale) are computed as the inner product of the wavelet
and scaling function with the signal f(x) using equations 3.16 and 3.17.
( ) [ ] [ ] [ ],,
s lsa l f k kφ= 3.16
( ) [ ] [ ] [ ],,
s lsd l f k kψ= 3.17
40
In the case of the DWT, the decomposition is not done by scaling the wavelet, but by sub-
sampling the signal. The coefficients are computed sequentially at each decomposition
level s. To capture the details at different frequencies, the 1D signal or an image (2D
signal) are sampled by a dyadic function, that is, each dimension is reduced by a factor of
two after every decompositions level s.
When the wavelets are applied to a vector (e.g. time series of a signal or a one-
dimensional spatial vector), the decomposition yields a series of detail coefficient and an
approximation vectors. It captures the highest frequency detail first than the next highest,
etc. and the residual of the signal after S decomposition (i.e. low frequency) is left in the
approximation. Using the DWT, the frequency band is cut in half at each decomposition
level (Figure 17). In Figure 17 and Figure 18, fn is the normalized maximum frequency of
the signal.
Figure 17 – Frequency band divisions of the DWT of a vector (one-dimensional signal)
In an image (two-dimensional signal), the DWT is simply applied in each direction. This
yields a set of three detail images, each more sensitive to a specific direction (i.e
horizontal, vertical and diagonal directions respectively) and one approximation image at
each scale. The process is illustrated in Figure 18 (Liu & MacGregor 2007).
Figure 18 a) shows the filtering and down sampling strategy used for the DWT based on
Mallat’s filtering strategy. First, the low pass and high pass filters are applied horizontally,
then the image dimensions are reduce by removing one row out of two. The same low
pass and high pass filters are then applied vertically on the results of the previous step.
The last operation is the deletion of half of the columns. The dimensions of the images (i.e.
details and approximation) at the decomposition level j correspond to a quarter of the
image at j-1 due to down sampling. The decomposition continues with the approximation
image at level j.
a3 d3 d2 d1
0 fn/8 fn/4 fn/2 fn
Level 1Level 2Level 3
Frequency
41
Figure 18 – 2-D DWT decomposition a) schematics of the filter bank used at the jth
decomposition and b) frequency distribution of the detail and approximation images (Liu &
MacGregor 2007)
The DWT yields detail coefficients images and approximation that are of different size at
each decomposition level. For visualization and computation of features, it is possible to
reconstruct each individual sub-images to its original size using the correct reconstruction
filters. For a certain family of wavelets, the reconstruction is perfect and no information is
lost in the transformation. These reconstructed images from only one detail sub-image
may sometime contain artifacts since perfect reconstruction is only valid when all detail
coefficients and approximation images are used for the reconstruction. The detail images
need to be cheked visually for artifacts. In this thesis, all the features were computed on
the reconstructed detail coefficient images and no obvious artifacts were observed.
Figure 18 b) shows a representation of the band-pass frequency captured by the
decomposition. The horizontal detail contains the high pass information in the horizontal
direction and the low pass frequencies in the vertical directions and vice versa for the
vertical detail. The diagonal detail contains the high pass frequencies in both directions.
Finally, the approximation contains the low pass information in both directions. The
frequency spectrum is cut in half at each decomposition level in both directions.
Another interesting property of the DWT is that each detail coefficients images are
orthogonal. This means that there is no redundancy in each frequency band captured by
each detail images and the approximation.
Figure 19 is used to illustrate the band pass frequency decomposition of the DWT. It is a
composite of 8 different images of surfaces exhibiting different visual textures. The roman
numbers correspond to a class of texture and the Arabic letters to the individual sub-
image. The texture i (sub-image 1) is a smooth surface with only low frequency variations.
a) b)
A1 D1h
0 fn/2 fn
Horizontal
frequency
D1v D1
d
Verticalfrequency
fn/2
fnAs-1
Ash0ver 1↓2
h1ver 1↓2 Ds
v
Dshh0
ver 1↓2
h1ver 1↓2 Ds
d
h1hor 2↓1
h0hor 2↓1
42
The texture ii (6-8) is finer (i.e. higher frequency). In iii (3), the texture is coarser than ii but
still very homogeneous. The texture iv (4) is a mix of fine and coarse texture and it
contains high contrast. The pavement in v (5) contains high frequency information in the
tiles and low frequency information (i.e. the mortar), both features should be captured by
different wavelet scale. Finally, the texture in vi (2) is highly oriented and contains low
frequency details in the horizontal directions and high frequency information in the vertical
direction.
Figure 19 – Composite image of different textures (http://www.highresolutiontextures.com)
Figure 20 presents the results of the DWT decomposition of the image in Figure 19.
Different parts (i.e. sub-images) of the original image at different decomposition levels are
used as examples of the properties of the DWT.
Figure 20 a) shows the reconstructed approximation (A3) image after 3 decomposition
levels. It is blurry since the high frequency details have been removed from the image.
Figure 20 b) contains the reconstructed detail images at scale 1 (D1) (i.e. high frequency)
for the sub-images 1 and 7. Sub-image 1 is very smooth and its detail coefficient contains
almost no information at that scale. However, the detail coefficient image of the sub-image
(7) (i.e. much finer texture) contains information in this frequency band.
i v
ii
ii
ii
vi
iii
iv
1
2
3
4
5
6
7
8
43
Figure 20 c) shows the reconstructed detail coefficients images of the sub-image 5 at two
different scales. D1 is the high frequency band and captures the fine details inside the
pavement tiles. D3 captures the mortar lines between the tiles. The mortar around the tile
is a lower frequency detail in comparison with the fine texture inside the tiles and it is
captured by a lower decomposition level (i.e. coarser texture).
Figure 20 d) shows the reconstructed detail coefficients images separately for the
horizontal and vertical directions for sub-image 2. That image contains oriented texture
and this is captured by the vertical D1v but not by the horizontal D1
h.
Figure 20 – Approximation and detail coefficients of the composite texture image: a)
approximation at scale 3, b) comparison of the reconstructed detail at scale 1 for sub-
images 1 and 7, c) comparison of the reconstructed detail at scale 1 and 3 of sub-image 5
and d) comparison of the direction sensitivity for the reconstructed detail at scale 1 of sub-
image 2.
To compare the texture of a set of images it is common to compute scalar textural
descriptors (or features) from the detail and approximation sub-images, and then use
some multivariate clustering and classification techniques to compare the images in a
quantitative manner. The features are usually computed from the detail sub-images at
each scale and direction (Dsk) but they can also be computed from the approximation sub-
images at each scale (Ask) (Facco et al. 2010). In this case however, the approximations at
a) b)
(1)D1
(7)D1
c)
(5)D1
(5)D3d)
(2)D1h
(2)D1v
A3
44
each scale contain redundant information from the previous approximations since the
approximation at scale s+1 contain part of the frequency distribution of scale s. Finally, the
features can also be computed on the sub-sampled (i.e. not reconstructed) or on the
reconstructed detail coefficients or approximation.
The most frequently used scalar feature is the energy (Scheunders et al. 1997; Liu &
MacGregor 2005; Liu et al. 2005; Tessier et al. 2007; Selvan & Ramakrishnan 2007; Prats-
Montalbán et al. 2009; Liu & Han 2011). Facco et al. (Facco et al. 2010) also proposed the
use of 4 additional features for texture analysis: the entropy, standard deviation, skewness
and kurtosis. The five scalar features are presented in equations 3.18-3.22. In image
texture analysis they are typically computed using the coefficients of the detail sub-images
(DsK) obtained at each scale and direction or those of approximation sub-images.
Alternatively, the scalar textural descriptors can be calculated on the reconstructed
versions of the sub-images RsK at each scale and direction. One advantage of using the
reconstructed sub-image is that their sizes are the same as those of the original image. It
was also proposed by some authors to apply the GLCM on the detail or approximation
images, then compute the GLCM features presented in section 3.3.1 (Van de Wouwer et
al. 1999; Liu & Han 2011; Yousefian-Jazi et al. 2014) and use these to quantify image
texture.
The energy (equation 3.18) is the sum of square of the detail coefficients in a given sub-
image. Since the DWT decomposition conserves the energy of the original image (i.e. the
sum of the energy of all detail images and the approximation is equal to the energy of the
original image), the energy is a measure of how much information is contained at each
scale (i.e. frequency band) and direction. It is also a measure of homogeneity similar to the
energy of the GLCM (equation 3.1).
( ),
2
1 1
R
E= ==
×
∑∑I J
ks
i jks
i j
I J 3.18
The entropy (equation 3.19) and the standard deviation (equation 3.20) are also a
measure of the homogeneity. The entropy is computed from the probability density
function Psk (i.e. histogram) of the reconstructed detail coefficients sub-images Rs
k.
45
( ) ( )( )log1 1
Ent P Pl l= =
= −∑∑I J
k k ks s s
i j
3.19
( )( ),
2
1 1
R µσ = =
−
=×
∑∑I J
ks
i jks
i j
I J 3.20
In equation 3.20, µ is the mean of Rsk.
The skewness (equation 3.21) is a measure of the lopsidedness of a distribution. The
skewness increases when values of the distribution of coefficients are skewed to one side
of the mean.
( )( )
,
,
3
1 1
3
R
Skew
J
µ
σ
= =
−
×=
∑∑I
ks
i j
ks k
s
i j
I J 3.21
The last feature is the kurtosis (equation 3.22). It is a measure of the broadness of the
wavelet coefficients distributions.
( )( )
,
,
4
1 1
4
R
Kurt
µ
σ
= =
−
×=
∑∑I J
ks
i j
ks k
s
i j
I J 3.22
There is no discussion in the literature about the redundancy of these WTA features, but
the energy, standard deviation and entropy are correlated characteristics. Using the same
reasoning as with the GLCM features (Van de Wouwer et al. 1999; Clausi 2002), only one
of these three features together with the skewness and the kurtosis should be enough to
classify the image texture.
47
Chapter 4 Experimental
This chapter introduces the experimental procedures, raw materials and imaging setup
used for the development of the machine vision algorithm. First, the softwares used to
analyze the images are presented. It is followed by a description of the laboratory paste
and anode manufacturing procedures and raw material properties. Then, the industrial
sampling strategy is explained. Finally, the imaging system and image analysis methods
are presented in details.
4.1 List of softwares
Several different softwares were used to obtain the results presented in this thesis. First,
all the computation involved in image processing and texture analysis were performed in
MatlabTM version 7.13 (2011b) from the MathWorks. The wavelet and image processing
toolbox are required to perform the calculations. A third party toolbox developed for
Matlab, the PLS toolboxTM version 7.3 from Eigenvector Research, was also used to
compute most of the latent variable models presented in the thesis. The multi-block
algorithm was also developed within the Matlab environment. Finally, the ProMVTM
software version 13.08 r1685 from ProSensus was used to benchmark the developed PLS
and MB-PLS algorithm programmed in Matlab as well as for exploratory analysis of the
datasets because of its user friendly interface.
4.2 Laboratory anode fabrication
Two series of experiments were conducted using different sets of raw materials to
fabricate laboratory anodes. An industrial raw material formulation was used for the first
set. This decision was made to obtain lab anode paste samples that were the most
representative of the real industrial paste. To achieve this, already classified raw materials
from the three industrial coke fractions and butts were collected from the Alcoa
Deschambault Quebec smelter (ADQ). The second sets of experiments were conducted
using commercially available cokes (no butts). The crushing, grinding and sizing were
performed in the laboratory to obtain a typical aggregate size distribution for laboratory
fabricated anodes (Azari et al. 2013).
4.2.1 Industrial raw material formulation
At the Alcoa Deschambault smelter’s carbon plant, a mix of two cokes is typically used to
formulate the anode paste. The day the material was sampled, the ratio of the two cokes
48
was 65% coke 1 and 35% coke 2. The raw material properties for the coke are listed in
Table 4. In this table, the vibrated bulk density (VBD) is presented for two ranges of
particle size distributions. The VBD was measured using ASTM D4292 standard test
method. Finally, the impurities are measured using X-ray fluorescence.
Table 4 – Properties of the industrial coke used for the laboratory paste manufacturing
Several paste samples were fabricated using this material blend, but these paste samples
were not pressed into anodes. The typical particle size distributions of the blended and
classified coke and butts particles are listed in Table 5. The reference mix formulation is
presented in Table 7.
Table 5 – Particle size distribution (measured at the plant) for each material fractions
Property Coke 1 Coke 2
VBD (-8/+14 US mesh) (g/cm³) 0,80 0,91
VBD (-30/+ 50 US mesh) (g/cm³) 0,86 0,95
Real density (g/cm³) 2,065 2,071
Fe (ppm) 0,033 0,014
Si (ppm) 0,016 0,003
S (%) 2,91 1,43
V (ppm) 0,029 0,019
Na (ppm) 0,01 0,006
Ca (ppm) 0,01 0,009
Ni (ppm) 0,01 0,012
+4 (US mesh) (%) 22,1 27,4
-4/+14 (US mesh) (%) 58,5 58,2
-14/+30 (US mesh) (%) 77,8 74,5
-30/+50 (US mesh) (%) 92,2 86,9
-50/+100 (US mesh) (%) 96,9 93,5
-100/+200 (US mesh) (%) 98,9 97
-200 (US mesh) (%) 99,9 100
Coarse Inter. Fines Butts
+3/8 (in) (%) --- --- --- 16-19
-3/8 (in) /+4 (%) 17-20 --- --- 23-26
-4/+8 (%) 24-27 --- --- 20-23
-8/+14 (%) 21-23 <1 <1 13-16
-14/+30 (%) 26-29 14-17 <1 14-17
-30/+50 (%) 4-6 50-53 <1 4-7
-50/+100 (%) <1 22-25 3-6 <1
-100/+200 (%) <1 6-9 19-22 <1
-200 (%) <1 1-3 74-77 <1
VBD (g/cm³) --- <1 --- ---
Blaine (cm2/g) --- --- 5000-6000 ---
Size (US mesh)
49
The proportion of fines particles (%) shown in Table 5 is the average of three
measurements for each 12h period of plant operation.
The size distribution data presented in Table 5 are given for a range of particle size band
that passed a given screen size (i.e. -) but were retained on the next screen (i.e +). Also
the Blaine number (BN) available for the fine fraction is a measure of the specific surface
area and it is representative of the fine’s fineness. This measurement is obtained using a
Malvern Mastersizer 2000TM laser diffraction particle size analyser.
The pitch properties used for laboratory anodes are listed in Table 6. Each property is the
weekly average of the pitch properties received that week based on the supplier’s
certificate of analysis (COA).
Table 6 – Properties of the industrial pitch supplied by ADQ for the laboratory anodes
The base formulation is presented in Table 7. It is typical of an average industrial
formulation. The percentages were calculated based on the dry aggregate total mass.
Thus, the proportion of pitch is the ratio of the amount of pitch to the amount of dry
particles (i.e. weight ratio).
Table 7 – Base mix formulation for the laboratory anode fabricated with the industrial raw materials
4.2.2 Laboratory raw material formulation
The second set of experiments was also conducted at the laboratory scale, but uses a
typical laboratory formulation. These paste samples were pressed into cylindrical anodes.
The diameter of the mold was 68 mm. Due to the small anode dimensions, the size
distribution of the particles was limited to a maximum size of approximately 4 mm. The
formulation used for the dry aggregate is presented in Table 8.
ADQ pitch
Softening point (°C) 109,1
Toluene soluble (%) 71,9
B fraction (%) 14,5
Quinoline insoluble (%) 13,5
Coking value (%) 58,8
Viscosity 160°C (cP) 1890,8
Viscosity 180°C (cP) 517,5
Property
Fraction Coarse Inter Fines Butts Pitch
% 32,5 26,0 18,2 23,4 16,9
50
Table 8 – Laboratory coke aggregate formulation
Two different commercially available calcined petroleum sponge cokes (A and B) were
used to prepare the laboratory anode formulations. Table 9 shows the physical properties
of these cokes. They were selected because they had the largest difference in density in
the +30/-50 US mesh size fraction of all the cokes available at the University. The crushing
and classification were performed in the laboratory. Both cokes were crushed with jaw and
roll crushers. The particles were separated into six different size fractions (i.e. -4/+8, -
8/+16, -16/+30, -30/+50, -50/+100 and -100/+200 US mesh) using a SwecoTM vibro-energy
round separator. The fines particles were obtained by grinding the -8/+16 US mesh
particles with a ball mill. This fraction is called ball mill fines. Details on the grinding
method are available in (Azari Dorcheh 2013). The real density and impurities
measurements were performed at the Deschambault smelter’s laboratory.
Table 9 – Laboratory coke properties
The pitch used for the laboratory formulated anodes was also supplied by ADQ. This pitch
was used instead of the one used in the industrial formulations for easier comparison and
troubleshooting of the laboratory anode manufacturing since all the other research projects
carried out at Université Laval involving laboratory anode manufacturing (i.e. Prof. Mario
Fafard’s MACE3 industrial research Chair and Prof. Houshang Alamdari’s Collaborative
Research and Development program on anode manufacturing) used this particular pitch.
The laboratory anode pitch properties are listed in Table 10.
Fraction (US mesh) -4/+8 -8/+14 -14/+30 -30/+50 -50/+100 -100/+200 Fine (BN 4000)
% 22,0 10,0 11,5 12,7 9,2 10,8 23,8
Property
Real
density
(g/cm3)
VBD
(-8/+14 US mesh)
(g/cm3)
VBD
(-30/+50 US mesh)
(g/cm3)
Si
(ppm)
S
(%)
Ca
(ppm)
V
(ppm)
Fe
(ppm)
Ni
(ppm)
Coke A 2,073 0,88 0,94 210 1,73 240 90 340 230
Coke B 2,057 0,77 0,86 120 2,13 130 360 460 250
51
Table 10- Laboratory pitch properties
4.2.3 Laboratory anode fabrication
The procedure for the production of the laboratory paste is the same for both the industrial
and the laboratory formulation. First, the dry aggregate mix was prepared for each paste
sample by weighting each coke fraction separately. For the laboratory formulations, the
small size distribution of each of the coke fractions ensures that the dry aggregate size
distribution variations are minimized between the different samples.
For the mixing step, a dough mixer (Hobart N50) was fitted inside a laboratory oven
(Precision Scientific model 28) to control the mixing temperature (Figure 21). This oven
was also used for pre-heating the dry aggregate at the mixing temperature of 178°C for
approximately 16h (i.e. overnight). The paste was prepared by adding the solid pitch in the
dry aggregate mix and in the mixing bowl. The pitch was then pre-heated for 30 min. The
materials were mixed for the desired mixing time and a sample of the paste was
discharged in an aluminium container for image acquisition.
Figure 21 – Details of the mixer and oven for laboratory paste preparation
Lab pitch
Softening point (°C) 109,1
Toluene soluble (%) 70,6
B fraction (%) 12,9
Quinoline insoluble (%) 16,5
Coking value (%) 58,4
Viscosity 160°C (cP) 1929,7
Viscosity 180°C (cP) 548,5
Property
BowlMixing blade
Motor inside vent duct
52
The image acquisition was the last step of the sample preparation for the industrial
formulation paste samples. The laboratory formulated paste samples were subsequently
pressed into anodes.
The laboratory formulated paste samples were pressed using a MTS servo hydraulic
press. A 250 kN MTS load cell and a 150 mm position transducer (LVDT) was used to
control the press. The pressing was done under constant displacement rate of 10 mm/min
up to a maximum force of 220kN (60 MPa). The diameter of the mold was 68 mm. The
temperature of the mold and the press was controlled by a three zone split-tube furnace
(LAB-TEMP Thermacraft) mounted on the press. Anodes were pressed at 150°C. Figure
22 presents details of the mold and the press used to produce the anodes.
Figure 22 – Details of the press: a) cylindrical mold and dye and b) the press with the oven
to control the pressing temperature (Azari Dorcheh 2013)
The green anodes obtained after pressing were baked in a Pyradia furnace. To protect the
anodes from oxidation, they were place in a metal box (i.e. made of inconel) and covered
with packing coke (Figure 23). The heat-up rates are listed in Table 11. After the anode
samples were maintained at 1100 °C for 20 hours (i.e. soaking time), the furnace was then
turned off and the anodes were cooled in the closed furnace down to room temperature
(i.e. approximately 30 hours).
a) b)
53
Table 11 – Heat-up rate during the laboratory anode baking
Figure 23 – Laboratory baking furnace and baking box
The apparent density of the green and baked anodes was measured on each sample. The
volume of the anodes was measured geometrically using the average of 4 length
measurements around the sample and the average of 6 diameter measurements (i.e. 3
diameters at 2 different heights). The weight of each sample was measured using a
Sartorius CPA160015 scale. The density (in g/cm3) was then computed by dividing the
sample weight by its measured volume.
4.2.4 Industrial paste sampling
The machine vision algorithm was developed and tested with the laboratory paste, but it
was also validated using industrial paste samples. These samples were collected at the
Alcoa Deschambault smelter’s paste plant. The paste was collected manually on a
conveyor belt after the mixing step, but before the compaction. At each sample time, three
aluminium containers were filled with paste. Each sample had a volume of approximately
550 cm3 and weighted around 500g. The imaging set-up used at the plant is shown in
Figure 24.
Temparature
range (°C)
Heat-up
rate (°C/h)
30-150 60
150-650 20
560-1100 50
54
Figure 24 – Imaging set-up installed at the ADQ industrial plant
Paste sample were collected for normal (i.e. steady-state operation) and plant start-up
operations and also from controlled pitch optimization (PO) experiments. For each paste
samples, the green anode density corresponding to that sample was available as well as
the process operating conditions data. Details of the data synchronization are available in
(Lauzon-Gauthier 2011).
These data are not sufficient to fully characterize the quality of the paste samples. Subsets
of green anode cores were baked in the laboratory furnace at Université Laval (Figure 21)
to obtain the baked anode density. Some of these anodes were also characterized at the
Deschambault laboratory to measure the electrical resistivity, Young’s modulus,
compressive strength and the CO2 reactivity. These core samples were drilled from the
test anodes before baking. The baking was performed in the laboratory furnace to obtain
uniform baking profiles for all cores, which is impossible to obtain if the anodes are baked
in the industrial furnace. The green cores have a diameter of 55 mm and a length of
approximately 356 mm.
4.3 Image analysis methodology
4.3.1 Description of the imaging set-up
The imaging set-up developed in this project is shown in Figure 25. It consists of an Allied
Vision Technologies Prosilica GX 6 megapixels color camera with a 50 mm Kowa lens,
two 4.5 W LED light bulbs and Fresnel lenses to ensure uniformity of lighting. This set-up
allows for a wide variety of adjustment of the lighting angle and camera height.
55
Figure 25 – Image acquisition set-up
The camera is connected to a laptop computer using an Ethernet cable and the network
card of the computer. Allied Vision Technologies proprietary software AcquireControlTM
version 4.0.2 is used for controlling the camera as well as image acquisition. The images
were saved as Tagged Image File Format (TIFF) format images.
The paste samples were placed into an aluminium food container of dimension (109 × 127
× 40 mm) for image acquisition. Each sample was spread into the contained and its
surface was flattened using a metal spatula.
The working distance between the sample and the camera was adjusted to maximize the
visible paste surface without imaging the shiny aluminum container’s edge. The distance
between the lens and the paste surface was 461 mm. At this distance, the pixel resolution
was 40.7 µm. The illumination angle (i.e 45°) was chosen to obtain the most uniform
lighting of the surface possible. The exposure time was selected to minimize the saturation
in the images. The exposure time was 80 ms for the industrial samples and 70 ms for the
lab samples. This is most probably due to the difference in surrounding light intensity in the
paste plant and in the laboratory. It was much darker inside the paste plant then in the well
lit laboratory. In both cases, the exposure time was selected to minimize the pixel
saturation.
4.3.2 Description of the image analysis methodology
The image texture analysis method consists of computing GLCMs and the DWT
decomposition of each paste image. This is followed by a features extraction and analysis
Fresnel
lens
LED lights
Camera
56
steps and finaly the interpretation of the information obtained through the texture analysis
scheme . A flowsheet of the methodology is presented in Figure 26.
Figure 26 – Anode paste machine vision flowsheet
Figure 27 shows an example of laboratory (a) and industrial paste (b) samples. Their
appearance is similar, but the industrial paste is coarser (i.e. contains larger particles) than
the laboratory paste.
- Energy- Entropy- Standard deviation- Skewness
- Kurtosis
Original image
Pre-processing
- RGB to grayscale
- ROI- Low pass filtering
- Contrast enhancement
GLCM DWT
Features for each L and average of θ
- ASM- Entropy- Contrast- Correlation
- Homogeneity
Details
coefficient
Approximation
GLCM
Features for
L(s) and average of θ
- ASM
- Entropy- Contrast
- Correlation
- Homogeneity
Feature analysis
- PCA
- PLS
Desired information
Interpretation,use in monitoring or
control scheme
57
Figure 27 – Example of paste image: a) laboratory paste and b) industrial paste
Unlike most on the shelf commercial cameras, there is no optical low-pass filter installed
on the sensor of the camera used in this project. It is not clearly visible in Figure 27, but
these images contain some high-frequency noise due to the lack of low-pass filter. To
eliminate that noise, a 3×3 Gaussian low-pass filter was applied to the images in the pre-
processing steps. This operation was performed just after the transformation of the RGB
image into a grayscale image. The low-pass filtered and grayscale image is presented in
Figure 28 a).
Paste images are very dark and have a low contrast. It is possible to enhance the contrast
of the images to obtain better results in image texture analysis. The imadujst function built
in MatlabTM was used to pre-process the paste images. This function performs an
adjustment of the distribution of the pixel intensity to obtain 1% of saturation of the pixels
at minimum intensity (i.e. 0 or black) and at maximum intensity (i.e. 1 or white). The image
before contrast enhancement (Figure 28 a) is compared to the results after pre-processing
(Figure 28 b). The pixel intensity histogram for both images is presented in Figure 28 c). It
is shown that the distribution of pixel intensity is more uniform after pre-processing which
results in more contrast in the image.
b)a)
58
Figure 28 – Results of the image pre-processing: a) low-pass filtered grayscale image, b)
image after contrast enhancement and c) comparison of the intensity histogram for both
images
The original image size was 2751×2199. In some images, parts of the aluminium container
or holes in the paste were observed at the edges. To avoid interference with texture
analysis the images were cropped to a smaller region of interest (ROI) of 2560×1882. Both
the GLCM and DWT were then applied to the pre-processed images.
Next, the GLCMs are computed at four different angles θ (i.e. 0°, 45° 90° and 135°) for
each distance L. The GLCM distances were chosen to match the coke particle size
distribution classes as closely as possible as shown in Table 12. The features listed in
section 3.3.1 are computed as the average of the four angles θ at each distance L. The
average is used in the case of paste images since the orientation of the particles is
stochastic and there was no structured or oriented patterns observable.
b)a)
c)
0
2
4
6
8
10
x 104
Pix
el count
Normalized pixel intensity value0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Before
After
59
Table 12 – Choice of GLCM distance L and comparison to the particle size distribution
Prior to applying the Discrete Wavelet Transform (DWT) to the paste images, one needs to
determine the number of decomposition levels to use. It was chosen based on the smallest
dimension of the pre-processed imaged (i.e. 1882 pixels). Seven decomposition levels
were selected for this application since each decomposition level decreases the size of
each dimension by a factor of 2. It is not recommended to reduce the image dimensions
below 10 pixels. After seven decomposition levels, the image has a residual size of 20×14.
The symlet wavelet was selected for the preliminary work (i.e. proof of concept and choice
of preprocessing) since it is a simple wavelet with only one parameter to define. This
parameter is the length of the wavelet filter. It was decided to use the symlet 4 (sym4)
which shape is presented in Figure 29. Several types and length of wavelet have been
tested and these results will be presented in section 6.3.3 of this thesis.
Figure 29 – Symlet 4 wavelet function
L
(pixel)
Length
(mm)
Length
(US mesh)
1 0,04 -200
2 0,08 +200
4 0,16 +100
8 0,33 +50
16 0,65 +30
36 1,47 +14
60 2,44 +8
120 4,88 +4
240 9,77 +3/8 (in)
0 1 2 3 4 5 6 7
-1
-0.5
0
0.5
1
1.5
60
The DWT splits the original image in a series of detail coefficient sub-images capturing
defined frequency bands. Table 13 presents the size range (i.e. period of the discrete
signals) of the features in mm as well as the GLCM distance L used for the machine vision
algorithm. The feature size ranges were computed based on the pixel spatial resolution in
the image and the frequency range of each decomposition level. The GLCM distances
correspond in pixel number to the middle of the decomposition level size range. For
example, 3 pixels correspond to approximately 0.12 mm which is the middle of the level 1
size range. The feature size is also compared to the approximate (i.e. closest)
corresponding particle size range.
Table 13 – Band pass size in period (i.e. spatial dimensions) for each decomposition level of the DWT
The WTA features listed in section 3.3.2 are computed on all the detail coefficient images
and the last approximation image. GLCMs were also computed on the wavelet detail sub-
images for all angles and one distance (i.e. the middle of the level’s band pass). The same
features as the image’s GLCMs are computed for these detail coefficients.
A vector of features was therefore obtained for each paste image after applying both the
GLCM and DWT decomposition. To compare the images together, either PCA or PLS
were used as data clustering or classification tools. PCA was used when no other
information was available on the paste other than the textural features (i.e. when a single
data matrix was available). When additional data on the paste was available, such as
green/baked anode properties or class assignments for each paste sample (i.e. supervised
classification), then a regression problem could be formulated. In those cases, PLS
regression was used.
Decomposition
level (s )
Feature size
(period in mm)
GLCM L on detail
coeff. (pixel)
Approximate particle
size (US mesh)
1 +0,08/-0,16 3 -100/+200
2 +0,16/-0,33 6 -50/+100
3 +0,33/-0,65 12 -30/+50
4 +0,65/-1,30 25 -14/+30
5 +1,30/-2,60 50 -8/+14
6 +2,60/-5,21 100 -4/+8
7 +5,21/-10,42 200 -3/8 (in) /+4
61
Chapter 5 A new Multi-block PLS algorithm
including a sequential pathway
5.1 Introduction
The most common latent variable methods (i.e. PCA and PLS) have been presented in
Chapter 2 along with their interpretation tools. However, graphical interpretation tools such
as loadings and contribution plots can become difficult to interpret when the number of
variables in a model becomes very large. To improve the interpretation, variables in a
dataset can be grouped into meaningful blocks. Usually, a priori knowledge is used to
decide which variable belong in each block. For process data, the blocks often represent
the sequence of steps or unit operations (i.e. pieces of equipment). A typical multi-block
data structure collected from an industrial process is shown in Figure 30. For example, Z
may contain the properties of raw materials at different times or for different lots, the
operating conditions of the different process units used to process each lot of raw
materials would be stored in the Xb blocks (b=1,2,…B), and the resulting properties of the
final product (i.e. quality attributes) could be collected in Y.
Figure 30 – Illustration of the block order for an industrial process
Several multi-block methods have been proposed in the literature and used in various
applications (Westerhuis et al. 1998; Smilde et al. 2003; Kohonen et al. 2008). Early
developments on the multi-block PLS methods were made by Wangen and Kowalski
(Wangen & Kowalski 1989), MacGregor et al. (MacGregor et al. 1994) and Westerhuis and
Coenegracht (Westerhuis & Coenegracht 1997). They proposed a hierarchical structure
allowing visualization and interpretation of the data at two levels: the super level where the
information from all the blocks is combined, and the local block level. These two levels of
scrutiny greatly help dealing with the large number of variables because each individual
block includes a smaller number of them while an overview of the information contained in
all blocks is maintained at the super level. This feature was found particularly useful for
process monitoring and fault detection applications (MacGregor et al. 1994; Kourti et al.
X1 XB
Process units
Z
Raw materials
Ob
se
rva
tio
ns
Variables
Y
Product quality
62
1995) and also for improved process understanding (Kohonen et al. 2008; Tessier,
Duchesne, Tarcy, et al. 2011).
The multi-block PLS models (MB-PLS) are computed based on extensions of the non-
linear iterative partial least squares (NIPALS) algorithm (Geladi & Kowalski 1986;
Höskuldsson 1988; Wold, Trygg, et al. 2001) traditionally used to estimate PLS models.
The main difference between the first two algorithms proposed in the literature is in the
way the regressor blocks (i.e X’s) are deflated. The first one, proposed by Wangen and
Kowalski (Wangen & Kowalski 1989), uses the block scores to deflate each X blocks.
Alternatively, Westerhuis and Coenegracht (Westerhuis & Coenegracht 1997) proposed to
use the super scores for deflating the X blocks. In a review article on multi-block PCA and
PLS methods, Westerhuis et al. (Westerhuis et al. 1998) proved that loss of information
occurs when block score deflation is used. This leads to poorer prediction performance of
the models. The super score deflation approach solved that issue and therefore the
Westerhuis and Coenegracht MB-PLS algorithm is considered as a benchmark in this
chapter. Another interesting feature of this algorithm is that it can be computed directly
from a standard PLS model after applying block scaling to each regressor block, and
combining them in a single X matrix (Westerhuis et al. 1998).
Other alternative methods and hierarchical frameworks have also been proposed in the
literature to accommodate specific data structures (i.e. pathways or correlations) or to
introduce some prior knowledge into the statistical models and improve their performance.
Smilde et al. (Smilde et al. 2003), Höskuldsson and Svinning (Höskuldsson & Svinning
2006) and Höskuldsson (Höskuldsson 2008; Höskuldsson 2014) proposed different
frameworks for computing PCA and PLS-like models for many different types of pathways
from parallel to sequential blocks or a mix of both cases. All of these methods were
focused on modeling the X or Y data and not so much on the interpretability of the models.
Hanafi et al. (Hanafi et al. 2006) proposed the use of either Common Components and
Specific Weights Analysis (CCSWA) or the Multiple Co-inertia Analysis (MCoA) to combine
information based on multiple measurements technique of the same set of samples for a
food processing application. Finally, Hassani et al. (Hassani et al. 2012) discussed
methods to choose the number of components and assess the importance of each block in
MB-PLS models.
63
However, the Westerhuis and Coenegracht MB-PLS method also has some drawbacks.
First, Westerhuis and Smilde (Westerhuis & Smilde 2001) reported that the use of super
score deflation in the MB-PLS algorithm introduced information mixing between the blocks.
This is explained by the fact that the super scores carry information from all the blocks.
When they are regressed onto each X block as part of the deflation procedure, variations
belonging to all the blocks is introduced into blocks in which this information was not
present originally. Hence, the interpretation of the block scores could be misleading. They
proposed to modify the super score deflation to deflate the Y matrix only, based on
previous work published by Dayal and MacGregor (Dayal & MacGregor 1997) for the PLS
algorithm. This effectively removed the information mixing, but no further use of the
modified algorithm was reported in the literature.
The second issue is that all X blocks are constrained to be modelled by the same number
of principal components. In a case where the effective ranks of the blocks (data matrices)
are different, the model could be improved by selecting different number of latent variables
(LVs) for each of them. For example, a block of data may include process variables
involved in an orthogonal design of experiments (DOE), to be combined with a second
block containing highly correlated spectral data collected from raw materials using some
spectroscopic techniques (Næs et al. 2011). In such a case, it is expected that as many
LVs as the number of variables would be necessary for the DOE block since it is full rank
whereas as the reduced rank nature of the spectral data block may require only a few LVs
(e.g. 1-2) to capture most of the information in that block. This is not possible with MB-
PLS.
Finally, interpreting the relationships between the variables at the block level in presence
of between block correlations is not straightforward and may result in misleading
conclusions. The loading vectors are used to interpret the relationships between the
variables in latent variable models. At the super level, the loadings indicate the contribution
of each block in modelling the Y data. Information about the variables is captured by the
block loadings. When interpreting the variations within a given block, one cannot assume
that these variations were introduced by the corresponding process unit (or step or
instrument), but may have been caused by other blocks. Consider a simple two regressor
block example where Z and X represent raw material properties and process variables,
respectively, and Y contains the final product quality attributes. Variations in raw material
properties typically affect both the process operation and the final product quality.
64
Additional process variations introduced by disturbances (other than raw materials) and
changes in operation policies may also impact the final product. Hence, the variations in
the process block X are not only caused by the process itself but also by variations in raw
materials contained in the other block Z. When interpreting the loadings of the X block
individually, variations in process variables caused by raw materials cannot be identified
explicitly and distinguished from other sources of variations, and may be attributed wrongly
to some other process disturbances. However, when prior knowledge is available about
the existing pathway between the blocks, this information should be used to enhance the
model interpretation. For instance, the variance of X could be decomposed into that
correlated with Z (Xcorr) and that orthogonal to Z (Xortho). The effect of raw material
properties on certain process state variables as well as feedback/feedforward control
actions made to compensate for raw material variations would be identified in Xcorr,
whereas Xortho would contain information about other process disturbances and operation
decisions.
The sequential orthogonal PLS (SO-PLS) introduced by Naes et al. (Næs et al. 2011)
addresses the last two issues of MB-PLS to some extent. It follows the framework
proposed by Jørgensen et al. (Jørgensen et al. 2004; Jørgensen et al. 2007) in which
separate individual PLS models are estimated using each regressor block in a stepwise
fashion after the blocks are sequentially orthogonalized with respect to each other.
Basically, a sequential pathway is assumed for the regressor blocks and defines their
ordering. A PLS model is built between the first block and the response data Y.
Subsequent blocks are then othogonalized with respect to the information used from the
first block by regressing them against the scores of the PLS model. The procedure is
repeated for subsequent regressor blocks using only their orthogonal information and the
Y-residuals of the PLS model obtained at the previous step. Hence, at each step, the
regressor block only contains new information to explain the remaining variance left in Y.
One of the motivations for developing the SO-PLS approach was to model the Y data by
combining regressor blocks having very different ranks, such as a DOE block and several
blocks of spectral data (non-full rank) (Jørgensen & Næs 2008; Menichelli et al. 2014). The
proposed method allows using a different number of latent variables per block. It also
helped in the interpretation because each PLS model uses a single regressor block and
only new information is modelled at each step due to the orthogonalization.
65
However, the main issue with SO-PLS is that it totally ignores the correlated information
between the blocks. Although this information is not useful for making predictions, it is
important for interpreting the relationships between the variables especially in industrial
process applications. Consider again the simple two regressor block example discussed
previously. Applying SO-PLS to that dataset would result in two PLS models, the first
between Z and Y, and the second between Xortho and F, where Xortho is X orthogonalized
with respect to Z and F consists of the Y-residuals of the first model. Hence, the correlated
information between Z and X (i.e. Xcorr) is completely removed from the analysis. As
mentioned earlier, Xcorr contains process variations introduced by raw materials as well as
any feedback/feedforward control actions made to attenuate variations in raw material
properties. The authors consider this information very important for process understanding
and improvement, and for quality control. For example, Xcorr is required for establishing
multivariate specification regions for raw material properties in presence of
feedback/feedforward control (Duchesne & MacGregor 2004).
The goal of this chapter is to propose a new sequential multi-block PLS algorithm (SMB-
PLS) that combines the advantages of both MB-PLS and SO-PLS methods. This includes
the integrated two level hierarchical structure (super and block levels) of the first, the
separation of correlated and orthogonal information and the use of different number of
latent variables per block from the second. This is achieved by incorporating a sequential
pathway structure and block orthogonalization within the MB-PLS algorithm. In process
applications, the pathway typically represents the flowsheet of process units, from raw
materials to final product. The key feature of the new algorithm is that correlated
information between a given block and subsequent ones is pooled in a common latent
variable space whereas orthogonal information is captured by other components. Hence,
both between block correlated and orthogonal information is considered in the model and
therefore, no information is lost.
This chapter is organized as follow. First, the different multi-block methods are introduced
in more technical details. Second, two dataset are used to illustrate the properties of the
proposed SMB-PLS algorithm. The first was obtained from a simulated film blowing
polymer extrusion process, and the second from a real anode manufacturing process.
Modeling results and interpretation are then discussed, and conclusions are drawn.
66
5.2 Description of the multi-block methods
5.2.1 Multi-block PLS (MB-PLS)
There are two different implementations of the MB-PLS algorithm. The first was proposed
by Wangen and Kowalski (Wangen & Kowalski 1989) and uses the block score deflation,
whereas the second algorithm proposed by Westerhuis and Coenegracht (Westerhuis &
Coenegracht 1997) uses super score deflation. The computation of both algorithms is
based on the PLS NIPALS algorithm and is essentially the same except for the deflation
step. The MB-PLS based on block score deflation suffers from loss of information in the
deflation step (Westerhuis & Coenegracht 1997; Westerhuis et al. 1998) leading to inferior
prediction ability compared with the super score deflation method. Also, the super score
deflation method gives equivalent result as PLS when all blocks are concatenated in the
single regressor matrix and block scaling is applied (Westerhuis et al. 1998). For these
reasons, only the second MB-PLS algorithm (super score deflation) will be described in the
section. A detailed description of the algorithm and the Matlab code for computing the
model are available in Westerhuis et al. (Westerhuis et al. 1998).
The schematic of the method is presented in Figure 31. In this figure, tb corresponds to the
block score and tT so the super score. This figure shows the main steps in computing the
MB-PLS model. First, vector u is initialized and is then regressed on all blocks X1, X2, ...,
XB separately to obtain block weights w1T, w2
T, ..., wBT. These block weights are used to
compute block scores t1, t2, ..., tB after normalization to length one. The block scores are
combined together column wise in a super block T. Then super weights wTT and super
scores tT are computed using T (i.e. concatenated block scores). Here the super weights
are normalized to length one. This computation cycle is repeated until the convergence of
the super score tT. The super scores tT is used for the deflation of all Xb blocks.
67
Figure 31 – The MB-PLS algorithm for 2 regressor blocks
(adapted from (Westerhuis et al. 1998))
Since the number of variables in each block is different, it is important to apply block
scaling (equation 5.1) to give the same importance to each block. If this step is not
performed, the latent variable model will focus more on those blocks containing many
variables (i.e., carrying more variability) and may hide the variance contained in the
smaller blocks. To do so, each variable is divided by the square root of the number of
variables in that block. This operation sets the variance of each block to one as opposed to
each variable having a variance of one after the normal auto-scaling applied for normal
PLS models.
*
*, =jbl
j
bk
xx 5.1
Where x*j is the auto-scaled variable, j is the variable index and kb is the number of
variables in the block.
1. Set u to any column of Y.
2. Start convergence loop.
For b = 1:B
2.1. wb = XbTu/(uTu)
2.2. wb = wb/(wbTwb)
½
2.3. tb = Xbwb
End
2.4. T = [t1 ... tB]
2.5. wT = TTu/(uTu)
2.6. wT = wT/(wTTwT)½
2.7. tT = TTwT /(wTTwT)
2.8. q = YT tT/(tTTtT)
2.9. u = Yq/(qTq)
2.10. Check for convergence of tT or u.
Continue to step 3 if converged.
For b = 1:B
3. pb = XbTtT/(tT
TtT)
4. Eb = Xb – tTpbT
End
5. F = Y – tTqT
6. Store wb, pb, tb, tT and u as new columns in W, P, T, TT and U.
7. Restart at step 1, replacing Xb by Eb and Y by F.
Y
X1 X2
u
w1T w2
T
qT
t1 t2
tT
p1T p2
T
Super level
Block level
T
wTT2.1
2.1
2.3 2.3
2.42.4
2.5
2.7
2.82.9
68
5.2.2 Sequential Orthogonal PLS (SO-PLS)
Sequential orthogonal PLS (SO-PLS) was proposed by Naes et al. (Næs et al. 2011).
They developed this method for the analysis of datasets combining design of experiment
data (i.e. a full rank data matrix), and a very low rank spectral data blocks obtained from
analytical instruments such as spectroscopy or chromatography. In this case, selecting the
number of LVs for the traditional MB-PLS is a compromise between the full rank DOE data
and the low rank data. With SO-PLS, different number of LV can be computed for each
block, overcoming the problem. The schematic of the SO-PLS method is presented in
Figure 32.
Figure 32 – The SO-PLS algorithm shown for 2 regressor blocks
The SO-PLS method is a stepwise procedure by which a separate PLS model is estimated
for each regressor block Xb. The particularity of the method is that each subsequent block
is orthogonalized with respect to the scores (T) of the PLS models built at the previous
steps. The orthogonalization step ensures that only new information not modeled by the
previous blocks is left in the subsequent blocks. The framework is as follow (Figure 32).
First, the regressor blocks Xb are ordered according to a sequential pathway defined by
the user. A PLS model is built between the first block in the sequence X1 and Y. Then the
scores of the first block T1 are used to orthogonalize the data in the second block (and
subsequent blocks if B > 2) using multiple linear regression (MLR):
For b = 1, 2, …, B-1 and for k = 1, 2, …, B-b
( )--1orth T T= X X T T T T Xb bb+k b+k b b b+k 5.2
YX1
X2orth
u1
w1T
w2orth,T
q1T
t1
t2orth,T
p1T
p2orth,T
X2
PLS
F
u2
q2T
PLS
1
2
3
69
Finally, a PLS model is built between the orthogonalized second block (X2orth) and the Y-
residuals (F) of the first PLS model. This sequence is repeated for all subsequent blocks
and is not limited to datasets with only two X blocks. When b=B, then a simple PLS model
is built between the information left in XB after the sequential orthogonalization steps and
the Y-residual at this step.
For this method, the blocks do not need to be block scaled prior to the analysis. Moreover,
contrary to the MB-PLS method, the number of components selected for each block can
be different. This allows capturing information from dataset with very different ranks.
5.2.3 Proposed algorithm: the Sequential Multi-block PLS (SMB-PLS)
The proposed multi-block method is called the Sequential Multi-block PLS (SMB-PLS).
The schematic of the method is presented in Figure 33. It is presented for two blocks only,
but the algorithm can be applied to any number of blocks.
An orthogonalization procedure according to a sequential pathway is introduced into the
MB-PLS framework, while keeping the super level and block level structure for ease of
interpretation. The first step of the algorithm is to compute the first block weight w1T by the
regression of an initial Y score u onto X1. Than in order to differentiate the correlated
information from the orthogonal information, the subsequent blocks Xb are split using the
following equation:
For b = 1, 2, …, B-1 and for k = 1, 2, …, B-b
( )-1corr T TX XX X= ∗X T T T T Xb bb+k b b b+k 5.3
The block score for the subsequent block are computed by regressing u onto Xbcorr to
obtain the block weights wbcorr,T. Than the X block score t1, ... tb are combined in super
level score T as in the MB-PLS method. The last step is the computation of a PLS cycle
between u and T to compute the super level weights (WTT) and super scores tT. This
computation cycle is repeated until convergence on tT. Deflation using the super score is
then performed. Once all the information from X1 has been explained, the same
methodology is applied to the subsequent blocks. Since only the correlated information
with the previous block was removed by the deflation step, the components for the
subsequent block will only model new information not explained by the previous block
components.
70
Figure 33 – The SMB-PLS algorithm for two X blocks
The advantages of this method are that a different number of components can be used for
each block and it also enables the visualization of between blocks correlated information.
Different number of components can be computed for each block since only the correlated
information is removed after the deflation for the subsequent block. This leaves orthogonal
(i.e. new) information in the Xb blocks to further explain variations in Y by additional
components. Finally, for each LV, block scores and loadings are computed for each block
to enable interpretation of relationships between variables, outlier detection, visualization
of clustering patterns and so on. The super scores also give important information on the
correlation structure between the blocks.
For b = 1, ..., B-1
1. Set u to any column of Y.
2. Start convergence loop.2.1. wb = Xb
Tu/(uTu)
2.2. wb = wb/(wbTwb)
½
2.3. tb = Xbwb
For k = 1, ..., B-b
2.4. Xb+kcorr = tb(tb
Ttb)-1tb
TXb+k
2.5. wb+k = Xb+kcorr,Tu/(uTu)
2.6. wb+k = wb+k/(wb+kTwb+k)
½
2.7. tb+k = Xb+kcorr wb+k
End
2.8. T = [tb ... tB]
2.9. wT = TTu/(uTu)
2.10. wT = wT/(wTTwT)½
2.11. tT = TTwT /(wTTwT)
2.12. q = YT tT/(tTTtT)
2.13. u = Yq/(qTq)
2.14. Check for convergence of tT or u.
Continue to step 3 if converged.
For k = b, ..., B
3. pk = XkTtT/(tT
TtT)
4. Ek = Xk – tTpkT
End
5. F = Y – tTqT
6. Store wb, pb, tb, tT and u as new columns in W, P, TT and U.
7. Restart at step 1, replacing Xb by Eb and Y by F.
8. When the information in blocks Xb is depleted, increment b and start at step 1.
End
9. When b = B, compute a normal PLS between EB and F.
First block components
Second block components FE2
u
w2T qT
t2
p2T
PLS
X2
Y
X1 X2corr
u
w1T w2
corr,T
qT
t1 t2corr,T
tT
p1T p2
T
Super level
Block level
T
wTT2.1
2.7
2.8
2.9
2.11
2.12
2.6
2.3
2.13
2.8
2.4
71
This algorithm has other interesting properties. First, both the block scores and super
scores are orthogonal. In MB-PLS, the block scores are correlated. In fact, the information
captured by the block scores is the same as the super scores for each latent variable. Only
the numerical values are not exactly the same. It is only necessary to use either the super
scores of block scores for interpretation. Also, for the weights, the block weights are
important for the interpretation of what is going on inside each blocks. But the super
weights are equally important since they provide information about how the information
extracted from each block distributes in each component of the model.
5.3 Description of the dataset used for the case studies
This section describes the two datasets used to illustrate the properties of the new SMB-
PLS algorithm and compare them against the traditional MB-PLS and SO-PLS methods.
The first dataset was obtained from a simulated polymer film blowing process. The second
is an industrial dataset collected from an anode manufacturing process.
5.3.1 Simulated data from film blowing process
The data used in this part of the thesis were retrieved from a simulated polymer film
blowing process presented by Duchesne (Duchesne 2000) and Duchesne & MacGregor
(Duchesne & MacGregor 2004). Two case studies are used to illustrate the differences in
the multi-block methods. The first case represents the situation when there is no
correlation between the raw material variations and process variations. That is the
variability in the raw material does not have an effect on the process parameters and there
is no between block correlation. In the second case, correlation between the raw material
and the process variables is introduced through a feedforward control strategy. The final
section of the film blowing process showing the bubble inflation and cooling is presented in
Figure 34.
Polymer films are manufactured by an extrusion process. The material is melted and
mixed in a screw extruder and passed through a hollow circular die. As the polymer melt
reaches the die, air is blown inside the extruded film to maintain a given inflation pressure.
This creates a bubble-shape film of a desired diameter. The film is then cooled by blowing
air at a given temperature on its outer surface.
72
Figure 34 – Simulated end section of a film blowing process
(adapted from (Duchesne 2000))
The data is organized in three different blocks. The raw material block (Z) contains the raw
polymer properties consisting of ten temperature dependent viscosities (η), the heat
capacity (Cp) and the density (ρ). The second block X contains process variables. The
process manipulated variables are the polymer flow rate (Q) and cooling air flow rate. The
cooling air flow rate is however represented by the maximum local heat transfer coefficient
along the bubble (h0). The ambient air temperature (Ta), a measured process disturbance,
is also included in the X block. Two film quality variables (Y) are measured in this
simulation. The film frost line height (FLH) which is the position along the film where the
cooling has no more effect on the film properties (affects film crystallinity). The last
measurement is the full stress in the machine direction (FMDS) taken beyond the FLH. It is
a measure of the mechanical strength of the film in the extrusion direction.
For both case studies, 50 lots of raw materials were simulated by introducing variations in
the polymer resin properties (viscosity curve, heat capacity and density). These were
processed in the simulator according to some operating policies. Details of the simulation
are available in Duchesne’s Ph.D. thesis (Duchesne 2000).
5.3.1.1 First case – No correlation between raw materials and process data
The first case study illustrates the properties of the different multi-block methods when
there is no correlation between the regressor blocks. Variations in both blocks affect final
product properties (Y), but no correlation exists between the two regressor blocks (i.e. Z
and X). In this case, random variations were added to raw material properties (Z) and to
process variables (X) Q, h0, and Ta to simulate the effect on the product quality.
FLH
Die
Cooling air:Ta, h0
Molten polymer :Cp, ρ, η(T) and flow rate Q
73
There is, however, correlated information inside each block. In the Z block, the viscosities
(η) are correlated, but the variations in ρ and Cp are independent. The correlations in X are
due to the feedforward control actions implemented to attenuate variations in Ta by
adjusting Q and h0 (control scheme correcting for a process disturbance and not for
changes in resin properties).
5.3.1.2 Second case – Correlation between raw materials and process data
In this second case study, variations in raw material properties were implemented similarly
as for the first case, but this time process parameters are adjusted by a second
feedforward controller to attenuate variations caused by raw material properties. This
introduces correlations between the raw material and process blocks (Z and X). The
feedforward controller corrects for some of the variability in the polymer heat capacity Cp
by adjusting the flow rate Q. The existing feedforward control for Ta is modified to use only
h0 as the manipulated variable.
5.3.2 Industrial data from the anode manufacturing process
The details of the anode manufacturing process have been described in detail in Chapter 1
of this thesis. Nevertheless, Figure 35 presents a non-exhaustive list of the variables
included in each data block.
The ordering of the blocks is chosen to represent the natural order of the process units.
The first block (Z) includes the properties of all three types of raw materials (coke, pitch
and anode butts). The first process block (X1) contains the variables associated with the
formulation of the paste. That is, the relative amount of each type of material as well as the
particle size distribution of the aggregates. The second process block (X2) represents the
operating conditions measured during the mixing and the vibro-forming of the anode paste.
The operating conditions of the anode baking furnace are stored in the third process block
(X3). Finally, the baked anode quality data obtained from the laboratory (i.e. testing of
anode core samples) are collected in the response block Y. These are listed in Table 14.
74
Figure 35 – Data blocks collected from the anode manufacturing process (Modified from
(Lauzon-Gauthier et al. 2012))
Table 14 – List of the Y variables used for the anode manufacturing dataset case study
This industrial dataset previously investigated by the author (Lauzon-Gauthier 2011;
Lauzon-Gauthier et al. 2012) was found to be a good candidate for testing the multi-block
algorithms for two main reasons. First, it contains repeated observations in the raw
material block allowing to assess between block information mixing (Westerhuis & Smilde
2001). But most importantly, some of the blocks are correlated (i.e. raw material Z and
formulation X1) whereas others are not (i.e. raw material Z and baking X3).
Raw
materials
Classification
of materials
Paste mixing
& Forming
Baking
• Coke density and impurities
• Pitch physical properties
• Butts impurities
• Aggregate size distribution
(shift based)
• Paste formulation
• Temperatures
• Mixing power, etc.
• Bellows pressure
• Anode Height
• Maximum flue wall temperature
• 2 anodes temperature
• Cycle time, etc.
1
2
3
4
5Core
Sampling• Physical properties
Z
Y
X1
X2
X3
Number Variable
1 Green anode apparent density
2 Green anode weight
3 Baked anode weight (mean)
4 Thermal conductivity
5 Baked anode apparent density
6 Real dens
7 Compresive strengh
8 LC
9 Young's modulus
10 Electrical resistivity
75
5.4 Results and discussion
All dataset were auto-scaled and block scaled prior to the analysis. The original anode
manufacturing process dataset contained missing data. These were imputed by values
estimated using the PLS Toolbox mdcheck function which uses a PCA model to replace
the missing data until convergence. This operation was performed on the whole
concatenated dataset (i.e. [Z, X1, X2, X3]).
5.4.1 Selecting the number of components
Selecting the number of component in any latent variable methods is very important for the
interpretation but also the application of the models. The commonly used cross-validation
technique was described in section 2.4 of this thesis.
To compare the prediction ability and the interpretation obtained with the three multi-block
methods, the same criterion was used to select the number of components. This is
especially important for sequential algorithm like SMB-PLS and SO-PLS since the number
of components captured after each block will have an effect on the orthogonalization and
thus the information left in the subsequent blocks.
The selected criterion is a modified version of the Q2 statistic as defined in the ProMV
software and in (Wold, Sjöström, et al. 2001). It is based on the prediction ability of a
model obtained on an external validation dataset instead of using the traditional cross-
validation procedure. Equation 5.4 describes the statistic. An increasing Q2 value (i.e.
predictive ability) from one LV to the next indicates the significance of the added
component.
2
Y
PRESSQ Y 1
SSa
a = − 5.4
Where a corresponds to the number of latent variables and SSY is the total sum of squares
of the Y data. PRESSa is the prediction error sum of squares calculated based on the
external validation dataset and a model containing a latent variables:
( ),ˆ-
2
( )1 1
PRESSI H
a ih aihi h
y y= =
= ∑∑ 5.5
76
In equation 5.5, I and H respectively correspond to the number of observations in the
external validation dataset and the number of variables in Y.
The root mean square error of prediction (RMSEP) defined in the following equation is also
used to choose the number of LVs. Usually the RMSEP decreases rapidly and then
stabilizes. The LV at which the RMSEP stops decreasing is an indication of that no more
information in X can be used to improve predictions of Y.
( ),
,
ˆ2
1RMSEP
I
i i ai
h a
y y
n
=−∑
= 5.6
The number of latent variables of the multi-block PLS models selected based on the
RMSEP criterion was set to the smallest number that meets the one of the following two
criteria: the last LV (i.e. a-1) before the Q2Y increases less than 0.01 (i.e. 1% of additional
prediction performance) or the first LV where all Y variables RMSEP stops decreasing.
To apply this procedure, both the simulated film blowing datasets and the anode
manufacturing data were split in a calibration and a validation set. The calibration set was
formed by selecting randomly two-thirds of the original dataset. The remaining data was
used as the external validation set. To make sure that both datasets spanned the same
range of variations, a PCA model was built on the calibration data (i.e. the concatenation
of Z, all X’s and Y) and the prediction set was than projected onto the model. The score
plots, residuals and T2 were checked to make sure that both dataset spanned the same
sub-space.
Note that the SO-PLS and SMB-PLS algorithms allow for a different number of LVs to be
selected for each regressor block, as opposed to the traditional MB-PLS for which the
number of LVs is the same for all the blocks. Hence, the procedure discussed above for
selecting the number of LVs applies to each regressor block sequentially for SO-PLS and
SMB-PLS. The number of components could also be selected by computing the Q2Y and
RMSEP for all the possible combination of LVs for each block and selecting the optimum
as discussed in (Næs et al. 2011). In this study, the number of components were selected
sequentially for each block for the 3 methods.
77
5.4.2 Results for the film blowing example
This section describes the results for the film blowing example. The number of LVs was
first selected for the six multivariate models obtained by estimating the 3 multi-block
models (i.e. MB-PLS, SO-PLS and SMB-PLS) on the datasets of the 2 case studies. The
prediction performances of the models for case 1 are shown in Figure 36 and for case 2 in
Figure 39. The models were then interpreted and the distribution of the information
captured from each block in each latent variable is discussed in order to illustrate the
properties of the algorithms.
Figure 36 shows both Q2Y statistics as well as the RMSEP for FLH and FMDS for the MB-
PLS model built on the case 1 data (no correlation between Z and X). A total of 6 LVs were
selected for this model because the increase in Q2Y is less than 0.01 and the RMSEP do
not decrease for both Y-variables after 6 components.
Figure 36 – Q2Y and RMSEP statistics for selecting the number of components of the MB-
PLS algorithm for case 1 (Z and X are orthogonal)
For both sequential methods, the selection of the number of components is performed
sequentially and the results are presented in separate figures for each block. Figure 37
shows the statistics for the SO-PLS method.
1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
LV
Q2 a
nd R
MS
EP
Q²Y RMSEP FLH RMSEP FMDS
78
Figure 37 – Q2Y and RMSEP statistics for selecting the number of component of the SO-
PLS model for case 1
Both statistics indicate that 4 components should be used for the raw material block (Z).
For the second block (X), 2 latent variables are selected. This choice was based on the
Q2Y statistics which stops increasing at the 2nd LV. The maximum numbers of components
for the Z and X blocks are 10 and 3, respectively, which corresponds to the number of
variables in each block.
Figure 38 presents the statistics for the SMB-PLS model for case 1. The number of LVs for
the Z block is also 4 just as the SO-PLS. The RMSEP for FMDS is low and the Q2Y stops
increasing after 4 components. The number of LVs used for the X block is 3 since it is the
number of variables in the block. In this case, there are at least two phenomena included
in the X block. The first is the variations in Q and the second are the variations of h0 based
on Ta. There are also some random variations due to the simulation included in the
dataset. It is reasonable to assume that this block needs to be modeled by 3 LVs.
1 2 3 4 5 6 7 8 9 10
0
0.5
1
Z block
Q2 and RMSEP
1 2 3
0
0.2
0.4
0.6
0.8
1
X block after 4 Z LV
Q2 and RMSEP
LV
Q²Y RMSEP FLH RMSEP FMDS
79
Figure 38 – Q2Y and RMSEP statistics used for selecting the number of component of the
SMB-PLS model for case 1
The selection of the number of components for the second case study (correlated Z and X
blocks) is presented in Figure 39.
Figure 39 – Q2Y and RMSEP statistics used for selecting the number of component for the
MB-PLS algorithm for case 2
As shown in Figure 39, a total of 7 LVs are recommended because the minimum value for
RMSEP is reached for both Y-variables and this also correspond to the maximum value of
Q2Y.
1 2 3 4 5 6 7 8 9 10
0
0.2
0.4
0.6
0.8
1
Z block
Q2 and RMSEP
1 2 3
0
0.2
0.4
0.6
0.8
1X block after 4 Z LV
Q2 and RMSEP
LV
Q²Y RMSEP FLH RMSEP FMDS
1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
LV
Q2 a
nd R
MS
EP
Q²Y RMSEP FLH RMSEP FMDS
80
Figure 40 – Q2Y and RMSEP statistics used for selecting the number of components of the
SO-PLS model for case 2
The statistics for the SO-PLS model are presented in Figure 40. For the Z block, the
number of LV was selected to be 3 mainly based on the Q2Y statistic. For the X block, 3
components were also selected.
Figure 41 – Q2Y and RMSEP statistics used for selecting the number of components of the
SMB-PLS model for case 2
1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
2.5
Z block
Q2 and RMSEP
1 2 3
0
0.5
1
1.5
2
X block after 3 Z LV
Q2 and RMSEP
LV
Q²Y RMSEP FLH RMSEP FMDS
1 2 3 4 5 6 7 8 9 10
0
0.5
1
1.5
2
Z block
Q2 and RMSEP
1 2 3
0
0.5
1
1.5
X block after 3 Z LV
Q2 and RMSEP
LV
Q²Y RMSEP FLH RMSEP FMDS
81
For the SMB-PLS model (Figure 41), all statistics suggest using 3 components for both
blocks as for SO-PLS.
RMSEP values for the film FLH is not very well predicted since it is above 1 for all LVs. In
this case, the error variance of the MB-PLS is better than both sequential models. If a
global selection approach (i.e. testing all possible combinations of LVs for both blocks) the
FLH RMSEP is minimal for 8 LVs for the Z block and 3 LVs for the X block with a value of
1.07 instead of 1.19 compared to 0.47 for the MB-PLS.
Figure 42 presents the total Y variance captured for both cases and all three algorithms.
This figure allows comparing the prediction ability of the different algorithms. Since each
model does not capture the same latent variable space due to the orthogonalization
process, the number of LVs is different for each method. The choice of the number of LV
can have an impact on the comparison of the R2 and Q2 between the methods. But the
same criteria were used to ensure a fair comparison.
Figure 42 – Explained Y variance for the three multi-block methods built on the film
blowing datasets: a) case 1 and b) case 2. Z and X block variance explained and total (i.e.
concatenated regressor blocks) variance explained: c) case 1 and d) case 2
MB SMB SO0
20
40
60
80
100
R2 a
nd Q
2
Cal R2 Y Val Q2 Y
MB SMB SO0
20
40
60
80
100
b)a)
Total Z X0
20
40
60
80
100
Blocks
R2
Total Z X0
20
40
60
80
100
Blocks
b)
MB-PLS SMB-PLS SO-PLS
a)
R2 a
nd Q
2
40
60b)
c)
R2
d)
82
Figure 42 (a) presents the explained variance for the three models built on the case 1
dataset. The MB-PLS has the highest R2 of all three algorithms. Based on the validation
Q2, the prediction performances are very similar for all the methods.
For the second case study (Figure 42 b), the MB-PLS model perform significantly better
(explained variance almost 30% higher in calibration and 13% in validation) compared with
the two other methods. This is most probably due to the way the numbers of latent
variables for the sequential methods were selected. If a global search is used for SMB-
PLS, the Q2Y is maximized by using 8 and 3 LVs for the Z and X blocks, respectively. In
this case, the Q2Y for SMB-PLS is 70.3% compared to 77.3% for MB-PLS.
Selecting the number of components of latent variable models having different structures
(i.e. MB-PLS vs sequential methods) is not straightforward. Although the same criteria
were applied to all three multi-block methods to ensure a fair comparison, it is clear that in
the simulated film blowing study, the stopping criteria based on Q2Y and RMSEP statistics
do not allow finding the optimal number of components for the sequential methods when it
does for MB-PLS. However, this situation should not be generalized because all multi-
block methods perform equally well on the anode manufacturing dataset as will be shown
later. The main goal of sequential methods is to improve interpretability of the models and
not prediction ability. A comparative analysis of model interpretability is performed next.
The total variance (i.e. the concatenated Z and X regressor blocks) of each block
explained by the multi-block algorithms in both case studies are shown in Figure 42. For
the first case study (Figure 42 c) the explained variances by each algorithm are very
similar, as was the case of the Y-variance (Figure 42 a and b). In the case where there is
no correlation between the blocks there is no difference between the algorithms.
However, for case 2 presented in Figure 42 d), the correlation between the two blocks has
an impact on the X-variance explained by the models. In this case, MB-PLS captures a
greater percentage of variance compared with the two sequential methods. For SMB-PLS
and SO-PLS, the results are very similar except for the X block. This is due to the
orthogonalization of the X block using the scores of the Z block. In fact, the variance
removed in the second block (X) by the orthogonalization from the SO-PLS is 1.4% for the
first case and 7.9% for case 2. It is higher in case 2 due to the correlation between the
blocks. It is important to note that the variance removed in X (i.e. falling in the LV space of
Z) did not have a significant impact on the prediction of Y because this information is
83
redundant with that included in Z. However, SO-PLS does not consider between block
correlations and so the information in X correlated with Z is not available for interpretation.
This explains why MB-PLS and SMB-PLS capture more variance of X compared to SO-
PLS. Although SMB-PLS performs block orthogonalization, the information in X correlated
with Z is still available for interpretation in the block weights and scores of previous block
(Z in this case). As argued earlier, between block correlated information may be important
for process data analysis and interpretation and should not be removed from the analysis.
How the information captured from each block is distributed in each latent variable of MB-
PLS and SMB-PLS models is shown in Figure 43 for both simulated case studies. The
relative contribution of each block is used to illustrate this point. It is calculated as the
square of the super weight of a given block in each LV. Since both MB-PLS and SMB-PLS
have a similar hierarchical structure (i.e. block and super levels) and block contributions
can be computed in the same way, only those two algorithms are compared in Figure 43.
For the SO-PLS algorithm, block contributions could be calculated based on the explained
variance Y by each model (i.e. block).
As shown in Figure 43 a) and b), MB-PLS extracts information from all the blocks in every
component and thus distributes the information among all latent variables no matter the
correlation structure between the blocks. Even when the blocks are orthogonal to each
other (Figure 43 a) the LVs capture information from both blocks. However, SMB-PLS
distributes the information differently depending on the between block correlation structure.
For the case where Z and X are uncorrelated (Figure 43 c) the first 4 LVs extract
information from the raw material block and almost nothing from the process block. The
latter is modeled by subsequent LVs. On the other hand, when Z and X are correlated to
some extent (Figure 43 d) the third component (Z-3) clearly captures correlated
information between the two blocks whereas the first two (Z-1 and Z-2) and the last two (X-
1 and X-2) focuses on the orthogonal information in Z and X, respectively.
84
Figure 43 – Relative importance of each block by LV for: a) MB-PLS case 1, b) MB-PLS,
case 2, c) SMB-PLS case 1 and d) SMB-PLS case 2
The loading plot presented in Figure 44 shows that component Z-3 essentially captures
the feedforward control adjustments made on the process to compensate for variations in
some raw material properties (i.e. the source of correlated variations between Z and X).
The loading values show that Cp is strongly negatively correlated with Q because when Cp
increases the production rate Q is reduced to mitigate the impact on FLH. Basically, when
the heat capacity increases, more heat needs to be removed from the molten polymer to
reach solidification temperature (i.e. FLH) at given heat transfer conditions (h0 and Ta).
Decreasing production rate reduces the heat load and therefore attenuates the effect of
Cp. The fact that FLH is positively correlated with Q is because the control adjustments are
not perfect (i.e. effect of Cp is not removed completely).
1 2 3 4 5 60
0.2
0.4
0.6
0.8
1
Rela
tive w
eig
hts
1 2 3 4 5 6 70
0.2
0.4
0.6
0.8
1
b)
Z-1 Z-2 Z-3 Z-4 X-1 X-20
0.2
0.4
0.6
0.8
1
LV
Rela
tive w
eig
hts
Z-1 Z-2 Z-3 X-1 X-2 X-30
0.2
0.4
0.6
0.8
1
LV
d)
Z X
c)
a)
85
Figure 44 – Loadings of Z, X and Y blocks in the 3rd SMB-PLS component (Z-3) for case 2.
The loadings of the X and Y blocks in the last two SMB-PLS model components (LV4 and
LV5 or X-1 and X-2) for case 2 are shown in Figure 45. Component LV4 captures the
feedforward adjustments made on convective heat transfer (h0) to attenuate the variations
in film properties introduced by environment temperature Ta (main process disturbance).
When Ta increases, heat transfer rate reduces. Thus h0 needs to be increased, for
instance by increasing cooling air flow rate, to attenuate the impact of the disturbance on
FLH. The last component (LV5 or X-2) models the impact of additional variations in Ta and
Q on film properties (FLH, FMDS). The information captured by the two X block
components have nothing to do with variations in raw material properties and are therefore
captured in a separate latent variable space. Note that the loadings of the Z block are all
zero in these components because information of raw materials has been captured when
modelling the Z block (LVs 1-3).
2 4 6 8 10 12 14 16-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Variables
LV
3 (
Z-3
) n96
n97
n102 n103 n137 n138
n146 n147
n194 n195
Cp
Rho
Q
Ta
h0
FLH
FMDS
Z X Y
86
Figure 45 – Loadings of Z, X and Y blocks in the 4th and 5th SMB-PLS component (X-1 and
X-2) for case 2.
This demonstrates the advantages of the method which can distinguish between
correlated information which are explained in the first blocks and new information from the
subsequent blocks. It is very useful for the interpretation of the models.
5.4.3 Industrial data from the anode manufacturing process
The proposed SMB-PLS algorithm is now applied to the industrial dataset collected from
the anode manufacturing process. The selection of the number of components is
presented first and the predictive ability of the three multi-block algorithms is compared.
Finally, how the information extracted from each block is distributed among the
components is discussed along with model interpretation.
Based on the Q2Y statistic, a total of 5 LVs were selected for the MB-PLS model (Figure
46) because it increases by less than 1% after adding this component. According to the
RMSEP, model predictions for a few Y-variables improve slightly beyond 5 LVs but most of
them remain fairly constant. The list of variable names and numbers is given in Table 14.
Since it was decided to stop adding LVs when one of the 2 criteria is met, 5 components
were used for MB-PLS.
-0.5 0 0.5-0.2
0
0.2
0.4
0.6
0.8
1
LV 4 (X-1)
LV
5 (
X-2
) Q
Ta
h0
FLH
FMDS
Z X Y
87
Figure 46 – Selection of the number of LVs for the MB-PLS model computed from the
anode manufacturing dataset: a) Q2Y and b) RMSEP for all Y variables
The statistics used for selecting the number of LVs for both sequential algorithms are
presented in Figure 47 and Figure 48. Since four regressor blocks are involved in this
application instead of two, it was decided to present only the global cumulative Q2Y
statistics for all the blocks on the same plot rather than showing for each individual block.
This reduces the number of figures and simplifies the interpretation. Also, the RMSEP
statistics are not presented since it was not the critical criterion in any of the models.
Figure 47 – Selection of the number of LVs for the SO-PLS model computed from the
anode manufacturing dataset
0
0.1
0.2
0.3
0.4
0.5
Q2Y
1 2 3 4 5 6 7 8 9 10
0.4
0.6
0.8
1
LV
RM
SE
P
1 2 3 4 5 6 7 8 9 10
1 2 3 4 5 6 7 8 9 100.15
0.2
0.25
0.3
0.35
0.4
0.45
LV
Q2
Q²Y Z Q²Y X1 Q²Y X2 Q²Y X3
88
The numbers of components selected for each block when building the SO-PLS model are
shown in Figure 47. Note that the curves in the plot should be interpreted sequentially,
using the same order as the one established for the blocks. Thus, the variance of Y
explained by Z (Q2Y_Z) as a function of the number of latent variables must be used first
to determine the number of Z components. The increase in Q2Y_Z when adding the 6th
component is less than 1%. Therefore 5 LVs were selected for the Z block. The explained
variance for the second block Q2Y_X1 is then used to determine the number of
components for the X1 residuals remaining after orthogonalization of X1 with respect to the
5 latent variables (scores) selected for Z. The Q2Y_X1 value after adding one component
for X1 is the cumulative Y-variance explained after using 5 LVs for Z and 1 LV for X1. Since
the additional variance explained by the latter X1 component increases the cumulative Q2Y
by less than 1%, then no components were retained in X1. This means that after
orthogonalization, X1 does not contain new information to explain additional Y-variance.
The curves for Q2Y_X2 and Q2Y_X3 should then be interpreted sequentially in a similar
way as for the first two blocks. At the end of the procedure, the numbers of components
selected for the SO-PLS model are 5, 0, 1 and 2 for the Z, X1, X2 and X3 blocks,
respectively.
Figure 48 – Selection of the number of LV for the SMB-PLS anode model
The SMB-PLS statistics (Figure 48) are very similar to those of SO-PLS as shown in
Figure 47. Hence, the number of LV for each block is also 5, 0, 1 and 2. The variance of
each regressor block and Y explained by the algorithms are presented in Figure 49. The
total variance of Y explained by each algorithm on the training and validation datasets are
very similar (Fig. 48 a). Hence, the sequential algorithms (SMB-PLS in particular) do not
1 2 3 4 5 6 7 8 9 100.15
0.2
0.25
0.3
0.35
0.4
0.45
LV
Q2
Q²Y Z Q²Y X1 Q²Y X2 Q²Y X3
89
lead to any loss of information. The main differences are the extent to which the algorithms
use each regressor block and how the information is distributed in the various latent
variables.
The variance of each regressor block explained by the algorithms is compared in Figure
49 b). Both sequential algorithms make a slightly greater use of the Xb blocks to model Y
since the total variance explained by these algorithms is about 6% higher compared with
MB-PLS (total variance is the sum of the variance explained of each block). This is most
likely due to the fact that the number of latent variables is different for each block in
sequential algorithms. The Z block contributes the most in all three methods, which is
expected because the anode manufacturing process is strongly driven by raw material
variability. However, the explained variance of Z by MB-PLS is lower compared with
sequential methods because MB-PLS tends to capture information from all the blocks
more evenly due of the existing correlations between the blocks (i.e. Z, X1 and X2).
Sequential methods seem more selective because the effect of Z on subsequent blocks is
captured. This is consistent with observations made on the simulation dataset. The
greatest difference between the sequential methods is how they model the subsequent
blocks. The variance of X1, X2, and X3 explained by SMB-PLS is higher by 18%, 30% and
5%, respectively, mainly because SO-PLS ignores the between block correlated
information as opposed to SMB-PLS which keeps it in the model.
How the information contained in the regressor blocks is distributed in each latent variable
of MB-PLS and SMB-PLS models is shown in Figure 49 c) and d), respectively. Again,
each MB-PLS component captures information from all the blocks in different proportions,
even if X3 is almost uncorrelated with the previous blocks in the sequence (i.e. LVs 3-5
explain variance from all four blocks, including X3). However, the first 5 latent variables of
SMB-PLS (Z-1 to Z-5) concentrate on raw material variations and their impact on
subsequent blocks, X1 and X2 in particular. Anode paste formulation (X1) is typically
adjusted according to changes in raw material properties (i.e. control actions) and, in turn,
raw material properties and formulation affect the mixing and forming process units (X2).
The first 5 LVs capture the correlations between these blocks all originating from variations
in raw material properties. No additional component was found significant for the X1 block
after exhausting information from Z. Hence, all the X1 variations relevant to Y falls in the
space of Z (correlated information) and additional (orthogonal) information left in X1 did to
contribute to explaining more variance of Y. The 6th latent variable (X2-1) captures
90
additional information in the mixing and forming block (orthogonal to raw materials) that
improved Y predictions. Finally, the last two components (X3-1 and X3-2) focus on the
baking block (X3) exclusively. SMB-PLS clearly shows that X3 is nearly orthogonal to the
other blocks because components 1-6 explain almost no variance of X3 and LVs 7-8
extract information from that block only. On the other hand, it would be difficult to draw
similar conclusions using the MB-PLS results since the information captured from all the
regressor blocks is distributed within most components.
Figure 49 – Results obtained with the multi-block algorithms on the anode manufacturing
dataset: a) R2Y and Q2Y for all methods, b) overall R2X by block for all methods, relative
weights (bars) and block variance explained R2X (lines) by LV for c) MB-PLS and d) SMB-
PLS
The enhanced interpretation ability provided by SMB-PLS is now discussed in terms of the
relationships identified between the variables included in different data blocks. Loading bi-
plots will be used to illustrate the similarities and differences between SMB-PLS and MB-
PLS. Comparing the interpretation ability of these two methods is not straightforward
because, in general, there may not be a direct correspondence between the latent
variables of the two approaches (e.g. LV-1 of MB-PLS and SMB-PLS do not necessarily
MB SMB SO0
10
20
30
40
50
Methods
R2 a
nd Q
2
Total Z X1 X2 X30
20
40
60
Blocks
R2
1 2 3 4 50
0.5
1
LV
Rela
tive w
eig
hts
and R
2X
1 2 3 4 5 6 7 80
0.5
1
LV
Rela
tive w
eig
hts
and R
2X
Cal R2Y Val Q2Y MB SMB SO
Z X1 X2 X3 Z X1 X2 X3
b)a)
c) d)
91
extract the same information). Nevertheless, after careful analysis of the latent variables in
each model, it was possible to find five pairs of latent variables (i.e. one in each method)
explaining similar variance of the regressor blocks (Z and Xb’s) and Y. These comparisons
are presented in Figure 50-53. Note that SO-PLS is not included in the comparative
analysis because it provides a similar interpretation as SMB-PLS for the orthogonal
information between the blocks, but no interpretation of the between block correlations is
possible because this information is completely removed from the model.
The loading bi-plot of the first two SMB-PLS components (i.e. Z-1 and Z-2) is shown in
Figure 50. These components were calculated using Z as the first block in the sequence,
which means that they capture the relationships between Z and Y and the variations in the
subsequent blocks (X1, X2 and X3) that are correlated with the block scores of Z. Hence,
the block loadings are calculated for each block in this modelling step (just as MB-PLS),
even for the X1 block (i.e. block with no component in the model) since the variance
contained in X1 that is relevant for Y falls in the latent variable space of Z. Note that some
variable names and the loadings of X3 are not shown in order to declutter the loading bi-
plot. The first two Z components do not explain much variance from the X3 block.
Figure 50 – Bi-plot of the block weights and Y loadings for first two components (Z-1 and
Z-2) of the SMB-PLS model built on the anode manufacturing dataset
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
LV1 (Z-1)
LV
2 (
Z-2
) Coke Na
Coke 28/48 app dens
Coke Si
Coke RT 4 Coke RT 8
Coke RT 14
Coke RT 30
Coke RT 50
Coke RT 100 Coke RT 200
Butts Ni
Pitch QI
Coarse Rt4
Coarse Rt8
Inter Rt50+Rt100
Fines Pt200
Agg Rt3/8 Agg Rt50+Rt100
Agg Rt200+Pt200
Agg Pt200 Coarse %
Fines %
Butts %
Pitch %
MX1 therm oil T
MX2 KW mean pan 2
MX2 dump gate pos
Anode dim 26
GAD Green anode dens
Green weight
Baked weight (mean)
App dens
L c
Elect resis Butts Ca
Z X1 X2 Y
A
D
D
A
B
B
C
B
C
C
92
The loading bi-plot essentially reveals how the anode paste formulation (X1) was adjusted
in response to variations in raw material properties due to supplier changes (Z) in order to
achieve the desired green anode density or GAD (Y-variable). The GAD is mainly affected
by the properties of the coke aggregate mix (i.e. composition, size distribution and coke
properties) as well as the amount of pitch used to formulate the paste. The latter is
typically adjusted in so-called pitch optimization experiments which are performed
periodically and also every time the coke and/or pitch supplier changes. Hence, the coke
aggregate mix and pitch properties are the main disturbances affecting GAD whereas
amount of pitch is the manipulated variable used to correct for these disturbances.
The pitch quinoline insoluble or pitch QI (Z) is the main pitch property requiring the amount
of pitch (X1) to be adjusted. When QI increases, the pitch has a reduced wetting capacity
and so this is compensated by adding more pitch to the formulation in a proportional
manner. The relationship between pitch QI and pitch demand is well characterized in the
literature (Hulse 2000). The positive correlation between pitch % and pitch QI caused by
the feedforward control adjustment is clearly shown in Figure 50 (ellipses labelled A).
The aggregate mix composition (i.e. proportions of butts, coarse and fine particles in the
blend) varies significantly at this particular plant and simultaneously with coke supplier
changes due to plant design and operating policies. Since the dimensions of the green
anodes are fixed, fluctuations in coke density occurring when supplier changes affect the
anode weight and, in turn, the amount of butts returning from the potroom after the anodes
are partially consumed by the reduction reaction. Since there is no inventory for crushed
butts particles at this particular plant, more butts are included in the formulation when the
coke has a higher density. Hence, the amount of butts in the recipe is correlated with coke
supplier changes and this explains why this relationship is extracted in this modeling step
(ellipses labelled B). The loading bi-plot (Figure 48) shows that the amount of butts is
negatively correlated with the amount of coarse and fine coke fractions. Butts particles are
generally coarser and so they replace coarse coke particles in the aggregate mix in order
to obtain the desired size distribution. The role of fine coke particles is to fill the pores of
the coarser coke particles and the voids in the aggregate mix to ensure high anode
density. Since butt particles are less porous than coke particles (Fischer & Perruchoud
1991), less fines are required when more butts are added to the mix.
93
The bi-plot shows that the amount of pitch is adjusted in a positively correlated fashion
with the amount of butts in the mix and in opposite direction with the fines ratio (rectangles
labelled C). These relationships are counterintuitive with respect to process knowledge
and the literature on this subject (Belitskus 1978, p.78; Belitskus 1993). At this point, the
reader is reminded that the model was built on routine production data and no designs of
experiments were performed. Therefore, the reader should not interpret the results based
on known cause and effect relationships between each regressor and response variable
as if the regressor variables were changed according to an orthogonal experimental
design. The amount of pitch used in the formulation is the result of the pitch optimization
experiments (i.e. some form of feedback control) compounding all the changes occurring
in the aggregate mix affecting pitch demand. It is therefore difficult to explain exactly why
the amount of pitch was changed in this way. However, the scatter plot of the amount of
pitch used in the formulation against the proportion of fines in the mix presented in Figure
49 may shed some light on this issue. It is clearly shown that the negative correlation
between these two variables is driven by the sources of raw materials (i.e. coke and pitch
suppliers) which suggests that unmeasured changes in some properties of fine coke
particles (perhaps Blaine number) may have modified the pitch demand. It is also
interesting to note the positive slope between pitch % and fines % for most of the
individual blends, which is consistent with process knowledge. That is more fines in the
formulation requires more pitch for a given coke type because of the higher surface area of
the fines to be coated by the pitch. Finally, it is important to point out that this interpretation
could not be obtained from the SO-PLS model because the relationship between pitch %
and fines % (X1) is correlated with the raw materials (Z) and this information would be
removed by the block orthogonalization procedure.
Additional interesting information can be extracted from Figure 50. For example, the
relationship between the size distribution of the raw coke and the size distribution of the
classified particles shown by the ellipses labelled D. This means that a change in the raw
material particle size as an effect on the efficiency of the screening and classification of the
particles in the three coke fractions. An increase in coke particle size increases the amount
of coarse in the coarse fractions (e.g. RT4 and RT8 particles) but lowers the amount of
intermediate size particles (i.e. RT50 and RT100) in the intermediate fractions. Another
example is with the thermal oil temperature in the first mixer (MX1) and the power
consumption of the second mixer (MX2), both variables included in the X2 block. These
94
variables are correlated with the raw materials block (Z) most likely because of changes in
the paste viscosity caused by fluctuations in raw materials properties and the required
formulation adjustments.
Figure 51 – Amount of pitch used in the formulation as a function of the amount of coke
fines particles for different raw material blends (combinations of coke and pitch suppliers)
The Z and X1 block weights of the first two components (LV1-LV2) of the MB-PLS model
are shown in Figure 52 for comparison. The interpretation of these components is very
similar to that made for the first two components of the SMB-PLS model (Figure 50). Both
models capture raw material and formulation variability and the pitch% and fines% are still
important in both components. That was expected because the process variability is
mainly driven by raw materials and supplier changes in particular. The advantage of SMB-
PLS in this case is that is clearly shows that most of the variations in formulation (X1) are
associated with the raw materials (Z) because of the imposed pathway between the
blocks. The remainder of this section will focus on differences between MB-PLS and SMB-
PLS.
-1 -0.5 0 0.5 1-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Fines %
Pitch %
Blend 1
Blend 2
Blend 3
Blend 4
Blend 5
Blend 6
Blend 7
Blend 8
Blend 9
Blend 10
95
Figure 52 – Z and X1 block weights bi-plot for LV1 and LV2 of MB-PLS
The paste plant process data block (X2) contains both correlated and orthogonal
information with respect to Z and X1 as extracted by SMB-PLS. As shown in Figure 49, the
5 LVs computed for the Z block (i.e. LVs Z-1 to Z-5) capture 30.4% of the variance of X2
whereas the 6th LV (i.e. X2-1) explains an additional 6.9% of its variance. It was found that
the information extracted from X2 in LV5 of the MB-PLS model closely matches that of the
6th LV (X2-1) of SMB-PLS. The loading bi-plots are shown in Figure 53 for MB-PLS (a) and
SMB-PLS (b). The SMB-PLS component X2-1 explains additional variance of X2 relevant
for predicting Y but orthogonal to the Z and X1 blocks (i.e. not related with raw material
variations and formulation adjustments). This component captures the relationship
between the green anode height and its density (green anode density or GAD). At the
plant, changes in anode density are compensated by changing the anode weight (i.e.
amount of paste fed in the mold) in order to maintain the anode height as constant as
possible, while the other physical dimensions of the anodes are fixed by the mold area.
This operation strategy aims at reducing variability of the anode dimensions to facilitate
downstream operations. The relationship between anode height and green anode density
(negative correlation) is clearly visible in the loadings of the SMB-PLS model (Figure 53 b).
It also appears in the MB-PLS loadings (Figure 53 a) but several other variables also have
strong loading values because of correlations with variables of other blocks. This makes
the interpretation more difficult to make.
-0.4 -0.2 0 0.2 0.4-0.5
0
0.5
LV 1
LV
2 Coke real dens
Coke Ca
Coke S
Coke Si
Coke Ni
Coke RT 50
Coke RT 200
Coke 8/14 app dens
Coke Rx CO2
Butts Al
Butts Ca
Butts ash
Butts Ni
Butts Si Butts Na
Butts V
Butts Na/Ca
Pitch SP Pitch TS
Pitch B/QI
Pitch CV
Pitch ash
Pitch S Pitch dist
Coarse Rt4 Coarse Rt8
Fines Pt200
Agg Rt3/8 Agg Rt4@Rt30
Agg Rt50+Rt100
Agg Rt200+Pt200
Agg Pt200
Dry agg (tph)
Paste (tph)
Coarse %
Fines %
Inter. %
Butts %
Pitch %
Green recyc %
Fines rot valve speed
Z
X1
96
Figure 53 – Bi-plots of X2 block weights and Y loadings: a) LV 5 of MB-PLS and b) LV6
(X2-1) of SMB-PLS
The next comparison between MB-PLS and SMB-PLS is based on the last block (X3)
which is almost orthogonal to Z, X1 and X2 as discussed previously. Indeed, the variance
of X3 explained by the first 6 SMB-PLS components (i.e. LVs Z1 to Z5 and X2-1) is only
4.6%, but nearly 50% of the variance in that block is captured by the last two components.
It was found that LV4-LV5 from MB-PLS explains similar information as LV7-LV8 from
SMB-PLS, and therefore these pairs of components were selected for the comparison.
The X3 block scores and weights are shown in Figure 54 for both methods. The blue and
red markers represent the two sampling position in the baking furnace. Since there is a
distribution of final temperature and heat-up rate gradients in the furnace, these positions
were chosen to sample the coldest and hottest anodes in the furnace.
The first observation made from the block weights (Figure 54 b and d), is that the baking
position (pit position) is the most important variable for both models. The position in the
furnace is a very important variable affecting the anode properties (Lauzon-Gauthier 2011)
due to the distribution of heat-up rate and final anode temperature between the anodes.
This is not affected by raw material variability and paste plant operating conditions and it
explains why this block still contains new information (i.e. for the SMB-PLS) after all the
other blocks have been modeled.
The second observation is the improved separation of the 2 classes of anodes based on
the pit position by SMB-PLS. This is due to the sequential orthogonalization of the blocks.
As the correlated information is removed block by block at each new component, only new
information not explained by previous block is left to explain the variability in the Y data.
This improves the interpretation of the subsequent blocks by removing redundant
X2 Y
5 10 15 20 25 30-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Variables
LV
5
Agg pre-heater T
MX1 KW mean Paste T between MX
Paste T after MX2
PP mean ext T Green anode height
VC bellows P
GAD Green anode dens
Thermal cond
App dens
L c
Agg pre-heater_2 current
5 10 15 20 25 30-1
-0.5
0
0.5
1
Variable
LV
6 (
X2-1
)
Agg pre-heater T Paste T between MX
Paste T after MX2
PP mean ext T
Green anode height
VC bellows P
GAD Green anode dens
App dens
L c
a) b)
97
information. In the MB-PLS model, the LVs associated with most X3 block variance also
contains information from other block that adds variability to the component and degrades
the interpretation ability.
Figure 54 – Baking block (X3) scores and loadings bi-plot: a) MB-PLS block scores, b) MB-
PLS block weights for LV4-LV5, c) SMB-PLS block scores and d) SMB-PLS block weights
for LV7-LV8 (X3-1 and X3-2). The blue and red markers indicate the anodes baked in the
coldest and hottest positions in the furnace
In terms of interpretation, both the MB-PLS and the SMB-PLS capture the correlation
between the baking temperature (i.e. pit position) and the LC (i.e. crystallinity) and the real
density of the anode. A higher final baking temperature increases the crystallinity of the
anode micro-structure (Keller & Sulger 2008). But the SMB-PLS (Figure 54 d)
interpretation for the mechanical properties (i.e. compressive strength and Young’s
modulus) and the electrical resistivity is much clearer than the MB-PLS (Figure 54 b). It is
known in the literature that the final baking temperature and heat-up gradient influences
the mechanical and electrical properties of the anodes (Fischer et al. 1993). The SMB-PLS
shows a much stronger covariance between these properties than MB-PLS. This is likely
-1.5 -1 -0.5 0 0.5 1 1.5-1.5
-1
-0.5
0
0.5
1
LV4
LV
5
Cold Hot
a)
-1 -0.5 0 0.5-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
LV4
LV
5
Oven
Fire
Pit position
Fire cycle T
BF pit starting T
BB1 pit max T pos A
BB2 pit max T pos A BB3 pit max T pos A
BB3 pit max T pos B
BB1 flue 3 max T
BB3 flue 3 max T
BB3 flue 3 T set point
GAD Green anode dens
Green weight
Baked weight (mean)
Thermal cond
App dens
Real dens
Comp strengh
L c
Young's mod
Elect resis
X3 Y
b)
-1 -0.5 0 0.5 1-1.5
-1
-0.5
0
0.5
1
1.5
2
LV7 (X3-1)
LV
8 (
X3-2
)
c)
-1 -0.5 0 0.5-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
LV7 (X3-1)
LV
8 (
X3-2
)
Fire
Pit position BF pit starting T BB1 pit max T pos A
BB2 pit max T pos A BB3 pit max T pos A
BB3 pit max T pos B
BB3 flue 3 T set point
GAD Green anode dens Green weight
Baked weight (mean)
Thermal cond App dens
Real dens
Comp strengh
L c
Young's mod
Elect resis
Oven BB1 flue 3 max T
Fire cycle T
d)
98
because the information captured from X3 by MB-PLS is spread more evenly in all the
components of the model, whereas the influence of the baking step is captured mainly by
two components in the SMB-PLS model.
The last point of comparison between MB-PLS and SMB-PLS is based on information
mixing. This problem with MB-PLS was investigated by Westerhuis and Smilde
(Westerhuis & Smilde 2001). Basically, they have shown that when the Xb block is deflated
using the super-scores of the MB-PLS model, information from the other blocks is
introduced into the Xb block (hence the name information mixing), which affects the
interpretability of MB-PLS models. Westerhuis and Smilde proposed to deflate the Y data
only as a solution to this problem, but this modification is currently not implemented in
commercial multivariate data analysis softwares, such as ProMVTM (ProSensus Inc.).
Hence, the traditional MB-PLS super-score deflation approach will be used in the
comparative study.
To demonstrate that SMB-PLS do not suffer from information mixing, a similar strategy as
the one adopted by (Westerhuis & Smilde 2001) is used. Basically, the raw material
properties matrix (Z) contains blocks of repeated data because the properties are only
available as weekly averages. Hence, all anodes produced in the same week have the
same raw material data. This also means that all these anodes should have the same Z
block score values (i.e. should overlap perfectly in a score plot) in absence of information
mixing, and will have different values if mixing occurs. The comparison will be performed
using both the super-scores and block scores of MB-PLS and SMB-PLS. First, the super
scores are used to demonstrate that MB-PLS captures information from all blocks at the
same time. Conversely, the block orthogonalization steps of the SMB-PLS algorithm forces
the model to capture only the variability related with the block of interest. Second, the
block scores will be used to show the presence or absence of information mixing between
the blocks.
99
Figure 55 – Comparison of the information mixing in MB-PLS and SMB-PLS models: a)
super scores (LV1-LV2) of MB-PLS, b) super scores (LV1-LV2) of SMB-PLS, c) Z scores
(LV2-LV3) of MB-PLS and d) Z scores (LV2-LV3) of SMB-PLS
The super-scores for the first two components of MB-PLS and SMB-PLS are presented in
Figure 55 a) and b). These components in both algorithms were shown to capture the
impact of raw material variations and associated process variations on anode properties
(Figure 48 and 50). Although the general trends in the super-scores are similar for both
algorithms, those of the SMB-PLS models (Figure 55 b) show less variability. This is
because LV1 and LV2 model the impact of Z on subsequent blocks and Y, and are not
corrupted by other types of information contained in the subsequent X blocks. The
observations having the same Z data overlap on top of each other (Figure 53 b), whereas
the super-score values of MB-PLS (Figure 55 a) are clearly different even if some of them
share the same Z data.
To illustrate the mixing of information during the deflation step, Figure 55 c) and d) show
the block scores of the second and third components of MB-PLS and SMB-PLS. These
components were selected since information mixing only appears after the first deflation
step. Once again, both models capture the similar information with LV2 and LV3. The
-0.5 0 0.5 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
LV 2
LV
3
-0.5 0 0.5 1-1
-0.5
0
0.5
1
LV 2 (Z-2)
LV
3 (
Z-3
)
-1.5 -1 -0.5 0 0.5 1 1.5-1.5
-1
-0.5
0
0.5
1
1.5
LV 1
LV
2
-1.5 -1 -0.5 0 0.5 1 1.5 2-1.5
-1
-0.5
0
0.5
1
1.5
LV 1 (Z-1)
LV
2 (
Z-2
)
a) b)
c) d)
100
information mixing is clearly visible in the block scores of MB-PLS (Figure 55 c), consistent
with the study by (Westerhuis & Smilde 2001). No sign of information mixing appear with
SMB-PLS (Figure 55 d). The small differences in block score values for a few observations
are due to the missing data imputation procedure and not to information mixing. Indeed,
the iterative PCA approach used to estimate the few missing data could compute different
values for anodes produced in the same week.
5.5 Conclusion
This chapter describes an improved multi-block PLS algorithm for the analysis of complex
process data. The interpretation of PLS models (e.g. scores and loadings plots) of
industrial dataset can be difficult due to the large number of variables and the complex
correlation structure between the blocks (e.g. control actions). Fortunately, processes are
often a succession of smaller operations done in a meaningful and structured order and
the variables of the dataset can be blocked accordingly.
Multi-block PLS has several advantages over the normal PLS due to the ability to
scrutinise the model at a super level and a block level. Unfortunately, it does not help to
differentiate between correlated and orthogonal information contained in each component
of the model. Alternatively, the SO-PLS takes advantage of the sequential nature of
process data by removing correlated information in subsequent blocks. However, it does
not provide the possibility to interpret that information.
The objective of this chapter was to develop a new sequential multi-block PLS algorithm
called SMB-PLS to combine the advantages of the MB-PLS and SO-PLS to improve the
interpretation of complex industrial process data. The MB-PLS structure with the super and
block levels (i.e. two level of scrutiny) was used as the basis of the algorithm. To avoid
misinterpretation of the results due to the correlation between the blocks, the
orthogonalization scheme used in the SO-PLS was incorporated into the MB-PLS
structure. This enables the interpretation of both the correlated and orthogonal separately
without loss of information. Also, no significant differences were observed in the
computational load between the three multi-block methods.
The performance of the new SMB-PLS was illustrated using two datasets. The first was a
simulated polymer film blowing process and the second was a real industrial dataset from
101
the carbon anode manufacturing process. The prediction performances of the new
algorithm were found to be similar to the MB-PLS and SO-PLS algorithm for both datasets.
However, the SMP-PLS has some limitations. First, it does not improve the predictive
ability compared to a regular PLS. Second, the method to choose the number of latent
variables has not been explored much. A sequential approach has been used in this
thesis, but a global selection could also be possible. Finally, the sequential structure
imposed in SMB-PLS may not be suited for more complex and highly integrated processes
including recycle streams.
The simulated dataset contained two different case studies that were used to illustrate the
pathway orthogonalization properties of the SMB-PLS. One without correlation between
the raw material and process data blocks while the other one contained both correlated
and orthogonal variability. Using the SMB-PLS, the correlated variations due to the
feedback control actions were captured by a different set of latent variables than the
orthogonal variability. This was not the case with the MB-PLS, in which both the correlated
and orthogonal information was spread in all components.
The anode manufacturing dataset was used to validate the new algorithm on a real life
dataset. It was found that the information contained in each latent variable was different
than with the MB-PLS algorithm for which it is not possible to differentiate correlated and
orthogonal information. In the SMB-PLS, the subsequent blocks only contain new
information and it was shown that removing the correlated variability in a sequential order
led to better interpretation of models. The raw material block (Z) components showed the
effect of raw material variability on the process variables while subsequent blocks
contained variability that was not due to the raw materials. Finally, it was showed that
there was no information mixing between the blocks when using the SMB-PLS algorithm.
103
Chapter 6 Paste image texture analysis
This chapter presents the first part of the development of the anode paste machine vision
methodology. It introduces the laboratory work performed on anode paste. Testing of the
image sensor with industrial paste will be presented in Chapter 7.
6.1 Introduction
The need for real-time quantitative measurements of the anode quality has been
discussed in details in the introduction chapter (section 1.6) of this thesis. The most
important issue is with the increased variability of the anode raw materials (i.e. coke and
pitch). Even if good quality materials are still currently available on the market, these have
a higher cost. In order to reach a compromise between cost and quality, the anode
manufacturers blend materials from different suppliers, including some lower cost/quality
raw materials. Producing anodes with consistent quality attributes therefore requires the
manufacturers to track the properties of incoming coke and pitch materials and those of
the anodes itself, and adjust the anode formulation accordingly using feedforward and/or
feedback control. However, the key raw materials and anodes properties are currently
measured in the laboratory using a limited number of samples and the results are typically
available after long time delays. Developing new real-time sensors for tracking the
properties of the materials at different stages of the production chain is therefore important
to help manufacturers mitigate the impact of increasing raw material variability. It was
decided to focus first on developing a sensor for the quality of the anode paste because it
is the material used to form the anode. Using images as a high frequency measurement to
compensate for the lack of real-time measurements of raw material quality and some
operating parameters such as the particle size distribution could enable quick feedback
control actions on formulation with little dead-time.
Anode paste quality is generally defined in terms of the material that yields the desired
baked anode properties. In this work, a high quality paste is defined as one made from the
right combination of coke aggregate size distribution and amount of pitch for a given coke
source such that baked anode density (BAD) is maximized. Changes in aggregate size
and coke properties affect the so-called pitch demand and eventually the baked anode
density if not corrected for by adjusting the amount of pitch in the formulation.
104
The hypothesis tested in this research was that changes in the coke aggregate mix
affecting pitch demand and the amount of pitch in the formulation modify the paste visual
appearance. Hence, a machine vision sensor could be used in real-time and non-
intrusively to indicate whether the amount of pitch in the formulation should be adjusted.
In spite of an extensive literature review, no applications of machine vision to anode paste
images (i.e. nor to any kind of paste materials) were found. Therefore, the review was
broadened to include applications to the anode itself and other similar materials.
Image analysis has been used before to characterize anode quality both qualitatively and
quantitatively. These applications were based on optical microscopy images of carefully
prepared and polished anode samples. Adams et al. (Adams et al. 2002) found that the
optimum binder layer thickness correlated well with the optimum BAD and electrical
resistivity of the anodes. The binder thickness was measured using a combination of
thresholding and dilation techniques to segregate the binder and the coke particles in the
image. Rorvik et al. (Rorvik et al. 2006) used thresholding on images gathered by
polarizing light microscopy. This method could segment the pitch, the coke particles and
the pores. It was also used to quantify the pitch thickness distribution and correlate this
measurement to some anode properties. Finally, Sadler (Sadler 2012) qualitatively
analysed microscopy images of baked anode surfaces. He found that changing the
operating conditions in the paste plant had an effect on the visible micro-structure of the
anodes. This study provides supporting evidence that the visual appearance of the anode
changes with variations in processing conditions.
However, these methods cannot be used for real-time application on anode paste samples
for two reasons. First, these imaging microscopy techniques require sample preparation
and are limited to small sample size. Hence, they are time consuming and may lack
representativeness with respect to the throughput of industrial production lines. Second,
thresholding methods perform well with polarizing light microscopy images because the
contrast is generally large. This is not the case with the low contrast paste images
collected using industrial cameras (no magnification) as shown in Figure 56 where the
macro-texture formed by the various components is all dark.
105
Figure 56 – Anode paste image
Similarly to anodes, asphalt and concrete are two granular materials that also contain a
certain amount of binder. Thresholding and segmentation techniques have also been
applied to asphalt (Yue & Morin 1996; Bruno et al. 2012) and concrete mixes (Dequiedt et
al. 2001). Geometric measurements were made on the segmented particles to
characterize these granular materials. Dispersion was also measured from the concrete
images. Unfortunately, these image analysis applications are based on core samples of
cured material (i.e. sample preparation required) and not the paste material itself.
Internal research was conducted in the past at the Alcoa Technical Center (Pittsburgh,
USA) in order to investigate the possibility of estimating the amount of pitch in the paste
using images (Adams et al. 2007; Adams et al. 2009). The proposed approach using
statistics computed from the gray level intensity histogram of the images showed
promising results on paste samples prepared in the laboratory. Additional work was
needed to improve robustness with industrial paste and also to verify the sensitivity to
more parameters than the pitch ratio.
Image texture analysis methods seem more appropriate for tracking variations in the paste
related with coke aggregate properties (i.e. pitch demand and size) and the amount of
pitch. For example, changes in the aggregate size distribution should affect the degree of
fineness or coarseness of the paste. In addition, an under-pitched paste should look
rougher, but smoother when it is over-pitched. Texture methods should be sensitive to
these changes in the paste visual appearance because they extract information about the
spatial organization of the pixels within the image (i.e. relationships between the light
intensities of neighbouring pixels or patterns). Furthermore, multi-resolution textural
106
methods, such as the wavelet wexture analysis (WTA), have been shown to be more
robust to variations in lighting conditions (Bharati et al. 2004). Finally, image texture
analysis is increasingly used in the process industries for monitoring products for defects,
predicting overall product quality, and for feedback control of product quality as reviewed
by Duchesne et al. (2012). They have proven to be useful in several applications that are
relevant for the paste imaging problem studied in this work. Texture methods were applied
to the mineral processing field for the characterization and control of the froth flotation
process (Liu et al. 2005; Liu & MacGregor 2008) and for classification of ore materials
(Tessier et al. 2007). They have been used for the detection and classification of surface
defect in many industrial applications such as manufacturing of steel sheets (Bharati et al.
2004), polymer films (Liu & MacGregor 2005; Gosselin et al. 2009), paper sheets (Reis &
Bauer 2009; Reis & Bauer 2010), artificial stone countertops (Liu & MacGregor 2006),
glass substrates for TFT-LCD screens (Yousefian-Jazi et al. 2014), semi-conductors
(Facco et al. 2009), textile fabrics (Zhang et al. 2007) and pharmaceutical tablets (García-
Muñoz & Carmody 2010). They were also used to measure the degree of homogeneity of
a binary mixture of polymer powders (Gosselin et al. 2008) and for automatic
characterization of nanofibers using SEM (scanning electron microscopy) (Facco et al.
2010).
The objective of this chapter consists of developing a machine vision sensor for anode
paste characterization based on its surface texture (i.e. visual appearance), and to
demonstrate its performance on paste samples formulated in the laboratory. This sensor
should be sensitive to changes in paste formulation and pitch demand. The first is related
with the coke aggregate size distribution and pitch level and the second with the
processing conditions and raw material properties. The sensor should also provide some
indication of what the optimum amount of pitch is for a given coke source. Furthermore,
this study should contribute to a better understanding of the relationships between the
paste surface texture and its macro properties (i.e. formulation and pitch demand) in order
to set the stage for an application to industrial paste samples.
Two texture methods were selected for anode paste characterization, namely the gray
level co-occurrence matrix (GLCM) and the wavelet texture analysis (WTA). The former is
a statistical approach to texture analysis whereas the latter is a transform based textural
method. These two are recognized as state-of-the-art techniques (Bharati et al. 2004) and
were selected because textural patterns in paste images (Figure 56) are stochastic in
107
nature rather than highly repetitive and clearly identifiable structures. Note that some
preliminary results of using DWT were already published (Lauzon-Gauthier et al. 2014) to
show the potential of using texture analysis on anode paste images. In this publication, the
ability of the sensor to detect the optimum pitch demand was not covered. This chapter
goes much beyond and completes this work by comparing different wavelets types, and
exploring different combinations of image pre-processing techniques and textural features.
The best combination of methods, wavelet and features for the paste images application
as well as the relationships between the textural features and the paste properties were
determined using three sets of paste samples produced in the laboratory under different
formulations and experimental processing conditions.
This chapter is organized as follows. First, the experimental details on the fabrication of
the paste samples are given in section 6.2. Then, section 6.3 describes the development
of the paste image texture analysis scheme including the choice of wavelets and textural
features. This is followed by the results section (6.4) where the interpretation of the
features is presented, after which conclusions are drawn.
6.2 Laboratory paste and anode experiments
Three different sets of experiments were performed in the laboratory. The first two were
made using a formulation very similar to those used in the industry, including coke in all
size fractions and butts in order to be as representative of the real paste as possible. The
third used a typical formulation for lab scale anode fabrication in which no butts and coarse
coke particles are used to be less sensitive to variability due the large particles because of
the small size of the anode samples.
The goals of the laboratory experiments were to find the best combination of methods and
features to capture the visual information of the paste and to understand the relationships
between the features and the properties of the paste.
6.2.1 Preliminary design on paste formulation
This is the first experiment that was performed for the development of the machine vision
algorithm. The goal was to verify that the image texture analysis method was sensitive to
changes in the amount of coke fines in the paste. Since 75% of particles in the fines
fraction are smaller than the camera’s resolution (i.e. 1 pixel = 40.7µm), they should not
influence the “visible” size distribution in the image. However, the fines have an impact on
108
the pitch demand of the paste which should influence its surface texture. It is important to
note that the optimum pitch demand of the paste was not determined in this test.
Paste samples were prepared by using five different amounts of fines (i.e. in grams) for
two different amounts of pitch. The other fractions were the coarse and intermediate coke
and the butts. The amount of these fractions was fixed at 125g, 100g and 90g respectively.
Table 15 presents the details of the 10 pastes formulations.
Table 15 – Formulations used in the first series of experiments aiming at varying the amounts of coke fines and pitch in the paste.
Since the total amount of paste was different for each paste formulation, the percentages
(i.e. relative mass) of each fraction is different for each experiment. The pitch % is
computed on a dry aggregate basis and is therefore a ratio.
6.2.2 Detailed design on paste formulation
The aim of the second set of experiments performed in the laboratory was again to assess
the sensitivity of the paste image texture analysis but for a wider range of variations in the
paste formulation. The ratios of the various constituents, the size distribution of the dry
aggregate mix, the fineness of the coke fines fraction as well as some mixing parameters
were modified in order to introduce changes in pitch demand, size of the aggregate mix,
and ratio of pitch to dry aggregate. These variations should influence the visual
appearance of the paste samples. More details on how each parameter is expected to
change the paste appearance are provided later in this section. As for the first experiment,
the optimum pitch demand for each set of properties was not determined experimentally.
Experiment
Number of
replicates
Pitch
(g)
Pitch
(%)
Fines
(g)
Fines
(%)
P-L_F-1 1 55,0 15,07 50,0 11,90
P-L_F-2 1 55,0 14,67 60,0 13,95
P-L_F-3 1 55,0 14,29 70,0 15,91
P-L_F-4 1 55,0 13,92 80,0 17,78
P-L_F-5 1 55,0 13,58 90,0 19,57
P-H_F-1 3 65,0 17,81 50,0 11,63
P-H_F-2 3 65,0 17,33 60,0 13,64
P-H_F-3 3 65,0 16,88 70,0 15,56
P-H_F-4 3 65,0 16,46 80,0 17,39
P-H_F-5 3 65,0 16,05 90,0 19,15
109
The tested parameters are listed in Table 16. To preserve confidentiality, the formulation of
the base mix (i.e. nominal paste formulation serving as a reference) is not revealed.
Therefore, the various changes made in this series of experiments are reported as
percentage of deviation with respect to the base mix, except for the nominal mixing time
and temperature which were 10 min and 178°C, respectively. The percentages for the
coarse, intermediate, fines and butts fractions are reported on a dry aggregate basis, but
the pitch % is based on the paste weight. Finally, the weight of each paste sample was
kept fixed at 450 g. Hence, the weights of each fraction were adjusted accordingly.
In this second series of experiments, 23 different paste formulations were prepared and
some of them were replicated for a total of 32 batches. Two images were acquired for
each paste sample for a total of 64 images.
The overall visual appearance of the paste in this experiment is described as wet or dry.
Since the optimum pitch demand is not known for any of the prepared samples, it is
described in relative terms. For a fixed pitch/dry aggregate ratio, a change in a given
parameter that caused the paste to look dryer was considered to increase the pitch
demand and vice-versa for a wet paste. If more pitch is left on the surface of the particles
for the same amount of pitch in the paste (wetter paste), it means that the aggregate mix
requires less pitch (i.e. lower pitch demand). The expected effects of each of the
parameters included in the experimental design on the paste visual appearance are
described below.
110
Table 16 – Changes in the paste formulation tested in the second set of experiments
Five different parameters were manipulated from the base mix to influence the pitch
demand of the paste. First, a ±10% change was made to the butts ratio. The change in the
dry aggregate weight was compensated by adding/removing coke from the coarse fraction
(similarly as plant operation practice) since the butts is mainly composed of large particles.
Crushed anode butt particles are less porous than fresh coke. Thus, less pitch is required
to wet them properly. A higher amount of butts fraction decreases the pitch demand and
should lead to a wetter paste (for the same amount of pitch).
The Blaine number (BN) (i.e. fineness) of the fines was also varied. Different coke fines
with specific BN were prepared in the laboratory and used in substitution of the industrial
fines fraction. The BN of the samples were measured using a Malvern laser diffraction
particle size analyzer. Ball mill fines with 2300, 4000 and 6000 BN were used. The BN
number of the industrial fines was within the range of the laboratory fines. The industrial
fines also differ from the lab fines because they contain very fine dust particles collected
Description
Number of
replicates Changes from base mix
Base mix 4 ---
Decreased butts ratio 2 -10 %
Increased butts ratio 2 +10 %
Different Blaine number (fines fraction) 2 BN 2300 cm2/g
Different Blaine number (fines fraction) 1 BN 4000 cm2/g
Different Blaine number (fines fraction) 2 BN 6000 cm2/g
Decreased fines ratio in the aggregate mix 1 -4 %
Decreased fines ratio in the aggregate mix 1 -2 %
Increased fines ratio in the aggregate mix 1 +2 %
Increased fines ratio in the aggregate mix 1 +4 %
Decreased pitch ratio in the paste 2 -1,4 %
Increased pitch ratio in the paste 2 +1,6 %
Decreased coarse and intermediate frac. 1 Coarse -12,5 % Inter -6 %
Decreased intermediate frac. 1 Inter -11 %
Increased coarse and intermediate frac. 1 Coarse +7,5 % Inter +4 %
Increased intermediate frac. 1 Inter +9 %
Substitution of coarse frac. by shot coke 1 20 %
Substitution of coarse frac. by shot coke 1 40 %
Substitution of coarse frac. by shot coke 1 60 %
Decreased mixing temperature 1 158 °C
Increased mixing temperature 1 188 °C
Decreased mixing time 1 -5 min
Increased mixing time 1 +5 min
111
by the fume treatment systems at the plant. Hence, the particle size distribution of the
industrial fines is not entirely similar to the particle size distribution of ball mill fines even if
they have the same Blaine number. Finer fine particles (i.e. higher BN) should require
more pitch because of their higher specific surface area. The paste should therefore look
dryer when using higher BN fines and wetter when using lower BN fines if the amount of
pitch remains unchanged.
Changes made to the fines ratio were compensated by adding/removing coarse and
intermediate coke particles and butts particles in equal amounts, similarly as in the first
series of experiments. The tested fines ratios were ±2 and ±4%. The fine coke particles
have a higher surface area than the coarser fractions and therefore require more pitch. A
finer formulation increases the pitch demand of the paste and the images should look
dryer.
In some formulations, a proportion of the coarse coke fraction was replaced with shot coke
(e.g. 20, 40 and 60% of the coarse fraction). Shot coke has inferior mechanical properties,
lower thermal shock resistance and a lower level of open porosity for pitch penetration. It is
therefore not used in industrial anode formulations (Edwards et al. 2009). However, it was
used in this study to generate additional variations in pitch demand. Because shot coke is
typically less porous, the pitch demand should decrease with the addition of shot coke and
the paste should look wetter for the same amount of pitch. This effect is similar to the
addition of butts to the formulation.
The mixing conditions do not introduce variations in pitch demand per se, but affect the
pitch penetration in the pores of the particles and are likely to alter the paste visual
appearance in a similar way as when a change in pitch demand occurs. However, too long
mixing times can be detrimental to the paste and increase the pitch demand (Hulse 2000).
Over-mixing can break some of the coke particles and create more surface area that need
to be wetted by the pitch. Mixing time was changed by ±5 minutes. Mixing temperature
was varied from 158 to 188 °C. When it is too low, the pitch is more viscous and should
not penetrate as much in the pores of the particles, leading to a wetter paste. This may be
detrimental to all anode properties. More detailed explanations of the effects of the
formulation and processing conditions on anode properties are available in (Belitskus
1978; Belitskus 1981; Belitskus 1993; Belitskus 2013; Belitskus & Danka 1988; McHenry
et al. 1998; Hulse 2000).
112
In order to deliberately change the wetness of the paste, the amount of pitch was also
changed (i.e. -1.4% and + 1.6%). For a constant dry aggregate formulation, there is a
direct relationship between the amount of pitch in the paste and its degree of wetness.
Finally, the dry aggregate size distribution was manipulated by either substituting the
coarse and intermediate fractions, or the intermediate fraction only, by fine coke particles
(e.g. +7.5% coarse, + 4% inter and -11.5% fines or - 11% inter and + 11% fines). Although
changes in the aggregate size distribution may also modify pitch demand due to changes
in overall porosity of the aggregate mix, it is expected that these variations will have a
different impact on paste visual appearance (i.e. its texture) compared with that of a
change in pitch demand. Hence, the two types of disturbances should be distinguishable
by the machine vision system.
6.2.3 Pitch optimization experiments
The last set of laboratory experiments was performed in order to assess the possibility of
quantitatively detecting the optimum pitch demand of the aggregate mix (i.e. optimal
amount of pitch to use in the paste formulation) using the machine vision approach. The
optimum pitch demand (OPD) is defined as the amount of pitch required to obtain the
maximum apparent baked anode density (BAD) on laboratory scale anodes.
To find the OPD, it is necessary to perform a pitch optimization experiment, a current
practice in the industry. It consists of changing the pitch ratio over a broad range of levels
for a given dry aggregate mix and fixed processing conditions, and measuring the BAD at
each pitch level. This requires collecting a sample from each paste formulation, forming it
as a laboratory scale anode, and baking it to obtain its BAD value. Due to the small
dimensions of the lab scale anodes, the size distribution of the particles was limited to a
maximum size of approximately 4 mm. Hence, the formulation of the aggregate mix is
different compared to the first two series of experiments (no butts and less of the coarse
coke fraction). The laboratory formulation presented in section 4.2.2 was used for this
experiment. Two different cokes were used (see section 4.2.2) to test the machine vision
system on materials having different properties. Coke A has a higher VBD than coke B.
Thus, the density of the anodes made with coke A will be higher than those made with
coke B. The range of pitch used for each coke and the number of replicated samples
produced are presented in Table 17.
113
Table 17 – List of experiments for the laboratory pitch optimization
The results of the pitch optimization procedure are presented in Figure 57. In this figure,
the solid lines represent the baked anode density or BAD (left axis) and the dash lines
correspond to the green anode density or GAD (right axis). The mean value of all the
replicate samples at each pitch level is plotted in the figure and not the individual values.
Also the ± 1 standard deviation error bars are plotted for the BAD.
Figure 57 – Baked and green anode density (BAD and GAD) for the pitch optimization
anodes using cokes from two different sources (A and B)
Pitch %
Number of
samples Pitch %
Number of
samples
15,0 3 15,2 2
16,0 2 15,7 3
17,0 2 16,2 4
17,5 3 16,7 3
18,0 3 17,2 3
18,5 3 18,2 3
19,0 3 19,2 2
20,0 3 20,0 2
21,0 2 21,0 2
22,0 2 22,0 2
23,0 2 23,0 1
24,0 2 24,0 2
25,0 2
26,0 2
Coke A Coke B
1,48
1,5
1,52
1,54
1,56
1,58
1,6
1,62
1,64
1,66
1,68
1,46
1,48
1,5
1,52
1,54
1,56
1,58
1,6
14,0 16,0 18,0 20,0 22,0 24,0 26,0
GA
D
BA
D
Pitch % (ratio)
BAD coke A BAD coke B GAD coke A GAD coke B
114
It is shown in Figure 57 that the OPD was found for both types of cokes. Coke A has an
optimum BAD of 1.578 g/cm3 at 20% of pitch ratio (16.67% in percentage) and coke B has
a maximum BAD of 1.551 g/cm3 at 21% of pitch ratio (17.36% in percentage). As
expected, coke A has a higher BAD and a lower optimum pitch demand than coke B due
to its higher VBD (Table 9). The average and maximum standard deviation of the
replicates for the BAD measurements were 0.005g/cm3 and 0.012g/cm3 respectively. This
is consistent with the variability of the laboratory anodes fabrication from previous work
performed on the same experimental set-up (Azari Dorcheh 2013). Finally, the GAD shows
no optimum. Adding more pitch will always increase the green density of the anode and
this is the reason why it is not a good indicator of the anode quality.
The OPD of the formulations used in this series of experiments was reached at a higher
pitch ratio than what is typically observed in the industry for a pitch of 16.5% QI. However,
this is explained by the fact that the laboratory formulation has no butts and no very coarse
coke particles so it is expected to have a higher pitch demand.
6.3 Selection of preprocessing operations and image textural
features
The aim of this section consists of determining the optimal combination of image
preprocessing operations to apply to paste images and textural features to extract from
them in order to maximise the performance of the machine vision sensor. The latter is
defined in terms of its capacity to detect changes in the paste (e.g. pitch demand and
aggregate size) that ultimately will affect the baked anode properties. A sequential
approach is adopted to arrive at the best results. The impact of applying contrast
enhancement to the images is tested first using the Discrete Wavelet Transform (DWT) as
the initial texture method and the energy computed on detail sub-images as textural
descriptors (features). This choice was motivated by the fact that the DWT-energy
combination is commonly used in texture analysis (Bharati et al. 2004; Liu et al. 2005;
Duchesne et al. 2012) and preliminary results published by the author using this approach
showed promise (Lauzon-Gauthier et al. 2014). In the second step, the effect of selecting
different types of mother wavelets on the imaging sensor performance is investigated.
Finally, additional textural features are added in different combinations to determine the
best set of descriptors to use for anode paste image analysis. Note that the Gray Level
Co-occurrence Matrix (GLCM) is also considered at this stage as an alternative texture
method to DWT.
115
The dataset and criteria used to perform the comparison are presented first, followed by
the assessment of image preprocessing, type of mother wavelet and textural feature
extraction.
6.3.1 Dataset and criteria used for the comparative analysis
The dataset collected during the pitch optimization experiments (section 6.2.3) is used to
compare the various alternatives. Indeed, the baked anode density (i.e. Y data) was
measured in this series of experiments which allows for the estimation of PLS regression
models between the image textural features and the paste quality attribute. The prediction
performances as well as the interpretation offered by these models enable a quantitative
comparison between the different ways of preprocessing and extracting the information
from the paste images.
Baked anode density is a complex function of raw material properties (e.g. coke density,
pitch demand, size distribution and particle shape), formulation and process operation
(mixing, forming and baking). At the formulation stage, the main source of variations to
deal with is that caused by the coke properties due to frequent supplier changes. The goal
of the formulator is to achieve the maximum BAD for a given coke source rather than
obtaining a fixed target value because the optimal BAD is expected to change with coke
supply as shown in Figure 57. Thus, to assist the operators, the imaging sensor should
ideally be able to indicate whether the actual anode paste should yield the optimum BAD
(if subsequent process operations are performed adequately) or if the current coke blend
is under or over pitched. This raises the following question: does the paste visual
appearance (i.e. surface texture) look similar when the optimal pitch content is reached for
different coke sources? If that is indeed the case, one could establish a simple multivariate
statistical process control scheme on paste textural features to ensure that these fall within
the desired (optimal) region and adjust the pitch ratio accordingly when the paste texture
drifts outside this region. In order to address this questions while searching for the best
combination of image preprocessing and analysis techniques, the BAD values for each
anode were transformed into deviation from the optimum density (∆BAD) using equation
6.1. This transformation was applied separately to each of the two cokes and pitch level.
pitch% max pitch%BAD BAD BAD∆ = − 6.1
116
In equation 6.1 BADmax is the maximum BAD obtained for the full range of pitch level for a
given coke and BAD is the actual BAD at a given pitch level for the same coke. Using the
∆BAD measurement shown in Figure 58, it is possible to quantify the distance to the
optimum pitch demand for each paste sample and each type of coke. Of course, this
means that the optimum BAD values for different cokes are known at the time of the
calibration of the imaging sensor, but not when used for on-line monitoring. In the latter
case, the PLS model would predict the deviation from optimal BAD (if desired), or one
could simply look at the PLS scores to verify that they fall within the optimal region.
Figure 58 – ∆BAD of the lab formulation anodes
The PLS regression problems are formulated as follows. The image features calculated for
each case are collected in the regressor matrix X (N×K) where N and K correspond to the
number of paste samples in the dataset and number of features extracted from the
images. The response matrix Y (N×1) contains the ∆BAD measured for each paste sample
after forming and baking for each pitch level and each coke. Note that the data collected
for replicated samples (features of the paste images and ∆BAD values) were averaged
and then stored in the X and Y matrices.
6.3.2 Choice of preprocessing
Three initial preprocessing steps were performed by default on each image. First the RGB
image is transformed into a grayscale image to obtain a univariate image. Then low pass
Gaussian filtering is applied to remove some camera CCD noise. Finally, the image is
cropped to exclude irrelevant data around the borders showing the aluminum container,
etc.
-0,07
-0,06
-0,05
-0,04
-0,03
-0,02
-0,01
0
14 16 18 20 22 24 26
∆B
AD
Pitch % (ratio)
Coke A Coke B
117
Applying contrast enhancement to the images is considered a preprocessing option to be
tested in this section. Adjusting the saturation was chosen since it enhances the contrast
in the image. This was performed using the imadjust function (MaltabTM) which adjusts the
image gray level histogram to obtain 1% of saturation at minimum and maximum
intensities of the image. When applied independently on all images, it can also
compensate for image to image lighting variations adding robustness in the machine vision
sensor. A downside of using contrast enhancement is that the vision system may not be
able to detect drifts (i.e. degradation) in the lighting system or the imaging device in long
term applications. This problem can be addressed by monitoring the adjustments made by
the equalization algorithm over time in order to detect any drift.The usefulness of the
contrast enhancement was verified by comparing the performance of PLS models of the
features before and after adding this preprocessing step.
Two PLS models were computed using the energy of the DWT detail coefficients in X
(26×7) and ∆BAD in Y (26×1), with and without applying contrast enhancement. Note that
7 wavelet decomposition levels were calculated and the detail sub-images in all three
directions were averaged prior to computing the energy. The samples were split in 7
consecutive subsets for the cross-validation procedure (section 2.4). The number of
components was selected based on the smallest root mean squared error in prediction by
cross-validation (RMSEPCV) of the ∆BAD using equation 2.16.
For interpretation, the variance explained (R2) for X and Y as well as the CVQ2 (equation
2.15) and RMSEPCV for Y are compared in Table 18. The RMSEPCV is an indication of
the model prediction error and it can be compared with the ∆BAD measurement error
standard deviation obtained from replicate samples which is 0.019 g/cm3 in this case. The
RMSEPCV are very close to measurement errors which suggest a good adequacy of the
models. Additionally, the score plots of the PLS models are also shown in Figure 59 for
interpretation purposes.
Table 18 – Impact of adding contrast enhancement on PLS model statistics
Preprocessing
Number
of LV
R2X
(%)
R2Y
(%)
R2Y
LV1-2
(%)
CVQ2Y
(%)
∆BAD
RMSEPCV
(g/cm3)
No preprocessing 4 99,69 72,77 33,92 36,69 0,016
Contrast enhancement 4 98,94 84,19 77,29 72,42 0,010
118
Using contrast enhancement doubles the prediction ability of ∆BAD (in cross-validation).
The estimated error standard deviation is also reduced by 0.006 g/cm3. The explained
variance of Y in calibration (R2Y) for the first two LVs (LV1-2) is also presented in Table 18
since these components are used to compare the interpretation of the models. Since the fit
is almost doubled for these 2 LVs with contrast enhancement, the interpretation is also
expected to be much clearer (Figure 59). The improvement in performance is due to the
elimination of the change in lighting intensity from one batch of experiments to the other
since these anodes were produced over a few weeks in the laboratory. Based on these
results, contrast enhancement was added as a standard preprocessing option for the
machine vision algorithm.
Figure 59 – Scores of the PLS models for the lab formulated anodes: a) no contrast
enhancement and b) with contrast enhancement
Figure 59 compares the impact of adding contrast enhancement on the interpretation of
both PLS models. The first 2 LVs are used since they capture most of the information in X
and Y (for the second model). These are the two orthogonal linear combinations of the
image textural features that are the most predictive of ∆BAD. The percentages shown in
the axis labels are the Y variance captured (R2) by each LV. Also, the markers are colored
according to their ∆BAD values (scale on the right hand side of the plot). Paste images
clustering close to each other in the score plots (i.e. similar score values) have similar
textural characteristics.
Contrast enhancement has a significant beneficial impact on the ability of the imaging
sensor to detect the optimum BAD (hence pitch demand of the coke) as revealed in Figure
-4 -2 0 2 4 6-2
-1.5
-1
-0.5
0
0.5
1
1.5
t LV1 (10.84%)
t LV
2 (
23.0
8%
)
A 15
B 15A 16
A 17B 17.2A 17.5
B 18.2A 18.5
A 19B 19.2
A 20
A 21
B 21
A 22
B 22
A 23B 23
A 24
B 24
B 25
B 26
B 20
-10 -5 0 5 10-8
-6
-4
-2
0
2
4
t LV1 (36.71%)
t LV
2 (
43.8
1%
)
A 15B 15.2
B 15.7
A 16B 16.2
B 16.7
A 17.5
A 18.5
B 19.2A 20
A 21
B 21
A 22
B 22
A 23
B 23
A 24
B 24
B 25
B 26
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0OPD
Coke A Coke B
a) b)
Under
pitched
Over
pitched
t1 (10.84%) t1 (36.71%)
t 2(2
3.0
8%
)
t 2(4
3.8
1%
)
119
59. Without contrast enhancement, the visual appearance of the paste sample
corresponding to optimum BAD (i.e. yellow markers) are spread across the LV space
(Figure 59 a), but they cluster very clearly in the north east quadrant of the score plot when
using contrast enhancement (Figure 59 b). This preprocessing is therefore important for
dark materials such as anode paste. It enhances the textural information related with
changes in pitch demand and ensures robustness of the sensor to irrelevant sources of
variations such as lighting intensity.
6.3.3 Choice of wavelet
Wavelets from three distinct families and different support lengths were tested. Orthogonal
and biorthogonal wavelets were selected because they can be applied using the DWT
algorithm for fast computation and they also allow perfect reconstruction of the original
images from the details coefficient and approximation sub-images (Wavelet Toolbox
Documentation 2015). Also, they roughly matched the image 1D signal (i.e. one line or one
column of the image). Hence, the symlets (sym), Daubechies (db) and Biorthogonal (bior)
wavelets were selected. Also, for the sym and bior wavelets, different lengths of the filters
were used to verify if the shape of the wavelet had an impact on the performance. The
same dataset was used as in section 6.3.2 with contrast enhancement. The statistics of
each PLS models are presented in Table 19.
Table 19 – Impact of wavelet type and filter length on PLS model statistics
Except for the symlet 24 (sym24) which has a lower performance in cross-validation (i.e.
Q2Y and RMSEPCV), the performance obtained with all types of wavelet and support
length is very similar. Hence, the original choice of the symlet 4 was kept for the
continuation of the work on the machine vision approach.
Wavelet
Number
of LV
R2X
(%)
R2Y
(%)
R2Y
LV1-2
(%)
CVQ2Y
(%)
∆BAD
RMSEPCV
(g/cm3)
sym4 4 98,94 84,19 77,29 72,42 0,010
sym14 4 98,47 84,60 78,35 74,45 0,010
sym24 4 98,44 84,47 78,45 63,20 0,012
db4 3 97,27 81,28 80,29 71,78 0,010
bior2.2 4 99,13 84,08 76,83 72,31 0,010
bior3.5 3 97,42 79,92 79,00 71,12 0,010
bior4.4 4 98,89 84,44 78,80 72,98 0,010
120
6.3.4 Selection of textural features
The final step to determine the image texture analysis methodology was to select which
textural features to use. Both the DWT and GLCM texture methods are used separately
and in combination to extract textural descriptors that are sensitive to relevant changes in
paste images (i.e. coke size distribution and pitch demand) and robust to irrelevant
sources of variations. The computed features are described in detail in section 3.3 of this
thesis. For the DWT method, the tested features are the energy (E), entropy (Ent),
standard deviation (Std), Skewness (Skew) and kurtosis (Kurt). These were calculated
either on the DWT approximation sub-images (low-pass frequency information) obtained at
each scale or on the detail sub-images (high-pass information). The features computed
from GLCM are the angular second moment (ASM), entropy (Ent), contrast (Cont),
homogeneity (Hom) and correlation (Corr). These were calculated after applying GLCM
either directly on the preprocessed images or on the DWT detail sub-images. The co-
occurrence matrices were obtained at four angles (i.e. 0°, 45° 90° and 135°) commonly
used in the literature (Haralick et al. 1973; Maillard 2003; Bharati et al. 2004) and multiple
distances (e.g. 1, 2, 3 and up to twelve in those references) to yield a multi-resolution
description of the paste texture.
No automatic feature selection methods were used in this work. Selecting only a subset of
features is a tradeoff between model complexity and monitoring ability. Model
interpretation is easier when fewer features are used, but it looses in generality for process
monitoring applications (e.g., may miss new variations appearing in the images that could
have been captured by the features removed from the model). Using or not feature
selection methods depends upon the objective of the application. For these reasons, all
the details images where used in all models even if image characteristics are often
confined in certain frequency bands in most texture analysis applications.
A set of nine PLS models were built between a selected subset (i.e. different
combinations) of features (X) and the ∆BAD (Y). The features were all computed on the
preprocessed images with contrast enhancement and the DWT was performed using the
wavelet sym4. For each model, the number of components was selected based on the
lowest ∆BAD RMSEPCV. The performance statistics are provided in Table 20 and the
interpretation of the models using the first two PLS scores (LVs 1 and 2) is presented in
Figure 60. Again, both were used for the feature selection.
121
Table 20 – Impact of different combinations of textural features on PLS model statistics
In terms of variance captured, all models perform well with R2X ranging from 79.2% to
99.9% and R2Y from 80.0% to 96.4%. The worst is model 5 with the DWT details on the
approximations only. Each approximation sub-image contains the low frequency
information from previous decomposition levels. It appears this frequency content
degrades the prediction ability since it probably contains information unrelated to the
∆BAD variability (i.e. lighting variations and paste spreading patterns). The models based
on features computed from the DWT details sub-images only performed better (i.e. models
1-4). This may be due to the fact that each decomposition level is orthogonal from the
others and contains unique textural information. The models based on the GLCMs (i.e.
models 8 and 9) computed directly on the images have high fit on both X and Y matrices
and high predictive ability in cross-validation, but the lowest fit in the first two LVs only.
These two components are important because they capture the optimum pitch demand.
Finally, the best model seems to be number 7. All the features are based on the DWT
detail coefficients, but only energy, skewness and kurtosis and the GLCM ASM, contrast
and correlation are used. This model has the best predictive ability with the lowest
RMSEPCV. The first two PLS components also capture the greatest amount of variance of
Y and this gives the best classification of the optimum pitch demand from paste images as
shown in Figure 60 d compared to the results obtained with models 1, 4 and 9 (Figure 60
a-c).
The redundancy in the features was discussed in section 3.3.1 and in (Van de Wouwer et
al. 1999; Clausi 2002). For the DWT detail coefficients models (i.e. 3 and 4), removing the
redundant features entropy and std does not change the performance of the models but
the interpretation of the loadings is much simpler with fewer variables. In the case of the
Features #
Number
of
features
Number
of LV
R2X
(%)
R2Y
(%)
R2Y
LV1-2
(%)
CVQ2Y
(%)
∆BAD
RMSEPCV
(g/cm3)
All 1 155 3 83,41 83,05 79,64 69,32 0,011
DWT details and approximations
(E, Ent, Std, Skew and Kurt)2 75 3 79,45 82,66 79,43 68,36 0,011
DWT details only (E, Ent, Std, Skew and Kurt) 3 35 5 92,73 90,39 78,85 65,26 0,012
DWT details only (E, Skew and Kurt) 4 21 6 96,04 92,19 77,36 67,06 0,012
DWT approximations only
(E, Ent, Std, Skew and Kurt)5 40 3 79,21 79,99 75,97 67,71 0,011
DWT+GLCM details only (ASM, Cont and Corr) 6 21 4 95,09 83,26 78,52 68,97 0,011
DWT details only (E, Skew and Kurt) and
DWT+GLCM details only (ASM, Cont and Corr)7 42 7 96,77 96,39 80,52 76,88 0,009
GLCM on images (ASM, Ent, Cont, Hom and Corr) 8 45 6 99,87 86,63 74,87 73,53 0,010
GLCM on images( ASM, Cont and Corr) 9 27 6 99,85 88,00 74,86 77,22 0,009
122
GLCM features, model 9 with fewer features performs slightly better in prediction with
3.7% higher Q2Y. Once again the loadings are easier to interpret. It seems that for this
machine vision application, the performance of the models are not affected by the features
redundancy, but since it will make the interpretation simpler, only non-redundant features
were selected for the final application.
All the models in Figure 60 can detect the OPD based on the PLS model between the
image features and the ∆BAD. The black dash lines in all plots show the direction of the
pitch % from low pitch to high pitch paste. The color map again shows the distance to the
optimum BAD (i.e. ∆BAD). All models can capture the differences between the under and
over pitched anodes, but the model with the best clustering of the OPD is presented in
Figure 60 d. It is the model combining the DWT and GLCM features on the DWT detail
coefficients (i.e. model 7 in Table 20).
Figure 60 – Score plots for the first two PLS components (LVs 1-2) of four models from
Table 20: a) model 1, b) model 4, c) model 9 and d) model 7
-20 -15 -10 -5 0 5 10
-15
-10
-5
0
5
10
t LV1 (48.91%)
t L
V2
(30
.74%
)
A 15B 15.2
B 15.7
A 16
B 16.2
B 16.7
A 17
B 17.2
A 17.5A 18
B 18.2
A 18.5A 19
B 19.2 A 20
B 20
A 21
B 21
A 22
B 22
A 23
B 23
A 24
B 24
B 25
B 26
-8 -6 -4 -2 0 2 4-10
-8
-6
-4
-2
0
2
4
t LV1 (62.23%)
t LV
2 (
12.6
3%
) A 15B 15.2 B 15.7
A 16
B 16.2
B 16.7
A 17
B 17.2
A 17.5 A 18
B 18.2
A 18.5A 19
B 19.2A 20
B 20
A 21
B 21
A 22
B 22
A 23
B 23
A 24
B 24
B 25
B 26
-8 -6 -4 -2 0 2 4
-5
-4
-3
-2
-1
0
1
2
3
t LV1 (40.62%)
t L
V2
(36
.74%
)
A 15B 15.2
B 15.7
A 16 B 16.2
B 16.7
A 17
B 17.2
A 17.5
A 18
B 18.2
A 18.5A 19
B 19.2 A 20B 20
A 21
B 21
A 22
B 22
A 23
B 23
A 24
B 24
B 25
B 26
-10 -5 0 5 10-8
-6
-4
-2
0
2
4
t LV1 (36.71%)
t LV
2 (
43.8
1%
)
A 15B 15.2
B 15.7
A 16B 16.2
B 16.7
A 17.5
A 18.5
B 19.2A 20
A 21
B 21
A 22
B 22
A 23
B 23
A 24
B 24
B 25
B 26
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
a) b)
c) d)
Coke A Coke B
OPD
Under
pitched
Over
pitched
t1 (48.91%)
t 2(3
0.7
4%
)
t1 (40.62%)
t 2(3
6.7
4%
)
t1 (62.23%)
t 2(2
1.6
3%
)
t1 (36.71%)
t 2(4
3.8
1%
)
123
A schematic of the final choice of preprocessing, wavelet and features is presented in
Figure 61. After noise removal and contrast enhancement, 7 levels of DWT decomposition
are calculated using the symlets 4 wavelet. The energy, skewness and kurtosis features
are calculated from the DWT detail sub-images. Finally, the GLCM ASM, contrast and
correlation features are calculated from the DWT detail coefficients at each decomposition
level for the number of distances and angles mentioned previously.
Figure 61 – Final image texture analysis procedure
6.4 Results
The detailed results for the three laboratory datasets are presented in this section. The
PLS models statistics, most informative scores as well as interpretation of the loadings are
provided.
6.4.1 Preliminary design on paste formulation
In this model, only the pitch and the fines weights in the paste have been manipulated
while the coarse, intermediate and butts fractions weights remained constant. The amount
of pitch has a direct correlation to the wetness of the paste since adding more pitch will
make it look shinier. Adding fines should have the opposite behavior of making the paste
dryer by increasing the pitch demand due to the higher surface area of the fines.
Two PLS models were built on this dataset, one based on all individual paste samples (i.e.
replicates considered as separate observations in the data matrices) and a second where
the X and Y data for the replicated samples were averaged. The image features were
Original image
Pre-processing
- RGB to grayscale
- ROI- Low pass filtering- Contrast enhancement
DWT (sym 4)
Details coefficient(7 levels)
- Energy
- Skewness- Kurtosis
GLCM(L(s) and average of θ)
- ASM
- Contrast- Correlation
124
stored in X whereas the changes made to formulation (i.e. fines and pitch %) were used as
Y data. The objective of building PLS regression models were not so much to assess the
predictive ability of the fines % and pitch %, but the Y data were rather used for supervised
clustering of the images to verify that variations in paste texture was indeed correlated with
changes in the formulation. The PLS models statistics are presented in Table 21. The
number of components was selected to minimize the RMSEPCV of both Y variables. The
standard deviation of the fines % and pitch % were 2.75 % and 0.98 % respectively. The
RMSEPCV of both variables are much lower than the dataset variability and this means
that the model error is small.
Table 21 – PLS model statistics for changes in fines and pitch percentages in the paste formulation
The model built after averaging the replicate data performs well at capturing the texture
information (R2X) as well as the formulation variation (R2Y and Q2Y). The model built using
all samples (no replicate averaging) will be used to verify the repeatability of the
methodology and experimental procedure.
Model
Number
of LV R2X (%) R
2Y (%)
CVQ2Y
(%)
Fines %
RMSEPCV
Pitch %
RMSEPCV
All samples 4 90,65 87,17 64,74 2,32 0,38
Replicate averages 3 86,15 98,06 89,33 1,27 0,31
-10 -5 0 5 10-6
-4
-2
0
2
4
t1 (58.16%)
t 2 (
33.4
5%
)
P-H_F-1
P-H_F-2
P-H_F-3
P-H_F-4P-H_F-5
P-L_F-1
P-L_F-2P-L_F-3
P-L_F-4
P-L_F-5
High pitch Low pitch
0 5 10 15 20 25 30 35 40 45-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
Variables
w*q
LV
2 (
33.4
5%
)
12 3
4
5 67
1
2 34
5
6
7
1
2
3 4
56
7
1
23
4
5 67
12
3
4 5 6 7
1
2
3
4
56
7
Fin
es %
Pitch
%
a)
b)
c)
Wet paste
Dry paste
0.3
Inte
r (
%)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr Y
-0.2
-0.1
0
0.1
0.2
0.3
w*q
LV
1 (
58.1
6%
)
1 2
3
4
5 6 71
2 3 4
5
6
7
1
2
3
4
56
7
1
2
3 4
5 67
1 23
4
5 6 7
1
2
3
4
56
7F
ine
s %
Pitc
h %
125
Figure 62 – Scores and loadings weights of the PLS model (replicates averaged) for the
case where fines and pitch variations were introduced in the paste formulation: a) LV1-LV2
scores, b) weights and loadings of LV1 and c) weights and loadings of LV 2
The interpretation of the PLS model built on averaged replicates is presented in Figure 62.
The variations introduced in the paste formulation clearly drive the paste textural features
in two main directions captured by the scores (Figure 62 a). The marker labels in the score
plot indicate both the level of pitch (P-L for low and P-H for high pitch) and the amount of
fines used in the formulation (F-1 the smallest and F-5 the highest). The first LV mainly
captures the variations in the amount of pitch, with the higher pitched anodes falling in the
positive t1 region and the lowest pitched anodes in the negative t1 range. LV1 also
captures some of the variations in fines in each of the two groups (low and high pitch). The
second component is associated with the changes in fines %. The amount of fines is
positively correlated with the t2 scores in this case. The loadings bi-plot shown in Figure 62
b) and c) can be used to understand how the textural features are influenced by the
changes in formulation. The color of the bars corresponds to each set of X features and Y
variables, and the numbers to the DWT decomposition level.
In the case of the first component LV1 (Figure 62 b) all loading weights for the energy and
the contrast (except in level 4) are positive. This is an indication that a positive correlation
exists between E and Cont and the amount of pitch. This means that this component
captures the shininess or reflectivity of the paste. A paste with more pitch on the surface is
more reflective and the total energy of that image (i.e. sum of square of all pixel intensity
values) increases. Since the DWT conserves the total energy of the image, this increase in
energy is visible in all the detail coefficients. The reflectivity of the paste increases the
specular reflection and so it increases the contrast captured by the GLCM on the detail
coefficients. Finally, the skewness and kurtosis decreases for levels 5-7 when the amount
of pitch increases. Skewness is a measure of the normality and kurtosis is a measure of
the broadness of a distribution. When both features decrease at the same time, it is an
indication that the values of the detail coefficients are more evenly distributed and have a
lower spread (i.e. narrower distribution). This indicates that the paste texture is smoother
at the lowest frequencies (i.e. larger details) when the pitch increases.
For the second component LV 2 (Figure 62 c) the increase in fines also increases the
reflectivity of the paste because the energy features all have positive weights. However,
126
contrary to LV1, the skewness and kurtosis in the decomposition levels 1-4 decrease with
increasing fines content which suggests less specular reflection is obtained since the
paste is more homogeneous in the high frequency levels. Furthermore, the ASM
decreases when the amount of fines decreases. This indicates that a finer paste as a
smoother appearance even if the reflectivity is increased.
The third LV of the PLS model is not presented in this thesis since it did not improve the
understanding of the features.
In summary, both an increase in fine% and pitch% increase the paste reflectivity. The pitch
will also increase the specular reflection which makes the paste look rougher in the high
frequencies. However, the paste appearance of the finest samples will tend to appear
smoother due to less specular reflection in the high frequencies.
For this experiment, all the samples with the highest amount of pitch (P-H_F-#) were
prepared three times (replicated). Each mix was imaged only once for a total of three
images for each of the five fines fraction levels. No other quantitative measurements than
the paste images themselves were available to verify the repeatability of the paste
preparation method. It was postulated that the replicates should have similar paste textural
characteristics. The paste appearance is characterized by 42 image textural features and it
is not convenient to verify the repeatability of the fabrication and imaging procedure for
each feature separately. Since these are correlated, individual confidence limits on the 42
features can be misleading. However the scores of the PLS model are orthogonal and
individual approximate uncertainty interval can be computed on the scores of the paste
samples image features. To compute these uncertainty intervals, the PLS model built
using all the sample is used to obtain the score values for each samples, including the
replicated ones.
Figure 63 presents the one standard deviation intervals for the LV1 and LV2 scores values
calculated based on the replicated samples. In this figure, the markers represent the
average of the replicated score values and the error bars are set to one standard deviation
around the mean. All the samples, even those without replicates (i.e. low pitch samples)
were used to build the PLS model, but only the samples with replicates are shown in
Figure 63. Samples F-1, F-2, F-3 and F-4 can be discriminated completely while the error
bars of samples F-3 and F-5 slightly overlap along the second component. This indicates
that the machine vision sensor is sensitive to the variations in pitch and fines.
127
Figure 63 – Reproducibility of the imaging sensor in the case of the preliminary design on
formulation. The averaged LV1 and LV2 scores are shown for replicated samples along
with their one standard deviation error bars
6.4.2 Detailed design on paste formulation
In this series of experiments, the pitch demand of the paste was manipulated using more
parameters than just the amounts of pitch and fines. However, the greater number of
samples to prepare using the same lot of coke and butts aggregates forced the formulation
of small size paste samples (i.e. 450g). This was an important issue, particularly for the dry
aggregate fractions which were stored in large 20-25 kg buckets. Only 60g to 125g of each
constituent was needed for each paste sample. Thus, it was difficult to sample the
fractions from the bucket consistently and obtain a representative size distribution for each
dry aggregate fraction.
This problem was even more critical for the recycled butts fraction which contains a very
large distribution of particle sizes (i.e. from 2-3 cm to a few µm). To minimize sample to
sample variations, the full source sample was split into several smaller fractions of
approximately 100g using sample splitters. Even with careful manipulations, it was not
possible to obtain a constant size distribution in all split samples. This is illustrated in
Figure 64, where the size distributions for 5 split butts samples are shown. Variations in
the particle size distribution when preparing the paste samples were mainly due to the
coarser particle fractions. These inconsistencies in aggregate size distribution may affect
pitch demand unintentionally.
-4 -2 0 2 4 6
-2
0
2
4
6
t1
t 2P-H_F-1
P-H_F-2P-H_F-3P-H_F-4
P-H_F-5
128
Figure 64 – Butts size distribution span
As discussed in section 6.2.2, a certain number of paste samples were replicated. In
addition, two images were collected for each paste sample. This allowed to assess the
reproducibility of the paste image itself separately from that of the entire experimental
procedure (i.e. sampling errors, etc.). Therefore, three PLS models were built using the
image features in X and changes to formulation in Y, but the data included in these
matrices depend on how the replicates were averaged. The first PLS model was based on
including all replicates of paste samples and images as a row in the data matrices (no
averaging at all). For the second model, the textural features of the two images collected
for each sample were averaged and for the third model, these features were also
averaged for each replicate sample (i.e. replicated formulation). The five variables included
in Y are the paste formulation percentages for each particle fractions (coarse,
intermediate, fines and butts) and the pitch.
Table 22 – PLS models statistics for the detail design on paste formulation
The statistics of the models are presented in Table 22. In this case, the number of
components was chosen to maximize the cross-validation Q2Y instead of the RMSEPCV.
This is due to their relatively high values compared to the standard deviations of the
coarse %, inter %, fines %, butts % and pitch % which are 4.56 %, 2.74 %, 4.59 %, 3.82 %
and 0.52 % respectively. The captured variance (R2X) for the feature space (X) is high
Rt4 Rt10 Rt18 Rt30 Rt50 Pt500
5
10
15
20
25
30
35
40
45
50
Sieve size (Mesh)
Perc
en
tag
e o
f to
tal sa
mp
le (
%)
Sample 1
Sample 2
Sample 3
Sample 4
Sample 5
Model
Number
of LV
R2X
(%)
R2Y
(%)
CVQ2Y
(%)
Coarse %
RMSEPCV
Inter %
RMSEPCV
Fines %
RMSEPCV
Butts %
RMSEPCV
Pitch %
RMSEPCV
All samples 4 86,84 38,73 19,72 4,76 1,91 4,21 3,78 0,51
Replicated image
averages2 76,67 36,67 15,82 6,63 2,59 7,22 3,24 0,53
Replicated formulation
averages2 73,23 29,96 22,08 5,03 2,34 4,52 3,75 0,53
129
while the numbers of LV are low (i.e. 4 and 2 for the all samples model and averages
models). This is an indication that most of the changes performed in these experiments
(i.e. 8 types of variations) only drives the paste image texture in a few (i.e. 4 or 2)
directions. This means that they have a similar effect on the paste visual appearance. This
was expected since the types of variations were carefully selected to excite the paste’s
pitch demand and particle size distribution in different manners. The prediction ability
(CVQ2) is low in this case. This is probably due to the high particle size distribution of the
butts fraction. The Y data is also corrupted by errors. This is due to the fact that the
proportions in the imaged samples may not be exactly those of the design conditions
stored in Y. However, the PLS models were not intended to be used for predicting the
formulation variables, but the interpretation of the image texture features based on the
change in paste characteristics (i.e. pitch demand and formulation). The supervised
clustering of the image textural features as a function of the change made to the
formulation helps to remove some of the unwanted image variability as opposed to
applying a PCA model on the features only.
The scores and loadings bi-plot for the second model (i.e. the replicated image averages
model) are presented in Figure 65. The scores for the X and Y spaces are presented in
Figure 65 a) and b). The light blue arrow indicates the direction in the score space
dominated by the pitch demand variations (e.g. changes in amount of pitch, fines and shot
coke). The light gray arrow indicates the direction capturing changes in the size distribution
of the aggregate mix (i.e. formulation). However, the formulation variations contribute to
both directions (Figure 65 a) since variations in the aggregate size also affect the pitch
demand. The interpretation of the textural features based on these two main directions is
not straightforward using the loadings bi-plot (c and d) since the axes (i.e. LVs) are not
aligned with them. That is, both pitch demand and formulation have contributions in both
components.
130
Figure 65 – Scores and loadings weights of the PLS model built on averaged replicated
samples data for the case of the detailed design on formulation: a) X scores on LV1 and
LV2, b) Y scores on LV1 and LV2, c) weights and loadings of LV1 and d) weights and
loadings of LV2
To improve the interpretation, contribution plots are used instead to highlight the changes
in the image features that distinguish two groups of observations. Contribution plots show
differences between observations more specifically compared to the loading plots. Groups
of observations showing variations due to change in pitch ratio, amount of shot coke and
fines, and aggregate size are illustrated in the score plots shown in Figure 66 a), c), e) and
g). The corresponding contributions plots are presented next to each score plot (Figure 66
b, d, f and h). The contribution of each image textural feature is computed using equation
2.21. The contribution plots show the change in the features from group 1 (i.e. ellipse) to
group 2 (i.e. rectangle) in each score plots. The arrow also indicates the direction of the
change under study.
-15 -10 -5 0 5 10-10
-5
0
5
10
t1 (24.85%)
t 2 (
5.1
0%
)
B_+10%
F_-4%
SD_+C+I
F_+4%
BL_2300
F_+2%t_5min
shot_40%base
SD_-I
F_-2%
P_-1.4%
B_-10%BL_4000
P_+1.6%
SD_-C-I
SD_+Ishot_20%
T_158°C
T_188°C
BL_6000
B_+10%shot_60%
B_-10%
BL_2300P_-1.4%
P_+1.6%
-20 -15 -10 -5 0 5 10 15 20-15
-10
-5
0
5
10
15
u1 (24.85%)
u2 (
5.1
0%
)
F_-4%
SD_+C+I
F_+4%F_+2%
base
SD_-I
F_-2%
B_-10%
P_+1.6%
SD_-C-I
SD_+I
B_+10%P_-1.4%
0 5 10 15 20 25 30 35 40 45-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Variables
w*q
LV
2 (
5.1
0%
)
1
2
3
4
5
67
1
234
5
67
12
3
4
5
6
7
1
2 3
4
5
67
1
2
34
5
67
1
23
4
56
7
Co
ars
e (
%)
Inte
r (
%)
Fin
es (%
)B
utts (%
)P
itch
(%
)
-0.2
-0.1
0
0.1
0.2
0.3
w*q
LV
1 (
24.8
5%
)
12
3
45
6
7
1 23
4
5
6
7
1
234
5
6
7
1
2
34
5
6 7
12
3
4
5
6
7
1
2
3
4
56
7
Co
ars
e (
%)
Inte
r (
%)
Fin
es (%
)B
utts (%
)P
itch
(%
)
Formulation
Pitch demand
Formulation
Pitch demand
10
BL Base Butts Fines Mix_Temp Mix_t Pitch SD Shot
0.3
Inte
r (
%)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr Y
a) c)
d)b)
131
Figure 66 – Interpretation of the PLS model built using averaged replicated samples data
for the case of the detailed design on formulation. Variations in the scores and associated
contribution plots: a) and b) increase in the pitch ratio, c) and d) shot coke addition, e) and
f) decrease in the fines ratio and g) and h) change from a coarser to a finer formulation
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
0.2
w*
contr
ibutions L
V1-2
1
2
3
4
5
6
7
1
2 3 4
5
6
7
1
23
4
5
7
1
2
3
4
5
6
7
1
2
3
4
5
6
7
1
2
3
4
5
6
7
-15 -10 -5 0 5 10-10
-5
0
5
10
t LV1 (24.85%)
t LV
2 (
5.1
0%
)
SD_+C+I
SD_-I
SD_-C-I
SD_+I
-15 -10 -5 0 5 10-10
-5
0
5
10
t LV
2 (
5.1
0%
)
F_-4%
F_+4%F_-2%
F_+2%
-15 -10 -5 0 5 10-10
-5
0
5
10
t LV1 (24.85%)
t LV
2 (
5.1
0%
)
shot_40%
base
base
base
shot_60%
base
-15 -10 -5 0 5 10-10
-5
0
5
10
t LV1 (24.85%)
t LV
2 (
5.1
0%
)
P_-1.4%
P_+1.6%
P_-1.4%
P_+1.6%
-2
-1.5
-1
-0.5
0
0.5
1
1.5
w*
contr
ibutions L
V1-2
12
3
4
5
67
1
23 4
5
67
1
2 34
5 6
7
1
2
3
4
5
6 7
12
3
45
6
7
12
3
4
5 6
7
-0.2
-0.1
0
0.1
0.2
0.3
w*
contr
ibutions L
V1-2
12
3
4
5
6 7
1
2
3
4
5
6
7
1
2
3 4
5
6
7
1
2
3 4
5
6
7
12
3
45
6
71
2
3
45
6
7
a) b)
c) d)
e) f)
g)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
Base Fines Pitch SD Shot
0 5 10 15 20 25 30 35 40
-6
-4
-2
0
2
4
6
Variables
w*
contr
ibutions L
V1-2
1 23
4
5
6
7
1
2
34
5
6
7
1
23
4
5
6
7
1
2
3
4
56
7 12
3
4
56
71
2
3
4
5
6
7
h)
1
2
1
2
1
2
12
t1 (24.85%)
t 2(5
.10%
)t 2
(5.1
0%
)t 2
(5.1
0%
)t 2
(5.1
0%
)
132
The information extracted from the score plots shown in Figure 66 a), c) and e) focus on
changes along the pitch demand direction where the paste appearance evolves from a
dryer to wetter appearance. In Figure 66 a), the pitch content was increased from -1.4 to
+1.6% for the same dry aggregate which drives the paste appearance from dry to wet. In
Figure 66 b), the addition of shot coke to the base mix also increases the wetness of the
paste. Both contribution plots (Figure 66 b and d) are very similar. The energy and contrast
features increase at the high and low frequencies (levels 1, 2, 6 and 7) while decreasing in
the middle frequencies. The accumulation of pitch on the surface of the particles increases
specular reflection and this appears as small details (i.e. high frequency) in the images. At
the same time, the skewness, kurtosis and correlation also increase in the high frequency
detail coefficients. The decrease in skewness and kurtosis in the large details corresponds
to a smoother texture at these frequencies. As the pitch demand decreases, more pitch
saturate the surface of the paste and it smoothes the surface while creating high frequency
reflectivity. This is very similar to the behavior observed in the previous experiment
(section 6.4.1).
The change in the fines fraction is presented in Figure 66 e) and f). The contributions are
shown for an increase in the fines %. The effect of this change on the texture features is
similar from what was observed previously (Figure 62 c) on the preliminary design on
paste formulation. The fines addition tends to smooth the surface texture and decrease the
specular reflection (i.e. high frequency content). This is shown by the decrease of the
energy and contrast in the high frequency levels 1 and 2. Also the detail images of the
higher fines pastes are more uniform since the kurtosis and skewness decrease compared
to the lower fines samples.
Finally, the variables contribution from the coarser mixes (i.e. SD_+C+I and SD_+I) to the
paste with less intermediate size coke (i.e. SD_-I) is presented in Figure 66 g) and h). This
contribution plot captures the contribution of the particle size changes on the paste textural
features. The experiments made to change the dry aggregate size distribution moved the
paste sample appearance along an almost orthogonal direction from pitch demand. In this
case, increasing the fineness of the aggregates by removing particles in the intermediate
fraction leads to a smoother paste (i.e. lower skewness and kurtosis). It also concentrates
the information (i.e. energy) in the low frequency band and increased the contrast of the
low level decomposition details. These effects are similar to the fines fraction changes
presented in Figure 66 f). The main difference however is in the correlation features. In this
133
case, only the intermediate fraction was replaced by fines instead of all other fractions
which as was done previously. This seems to have more effect on the correlation features
than previously observed in Figure 66 b), d) and f) where only the correlation of
decomposition level 2 has a high contribution. For this case, the removal of intermediates
only seems to make the image texture more regular and smooth since the correlation
increases more in the low frequency details.
Two additional observations can be made from this figure. In Figure 66 c, the spread of the
base mix is almost perpendicular to the pitch demand direction change. This is an
indication of the variability of the particle size distribution of the dry aggregate mix from
sample to sample. In addition, in Figure 66 g), the position of the finer formulation paste
(i.e.SD_-C-I) in the bottom left corner indicates that the high content of the fines affected
both the pitch demand and the size distribution of the paste.
Two types of replicates were available in these experiments to assess the reproducibility of
the results. Each paste sample was imaged twice to verify the repeatability of the imaging
system itself. In addition, some of the mixes were repeated twice and the base mix four
times to check the repeatability of the paste mixing methodology. Reproducibility is
assessed similarly as to the preliminary set of experiments. A PLS model was built on all
the individual images available (no averaging at all). Then, the average and standard
deviations of the scores were computed for the image replicates and the mix replicates for
both model components. The results are presented in Figure 67.
The approximate uncertainty intervals (±one standard deviation) for replicated images are
presented in Figure 67 a). The intervals for these replicates are small in comparison with
the range of score values. This indicates that the imaging system and feature extraction
produce consistent results for most of the paste samples.
The results for the mix replicates are shown in Figure 67 b). Only the score values for the
replicated paste mixes are shown in the plot. In this case, only the samples obtained by
changing the pitch ratio seem distinguishable from the other paste samples. This indicates
that there is a large variability in the sample preparation method. The principal cause of
this variation comes from the large particle size distribution of the coke and butts fractions
and the difficulty of ensuring uniform and consistent sampling of the aggregates. However,
the model can still be used for the interpretation of the relationship between the changes in
the paste and the image texture variations.
134
Figure 67 – Reproducibility of the imaging sensor in the detailed design on formulation.
The averaged LV1 and LV2 score values are shown for a) image replicates and b) mix
replicates along with their one-standard deviation error bars
6.4.3 Pitch optimization experiment anodes
This dataset was used to select the preprocessing, wavelet type and features for the
machine vision algorithm. The experiment was designed to find the optimum pitch demand
of each coke based on the same formulation by finding the amount of pitch necessary to
obtain the maximum BAD. The particularity of this dataset is that each paste sample was
pressed and baked so the baked anode density can be used as a quantitative
measurement of the anode quality.
For this dataset, only one image was captured for each sample. However each pitch level
for both cokes have been repeated at least twice and up to four times for certain samples.
The X matrix contains the image textural features and the ∆BAD is stored in Y. The PLS
model statistics and scores have been presented and discussed in section 6.3 and Figure
59 b), but the loading weights were not interpreted. The PLS model based on the
averaged features is used for interpretation whereas the model computed on all samples
enables a comparison of the repeated samples.
The statistics for both models are available in Table 23. In this case, the model
performances are very good. Both the X and Y captured variance (R2) are high. The
predictive ability (CVQ2) is very close to the R2Y with less than 4% difference for the
averaged replicates model. Also, the RMSEPCV of the models are low compared to the
standard deviation (0.019 g/cm3) of the measured ∆BAD.
-15 -12.5 -10 -7.5 -5 -2.5 0 2.5 5 7.5 10-10
-7.5
-5
-2.5
0
2.5
5
7.5
10
t LV1
t LV
2
B_+10%
SD_+C+I
F_+4%
BL_6000t_5min
shot_40%
SD_-I
F_-2%
P_-1.4%
BL_4000
SD_-C-I
SD_+I
T_158°C
B_+10%
B_-10%
BL_2300
P_+1.6%
-6 -4 -2 0 2 4 6-5
-2.5
0
2.5
5
t LV1
t LV
2
BL_2300
BL_4000
BL_6000
B_+10%B_-10%
P_+1.6%
P_-1.4%
base
10
BL Base Butts Fines Mix_Temp Mix_t Pitch SD Shot
t1
t 2
a)
t1
t 2
b)
135
Table 23 – PLS model statistics for the pitch optimization experiments
A comparison of the predicted and measured ∆BAD is presented in Figure 68. This figure
shows that all the prediction closely fit the measured values.
Figure 68 – Comparison of the predicted and measured ∆BAD for the replicated averages
model
The scores and loadings bi-plot for the averaged replicates model are presented in Figure
69. The LV1 and LV2 scores are presented individually to improve the interpretation. For
the score plots in Figure 69 a) and c), the shape of the markers corresponds to the type of
coke, the color map to the ∆BAD values, and the labels indicate the combination of coke
type and pitch ratio. It is interesting to note that the paste at the OPD for both coke have a
similar visual appearance (i.e. texture features).
Model
Number
of LV
R2X
(%)
R2Y
(%)
R2Y
LV1-2
(%)
CVQ2Y
(%)
∆BAD
RMSEPCV
(g/cm3)
Replicate averages 7 96,77 96,39 80,52 76,88 0,009
All samples 5 89,78 85,45 73,58 63,70 0,012
-0.06 -0.04 -0.02 0
-0.06
-0.04
-0.02
0
∆BADmeasured
∆B
AD
pre
dic
ted
Coke A Coke B
136
Figure 69 – Scores and loadings of the PLS model (averaged replicates) for the pitch
optimization experiments: a) LV1 scores , b) LV1 weights and loadings, c) LV2 scores and
d) LV2 weights and loadings
The LV1 scores and loading weights are shown in Figure 69 a) and b). This component
captures the variations introduced in the pitch ratio. The values of t1 clearly increase with
the pitch %. No optimum is captured by this component as the values of the scores do not
decrease when the paste is over pitched. The interpretation of this component is similar to
PLS models built in previous experiments when pitch ratio was varied. The energy
increases proportionally with the pitch% which corresponds to higher reflectivity. The
texture is rougher in the high frequencies compared to the low frequencies. Indeed, the
skewness and kurtosis of levels 1-3 increase with the pitch% while they decrease in the
lower frequency range (i.e. levels 5-7). Finally, the contrast features which are a measure
of heterogeneity also increase with the pitch %.
The second component, however, does capture the optimum BAD based on the image
textural features. The t2 values are negative when the anodes are under-pitched and over-
pitched while optimum pastes have high positive t2 values. Except for the correlation
features, the decomposition level number 4 (i.e. resolution of +0.65mm/-1.30mm) has the
0 5 10 15 20 25-8
-6
-4
-2
0
2
4
Samples
t LV
2 (
43
.81%
)
A 15
A 16
A 17
A 17.5
A 18A 18.5A 19
A 20
A 21
A 22
A 23
A 24
B 15.2
B 15.7
B 16.2
B 16.7B 17.2
B 18.2
B 19.2B 20
B 21B 22
B 23B 24
B 25
B 26
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
0 5 10 15 20 25-10
-5
0
5
10
Samples
t LV
1 (
36.7
1%
)
A 15
A 16
A 17
A 17.5
A 18A 18.5
A 19A 20A 21
A 22
A 23A 24
B 15.2
B 15.7
B 16.2B 16.7
B 17.2B 18.2
B 19.2
B 20B 21
B 22
B 23
B 24B 25
B 26
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
0 5 10 15 20 25 30 35 40-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
Variables
w*q
LV
2 (
43.8
1%
)1
23
4
5
67
1 2
3
4
5
6
71 2
3
4
5
6
7
1 2 3
4
5
6
7 1 2 3
4
5
67
1
2
3
4 56
7
de
lta B
AD
0 5 10 15 20 25 30 35 40-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Variables
w*q
LV
1 (
36.7
1%
)
1
2
3
4
5
67
12
3
4
56
7
1 2
3
4 5
6
7
1
23
4
5
6
7
12
3
4
5
67
1
2
3 4 56
7 de
lta B
AD
Coke A Coke B
0.3
Inte
r (
%)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr Y
b)
d)
OPD
Pitch %
t 1(3
6.7
1%
)t 2
(43.8
1%
)
a)
c)
137
strongest weights in all other features. The optimum pitch demand has a concentration of
the energy in this frequency band. The contrast on the GLCM matrix is also the highest at
this level. Finally, the kurtosis measuring the spread of the distribution of the detail
coefficients is the smallest for the OPD pastes. It seems that the textural details of about
1mm in length (i.e. approximately 25 pixels) are the most sensitive to the OPD. The other
features are characterized by a contrast in the positive vs. negative weight values in the
different decomposition levels.
The scores of components 3 and 4 are shown in Figure 70. Although it does not show any
specific pattern related with changes made on pitch percentage, the maximum density
anodes do cluster in the center of the score plot. Based on this observation, these LVs
may also be useful for detection the optimum pitch demand. The loadings bi-plots are not
shown since there physical meaning is not straightforward. Components 5-7 are not shown
since they mainly capture variations that are not related with the pitch demand.
Figure 70 – Scores of the 3rd and 4th components of the PLS model built on the pitch
optimization dataset (averaged features)
The reproducibility of the results obtained in these experiments is now assessed using the
replicated samples. The PLS model trained using all individual samples was again used to
compute the average and standard deviations of the scores for replicated samples. The
results are presented in Figure 71 for the first two components (most highly correlated with
BAD). In this case, the uncertainties are small enough to allow the bell-shape curve to be
clearly distinguished from experimental errors. The sensor therefore seems sensitive to
the variations in pitch demand of the paste.
-4 -2 0 2 4-3
-2
-1
0
1
2
t LV3 (2.56%)
t LV
4 (
4.2
6%
)
A 15
B 15.2
B 15.7
A 16
B 16.2B 16.7
A 17
A 17.5
A 18
A 18.5
A 19
A 20
B 20
A 21
B 21
A 22
B 22
A 23
B 23 A 24
B 24
B 25
B 26
-0.06
-0.05
-0.04
-0.03
-0.02
-0.01
0
Coke A Coke B
OPD
t3 (2.56%)
t 4(4
.26%
)
138
Figure 71 – Reproducibility of the imaging sensor in the pitch optimization experiments.
The scores of the first two components of the PLS model built on all samples are shown
along with one standard deviation error bars
Good results were obtained for detecting the optimum pitch demand for two different coke
sources. It is highly probable that this will also be the case for other coke sources. The
variation in porosity of different cokes should have a similar effect as a change in pitch
demand as this will change the amount of pitch needed to fill the pores. The OPD is
correlated to the thickness of the pitch layer on the coke particles (Adams et al. 2002).
Since this thickness should be similar for different coke sources, the sensor should be
robust to these raw material variations.
6.5 Conclusion
There is a need in the aluminium industry to develop new non destructive on-line
measurements for the characterization of the anode production process. Anode producers
face an increase in variability of the anode raw materials (i.e. coke and pitch). To find a
good compromise between raw material costs and anode quality, materials are blended
from different suppliers, including some lower cost/quality raw materials. Producing
anodes with a consistent quality is not straightforward using the current quality control
strategy. The key raw materials and anodes properties are currently measured in the
laboratory on a limited number of samples and the results are typically available after long
delays due to the production time and also sampling and analysis delays.
Anode paste quality is defined in terms of the resulting baked anode properties. These are
influenced by the raw material properties, the formulation such as the particle size
distribution and the pitch ratio and also the process operating conditions. The hypothesis
-10 -5 0 5 10-10
-8
-6
-4
-2
0
2
4
t LV1
t LV
2
A 15B 15.2
B 15.7
B 16.2
A 17.5B 18.2
A 20B 20
B 21
A 23
B 23
B 25
B 26
Coke A Coke B
t1
t 2
139
tested in this thesis was that changes in the particle size, pitch ratio and mixing parameter
are affecting the paste visual appearance (i.e. its texture). Therefore, it is possible to
quantify the variations in paste quality using the right combination of image analysis
methods.
Image analysis methods are well suited for on-line measurement sensors of product
appearance. The objective of this chapter was to develop a machine vision sensor for the
anode paste characterization based on its image texture. First, to demonstrate its
sensitivity to changes in paste formulation and pitch demand on paste samples formulated
in the laboratory. Second, the laboratory experiments where designed to gain an
understanding of the relationships between the paste surface texture (i.e. image features)
and its macro properties (i.e. formulation and pitch demand) to help the interpretation of
the sensor results on real industrial paste images.
The image analysis methodology used for the paste machine vision sensor is based on
contrast enhancement preprocessing, image texture analysis methods and features
extraction. It was found that a contrast enhancement preprocessing improved the
performance of the image analysis method and to detect an optimum in the pitch
optimization experiments. Finally, a combination of DWT and GLCM features computed on
the image DWT detail coefficients of seven level of decomposition were found best suited
to capture the texture information contained in the images.
To interpret the texture features, PLS models were used as supervised classification
algorithm to force the latent variable space to explain the desired paste fabrication
variations by these features. It was found that the amount of pitch on the surface of the
particle influences the reflectivity of the paste (i.e. shininess) and that the specular
reflection was captured by the high frequency levels. Also, paste with high pitch levels or
low pitch demand have a smooth texture in the low frequencies. Finally, finer formulation
mixes tend to smooth the surface of the paste in all frequencies compared to the coarser
paste.
The main difficulty in comparing the paste mixes together was to obtain quantitative
information about the paste quality. This was done in the final laboratory experiment by
performing a pitch demand optimization based on the maximum anode BAD.
140
This experiment showed that it is possible to capture the OPD using the machine vision
algorithm. Also, the texture features of the paste at the OPD for both type of coke was
similar even if the pitch ratio and BAD values were not the same for these cokes.
141
Chapter 7 Industrial paste imaging
7.1 Introduction
The experiments performed in the laboratory were useful to develop the image texture
analysis methodology for the anode paste images. It was also useful for understanding the
relationships between the variations in paste visual appearance captured by the textural
features and the changes introduced in coke source, formulation and processing
conditions. The next step of this project was aimed at testing the machine vision algorithm
on samples collected directly from an industrial paste plant.
Collecting data and samples and tracking anodes from an industrial carbon plant is not
straightforward due to the size and complexity of the manufacturing process. For example,
Deschambault’s carbon plant produces approximately 3200 anodes every week. The
paste plant processes 32 tons per hour of anode paste. Also, the 2 baking furnaces
contain 32 sections of 6 pits with 16 anodes per pit. This means that there are more than
5000 anodes loaded in the baking furnaces at any given time. Finally, the plant also stores
an inventory of green and bakes anodes.
Three major difficulties were faced during these tests. First, obtaining quantitative
measurements of the paste and/or anode quality is not straightforward in comparison with
laboratory experiments. For example, measuring the BAD of a particular anode requires
tracking the green anode block in the paste plant and in the baking furnace, and then drill
a core sample from the block after it is unloaded from the furnace. Tracking the anode
block requires human resources and logistics and is often difficult to implement with high
accuracy. Second, the heat-up rate and final baking temperature of the anodes depend
upon the location where they were baked in the furnace. These two parameters are known
to have a strong impact on the anode properties (Fischer et al. 1993). Therefore, it is
necessary to account for the effect of the baking process on BAD in addition to raw
material properties and paste formulation. This was not necessary in the laboratory
experiments because the green anode samples were baked in a smaller furnace where
baking conditions were controlled and homogeneous (i.e. core samples were all baked
under very similar conditions). Third, the range of the variations that can be safely applied
during the operation of the plant is limited to avoid the production of defective anodes or
breakdown of the plant. This limits the range at which it is possible to test the sensitivity of
the image texture analysis method.
142
The objective of this chapter is to test the robustness and the sensitivity of the machine
vision sensor developed in Chapter 6 on industrial paste samples using different datasets
obtained from sampling campaigns conducted at the Alcoa Deschambault (ADQ) smelter’s
carbon plant.
The robustness and sensitivity of the machine vision sensor is first studied under different
process operating conditions, including normal operation, plant start-up and pitch
optimization experiments (sections 7.3.1, 7.3.2, 7.3.3, respectively) by building PLS
regression models between the image textural features and the formulation variables,
similarly to what was done in the laboratory development phase. The aim was again to
perform supervised clustering of the paste image features based on the changes made to
the manipulated variables in the paste plant as part of the industrial sampling campaign.
In a second step (section 7.4), the textural features computed for each paste samples
were added to the larger database including raw material properties, formulation and
process conditions in order to establish relationships between the information extracted
from the images and the data collected in the different parts of the plant. This also
provides an assessment of the added value brought by using the machine vision sensor in
addition to the data already routinely available at the plant.
This chapter is organized as follows. First, the paste sampling, imaging procedure and
data synchronization are discussed in section 7.2. This is followed by the results section
(7.3) where the datasets and the interpretation of the features and the robustness of each
industrial experiment are discussed. The fusion of the sensor’s data to the plant datasets
using the new SMB-PLS algorithm is presented in section 7.4. Finally, some conclusions
are drawn.
7.2 Sampling and data synchronization
For all the experiments performed in the ADQ paste plant, the routine operating conditions
data were collected. This includes the raw material properties, the formulation ratios, the
particle size distribution of the dry aggregate mix and the processing conditions of the
mixing and anode forming units (e.g. temperatures, mixing energy, etc.). These data
needed to be synchronized to account for the residence time within each piece of
equipment and dead-times introduced by conveyor belts. The synchronization procedure
was described in previous work (Lauzon-Gauthier 2011). The only difference with the
143
procedure used in this thesis is that the basis for the synchronization is the paste sampling
time instead of the anode forming time (approximately 4 minutes difference between the
two events). The synchronization schedule was adjusted accordingly.
In order to preserve the confidentiality of the industrial data, all the process data presented
in this chapter were mean-centered. The pre-processing was done for each set of
experiments independently since the range of variation was wide due to the large time gap
between each experiment. The mean-centering hides the absolute values of each
variables but the variability is preserved so the variations in the data can be interpreted in
real engineering units.
Finally, at each sampling time, three aluminium containers were filled with anode paste.
The three paste samples were assumed to be true replicates since they were collected
within a very short period of time. It was thus possible to capture three different images for
every sampling time (i.e. one from each container). The paste samples were grabbed
manually from the conveyor belt and it was not straightforward to sample the whole flow of
paste with the sampling bucket. Most of the results presented in this chapter were
obtained after averaging the features of the three images of replicate paste samples. This
decision was made to average the differences due to the manual sampling variability. The
uncertainties due to sampling will also be quantified using the standard deviations of the
scores of the replicate samples.
7.3 Datasets and results
This chapter is organized differently compared to the previous chapter (Chapter 6). The
datasets and the results are presented in the same section since the interpretation of the
variations in the formulation is important to the understanding of the image features.
However, similarly to the laboratory development stage (section 6.4), two PLS models are
built on each dataset consisting of the image features (X) and the formulation data (Y).
One model is computed from all the individual images to obtain an estimate of the
uncertainties based on the replicated samples and the second model is built after
averaging the X and Y data for the replicated samples. The PLS scores and loadings are
interpreted using the second model (averaged replicate data) for sake of simplicity.
For all the PLS models, the number of components was selected to minimize the
RMSEPCV (equation 2.16) of all formulation variables. For the PCA models, the CVQ2 of
144
the X dataset was used instead. The cross-validation procedure was implemented by
splitting the dataset in 7 consecutive blocks.
7.3.1 Normal operation
The first industrial dataset was collected during normal operation of the paste plant without
any designed experiment on the processing conditions. It was performed to test the
sampling procedure and imaging equipment at the site. In total, anode paste was collected
at 118 sampling times in triplicate and subsequently imaged (354 paste images). The
samples were gathered at regular time intervals during normal plant operation for six
different days (numbers in Figure 72) over the course of 2 weeks.
The formulation variables are presented in Figure 72. The fines percentage is not shown
for better clarity of the figure, but its variance is 0.078% around the average value. It was
not manipulated during these six days and was affected by common cause variations only.
Figure 72 – Formulation variables for the normal operation industrial dataset: a) dry
aggregate % and b) pitch ratio
-4
-3
-2
-1
0
1
2
3
Aggre
gate
form
ula
tion
0 10 20 30 40 50 60 70 80 90 100 110 120-1
-0.75
-0.5
-0.25
0
0.25
0.5
0.7
Paste samples
Pitch (
%)
a)
b)
1 2 3 4 5 6
Coarse % Inter. % Butts % Pitch %
1
22
145
Due to operational constraints, the ratios of coarse and butts (Figure 72 a) must be
adjusted daily and this was the main source of variability contained in this dataset. The
butts vary in opposite direction to coarse ratio because the butts particles are always
substituted by coarse coke particles in the dry aggregate formulation. Furthermore, the
pitch percentage in the paste (Figure 72 b) was adjusted according to the changes made
to the butts/coarse fractions since the pitch demand of the aggregate mix is lower when
the butts ratio increases (i.e. pitch ratio decreases with butts content).
The correlation coefficients between the different dry aggregate fractions and the pitch %
are presented in Table 24. This confirms the strong negative correlation (-0.93) between
the butts % and coarse % and the negative correlation (-0.75) between the butts % and
the pitch %.
Table 24 – Correlation coefficients between the paste formulation variables for the normal operation data
This dataset contains day-to-day variations in the paste formulation mostly caused by
change in the anodes butts and coarse coke particles ratios. The pitch ratio was also
adjusted when changes in the dry aggregate mix modified the pitch demand. Hence, in this
case, the pitch % is correlated with the dry aggregate formulation since it was not
manipulated independently. No designs of experiment were implemented in this sampling
campaign.
The PLS models statistics are presented in Table 25. Again here, it was not intended to
predict the paste formulation variables but to cluster the image features based on changes
made to formulation. Thus the prediction ability will not be discussed. The models use 82%
and 78% of the variance contained in the image features dataset (X) to explain 40% and
28% of the variance in the formulation variables (Y) for the models built on averaged
replicate data and all images (no averaging), respectively. Lower explained variance of Y
was expected in this case because of the lower signal-to-noise ratio (i.e. smaller range of
variations and no design of experiments). Nevertheless, it will be shown that the image
Coarse % Inter % Fines % Butts % Pitch %
Coarse % 1,00 -0,49 0,02 -0,93 0,59
Inter % 1,00 -0,19 0,26 -0,16
Fines % 1,00 0,09 -0,19
Butts % 1,00 -0,75
Pitch % 1,00
146
features do correlate with formulation data. For comparison, the standard deviation of the
formulation variables were 1.95 %, 0.39 %, 0.27 %, 1.70 % and 0.25 % for the coarse %,
intermediate %, fines %, butts % and pitch %. Estimated model errors (RMSEPCV) are
lower for coarse, fines, and butts % but larger or similar for intermediate % and fines %.
Table 25 – Statistics of the PLS models built on normal operation data
The scores and loadings of the PLS model built on averaged replicate data are shown in
Figure 73-72. The main sources of variations captured by each LV are discussed first,
followed by the model interpretation based on the image features loading weights. These
are used interpret the effect of change in formulation on the paste visual appearance.
Figure 73 – First component’s scores (a) and loadings (b) of the PLS model (averaged
replicate data) built on normal operation data of the ADQ paste plant
The first PLS model component mainly captures the large changes made in pitch ratio in
between the different days of operation as indicated by the ellipse labelled #1 in Figure 72
b) and Figure 73 a). The Y loadings (Figure 73 b) also indicate that this component
captures the variations in coarse and butts % which occurred simultaneously. As the pitch
increases, the energy and contrast of the high frequency levels (1-4) increase while they
decrease in the low frequency range (5-7). This is an indication that the paste becomes
smoother for large details, but with more specular reflections due to the higher pitch
content in the fine details. This is consistent with the observations made from the
laboratory paste experiments.
Model
Number
of LV
R2X
(%)
R2Y
(%)
CVQ2Y
(%)
Coarse %
RMSEPCV
Inter %
RMSEPCV
Fines %
RMSEPCV
Butts %
RMSEPCV
Pitch %
RMSEPCV
Replicate
averages3 81,72 40,55 20,05 1,64 0,43 0,30 1,19 0,24
All samples 3 78,32 28,31 11,43 1,82 0,44 0,19 1,44 0,25
-10
-5
0
5
10
15
t LV
1 (
20.0
3%
)
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
w*q
LV
1 (
20.0
3%
)
12 3
4
56 7
1
2 3
4 5
67
1
2
3
4
5
6
7
1 23
4
56 7
1 23
4
5
67
1
2
3
4
5 6 7
Co
ars
e %
Fin
es %
In
ter.
% B
utts %
Pitc
h %
a) b)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
1
1 2 3 4 5 6
147
Figure 74 – Second component’s scores (a) and loadings (b) of the PLS model (averaged
replicate data) built on normal operation data of the ADQ paste plant
The second component (Figure 74 a) seems to focus on the local variations within each
experimental day. These local variations in pitch are indicated by the black arrows in
Figure 72 b) and Figure 74 a). Based on the LV2 Y loadings (Figure 74 b), this component
also captures information from the coarse/butts and pitch variations. However, since it is
orthogonal to LV1 it explains a different combination of events. The energy features which
are positive in almost all levels seem to explain the variations in pitch since they are
positively correlated with this variable (i.e. pitch increases the reflectivity of the paste).
Figure 75 – Third component’s scores (a) and loadings (b) of the PLS model (averaged
replicate data) built on normal operation data of the ADQ paste plant
Finally, the third component (Figure 75 a) mainly captures the changes in the intermediate
coke fraction as indicated by the ellipse labelled #2 in Figure 72 a) and Figure 75 a). The
particle size for this coke fraction is +0.15 mm/-1.40 mm (Table 5). The loadings for this
-10
-5
0
5
Observations
t LV
2 (
16.2
4%
)
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
w*q
LV
2 (
16.2
4%
)
12 3
4
5
6
7
1
2
345
6
71
2
3
4
5
6
7
12
3
4
5
6
7
1 23
4
5
6
7
1
2
3
4
5 6
7
Co
ars
e %
Fin
es %
In
ter.
% B
utts %
Pitc
h %
a) b)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
1 2 3 4 5 6
0 20 40 60 80 100 120-8
-6
-4
-2
0
2
4
Observations
t LV
3 (
4.2
8%
)
0 5 10 15 20 25 30 35 40 45
-0.4
-0.2
0
0.2
0.4
0.6
Variables
w*q
LV
3 (
4.2
8%
)
12
3
4
5
6
7
1
2
34
56
7 12 3
4
5
671
23
4
5
6
7
1 2
3
4
5
6
71
2
3
45
6
7
Co
ars
e %
Fin
es %
In
ter.
% B
utts %
Pitc
h %
a) b)
2
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
1 2 3 4 5 6
148
component (Figure 75 b) show that the decomposition levels 4 and 5 have the highest
weights in all the features. The intermediate particle size range approximately corresponds
to the details captured by these decomposition levels which are approximately +0.65 mm/-
1.30 mm and +1.30 mm/-2.60 mm respectively. Since the variations in intermediate
fraction had a very small correlation (Table 24) with the other formulation variable these
results suggest that the sensor is sensitive to changes in size distribution of the dry
aggregate mix used to manufacture the industrial paste.
The uncertainties in the scores of the PLS models built using all samples (no averaging)
are presented in Figure 76. The standard deviations of the scores were calculated based
on the three images collected at each sampling time. Therefore, these represent the
uncertainties in the surface appearance (texture) of the paste at a given time. The trends
in the scores shown in Figure 76 are clearly significant. For example, the variations in LV1
within each day lay within the error bars, but the score values of the 2 days with the largest
pitch difference (day 3 and 4) is larger than the standard deviation (i.e. the error bars for
these day do not overlap). This is also the case for the variations captured by LV2 (within
day variability for pitch and coarse coke fraction) and LV3 (intermediate fraction
variations). The fact that the uncertainties seem more important in this case compared to
the laboratory development phase was expected since the images were collected from an
industrial process where the environment is not as controlled as in the laboratory.
Furthermore, the signal-to-noise ratio is lower due to normal operation of the plant (mostly
common cause variations).
149
Figure 76 – Uncertainties in the scores of the PLS model built on normal operation data: a)
LV1, b) LV2 and c) LV3. One standard deviation error bars on the scores are shown.
7.3.2 Paste plant start-up
The second dataset was collected during the start-up of the paste plant. After the start-up
procedure is completed (lasts a few hours), the paste plant is run in a very stable steady-
state operation until the next shut-down. This test was performed to verify the sensitivity of
the machine vision sensor to changes in the paste during transient operation. In total,
paste samples were collected at 20 sampling times during start-up in triplicates and were
subsequently imaged (60 paste images).
The first paste sample was collected as soon as there was enough paste flowing through
the mixers onto the conveyor where sampling was performed. The time elapsed since the
first sample was collected are provided in Table 26. The sampling period was set to 4-5
0 20 40 60 80 100 120-7.5
-5
-2.5
0
2.5
5
Samples
t LV
3
-12
-10
-8
-6
-4
-2
0
2
4t LV
2
-10
-5
0
5
10
15
t LV
1
a)
b)
c)
1 2 3 4 5 6
150
minutes for the first hour (sampling period #1) when the paste quality improves the most
due to the equipments heating-up and the improvement in mixing. This was the fastest
manual sampling rate that could be implemented using the current set-up. After one hour,
the sampling rate was extended to 15 minutes for 1.5 hours (sampling period #2). At that
point the paste quality (i.e. density) is not yet in steady-state, but the changes are slower.
Finally, four additional samples were collected after a 1.5 hour delay (i.e. after the start-up
period, sampling period #3).
Table 26 – Sample number and elapse time since the first start-up sample
As an example of the transient paste quality, the GAD is presented in Figure 77 a). There
is a 0.02 g/cm3 increase in the first 30 minutes within the start-up. The numbers 1 to 3
correspond to the sampling period and are used in Figure 77 and Figure 78.
For the paste plant start-up dataset, only the image features data matrix (X) could be used
to analyze the changes in the image texture as the start-up progressed towards normal
operation. Indeed, it was not possible to apply the data synchronization procedure for the
first few samples because the residence times within the process units and dead-times
Sample #
Elapsed time
since S-U (h)
1 0,00
2 0,07
3 0,13
4 0,22
5 0,27
6 0,35
7 0,42
8 0,62
9 0,72
10 0,87
11 0,97
12 1,23
13 1,48
14 1,75
15 2,05
16 2,32
17 3,80
18 4,33
19 4,67
20 4,98
151
between them were changing during the start-up as opposed to steady state operation
(e.g. load of paste in the mixers increase to its normal level during start-up). Therefore, no
Y data was available in this case and so PCA was applied to model the feature matrix X
instead of PLS. The statistics of the PCA model are presented in Table 27. Three
components were found significant by cross-validation. In this case, the statistical model is
presented only for the average features of the three images per sampling time. This test
was not performed to validate the precision of the sensor but to verify if it was sensitive to
transient operation. For this reason, only the statistics for the replicate average (i.e. three
images per sample) are presented in Table 27. In this case, 3 components explain 73% (in
cross-validation) of the variance contained in the 42 image features variables.
Table 27 – Statistics PCA model built on the paste plant start-up data
Figure 77 b) shows a time series of the three component’s scores. The variations in the
score values can be jointly interpreted with the GAD from Figure 77 a). The transient
changes in the paste appearance and quality are captured by the imaging sensor. As
previously discussed, the paste quality changes rapidly in the first 30 minutes of the start-
up. The GAD increases rapidly (Figure 77 a). This is also visible in the increase of the
score values for LV1 and LV2 (Figure 77 b). Then, the GAD and all three LVs have a
period of higher variability (i.e. period #2). Finally, as the process reaches its steady-state,
the GAD stabilises at the same time as LV1 and LV3.
Model
Number
of LV R2X (%) CVQ
2X (%)
Replicate averages 3 86,21 73,18
152
Figure 77 – Time series of a 5h plant start-up period: a) GAD and b) scores of LV1, LV2
and LV3
Based on visual assessment of the score trends, the scores of LV1 and LV3 capture most
of the variations related with the evolution of the paste image features during the start-up
process. The score space of these components (Figure 78 a) shows that early in the start-
up (samples 1-6), the image textural features project in the negative t1 region identified by
ellipse #1 and their variance between the samples is high. Then, the paste texture features
transition towards the positive t1 region (square #2). Also note that the variance between
the samples, especially along t3, is also smaller. The LV1 loading plot presented Figure 78
b) is used to interpret the changes in paste appearance as the start-up progressed
(dominant component). Positive t1 values are associated with lower energy in the high
frequency levels, higher energy in the low frequency levels (5, 6 and 7) and also higher
GLCM correlation features. This is an indication that the paste appearance becomes
smoother as the start-up progressed and the process becomes more stable (i.e. the mixing
is more homogenous). The last four samples (17-20) are characterized by positive t1 and t3
values. These sample where taken a few hours after the start-up was completed (normal
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-10
-5
0
5
10
Sampling time
Sco
res
LV1 LV2 LV3
b)
a)
i ii iii
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
Sampling time
GA
D (
g/c
m3)
i
ii iii
153
operation). The loadings for LV3 are not presented since the features could not be easily
interpreted.
Figure 78 – Scores and loadings of the PCA model built on the industrial paste start-up
data (averaged image replicates): a) LV1 and LV3 score plot and b) LV1 loadings plot
The motivation for using the imaging sensor during paste plant start-up is to monitor the
transient process operation and eventually to assess if the paste textural features are in
agreement with those obtained in normal operation. This could provide a real-time
indication that the paste plant has reached steady-state and is ready for anode production.
To achieve this, a region in the score space corresponding to paste textural features
obtained in normal operation could be established. Then, it would suffice to verify whether
the current values of the scores fall within the region or not.
7.3.3 Industrial pitch optimization experiments
The last dataset was collected during five different experiments where the pitch ratio was
varied. These five experiments were performed in five different days during an eight month
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
p L
V1 (
48.0
8%
)
1 23
4
5
67
1
2
34 5
6
7
12 3 4
5
6
7
12 3
4
5
67 1 2
3
4
5
67 1
2 3 45
67
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
-10 -5 0 5
-4
-2
0
2
4
6
8
t LV1 (48.08%)
t LV
3 (
13.0
5%
)
1
2
3
45
6
7
891011
12
131415
16
17
18
1920
b)
i
0 5 10 15 20 25 30 35 40-0.3
Variables
ii
iii
a)
154
period in 2013-2014 in order to validate the sensitivity of the machine vision sensor to
changes in pitch and dry aggregate formulation. Indeed, changes in aggregate formulation
occurred during the experiments as part of a standard plant operating policy consisting of
substituting butts by coarse coke particles (and vice-versa) to meet constraints on butts
inventory. Some fluctuations in the percentage of intermediate and fines coke fractions
also occurred in between the different sampling campaigns. The variations in aggregate
formulation and pitch ratio that took place in this set of experiments are shown in Figure
79. The fines fraction is usually kept at a constant ratio in normal operation. However, a
step change of -2% was implemented in the experiment labelled E (see ellipses #1). The
fines were substituted by coarse coke particles to maximise the change in the particle size
distribution of the dry aggregate. In total, paste samples were collected in triplicates at 184
sampling times and were subsequently imaged (552 paste images).
Figure 79 – Changes in the formulation variables for the industrial dataset where pitch ratio
was varied. The five sampling campaigns are indicated by letters A-E.
a)
b)
-4
-3
-2
-1
0
1
2
3
4
Form
ula
tion
0 20 40 60 80 100 120 140 160 180-1
-0.5
0
0.5
1
Variables
Form
ula
tion
A B C D E
Coarse % Fines % Inter. % Butts % Pitch %
1
1
Ag
gre
ga
te f
orm
ula
tio
n
Paste samples
Pitch (
%)
155
In each experiment (A-E), the pitch percentage was manipulated by small increments of
0.1 % to 0.4 % around the nominal set-point. The latter was selected by the operators
based on the standard plant pitch adjustment policy. The measured pitch % for each
sample is presented in Figure 79 b). The changes made to pitch % set-points for each
experiment are presented in Table 28.
Table 28 – Changes implemented on pitch % set-point in the industrial pitch variations dataset
The overall range of variations in all the experiment was less than 1%. This was deemed
the maximum range allowed for maintaining safe process operation. In experiment E, two
designs of experiments were implemented on pitch ratio for two different fines/coarse
ratios during the same day. This was done to force a change in the pitch demand of the
paste for the same raw material and operating conditions, which is difficult to achieve
when the samples are collected on different days.
Different average pitch levels were maintained during each day of the sampling campaign
either because the operators adjusted pitch ratio to changes in raw materials and in the
dry aggregate formulation, or changed on purpose to meet the objectives of the
experimental program. However, within each day of experimentation, only the pitch ratio
was manipulated. Therefore, this dataset contains variation in pitch that are both
correlated with and orthogonal to the dry aggregate formulation.
Table 29 presents the correlation between the paste formulation variables. The correlation
between the coarse and butts fractions is again high (i.e. -0.93), but it is lower between the
pitch and butts (-0.52 instead of -0.75) compared to the normal operation dataset. This is
explained by the independent pitch variations implemented in each experiment. It was
extremely difficult to achieve a high signal-to-noise ratio for the changes made on pitch %
due, on one hand, to the limited range of pitch variations and, on the other hand, to
fluctuations in processing conditions, raw material quality and dry aggregate formulation,
Experiment
# of pitch
level
Pitch
levels
A 5 -0,2/+0,2
B 4 -0,1/+0,2
C 4 -0,2/+0,2
D 5 -0,3/+0,3
E (high fines) 5 -0,4/+0,3
E (low fines) 3 -0,2/+0,3
156
which may all have an effect on the pitch demand of the aggregate and on the paste visual
appearance. Nevertheless, this dataset contains more excitation (controlled or not) in
comparison with the normal operation dataset.
Table 29 – Correlation coefficients between the paste formulation variables for the experiments on pitch ratio
Some properties of the coke and pitch used during the sampling campaigns are presented
in Table 30. The coke apparent density (VBD) and real density as well as the pitch QI and
viscosities at 160°C and 180°C are provided in this table. These values consist of
weighted averages of the properties of each coke used in the blend (coke from at least two
sources are typically blended at the ADQ plant). The coke and pitch materials for
experiments A, B, and C come from the same suppliers, but different lots (i.e. ships and/or
trains). The pitch used in experiments D and E were from a different supplier than
experiments A, B and C. Finally, one of the cokes in the blends used during experiments D
and E is different for each of these blends.
Table 30 – Coke and pitch properties for each pitch variation experiments
For three experiments (i.e. C, E low fines and E high fines), four anode core samples were
collected from green anodes produced at each pitch levels. This was done in an effort to
quantitatively measure the paste quality and to find the real optimum pitch demand based
on the BAD. This is similar to what was performed with the laboratory anodes, with the
exception that a much smaller range of pitch variations was used in the industrial
Coarse % Inter % Fines % Butts % Pitch %
Coarse % 1,00 -0,01 0,10 -0,93 0,37
Inter % 1,00 0,64 0,24 -0,30
Fines % 1,00 -0,06 -0,11
Butts % 1,00 -0,52
Pitch % 1,00
A B C D E
Coke VBD (-30/+ 50 mesh) (g/cm³) 2,06 2,05 2,05 2,08 2,07
Coke real density (g/cm³) 0,90 0,89 0,90 0,88 0,91
Coke blend 1 1 1 2 3
Pitch QI (%) 7,4 6,4 6,4 17,2 13,5
Pitch viscosity 160°C (cP) 1890,0 1730,0 1730,0 2080,9 1470,0
Pitch viscosity 180°C (cP) 525,0 452,0 452,0 641,9 442,0
Pitch supplier 1 1 1 2 2
Properties
Experiments
157
experiments (i.e. less than 1% versus 11% for the lab experiments). These core samples
were baked in the same small scale furnace used in the laboratory experiments after
which BAD was measured. The baked core samples from experiments E low fines (E_LF)
and E high fines (E_HF) were also sent to the ADQ laboratory to measure the BAD,
electrical resistivity, compressive strength, CO2 reactivity and Young’s modulus. Hence,
two BAD measurements were available for experiments E_HF and E_LF. The top 10 cm of
the anode core was kept at the University and a first BAD measurement (labelled Top)
was performed. The remaining lengths of each core samples (approximately 25 cm) were
sent to the ADQ lab. The second BAD measurements (labelled Lab) were measured by
ADQ from a 13 cm long sub-sample.
The BAD data are presented in Figure 80 a) whereas the other properties measured in the
ADQ laboratory are presented in Figure 80 b) to e). The markers are the average of the
measurements for all core samples for each pitch level and the error bars represent the ±
1 standard deviation. All the results have been mean centered to protect the confidentiality
of the results. In this figure, LF and HF correspond to the low fines and high fines
experiments E and Top and Lab correspond to the two different measurements on the
same core sample.
A few observations can be made from the BAD measurements presented in Figure 80 a).
First, in all three experiments, the BAD is positively correlated with the pitch ratio. This
means that the paste was on dry side of the pitch optimum curves (Figure 4) for each
experiment. Secondly, there is a difference between the BAD measured from the top (Top)
part of the core sample and that of the laboratory (Lab) sample. This may be explained by
a greater surface porosity on the top sample and from measurement variability. Finally, the
optimum BAD could not be measured for any of the three experiments. For both
experiments C and E_LF, the BAD increases with the pitch %, therefore the optimum pitch
demand was not reached. For the E_HF dataset, the BAD also increases with pitch ratio
except for the last pitch level (+0.3%). However, the BAD values obtained at +0.2% and
+0.3% of pitch fall within the 1 std error bars of each other. A few additional pitch levels
(i.e. +0.4% and +0.5%) would have been necessary to validate that BAD effectively started
decreasing from the +0.3% pitch level.
158
Figure 80 – Baked anode core properties: a) BAD for experiments C and E, b) electrical
resistivity, c) compressive strength, d) CO2 reactivity residue (CRR) and e) Young’s
modulus
The main hurdle encountered during these experiments was the impossibility of obtaining
the BAD of the anodes in real-time. Thus it was not possible to know exactly when to stop
the experiment to obtain a full pitch optimisation curve where both under pitched and over
pitched anodes are fabricated. This is one of the reasons why it is important to develop a
method for measuring the OPD in real-time.
Unfortunately, the OPD could not be determined due to lack of over pitched anodes.
However, the laboratory measurements of the four anode properties (Figure 80 b-e) on the
E datasets can be used to validate that the BAD and the other properties correlate well
with each other. For these anodes, the maximum CO2 reactivity residue (CRR), and
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
Delta pitch (%)
CR
R
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
Com
pre
ssiv
e s
tren
gth
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-2000
-1500
-1000
-500
0
500
1000
Delta pitch (%)
Young's
modulu
s
-1.5
-1
-0.5
0
0.5
1
1.5
2
Ele
ctr
ical re
sis
tivity
C E_LF_Top E_LF_Lab E_HF_Top E_HF_Lab
b) c)
d) e)
a)
-0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3-0.03
-0.02
-0.01
0
0.01
0.02
0.03
Delta pitch (%)
BA
D (
g/c
m3)
159
minimum electrical resistivity are all obtained for the anodes with the maximum BAD. The
Young’s modulus is an indication of the thermal shock resistance. It is a measure of
inelasticity of the anodes and therefore must be minimized in order to minimize the strain
due to thermal shock in the potroom. The relationship between the BAD and Young’s
modulus is not as clear as with CRR and electrical resistivity. Furthermore, it seems that
pitch ratio had no effect on compressive strength for the range of pitch % tested in these
two experiments. Based on these results, it can be concluded that the quality of the
anodes increased with pitch ratio and the use of BAD to assess the changes in pitch
demand was deemed valid for these experiments. However, the implemented range of
variations was too small to confirm what the optimum pitch levels were for each individual
anode.
PLS regression models were computed between the image features (X) and the paste
formulation variables (Y). Again, the model is not intended to be used for predicting
formulation. Indeed, two pastes with the same pitch level, but formulated using a different
coke will very likely have different textural characteristics, and therefore poor prediction
performance of the Y data are expected from such a model. The objective of using PLS is
rather to extract a small number of orthogonal combinations of textural features that are
correlated with the various changes made to paste formulation variables and that allows
detecting what the optimal pitch demand is for a given coke source. Unfortunately, the
pitch demand of the paste could not be measured in any of the industrial experiments.
The PLS models statistics are presented in Table 31. Once again, the models capture
most of the variance in the features. Thus, the PLS scores and loading weights can be
used for the interpretation of the relationship between the paste image texture and the
variations in formulation. The averaged replicates (i.e. the three images per sampling time)
model has a very good percentage of variance explained in fit (73%) and in prediction
(58%, shown in Figure 81) for pitch ratio. This is a good performance for a model built from
industrial data. For comparison, the standard deviation of the formulation variables were
1.76%, 0.24 %, 0.67 %, 1.94 % and 0.45% for the coarse %, intermediate %, fines %,
butts % and pitch %. In this case, the RMSEPCV are lower that the dataset variability for
all variables except the fines %.
160
Table 31 – Statistics of the PLS model for the design of experiments on pitch ratio
The predicted versus measured pitch ratio (in fit) obtained from the averaged replicates
model is presented in Figure 81. This confirms the sensitivity of the paste image texture to
changes in pitch ratio.
Figure 81 – Predicted versus measured pitch ratio obtained using the PLS model built on
data collected during the design of experiments on pitch ratio (averaged replicates)
The PLS scores and loadings bi-plots for the averaged replicates data are presented in
Figure 82, Figure 83 and Figure 84. Only the first 3 components are discussed since the
last component does not improve the interpretation. This component seems to focus on
some low frequency information contained in level 7 that may be due to the manual
flattening of the paste in the aluminium containers. The variations captured by each
component can be interpreted from the Y loadings (Figure 82 b, Figure 83 b and Figure 84
b) and the scatter plots of the scores against paste formulation variables (Figure 82 c and
d and Figure 83 c).
The first component (Figure 82) captures two phenomena. First, it models the feedback
control strategy where pitch ratio is adjusted to attenuate the changes in pitch demand
introduced by fluctuations in the coarse coke and anode butts fractions as discussed
previously (i.e. limitations on butts inventory). The variations in formulation are illustrated in
Figure 79 a). The scatter plots of the LV1 scores against the percentage of the coarse
Model
Number
of LV
R2X
(%)
R2Y
(%)
CV
Q2Y
(%)
Pitch %
R2 (%)
Pitch %
Q2 (%)
Coarse %
RMSEPCV
Inter %
RMSEPCV
Fines %
RMSEPCV
Butts %
RMSEPCV
Pitch %
RMSEPCV
Replicate
averages4 89,88 48,86 27,66 72,66 58,20 1,49 0,29 0,83 1,54 0,30
All samples 4 84,87 42,42 24,39 62,12 48,30 1,51 0,28 0,80 1,57 0,33
-1 -0.5 0 0.5 1-1
-0.5
0
0.5
1
Measured Pitch %
Pre
dic
ted P
itch %
A B C D E
161
fraction (Figure 82 c and d) also support this interpretation. Second, this component also
captures the designed experiments pitch variations. The variations in LV1 (Figure 82 a)
clearly matches the overall trends in pitch percentage (Figure 79 b). The span of score
values (LV1) for each coarse % set-point in Figure 82 c) is due to the pitch variations
within each experiment. The linear variation or the scores as a function of the pitch %
(Figure 82 d) is also an indication of the sensitivity of this component to the pitch variations
to the change in formulation and to the designed experiments.
Figure 82 – Scores and loadings of the PLS model (averaged replicates) component 1 for
the designed experiments on pitch ratio: a) LV1 scores, b) LV1 weights and loadings, c)
scatter plots of LV1 scores and coarse % and d) scatter plots of LV1 scores and pitch %
It is very difficult to interpret the texture features loadings from Figure 82 b). The first
model component captures both the variations in coarse/butts % and its associated pitch%
adjustment and also the design pitch % experiments. First, as pitch and coarse
percentages are increased, the energy decreases in almost all decomposition levels
except for a small increase in level 1 and 2. From previous results, an increase in pitch
should increase the reflectivity of the paste and the energy of all the detail coefficients.
-4 -2 0 2 4-10
-5
0
5
10
Coarse %
t LV
1
-1 -0.5 0 0.5 1-10
-5
0
5
10
Pitch %
t LV
1
-10
-8
-6
-4
-2
0
2
4
6
8
t LV
1 (
32.1
2%
)
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
w*q
LV
1 (
32.1
2%
)
1 2
3
4 5
6
7
1
234
5
6 7
1
2 3 4
5
6 7
1
2
3 45
6
7
12
3
4 5
6
7
1
2
3
4 5
6
7
Co
ars
e %
Fin
es %
In
ter.
% B
utts %
Pitc
h %
a) b)
c) d)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
A B C D E
20 25
Variables50 75 100 125
Observations
162
However, the decrease in butts and increase in coarse % also increases the pitch demand
and should have the opposite effect. As discussed using Figure 80, the samples in
experiments C and E were on the dry side of the optimum pitch demand so these energy
loadings suggest that the changes in formulation were not compensated completely by an
increase in pitch%. Second, in this particular dataset, except for experiment E, the
increase in coarse % was a response to the decrease in butts %. It was not compensated
by the fines fraction as was previously done in the laboratory experiments. The main
differences in the particle size distribution between the coarse and the butts fractions are
the +3/8 inches particles. Only the butts fraction contains this particle size range (Table 5)
which falls in the level 7 detail coefficient sensitivity (Table 13). Figure 82 b) shows that
skewness and kurtosis increase in levels 1-5, but decrease in levels 6-7 which are more
sensitive to the particle size range of the butts fraction.
An interesting point to note in Figure 82 b) is that the samples from experiment D (solid
black arrow) do not clustered in the same region as all the other samples (dashed black
arrow). This indicates that the textural features captured by the LV1 component are
different for the same pitch % compared to the other paste mixes. However, both cluster
showed a positive correlation between pitch % and t1 which means that the same features
combination captures the pitch % but the pitch baseline is not the same. Variations in raw
materials have an effect on the dry aggregate pitch demand. Two pastes samples
manufactured using different coke blends but with the same pitch % may have a different
appearance. The shift for the experiment D samples shown in Figure 82 b) illustrates the
point that trying to predict the pitch content of the paste using the images as was
attempted in the past can lead to erroneous results. The models will not be robust to
change in raw materials. It is more important to determine what combinations of image
textural features are sensitive to pitch demand and build models that have the ability to
predict the extent of the deviation from the OPD instead of trying the absolute value of
pitch ratio.
The second component also captures some of the variations in pitch ratio as shown by the
black dashed arrow in Figure 83 c). However, these variations are orthogonal to those
explained by LV1 (i.e. both components are orthogonal). In addition, the Y loadings
provided in Figure 83 b) show that pitch ratio in LV2 is negatively correlated with all the dry
aggregate formulation variables. This suggests that LV2 mainly captures the variations in
pitch ratio that are independent from the changes in dry aggregate formulation.
163
Figure 83 – Scores and loadings of the PLS model (averaged replicates) component 2 for
the designed experiments on pitch ratio: a) LV2 scores, b) LV2 weights and loadings and
c) scatter plots of LV2 scores and pitch %
Furthermore, the trends in the LV2 scores (Figure 83 a) seem related with the variations
introduced in pitch ratio as part of the experimental design (Table 28). This is particularly
clear for the first two days (red and green dots) where an optimum in LV2 clearly appears.
Whether or not these correspond to the optimum pitch demand for the dry aggregates
used in the formulation during the first two days cannot, however, be confirmed. In this
case, the pitch ratio is negatively correlated to the reflectivity of the paste since all energy
features decrease in opposition to the pitch increase. This seems opposite to what was
previously observed for the laboratory paste samples. However, this dataset also contains
changes in raw materials blends which were not explored in previous case studies. It can
be seen in Figure 83 c) that the main correlation between the pitch and the scores of LV2
is due to the difference between each cluster of experiments. However, the local variations
(i.e. black solid arrow) are almost orthogonal to the main variations. In each of these local
variations, the pitch is negatively correlated to the LV2 scores. Hence the energy
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
w*q
LV
2 (
10.7
8%
)
1
2
3
4
5
6 7
1
2
3
45
6
7
1
23 4
5
6
7
1
2
3
4
5
6
7
1
2
3
4
5
67
1
2
3
4
5
6
7
Co
ars
e %
Fin
es %
In
ter.
% Bu
tts %
Pitch
%
a) b)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
A B C D E
-6
-4
-2
0
2
4
6
t L
V2
(1
0.7
8%
)
c)
-1 -0.5 0 0.5 1-10
-5
0
5
10
Pitch %
t LV
2
20 25
Variables50 75 100 125
Observations
164
increases with the pitch % in the local variations within each different experiment. Changes
in raw materials may have also affected paste appearance.
The third component focuses on changes made to the fines % and pitch ratio as indicated
by the Y loadings of LV3 (Figure 84 b). The ellipse labeled #1 in Figure 79 a) and Figure
84 a) illustrate the step change implemented in fines % in experiment E and the impact on
LV3, respectively. For this component, the loading weights of the energy features in the
high frequency levels are negative and so is the fines % loading. This indicates a positive
relationship between fines content and the energy features information, that is, the energy
increases in the high frequency bands when more fines are added to the paste. This is an
indication that the reflectivity of the paste increases in the high frequency due the
reflectivity of the fines as was observed in the laboratory experiments. At the same time as
the pitch and fines decrease, the skewness and correlation in all levels and kurtosis in the
high frequency levels increase which is an indication of more homogeneity in the paste
texture.
Figure 84 – Scores and loadings of the PLS model (averaged replicates) component 3 for
the designed experiments on pitch ratio: a) LV3 scores and b) LV3 weights and loadings
Finally, the uncertainties in the scores of the PLS model are presented in Figure 85. The
one standard deviation error bars around the mean score values of the replicated samples
are used to assess reproducibility. As was observed with the normal operation dataset
(7.3.1) the standard deviation of the replicate is smaller than the variability captured by the
component for each LV of the model.
-6
-4
-2
0
2
4
6
t LV
3 (
1.8
1%
)
-0.2
-0.1
0
0.1
0.2
0.3
w*q
LV
3 (
1.8
1%
)
1
2
3
4
56
7
1
23
4
5
67
12
3
4
5
6
71
2
3
4
5
6
71
23
4
5
6
7
1
2
3
45
6
7
Co
ars
e %
Fin
es %
In
ter.
% B
utts %
Pitc
h %
a) b)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
A B C D E
1
1
20 25
Variables50 75 100 125
Observations
165
Figure 85 – Uncertainties in the scores of the PLS model (all sample) for the data obtained
during the design of experiments on pitch ratio): a) LV1, b) LV2 and c) LV3. One standard
deviation error bars of the scores are shown in the figure.
7.4 Joint modelling of image features and paste plant data using
SMB-PLS
Previous work by Lauzon-Gauthier et al. (Lauzon-Gauthier 2011; Lauzon-Gauthier et al.
2012) demonstrated the possibility to use PLS modelling to predict baked anode properties
at the end of the baking cycle from raw material, paste plant and baking furnace data. An
update of this work is also presented in Appendix A The proposed machine vision sensor
provides 42 new measurements (i.e. image features) that can be fused with the data
collected from the plant instrumentation in order to verify if the characterization of the
paste and the prediction of anode properties can be improved further.
-8
-6
-4
-2
0
2
4
6
8
t LV
1
-6
-4
-2
0
2
4
t LV
2
50 100 150
-8
-6
-4
-2
0
2
4
6
8
Samples
t LV
3
a)
b)
c)
A B C D E
166
The new Sequential Multi-block PLS algorithm (SMB-PLS) presented in Chapter 5 was
shown to improve interpretability of large industrial datasets consisting of multiple data
blocks. The new data block containing the paste image textural features adds to the
complexity of the already available data structure. Thus, the use of SMB-PLS is even more
important now that new blocks of data become available.
This section presents a joint analysis of raw material properties, paste plant data and
paste image features using a SMB-PLS model. Both the industrial datasets collected
during normal operation (7.3.1) and the experimentation on pitch ratio (7.3.3) were used to
build the model. The green anode density (GAD) was used as the single Y variable since it
is the only on-line measurement available for anode quality. The baking furnace data were
not included in the model since no baked anode properties were available in datasets
selected for building the SMB-PLS model.
Figure 86 – Data blocks and variables used in the SMB-PLS model for predicting GAD
The multi-block structure of the dataset used to build the SMB-PLS model is presented in
Figure 86. The number of variables within each block is provided in Table 32. The raw
materials block (Z) contains the coke density and particle size distribution and the pitch
physical properties. The impurities in the butts do not have an effect on the green density
and were not included in this model. The formulation block (X1) contains the paste
formulation as well as the dry aggregate and some key dry aggregate fractions particle
size distribution. The X2 block discussed previously in Chapter 5 was split in two blocks
Raw
materials
Classification
of materials
Paste mixing
Image
features
• Coke density
• Coke size distribution
• Pitch physical properties
• Aggregate size distribution
(shift based)
• Paste formulation
• Temperatures
• Mixing power, etc.
• DWT detail coefficients
features
Anode block • GAD
Z
Y
X1
X2
X3
Forming• Bellows pressure
• Anode HeightX4
167
containing the paste mixing conditions (X2) and the forming variables (X4). The image
features (X3) were inserted in the dataset before the forming variables since the sampling
is done prior to the compaction. The prediction dataset (Y) consists of the GAD.
The statistics of the SMB-PLS model are presented in Table 32. The important information
from these statistics is not the high predictive ability of the GAD for these historical data,
but the very small (i.e. 3%) difference between the fit of the model (R2Y) and the prediction
ability (Q2Y). The number of components for each block was selected sequentially based
on the lowest RMSEP and 1% improvement of the Q2Y as described in Chapter 5. The
training dataset contains 2/3 of the observations (anodes) selected randomly from the
original dataset. The rest were used as the validation dataset.
Table 32 – Statistics of the GAD SMB-PLS model
The main result shown in Table 32 is that the paste image features do not add information
for the prediction of the GAD since no component is required from this block to improve
the prediction of Y. This was expected since it was shown in previous work (Lauzon-
Gauthier et al. 2012) that GAD is well predicted by using only the routinely collected raw
material and paste plant data. However, if baked anode properties would have been
available in these experiments (e.g. anode electrical resistivity, baked anode density and
mechanical properties), the image features would have likely provided additional
information. Nevertheless, the fact that 77% of the variance in the features block (X3) fall
in the space of the three previous blocks (Z, X1 and X2) deserves further discussion. The
high degree of correlation between the image features and the raw materials and
formulation blocks is very important for two reasons. First, this validates the sensitivity of
the machine vision sensor to the raw material and formulation variations. Secondly, the
raw material properties (Z) are only available as weekly averages and the aggregate size
distributions (X1) are measured once to three times per 12h shifts. Since the paste image
Block
Number of
variables
Number
of LV R2 (%) Q
2 (%)
Z 15 5 96,66 96,33
X1 18 3 86,47 83,18
X2 17 2 83,93 81,82
X3 42 0 76,96 73,63
X4 2 1 83,93 79,40
Total X 94 11 85,59 82,59
Total Y 1 11 89,19 86,05
168
features are highly correlated with those infrequent measurements, this suggests the
images may inform of changes in raw materials and aggregate size distribution in real-
time, which is a major advantage for using the proposed sensor. Hence, the machine
vision system could be used to monitor and control the paste plant and compensate for
infrequently measured variables.
The amount of variance of each block and GAD explained by the SMB-PLS model as well
as the relative contribution of the regressor blocks in each component are shown in Figure
87. The first observation is that most of the blocks contain some information about
subsequent blocks. This indicates that the blocks are correlated to each other. This is a
nice feature of the SMB-PLS algorithm which allows to quickly quantify the amount of new
information each data blocks add to the model. This cannot be obtained by traditional
multi-block PLS methods.
Figure 87 – Relative contribution (bars) of each regressor block in the SMB-PLS model.
The explained variance of each regressor block R2X (black lines) and of the Y block R2Y
(gray line) are also shown
Furthermore, it is possible that the image features would add information to a multivariate
statistical model of the paste plant data (i.e. raw material and operating conditions) if the
baked anode properties at the optimum pitch demand were used as the predicted variable
instead of the GAD. The laboratory results of the pitch demand experiment with the
laboratory formulation (section 6.4.3) showed sensitivity of the machine vision algorithm to
optimum pitch demand.
Z-1 Z-2 Z-3 Z-4 Z-5 X1-1 X1-2 X1-3 X2-1 X2-2 X4-10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
LV
Rela
tive w
eig
hts
blo
ck R
2X
and R
2Y
Z X1 X2 X3 X4 R2 Z R2 X1 R2 X2 R2 X3 R2 X4 R2 Y
169
To complement the above discussion, some interpretations of the SMB-PLS model is now
provided using plots of the loading weights for the different data blocks. In particular, some
of the correlations between the image features block (X3) and the raw materials and
process blocks (Z, X1 and X2) are analyzed. The loading plots are shown in Figure 88 to
Figure 90.
Figure 88 – Loading weights of the raw material properties (Z) in component LV Z-1
The loading weights of the first LV of the raw material block are presented in Figure 88.
This component (Z-1) explains 46% of the variance of the GAD using only 15% of the
variance of the raw materials block. Note that the variance explained from the subsequent
blocks in this component is correlated with the information extracted from the first block in
the sequence (Z). This explains why only the Z block loadings are interpreted here. Coke
density and coke size distribution seem to be the main drivers for changes in GAD as
indicated by the loading weights (Figure 88).
0 2 4 6 8 10 12 14 16
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Variables
w Z
blo
ck L
V Z
-1 (
46.4
2%
)
Cok
e re
al d
ens
Cok
e 28
/48 ap
p de
ns
Coke RT 4
Coke RT 8
Coke R
T 14
Coke R
T 30
Cok
e RT 5
0
Cok
e RT 1
00
Cok
e RT 2
00
Pitc
h SP
Pitch
TS
Pitc
h Beta
Pitch Q
IPitch CV
Pitch dist
170
Figure 89 – Block weights of LV Z-2: a) raw material (Z) and b) image features (X3)
The loadings of the second raw material block component (Z-2) are presented in Figure
89. This component captures 4 % of the variance in GAD, and 24 % and 17 % of the
variance in the raw material and the image features blocks, respectively. The main drivers
in this latent variable are a combination of coke particle size distribution and, most
importantly, pitch properties as shown in Figure 89 a). The paste’s pitch demand is
positively correlated with the Pitch QI (Hulse 2000). For these experiments, this change in
QI and its effect on the paste is captured by the higher energy in the high frequency band
(1-3) and lower energy in the lower frequencies (4-7) (Figure 89 b). Also, the skewness
which is an indication of inhomogeneity increases.
0 5 10 15 20 25 30 35 40
-0.2
-0.1
0
0.1
0.2
0.3
Variables
w X
3 b
lock L
V Z
-2 (
3.6
6%
)
1 23
45
67
1
23 4
5
6
7
1
2
3
4
5
67
1 2
3
4 5
67
1 2 3
4
56
7 1
2
3 45
67
b)
0 2 4 6 8 10 12 14 16
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Variables
w Z
blo
ck L
V Z
-2 (
3.6
6%
)
Coke real dens
Coke 28/48 app dens
Cok
e RT 4
Cok
e RT 8
Cok
e RT 1
4
Cok
e RT 3
0
Cok
e RT 5
0
Cok
e RT 1
00
Coke R
T 200
Pitc
h SP
Pitc
h TS
Pitc
h Bet
a
Pitch QI
Pitch CV
Pitc
h dist
a)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
171
Figure 90 – Block weights for LV X1-2: a) formulation (X1) and b) image features (X3)
Finally, the loadings of the second component of the formulation block LV X1-2 is
presented in Figure 90. The X1-2 component captures an additional 4% of the variance in
the GAD, and 6% and 2% of the variance of the formulation and the image features
blocks, respectively. This LV focuses mostly on changes in the dry aggregate particle size
distribution. The loading of the coarse fraction is negative while those of the butts % and
pitch % are positive (Figure 90 a). This relationship between the three variables has been
well explained in this chapter and is again captured by the SMB-PLS model. The positive
correlation between the butts largest sizes (Rt 3/8 (+3/8) and Rt4 (-3/8/+4)) and the
amount of large particles (Rt 3/8) in the dry aggregate is also captured by this LV. This
component clearly models the increase in the coarseness of the anode paste and it is
correlated with an increase in the energy of the low frequency detail coefficients number 5
to 7 (Figure 90 b) which correspond to a detail size of 1.3 mm to 10.4 mm (i.e. the coarsest
fraction of the anode paste).
7.5 Conclusions
It was easier to develop the machine vision algorithm with laboratory paste and anodes
due to the more controlled and large span of formulation changes possible. The objective
of this thesis, however, is to apply this new measurement method on a real life industrial
0 5 10 15 20 25 30 35 40
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
Variables
w X
3 b
lock L
V X
1-2
(4.2
7%
)
1
2
3
4
5
6 7
1
2 3
4 5
67
1
2
3
4
5
6
7
1
23
4
5
6
7
12
3
4
5
6 7 1
2
3
4 56 7
b)
a)
E Skew Kurt GLCM ASM GLCM Cont GLCM Corr
0 2 4 6 8 10 12 14 16 18-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Variables
w X
1 b
lock L
V X
1-2
(4
.27%
)
Past
e (tph
)
Coarse %
Fin
es %
Inte
r. % B
utts %
Pitc
h %
Gre
en re
cyc %
Fines
rot v
alve
spe
ed
But
ts R
t3/8
+Rt4
Coarse R
t4
Coarse R
t8
Inter Rt50+R
t100
Fines Pt200
Agg
Rt3
/8
Agg
Rt4
@Rt3
0
Agg R
t50+Rt100
Agg R
t200+Pt200
Agg P
t200
172
anode manufacturing process. Therefore, it was necessary to test this sensor on real
industrial paste variations.
Industrial paste contains day to day variations in formulation, changes in raw materials and
other normal process variability. The objectives of the chapter were to validate the
sensitivity and robustness of the machine vision algorithm. Also the understanding of the
texture features gained from the laboratory experiments presented in Chapter 6 were used
to interpret the texture variations of the industrial paste images. A second objective was to
verify if the paste image texture could add new information or complement existing
measurements. A multivariate statistical model of the paste plant using the SMB-PLS
algorithm developed in Chapter 5 was also built after fusion of plant and imaging sensor
data.
Three different datasets containing changes in paste formulation, different raw materials,
pitch design of experiments as well as a plant start-up transient operation were used to
test the sensitivity of machine vision sensor to the industrial variability. The sensor was
found sensitive to normal operation formulation variability. It could capture change in daily
butts/coarse % variations and its associated change in pitch level due to the effect of the
dry aggregate formulation on pitch demand. It could also capture independent variations of
the intermediate %. The variations of pitch and formulation within each day of the
experiment could also be observed with the image features. In addition, the transient state
of the paste during the start-up was tracked by the image features and this suggests that
the sensor is sensitive to fast variations of the paste appearance.
Then, the sensitivity to the variations in pitch was tested with six pitch optimization
experiments. In this case, the dry aggregate formulation variations as well as the pitch
changes could be detected using the image features. The interpretation of the texture
features using the PLS models weights is not as straightforward as with the laboratory
samples. However, the energy was still found sensitive to the pitch % in the paste and the
skewness and kurtosis to changes in formulation.
In an effort to obtain a quantitative measurement of the optimum pitch demand, the BAD of
anode core sample was measured on a few anodes spanning the range of tested pitch
level for three experiments. Green core sample of anodes were baked in controlled
conditions to obtain representative measurement of the BAD. Unfortunately, all the anodes
measured where on the dry side of the optimum pitch demand and the range of pitch
173
variations was not large enough to reach the optimum. Additional work is needed to verify
if the machine vision is sensitive to the optimum pitch demand with the industrial paste.
Furthermore, the repeatability of the image features was studied using replicates of the
paste sample. It was found that the range of variability from the replicate is smaller than
the process variations measured by each component of the statistical models for both the
normal operation and pitch variations experiments.
Finally, a SMB-PLS model of the raw material, paste plant operating conditions and
images features combined for the prediction of the GAD was presented. This was used to
verify the correlation of the machine vision sensor with the raw material and formulation
variables. The results have shown that the sensor could compensate for the lack of real-
time raw material properties and particle size distribution data.
175
Chapter 8 Conclusions and recommendations
The most important aspect of the anode quality regarding the performance of the
aluminium reduction cells are their consistency. This has become a challenge in the last
few years with the increasing variability of incoming raw material to the carbon plant due to
the degrading crude oil quality and the increasing frequency of supplier changes and
blending of different cokes. The lack of real-time quality monitoring and control of the
green and baked anodes, as well as the lack of sensors for the most important raw
material and process parameters (e.g. variations in particles size distribution, OPD, etc.)
impairs the plants ability to compensate for this increased variability.
The general objective of this thesis was to address some of the issues related with the lack
of real-time quality control of the anode quality and the lack of fast and relevant
measurements to cope with raw materials and process variability.
Based on multiple industrial machine vision applications, a non-destructive image texture
analysis algorithm was developed to track changes in the anode paste texture (visual
appearance) and eventually relate this information to the anode paste quality. The anode
paste was the focus of this machine vision application because it enables the fast
sampling of large proportions of paste compared to the use of images from formed anode
surfaces. Also, sampling the paste minimizes the delays if the measurements are to be
used in a feedback control strategy.
A new sequential multi-block PLS algorithm was also developed for improving the
interpretation of existing empirical latent variable models of the baked anode quality and
the perspective of adding new real-time measurement from the carbon plant to this model
in the future.
This chapter discusses the main conclusions and contributions from the specific objectives
and presents some future perspectives on the real-time quality control of the carbon plant.
8.2 Development of the machine vision sensor
The first specific objective was the development of a machine vision sensor using
laboratory scale paste and pressed anodes. This new sensor needed to be sensitive to
changes in formulation and in the pitch demand of the paste. The laboratory scale paste
was used since it enables more control on the paste formulation and a wider range of
176
variations. These datasets were also used to understand the relationship between the
variations in the paste the image texture features.
The anode paste appearance is very dark and has low contrast variations. In addition, the
relevant changes in the paste formulation affecting the paste surface appearance were
found to modify the size of the objects in the paste (finer/coarser), its degree of roughness
(rougher/smoother) and homogeneity. Hence, the use of image texture features seemed
the most appropriate approach to characterize the paste visual appearance variability.
A combination of preprocessing, wavelet type, GLCMs parameters and textural features
were tested using a dataset of images coming from a pitch optimization experiment on two
different cokes. The best preprocessing option was found to consist of using contrast
enhancement of the images by adjusting the saturation of the extreme values of pixel gray
level intensity distribution. This removed some of the light intensity variation from sample
to sample illumination variations. It also improved the differentiation of the paste samples
based on the optimum pitch demand. A set of six features computed from seven DWT
decomposition levels (i.e. 42 features in total) were found to provide optimum prediction
and clustering of the anode at the optimum pitch demand (OPD). The optimum pastes
were characterized by the concentration of the textural information in a particular
frequency band (i.e. decomposition level #4) and resulted in similar texture features
combinations for both type of coke even if the optimum BAD and pitch % was not the
same.
The pitch demand dataset was used to develop the algorithm, but the other two dataset
were used to understand the relationship between the features and the paste variations.
The first dataset included variations in pitch ratio and in fines ratio. Both an increase in
pitch and fines caused the paste to be more reflective. However, the pitch increased the
inhomogeneity of the paste in the high frequency details due to the increased specular
reflection of the pitch on the surface of the particles, but the fines had the opposite effect of
smoothing the paste image texture. Both phenomena could be captured by different latent
variables (components) of the PLS regression models between the image features and
formulation variables. The other dataset was designed to incorporate additional sources of
pitch demand variations such as change in formulation, mixing temperature, fines
fineness, etc. The interpretation of these results was consistent with the first dataset. An
177
increase in pitch had the same effect as a decrease in pitch demand (e.g. addition of low
porosity shot coke) and resulted in less homogeneity in the high frequencies.
The ability to detect the OPD for two different type of coke is a major contribution to the
characterization of the paste quality since it is the first time that it is reported. The
understanding of the relationship between the process variations and the change in the
image texture is also an important contribution. In addition, the features used in the paste
quality sensor were already reported in the literature, but this particular combination of
features was not reported before. Finally, no reliable machine vision application on any
type of paste was ever reported prior to this work.
8.3 Sensitivity and robustness to industrial paste
The final objective of the project was to develop a sensor for real-time paste quality
monitoring and control. It is not enough to show that it is sensitive to laboratory paste
quality. It was thus important to test the sensitivity and robustness of the machine vision
sensor on real industrial samples.
Three different datasets containing normal operation variations, plant start-up and
designed pitch variations were collected from the Alcoa Deschambault smelter’s paste
plant. It was shown that the sensor could capture the pitch demand variations due the
change in formulation in butts and coarse fraction and its effect on the pitch % (i.e. less
homogeneity). Also independent pitch variations introduced by the formulation could be
captured independently by a different combination of features (i.e. smoother paste). The
sensor could also differentiate variations in the fines % and intermediate fractions using
different and orthogonal latent variables (i.e. they were driven by different combinations of
features).
The selection of the BAD to determine the optimum pitch demand of the dry aggregates
was validated using laboratory analysis based on four anode properties measured on
baked green core samples collected during two of the pitch variation experiments.
Finally, the major contribution from this work is the validation that it is possible to quantify
the variability in the paste plant from various process parameters using images of the
paste.
178
8.4 SMB-PLS algorithm
The development of the new sequential multi-block algorithm is a major fundamental
contribution in this thesis. The idea of this new method arose from the difficulty to interpret
PLS models containing many variables (e.g. 100) from different process units for process
monitoring and troubleshooting. Multi-block PLS methods already exist in the literature and
some of them were developed more than 20 years ago in order to simplify the
interpretation of complex multi-block data sets. However they have some drawbacks. The
mixing of information between the blocks and the absence of explicit consideration of the
correlations between the data blocks can all lead to misleading interpretation. In addition,
some multi-block methods simply remove the between blocks correlated variations which
results in loss of information for interpretation purposes.
The most important improvement of the new algorithm is that correlated information
between a given block and subsequent ones is captured in the same latent variable space
as opposed to the orthogonal space which is captured by other components. When used
for the investigation of a process dataset, this feature highlights the effect of raw materials
on downstream process operating conditions and the effect of control loops between
variables from different blocks. Also each new block in this sequential approach only
contains new information so these components focus the interpretation on the most
important parameters without interference from previous blocks. Another key feature is the
possibility to select different number of components for each block, which is particularly
useful when the blocks have very different statistical ranks.
The modeling performance and interpretation of the new SMB-PLS were illustrated using
two datasets. First, the simulated polymer film blowing process dataset contained two
different case studies, one without and the other with correlation between the blocks, that
were used to illustrate the pathway orthogonalization properties of the SMB-PLS. As
opposed to the MB-PLS algorithm, the correlated variations due to the feedback control
actions were captured by a different set of latent variables than the orthogonal variability.
Second, the anode manufacturing dataset was used to validate the new algorithm on a
real life dataset. It was again proven that the distribution of information contained in each
latent variable was different than with the MB-PLS algorithm in which it is not possible to
differentiate correlated and orthogonal information. Improvements in the interpretation
were also illustrated using the scores and block weights of the different algorithms using
the anode dataset.
179
Finally, the SMB-PLS algorithm was used to validate the correlation between the raw
material properties, paste plant operating conditions and the paste image features. The
dataset was used to predict the GAD. It was found that the images features variations
correlated with the GAD were also correlated to raw material and process conditions
validating the sensitivity of the sensor to paste plant variability. It was also used to
demonstrate the possible use of the SMB-PLS model when quantitative measurements of
the optimum pitch demand and baking furnace data will be available.
8.5 Recommendations
8.5.1 Multivariate monitoring and control
The importance and use of multi-block PLS models will increase in the future due to the
size of the databases which continuously gain in complexity and size. For prediction only,
the usual PLS models are still the most effective and simple tools to use. However, for
troubleshooting and understanding, the multi-block models become more useful. As the
number of real-time measurements increases with the development on new non-
destructive sensors, the necessity of having access to good and reliable multi-block
methods will be important.
The focus of the presented multi-block results have been on the interpretation but many
more aspect of this algorithm remains to be tested:
• Fault detection ability compared to normal PLS and MB-PLS methods
• Using the orthogonalization for better selection of multivariate specification on raw
material properties and process operating conditions
• Implementing block based monitoring or control schemes in the latent variable
space
• Use SMB-PLS for other types of problems such as batch process analysis and
monitoring, where each batch phase could be regarded as different blocks.
Therefore, the effect of batch a trajectory deviations in a specific block on the rest
of that batch could be distinguished from the within phases variations.
• Comparing different approaches for the selection of the number of latent variables.
A sequential approach was used in this thesis, but a more global approach could
180
be used by testing all possible combination of number of components for each
block.
For a monitoring strategy in the carbon plant, latent variable models should be
implemented to detect major changes in raw material properties, the coke fractions and
dry aggregate particles size distribution and in the combination of baked anode properties.
The current practice is to use univariate statistical process control (SPC) to detect
abnormal situations which is time consuming and inefficient due to the large number of
variables.
An additional research area that still has to be studied is the use of latent variable model in
optimization and control strategies. For example, a PLS model between all the paste plant
data (X) and the image features (Y) could be used to compute the necessary combination
of change to adopt in X to compensate for a deviation measured in the features from the
target Y scores. This could be a good implementation of the SMB-PLS model because X
would be a mix of variable that cannot be manipulated and other that can be changed.
8.5.2 Real-time paste quality measurement
The laboratory results have shown the possibility to measure the optimum pitch demand
for two different cokes. The sensor could not only differentiate between under pitched
versus over pitched anodes, but the combination of image features was similar for both
cokes at the optimum pitch demand. This indicates that there is an opportunity to use the
sensor in a feedback control system to adjust in real-time the amount of pitch in the paste
in response to raw material variability. Additional laboratory work is needed to test if the
paste texture features are similar at the OPD for a different formulation as well. Then, it
would enable control of the pitch % in response to changes in formulation. It is suggested
that plant trials are conducted to find the optimum pitch level based on a full pitch
optimization procedure. It will be the only way to verify that the laboratory results (i.e.
detection of the OPD) can also be repeated in the industrial paste plant.
For industrial implementation, it is necessary to develop an automatic paste sampling and
imaging device. Manual sampling could be sufficient for an off-line industrial use, but an
automatic method with a fast sampling rate for each anode produced can enable the use
of the machine vision sensor for real-time monitoring or control. Then the machine vision
control scheme could be used for many applications:
181
1. Off-line monitoring during raw material changes
2. Off-line monitoring of design of experiment variation in pitch, formulation and
process operating conditions
3. On-line monitoring of target texture
4. On-line control of the pitch ratio in the paste
5. On-line multivariate control of both the formulation and pitch % in response to raw
material variations
The success of long term implementation of the sensor will depend more on the
robustness of the automatic sampling and imaging procedure than on the analysis
algorithm itself. Representative sampling and consistent and uniform illumination are some
of the challenges of industrial imaging applications. For the anode paste machine vision,
the fumes emitted from the hot paste could be an issue for the illumination and image
quality as it may cause sticking of volatile compounds on the lights and lens surfaces.
However, devices using compressed air already exist to keep the lens surface clean. Also,
if the images cannot be obtained directly from an existing conveyor, the design of an
automatic paste sampler will need to be self cleaning to avoid paste clogging.
At last, the industrial samples collected show that the paste plant is operated on the dry
side of the optimum pitch demand. An effort should be made to quantify the cost of
consistently using anodes that are below the optimum. This would help build a case study
for the implementation of this machine vision sensor in real-time in the process. The
control of this optimum pitch demand is probably one of the most important tools needed
to face the current paste plant variability challenges.
183
Bibliography
Adams, A.N. et al., 2009. Personal Communication,
Adams, A.N., Coleman, D.E. & Blake, R.A., 2007. Personal Communication,
Adams, A.N., Mathews, J.P. & Schobert, H.H., 2002. The Use of Image Analysis for the Optimization of Pre-Baked Anode Formulation. In Light Metals 2002. TMS, pp. 547–552.
Antonini, M. et al., 1992. Image coding using wavelet transform. IEEE Transactions on Image Processing, 1(2), pp.205–220.
Azari Dorcheh, K., 2013. Investigation of the materials and paste relationships to improve forming process and anode quality. Ph.D. Thesis. Université Laval.
Azari, K. et al., 2013. Mixing variables for prebaked anodes used in aluminum production. Powder Technology, 235, pp.341–348.
Baron, J.T., McKinney, S.A. & Wombles, R.H., 2009. Coal tar pitch – past, present, and future. In Light Metals 2009. TMS, pp. 935–939.
Belitskus, D., 1993. An Evaluation of Relative Effects of Coke, Formulation, and Baking Factors on Aluminum Reduction Cell Anode Performance. Light Metals 1993, pp.677–681.
Belitskus, D., 1981. Effect of carbon recycle materials on properties of bench scale prebaked anodes for aluminum smelting. Metallurgical Transactions B, 12, pp.135–139.
Belitskus, D., 1978. Effects of Coke and Formulation Variables on Fracture of Bench Scale Prebaked Anodes for Aluminum Smelting. Metallurgical Transactions B, 9, pp.705–710.
Belitskus, D., 2013. Effects of Mixing Variables and Mold Temperature on Prebaked Anode Quality. In A. Tomsett & J. Johnson, eds. Essential Readings in Light Metals. John Wiley & Sons, Inc., pp. 328–332.
Belitskus, D. & Danka, D.J., 1988. The effects of petroleum coke properties on carbon anode quality. JOM, 40(11), pp.28–29.
Bharati, M.H., Liu, J.J. & MacGregor, J.F., 2004. Image texture analysis: methods and comparisons. Chemometrics and intelligent laboratory systems, 72(1), pp.57–71.
Bruno, L., Parla, G. & Celauro, C., 2012. Image analysis for detecting aggregate gradation in asphalt mixture from planar images. Construction and Building Materials, 28(1), pp.21–30.
Burnham, A.J., MacGregor, J.F. & Viveros, R., 1999. Latent variable multivariate regression modeling. Chemometrics and Intelligent Laboratory Systems, 48(2), pp.167–180.
184
Burnham, A.J., Viveros, R. & MacGregor, J.F., 1996. Frameworks for latent variable multivariate regression. Journal of Chemometrics, 10(1), pp.31–45.
Charmier, F., Martin, O. & Gariepy, R., 2015. Development of the AP Technology Through Time. JOM, 67(2), pp.336–341.
Chen, S.S., Keller, J.M. & Crownover, R.M., 1993. On the calculation of fractal features from images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 15(10), pp.1087 –1090.
Chui, C.K., 1992. An Introduction to Wavelets, Academic Press.
Clausi, D.A., 2002. An analysis of co-occurrence texture statistics as a function of grey level quantization. Canadian Journal of Remote Sensing, 28(1), pp.45–62.
Cross, G.R. & Jain, A.K., 1983. Markov Random Field Texture Models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-5(1), pp.25 –39.
Dayal, B.S. & MacGregor, J.F., 1997. Improved PLS algorithms. Journal of Chemometrics, 11(1), pp.73–85.
Debnath, L. & Shah, F.A., 2015. Wavelet Transforms and Their Applications, Boston, MA: Birkhäuser Boston.
Dequiedt, A.-S. et al., 2001. Study of phase dispersion in concrete by image analysis. Cement and Concrete Composites, 23(2–3), pp.215–226.
Duchesne, C., 2000. Improvement of processes and product quality through multivariate data analysis. Ph.D. Thesis. McMaster University.
Duchesne, C., 2010. Multivariate Image Analysis in Mineral Processing. In D. Sbárbaro & R. del Villar, eds. Advanced Control and Supervision of Mineral Processing Plants. Advances in Industrial Control. Springer London, pp. 85–142.
Duchesne, C., Liu, J.J. & MacGregor, J.F., 2012. Multivariate image analysis in the process industries: A review. Chemometrics and Intelligent Laboratory Systems, 117, pp.116–128.
Duchesne, C. & MacGregor, J.F., 2004. Establishing Multivariate Specification Regions for Incoming Materials. Journal of Quality Technology, 36(1), pp.78–94.
Duchesne, C. & MacGregor, J.F., 2001. Jackknife and bootstrap methods in the identification of dynamic models. Journal of Process Control, 11(5), pp.553–564.
Edwards, L. et al., 2012. Evolution of anode grade coke quality. In Light Metals 2012. TMS.
Edwards, L. et al., 2009. Use of shot coke as an anode raw material. In Light Metals 2009. TMS, pp. 985–990.
Eilertsen, J.L. et al., 1996. An automatic image analysis of coke texture. Carbon, 34(3), pp.375–385.
185
Eriksson, L. et al., 2001. Multi-and megavariate data analysis: principles and applications,
Facco, P. et al., 2010. Automatic characterization of nanofiber assemblies by image texture analysis. Chemometrics and Intelligent Laboratory Systems, 103(1), pp.66–75.
Facco, P. et al., 2009. Monitoring roughness and edge shape on semiconductors through multiresolution and multivariate image analysis. AIChE Journal, 55(5), pp.1147–1160.
Fischer, W.K. et al., 1995. Anodes for the aluminium industry, R & D Carbon Ltd.
Fischer, W.K. et al., 1993. Baking Parameters and the Resulting Anode Quality. In Light Metals 1993. TMS, pp. 683–694.
Fischer, W.K. & Perruchoud, R.C., 1985. Influence of Coke Calcining Parameters on Petroleum Coke Quality. In Light Metals 1985. TMS, pp. 811–826.
Fischer, W.K. & Perruchoud, R.C., 1991. Interdependence Between Properties of Anode Butts and Quality of Prebaked Anodes. In Light Metals 1991. TMS, pp. 721–724.
Galloway, M.M., 1975. Texture analysis using gray level run lengths. Computer Graphics and Image Processing, 4(2), pp.172–179.
García-Muñoz, S. & Carmody, A., 2010. Multivariate wavelet texture analysis for pharmaceutical solid product characterization. International Journal of Pharmaceutics, 398(1-2), pp.97–106.
Geladi, P. & Kowalski, B.R., 1986. Partial least-squares regression: a tutorial. Analytica Chimica Acta, 185, pp.1–17.
Gonzalez, R.C. & Woods, R.E., 2008. Digital Image Processing, Prentice Hall.
Gosselin, R. et al., 2009. Potential of Hyperspectral Imaging for Quality Control of Polymer Blend Films. Industrial & Engineering Chemistry Research, 48(6), pp.3033–3042.
Gosselin, R., Duchesne, C. & Rodrigue, D., 2008. On the characterization of polymer powders mixing dynamics by texture analysis. Powder Technology, 183(2), pp.177–188.
Grahn, H. & Geladi, P., 2007. Techniques and applications of hyperspectral image analysis, Chichester England; Hoboken NJ: J. Wiley.
Grégoire, F., Gosselin, L. & Alamdari, H., 2013. Sensitivity of Carbon Anode Baking Model Outputs to Kinetic Parameters Describing Pitch Pyrolysis. Industrial & Engineering Chemistry Research, 52(12), pp.4465–4474.
Grjotheim, K. & Kvande, H., 1993. Introduction to Aluminium Electrolysis: Understanding the Hall-Hérloult Process 2nd ed., Düsseldorf, Germany: Aluminium-Verlag.
186
Hanafi, M. et al., 2006. Common components and specific weight analysis and multiple co-inertia analysis applied to the coupling of several measurement techniques. Journal of Chemometrics, 20(5), pp.172–183.
Haralick, R.M., Shanmugam, K. & Dinstein, I., 1973. Textural Features for Image Classification. IEEE Transactions on Systems, Man and Cybernetics, 3(6), pp.610–621.
Hassani, S. et al., 2012. Model validation and error estimation in multi-block partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 117, pp.42–53.
Höskuldsson, A., 2008. Multi-block and path modelling procedures. Journal of Chemometrics, 22(11-12), pp.571–579.
Höskuldsson, A., 2014. Path regression models and process control optimisation. Journal of Chemometrics, 28(4), pp.235–248.
Höskuldsson, A., 1988. PLS regression methods. Journal of Chemometrics, 2(3), pp.211–228.
Höskuldsson, A. & Svinning, K., 2006. Modelling of multi-block data. Journal of Chemometrics, 20(8-10), pp.376–385.
Hulse, K.L., 2000. Anode manufacture: raw materials, formulation and processing parameters, Sierre, Switzerland: R&D Carbon Ltd.
Jentoftsen, T. et al., 2009. Correlation between anode properties and cell performance. In Light Metals. pp. 301–304.
Jones, S.S., 1986. Anode-Carbon Usage in the Aluminum Industry. In J. D. Bacha, J. W. Newman, & J. L. White, eds. Petroleum-Derived Carbons. Washington, DC: American Chemical Society, pp. 234–250.
Jørgensen, K. et al., 2004. A comparison of methods for analysing regression models with both spectral and designed variables. Journal of Chemometrics, 18(10), pp.451–464.
Jørgensen, K., Mevik, B.-H. & Næs, T., 2007. Combining designed experiments with several blocks of spectroscopic data. Chemometrics and Intelligent Laboratory Systems, 88(2), pp.154–166.
Jørgensen, K. & Næs, T., 2008. The use of LS–PLS for improved understanding, monitoring and prediction of cheese processing. Chemometrics and Intelligent Laboratory Systems, 93(1), pp.11–19.
Keller, F. & Sulger, P.O., 2008. Anode Baking: Baking of Anodes for the Aluminum Industry 2nd ed., Sierre, Switzerland: R&D Carbon Ltd.
Kohonen, J. et al., 2008. Multi-block methods in multivariate process control. Journal of Chemometrics, 22(3-4), pp.281–287.
187
Kourti, T., 2005. Application of latent variable methods to process control and multivariate statistical process control in industry. International Journal of Adaptive Control and Signal Processing, 19(4), pp.213–246.
Kourti, T. & MacGregor, J.F., 1995. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemometrics and Intelligent Laboratory Systems, 28(1), pp.3–21.
Kourti, T., Nomikos, P. & MacGregor, J.F., 1995. Analysis, monitoring and fault diagnosis of batch processes using multiblock and multiway PLS. Journal of Process Control, 5(4), pp.277–284.
Lauzon-Gauthier, J., 2011. Multivariate latent variable modelling of the pre-baked anode manufacturing process used in aluminum smelting. M.Sc. Thesis. Québec city, Canada: Laval University.
Lauzon-Gauthier, J., Duchesne, C. & Tessier, J., 2012. A Database Approach for Predicting and Monitoring Baked Anode Properties. JOM Journal of the Minerals, Metals and Materials Society, 64(11), pp.1334–1342.
Lauzon-Gauthier, J., Duchesne, C. & Tessier, J., 2013. Diagnosing Changes in Baked Anode Properties using a Multivariate Data-Driven Approach. In B. A. Sadler, ed. Light Metals 2013. John Wiley & Sons, Inc., pp. 1219–1223.
Lauzon-Gauthier, J., Duchesne, C. & Tessier, J., 2014. Texture Analysis of Anode Paste Images. In J. Grandfield, ed. Light Metals 2014. John Wiley & Sons, Inc., pp. 1123–1126.
Liu, J., 2005. Machine Vision for Process Industries: Monitoring, Control, and Optimization of Visual Quality of Processes and Products. Hamilton, Ont., Canada, Canada: McMaster University.
Liu, J.J. et al., 2005. Flotation froth monitoring using multiresolutional multivariate image analysis. Minerals Engineering, 18(1), pp.65–76.
Liu, J.J. & Han, C., 2011. Wavelet texture analysis in process industries. Korean Journal of Chemical Engineering, 28(9), pp.1814–1823.
Liu, J.J. & MacGregor, J.F., 2006. Estimation and monitoring of product aesthetics: application to manufacturing of “engineered stone” countertops. Machine Vision and Applications, 16(6), pp.374–383.
Liu, J.J. & MacGregor, J.F., 2008. Froth-based modeling and control of flotation processes. Minerals Engineering, 21(9), pp.642–651.
Liu, J.J. & MacGregor, J.F., 2005. Modeling and Optimization of Product Appearance: Application to Injection-Molded Plastic Panels. Industrial & Engineering Chemistry Research, 44(13), pp.4687–4696.
Liu, J.J. & MacGregor, J.F., 2007. On the extraction of spectral and spatial information from images. Chemometrics and intelligent laboratory systems, 85(1), pp.119–130.
188
Livens, S. et al., 1997. Wavelets for texture analysis, an overview. In , Sixth International Conference on Image Processing and Its Applications, 1997. , Sixth International Conference on Image Processing and Its Applications, 1997. pp. 581–585 vol.2.
MacGregor, J.F. et al., 1994. Process monitoring and diagnosis by multiblock PLS methods. AIChE Journal, 40(5), pp.826–838.
MacGregor, J.F. & Kourti, T., 1995. Statistical process control of multivariate processes. Control Engineering Practice, 3(3), pp.403–414.
Maillard, P., 2003. Comparing Texture Analysis Methods through Classification. Photogrammetric Engineering & Remote Sensing, 69(4), pp.357–367.
Mallat, S.G., 1989. A theory for multiresolution signal decomposition: the wavelet representation. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 11(7), pp.674 –693.
Mannweiler, U., Fischer, W.K. & Perruchoud, R.C., 2009. Carbon products: A major concern to aluminum smelters. In Light Metals 2009. TMS, pp. 909–911.
Mannweiler, U. & Keller, F., 1994. The design of a new anode technology for the aluminium industry. JOM, (46), pp.15–21.
Martens, H., 2001. Reliable and relevant modelling of real world data: a personal account of the development of PLS Regression. Chemometrics and Intelligent Laboratory Systems, 58(2), pp.85–95.
Materka, A., Strzelecki, M. & others, 1998. Texture analysis methods–a review. Technical university of lodz, institute of electronics, COST B11 report, Brussels, pp.9–11.
McHenry, E.R., Baron, J.T. & Krupinski, K.C., 1998. Development of Anode Binder Pitch Laboratory Characterization Methods. LIGHT METALS-WARRENDALE-, pp.769–778.
Menichelli, E. et al., 2014. SO-PLS as an exploratory tool for path modelling. Food Quality and Preference, 36, pp.122–134.
Næs, T. et al., 2011. Path modelling by sequential PLS regression. Journal of Chemometrics, 25(1), pp.28–40.
Nomikos, P. & MacGregor, J.F., 1995. Multivariate SPC Charts for Monitoring Batch Processes. Technometrics, 37(1), pp.41–59.
Prasad, L. & Iyengar, S.S., 1997. Wavelet Analysis with Applications to Image Processing, CRC Press.
Prats-Montalbán, J.M. et al., 2009. Prediction of skin quality properties by different Multivariate Image Analysis methodologies. Chemometrics and Intelligent Laboratory Systems, 96(1), pp.6–13.
189
Prats-Montalbán, J.M., de Juan, A. & Ferrer, A., 2011. Multivariate image analysis: A review with applications. Chemometrics and Intelligent Laboratory Systems, 107(1), pp.1–23.
Reis, M.S. & Bauer, A., 2010. Image-based classification of paper surface quality using wavelet texture analysis. Computers & Chemical Engineering, 34(12), pp.2014–2021.
Reis, M.S. & Bauer, A., 2009. Wavelet texture analysis of on-line acquired images for paper formation assessment and monitoring. Chemometrics and Intelligent Laboratory Systems, 95(2), pp.129–137.
Rioul, O. & Vetterli, M., 1991. Wavelets and signal processing. IEEE signal processing magazine, pp.14–38.
Rorvik, S., Ratvik, A.P. & Foosnaes, T., 2006. Characterisation of green anode materials by image analysis. Light metals, pp.553–558.
Sadler, B.A., 2012. Diagnosing Anode Quality Problems Using Optical Macroscopy. In Light Metals 2012. TMS, pp. 1289–1292.
Sarkar, T.K. et al., 1998. A tutorial on wavelets from an electrical engineering perspective. I. Discrete wavelet techniques. IEEE Antennas and Propagation Magazine, 40(5), pp.49–68.
Scheunders, P. et al., 1997. Wavelet-based Texture Analysis. Int. Journal of Computer Science and Information Management, Special issue on Image Processing (IJCSIM, 1.
Selvan, S. & Ramakrishnan, S., 2007. SVD-Based Modeling for Image Texture Classification Using Wavelet Transformation. IEEE Transactions on Image Processing, 16(11), pp.2688–2696.
Sinclair, K.A. & Sadler, B.A., 2006. Improving carbon plant operations through the better use of data. In Light Metals 2006. TMS, pp. 577–582.
Sinclair, K.A. & Sadler, B.A., 2009. Which strategy to use when sampling anodes for coring and analysis? Start with how the data will be used. In Light Metals 2009. TMS, pp. 1037–1041.
Smilde, A.K., Westerhuis, J.A. & de Jong, S., 2003. A framework for sequential multiblock component methods. Journal of Chemometrics, 17(6), pp.323–337.
Soh, L.-K. & Tsatsoulis, C., 1999. Texture analysis of SAR sea ice imagery using gray level co-occurrence matrices. IEEE Transactions on Geoscience and Remote Sensing, 37(2), pp.780–795.
Sonka, M., Hlavac, V. & Boyle, R., 2008. Image processing, analysis, and machine vision Third., Thompson Learning.
Srinivasan, G.N. & Shobha, G., 2008. Statistical texture analysis. In Proceedings of world academy of science, engineering and technology. pp. 1264–1269.
190
Stark, H.-G., 2005. Wavelets and signal processing: an application-based introduction, Berlin ; New York: Springer.
Sun, C. & Wee, W.G., 1983. Neighboring gray level dependence matrix for texture classification. Computer Vision, Graphics, and Image Processing, 23(3), pp.341–352.
Tabereaux, A., 2000. Prebake cell technology: A global review. JOM, 52(2), pp.23–29.
Tessier, J. et al., 2008. Estimation of alumina content of anode cover materials using multivariate image analysis techniques. Chemical Engineering Science, 63(5), pp.1370–1380.
Tessier, J., Duchesne, C., Tarcy, G.P., et al., 2011. Multivariate Analysis and Monitoring of the Performance of Aluminum Reduction Cells. Industrial & Engineering Chemistry Research, 51(3), pp.1311–1323.
Tessier, J., Duchesne, C. & Bartolacci, G., 2007. A machine vision approach to on-line estimation of run-of-mine ore composition on conveyor belts. Minerals Engineering, 20(12), pp.1129–1144.
Tessier, J., Duchesne, C. & Tarcy, G.P., 2011. Multiblock Monitoring of Aluminum Reduction Cells Performance. In S. J. Lindsay, ed. Light Metals 2011. John Wiley & Sons, Inc., pp. 407–412.
Usevitch, B.E., 2001. A tutorial on modern lossy wavelet image compression: foundations of JPEG 2000. Signal Processing Magazine, IEEE, 18(5), pp.22–35.
Valle, S., Li, W. & Qin, S.J., 1999. Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a Comparison to Other Methods†. Ind. Eng. Chem. Res., 38(11), pp.4389–4401.
Van de Wouwer, G., Scheunders, P. & Van Dyck, D., 1999. Statistical texture characterization from discrete wavelet representations. IEEE Transactions on Image Processing, 8(4), pp.592–598.
Vitchus, B., Cannova, F. & Childs, H., 2013. Calcined Coke from Crude Oil to Customer Silo. In A. Tomsett & J. Johnson, eds. Essential Readings in Light Metals. John Wiley & Sons, Inc., pp. 1–10.
Wangen, L.E. & Kowalski, B.R., 1989. A multiblock partial least squares algorithm for investigating complex chemical systems. Journal of Chemometrics, 3(1), pp.3–20.
Wavelet Toolbox Documentation, 2015. Wavelet Toolbox Documentation. Mathworks.com. Available at: http://www.mathworks.com/help/wavelet/index.html [Accessed June 28, 2015].
Westerhuis, J.A. & Coenegracht, P.M.J., 1997. Multivariate modelling of the pharmaceutical two-step process of wet granulation and tableting with multiblock partial least squares. Journal of Chemometrics, 11(5), pp.379–392.
191
Westerhuis, J.A., Gurden, S.P. & Smilde, A.K., 2000. Generalized contribution plots in multivariate statistical process monitoring. Chemometrics and Intelligent Laboratory Systems, 51(1), pp.95–114.
Westerhuis, J.A., Kourti, T. & MacGregor, J.F., 1998. Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12(5), pp.301–321.
Westerhuis, J.A. & Smilde, A.K., 2001. Deflation in multiblock PLS. Journal of Chemometrics, 15(5), pp.485–493.
Wise, B.M. & Gallagher, N.B., 1996. The process chemometrics approach to process monitoring and fault detection. Journal of Process Control, 6(6), pp.329–348.
Wold, S., 1995. Chemometrics; what do we mean with it, and what do we want from it? Chemometrics and Intelligent Laboratory Systems, 30(1), pp.109–115.
Wold, S., 1978. Cross-Validatory Estimation of the Number of Components in Factor and Principal Components Models. Technometrics, 20(4), pp.397–405.
Wold, S., Trygg, J., et al., 2001. Some recent developments in PLS modeling. Chemometrics and Intelligent Laboratory Systems, 58(2), pp.131–150.
Wold, S., Esbensen, K. & Geladi, P., 1987. Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2(1-3), pp.37–52.
Wold, S., Sjöström, M. & Eriksson, L., 2001. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58(2), pp.109–130.
Yousefian-Jazi, A. et al., 2014. Decision support in machine vision system for monitoring of TFT-LCD glass substrates manufacturing. Journal of Process Control, 24(6), pp.1015–1023.
Yue, Z.Q. & Morin, I., 1996. Digital image processing for aggregate orientation in asphalt concrete mixtures. Canadian Journal of Civil Engineering, 23(2), pp.480–489.
Zhang, J., Wang, X. & Palmer, S., 2007. Objective Grading of Fabric Pilling with Wavelet Texture Analysis. Textile Research Journal, 77(11), pp.871–879.
193
Appendix A Update of the anode properties
prediction model
Previous work presented a PLS model used for the prediction of baked anode properties
(Lauzon-Gauthier 2011; Lauzon-Gauthier et al. 2012) and investigation (Lauzon-Gauthier
et al. 2013) of process deviations. Updated results are presented in this appendix to
discuss the robustness of this monitoring approach with a long term (i.e. six years) and
real industrial dataset.
The model used as a comparison basis was computed from the dataset presented in JOM
(Lauzon-Gauthier et al. 2012). This dataset contained 708 and 375 anodes in the training
and validation set respectively and spans a period starting in February 2009 to December
2011. Since then, 965 new anodes have been collected up to July 2014. For this PLS
model, 88 X variables are used instead of the 92 in the reported results. One coke
properties and one process variables stopped being measured. Also, the anode height is
almost a direct measurement of the GAD which is in the Y dataset because the other
dimensions of the anode are fixed. It was decided to remove this variable to verify the
ability to predict the GAD without it. The variables in the Y dataset are the same as the
physical properties model in JOM.
A PLS model was computed on the training set (2/3 of the original dataset) and the
number of LV was chosen by cross-validation of 10 random subsets of observation. 9
components were chosen because it minimized the RMSEPCV of most variables. The
validation set (1/3 of the original dataset) was used to compute the prediction
performance. Table 33 presents the fit (R2Y) and prediction performance in cross-
validation (CVQ2), for the validation set (Q2Y Pred original data) and for new data (Q2Y
Pred new data) for each variables and overall.
194
Table 33 – Performance statistics of the original dataset PLS model in cross-validation, prediction of the validation set and prediction of new data
The performance of the model in prediction has been discussed in the previous work, but it
is important to observe that Q2Y on the new dataset is not adequate. Plots of residual and
Hotelling’s T2 (Figure 91) can be used to diagnose these prediction issues. The
computation of the two statistics and the 95% control limits (i.e. the red dash line) are
presented in (Kourti & MacGregor 1995).
Figure 91 – Model residuals: a) Hoteling’s T2 and b) prediction residual
Variable R2Y (%) Q
2Y CV (%)
Q2Y Pred
original data (%)
Q2Y Pred
new data (%)
GAD Green app dens 59,03 55,29 46,90 4,66
Green weight 68,30 65,01 61,29 0,04
Baked weight (mean) 81,63 79,53 83,18 3,70
Thermal cond 25,61 20,54 21,46 0,89
BAD Baked app dens 39,60 32,95 30,56 9,21
Real dens 41,93 37,74 44,91 16,76
Comp strengh 25,66 16,93 22,68 7,24
Lc 54,85 50,22 57,80 29,54
Youngs mod 29,50 21,88 17,29 11,60
Elect resis 46,93 42,37 55,27 14,87
total 47,30 42,25 44,13 9,85
0 100 200 300 400 500 600 700 800 900
102
103
Resid
ual
Anode cores
0
20
40
60
80
100
120
Hote
lling's
T2
b)
a)
C
A B
2012 2013 2014
195
Figure 91 a) presents the T2 statistics for the projection of the new observation in the
original PLS model. Except for the excursions indicated by the red arrows, the projection in
the latent variable space is normal for most observations.
Figure 91 b) presents the model residuals for the prediction of the new anode properties
(Y) from the original model. Up to observation 300 (November 2013) the residual is higher
than the limit, but it is still acceptable. The most important variations in the carbon plant
come from the raw material. If every new batch of cokes or pitch have properties different
than what was previously used in the plant, the model will always be less robust to new
data and will need periodic updating. After observation 300 the residual is very large with
some spikes indicated by the red arrow. The fact that the observations with these arrows
have both large residual and T2 is an indication of gross outliers in the data and it needs
investigation.
It is possible to compute contribution plots (Kourti & MacGregor 1995) of the residual for
each observation. These plots (Figure 92) indicate the combination of X variables that are
associated with the lack of prediction for a particular observation. Three observations
indicated by letters in Figure 91 were investigated. Observation’s A lack of prediction is
part of the small residual period of the first 300 new observations. Its contribution plot is
presented in Figure 92 a). The main contributors are coke and pitch properties that are
different than the historical dataset. Also, the fine feeder’s rotating valve speed is a
contributor since it was rebuilt in 2013 and was operated at a different set point since then.
The lack of robustness to raw material variations has been discussed earlier, but is it
possible that this issue will be less with enough historical data spanning a wide range of
possible properties combination. The contribution plot for observation B is presented in
Figure 92 b). The same fine feeder valve speed is a contributor, but this time, the major
contributors are the particles size distribution of the coke and butts fractions and the dry
aggregate. This is due the tightening of the size distribution range of the coke and butts
fractions. The span of the new operating parameters were not contained in the original
dataset and caused the deviation in prediction. Only one observation was used to illustrate
this case, but it is consistent for most observation after # 300. The last observation to be
investigated is one of the gross outliers (point C). Its contribution plot is presented in
Figure 92 c). Only one variable contributes to this lack of prediction and it is a coke
properties measured in the laboratory. The value entered in the database was 10 times
higher than its normal range. This is probably due to a manual entry error and was not
196
detected using the normal weekly average SPC monitoring of this variable. If it was
detected in a timely manner (i.e. as soon as the data are available) it could have been
retested of checked with the recorded results.
Figure 92 – Residual contribution: a) Observation A, b) Observation b and c) Observation
C of Figure 91
Due to change in raw material and process operating conditions the model was not
adequate anymore and needed to be computed again. All the available observations were
split in a training (2/3) and validation (1/3) sets. The fit and prediction statistics of this new
model are presented in Table 34.
-4
-3
-2
-1
0
1
2
3
4
5
Resid
ual C
ontr
ibution
0 10 20 30 40 50 60 70 80 90-5
0
5
10
15
20
25
30
35
Variables
Resid
ual C
ontr
ibution
-3
-2
-1
0
1
2
3
4
5
Resid
ual C
ontr
ibution
b)
a)
Coke impurity
c)
Fine feeder valve
Fine feeder valve
i
ii
Coke and pitch properties
Particles size distribution
197
Table 34 – Performance statistics of the new PLS model in cross-validation and prediction of the validation set
After the computation of a new model including the anodes up to July 2014, the prediction
performances are acceptable and similar to the previous original model.
This is a good example of monitoring the performances of a PLS model over-time. Being
able the detect changes in the correlation structure of the regressor (X) dataset is one of
the most powerful feature of the PLS algorithm.
Finally, this model should be used in real-time. In this case, this is every time a new set of
laboratory measurements is available. It can be used to monitor multiple aspects of the
process at the same time. For example, to verify if particular combination of raw material
has been used previously, to check for gross manual entry errors in the database and also
for monitoring the process operating conditions. All of the above tasks can be accomplish
using only three plots of model residuals, T2 and a contribution plot.
Variable R2Y (%) Q
2Y CV (%)
Q2Y Pred
(%)
GAD Green app dens 41,06 36,07 39,70
Green weight 37,94 31,02 40,18
Baked weight (mean) 73,60 71,99 73,45
Thermal cond 26,12 23,14 22,47
BAD Baked app dens 32,82 29,93 27,57
Real dens 42,47 40,27 44,88
Comp strengh 22,66 18,50 20,24
Lc 53,62 50,25 57,53
Youngs mod 26,25 22,85 22,05
Elect resis 37,16 33,57 44,66
total 48,12 35,76 39,27