Correction of the Colour Variations under Uncontrolled Lighting Conditions

U N I V E R S I D A D P O L I T É C N I C A D E M A D R I D

E S C U E L A T É C N I C A S U P E R I O R D E I N G E N I E R O S D ET E L E C O M U N I C A C I Ó N

T E S I S D O C T O R A L

C O R R E C T I O N O F T H E C O L O U R VA R I AT I O N SU N D E R U N C O N T R O L L E D L I G H T I N G C O N D I T I O N S

juan torres arjona

Ingeniero de Telecomunicación

2014

Juan Torres Arjona: Correction of the Colour Variations under Uncontrolled Light-ing Conditions, © 2014

supervisor:Prof. José Manuel Menéndez García

D E PA RTA M E N T O D E S E Ñ A L E S , S I S T E M A S YR A D I O C O M U N I C A C I O N E S

E S C U E L A T É C N I C A S U P E R I O R D E I N G E N I E R O S D ET E L E C O M U N I C A C I Ó N

C O R R E C T I O N O F T H E C O L O U R VA R I AT I O N S U N D E RU N C O N T R O L L E D L I G H T I N G C O N D I T I O N S

juan torres arjona

Ingeniero de Telecomunicación

2014

supervisor

Prof. José Manuel Menéndez García

Department: Departamento de Señales, Sistemas y Radiocomunicaciones

Escuela Técnica Superior de Ingenieros de Telecomunicación

Universidad Politécnica de Madrid

PhD Thesis: C O R R E C T I O N O F T H E C O L O U R VA R I AT I O N S U N D E R U N -C O N T R O L L E D L I G H T I N G C O N D I T I O N S

Author: Juan Torres Arjona

Ingeniero de Telecomunicación (UPM)

Supervisor: Prof. José Manuel Menéndez García

Doctor Ingeniero de Telecomunicación (UPM)

Year: 2014

Committee appointed by the Rector of Universidad Politécnica de Madrid on . . . . . . . . . . . . . . . . . . .

Committee: Prof. Guillermo Cisneros Pérez


Prof. Eusebio Bernabéu Martínez

Universidad Complutense de Madrid

Prof. Luis Salgado Álvarez de Sotomayor


Dr. Francisco Javier de la Portilla Muelas

Centro Superior de Investigaciones Científicas

Dr. Plinio Moreno López

Universidade de Lisboa

Dr. Luis Magdalena Layos

European Centre for Soft Computing

Dr. Marcos Nieto Doncel

Vicomtech-IK4

After the defence of the PhD Thesis on . . . . . . . . . . . . . . . . . . . , at the Escuela Técnica Superior deIngenieros de Telecomunicación, the committee agrees to grant the following grade:

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

PRESIDENT MEMBERS

SECRETARY

A B S T R A C T

This thesis discusses the correction methods used to compensate the vari-ation of lighting conditions in colour image and video applications. Thesevariations are such that Computer Vision algorithms that use colour featuresto describe objects mostly fail. Three research questions are formulated thatdefine the framework of the thesis.

The first question addresses the similarities of the photometric behaviourbetween images of dissimilar adjacent surfaces. Based on the analysis of theimage formation model in dynamic situations, this thesis proposes a modelthat predicts the colour variations of the region of an image from the varia-tions of the surrounded regions. This proposed model is called the QuotientRelational Model of Regions. This model is valid when the light sources il-luminate all of the surfaces included in the model; these surfaces are placedclose each other, have similar orientations, and are primarily Lambertian.Under certain circumstances, a linear combination is established betweenthe photometric responses of the regions. Previous work that proposed sucha relational model was not found in the scientific literature.

The second question examines whether those similarities could be usedto correct the unknown photometric variations in an unknown region fromthe known adjacent regions. A method is proposed, called Linear Correc-tion Mapping, which is capable of providing an affirmative answer underthe circumstances previously characterised. A training stage is required todetermine the parameters of the model. The method for single camera sce-narios is extended to cover non-overlapping multi-camera architectures. Tothis extent, only several image samples of the same object acquired by allof the cameras are required. Furthermore, both the light variations and thechanges in the camera exposure settings are covered by correction mapping.

Every image correction method is unsuccessful when the image of theobject to be corrected is overexposed or the signal-to-noise ratio is very low.Thus, the third question refers to the control of the acquisition process toobtain an optimal exposure in uncontrolled light conditions. A Camera Ex-posure Control method is proposed that is capable of holding a suitableexposure provided that the light variations can be collected within the dy-namic range of the camera.

Each one of the proposed methods was evaluated individually. The me-thodology of the experiments consisted of first selecting some scenarios thatcover the representative situations for which the methods are theoreticallyvalid. Linear Correction Mapping was validated using three object re-identi-fication applications (vehicles, faces and persons) based on the object colourdistributions. Camera Exposure Control was proved in an outdoor parkingscenario. In addition, several performance indicators were defined to objec-tively compare the results with other relevant state of the art correction andauto-exposure methods. The results of the evaluation demonstrated that theproposed methods outperform the compared ones in the most situations.

Based on the obtained results, the answers to the above-described researchquestions are affirmative in limited circumstances, that is, the hypothesis ofthe forecasting, the correction based on it, and the auto exposure are feasiblein the situations identified in the thesis, although they cannot be guaranteed

vii

in general. Furthermore, the presented work raises new questions and sci-entific challenges, which are highlighted as future research work.

viii

R E S U M E N

Esta tesis trata sobre métodos de corrección que compensan la variación delas condiciones de iluminación en aplicaciones de imagen y video a color.Estas variaciones hacen que a menudo fallen aquellos algoritmos de visiónartificial que utilizan características de color para describir los objetos. Seformulan tres preguntas de investigación que definen el marco de trabajode esta tesis.

La primera cuestión aborda las similitudes que se dan entre las imágenesde superficies adyacentes en relación a su comportamiento fotométrico. Enbase al análisis del modelo de formación de imágenes en situaciones dinámi-cas, esta tesis propone un modelo capaz de predecir las variaciones de colorde la región de una determinada imagen a partir de las variaciones de lasregiones colindantes. Dicho modelo se denomina Quotient Relational Modelof Regions. Este modelo es válido cuando: las fuentes de luz iluminan to-das las superficies incluídas en él; estas superficies están próximas entre síy tienen orientaciones similares; y cuando son en su mayoría lambertianas.Bajo ciertas circunstancias, la respuesta fotométrica de una región se puederelacionar con el resto mediante una combinación lineal. No se ha podidoencontrar en la literatura científica ningún trabajo previo que proponga estetipo de modelo relacional.

La segunda cuestión va un paso más allá y se pregunta si estas similitudesse pueden utilizar para corregir variaciones fotométricas desconocidas enuna región también desconocida, a partir de regiones conocidas adyacentes.Para ello, se propone un método llamado Linear Correction Mapping capazde dar una respuesta afirmativa a esta cuestión bajo las circunstancias carac-terizadas previamente. Para calcular los parámetros del modelo se requiereuna etapa de entrenamiento previo. El método, que inicialmente funcionapara una sola cámara, se amplía para funcionar en arquitecturas con variascámaras sin solape entre sus campos visuales. Para ello, tan solo se nece-sitan varias muestras de imágenes del mismo objeto capturadas por todaslas cámaras. Además, este método tiene en cuenta tanto las variaciones deiluminación, como los cambios en los parámetros de exposición de las cá-maras.

Todos los métodos de corrección de imagen fallan cuando la imagen delobjeto que tiene que ser corregido está sobreexpuesta o cuando su relaciónseñal a ruido es muy baja. Así, la tercera cuestión se refiere a si se puedeestablecer un proceso de control de la adquisición que permita obtener unaexposición óptima cuando las condiciones de iluminación no están contro-ladas. De este modo, se propone un método denominado Camera ExposureControl capaz de mantener una exposición adecuada siempre y cuando lasvariaciones de iluminación puedan recogerse dentro del margen dinámicode la cámara.

Los métodos propuestos se evaluaron individualmente. La metodologíallevada a cabo en los experimentos consistió en, primero, seleccionar al-gunos escenarios que cubrieran situaciones representativas donde los méto-dos fueran válidos teóricamente. El Linear Correction Mapping fue validadoen tres aplicaciones de re-identificación de objetos (vehículos, caras y per-sonas) que utilizaban como caracterísiticas la distribución de color de éstos.Por otra parte, el Camera Exposure Control se probó en un parking al aire libre.Además de esto, se definieron varios indicadores que permitieron comparar

ix

objetivamente los resultados de los métodos propuestos con otros métodosrelevantes de corrección y auto exposición referidos en el estado del arte.Los resultados de la evaluación demostraron que los métodos propuestosmejoran los métodos comparados en la mayoría de las situaciones.

Basándose en los resultados obtenidos, se puede decir que las respuestas alas preguntas de investigación planteadas son afirmativas, aunque en circun-stancias limitadas. Esto quiere decir que, las hipótesis planteadas respecto ala predicción, la corrección basada en ésta y la auto exposición, son factiblesen aquellas situaciones identificadas a lo largo de la tesis pero que, sin em-bargo, no se puede garantizar que se cumplan de manera general. Por otraparte, se señalan como trabajo de investigación futuro algunas cuestionesnuevas y retos científicos que aparecen a partir del trabajo presentado enesta tesis.

x

A mi primo Carlos

"Sólo el que sabe es libre y más libre el que más sabe.No proclaméis la libertad de volar, sino dad alas."

"La verdadera ciencia enseña, por encima de todo,a dudar y a ser ignorante."

— Miguel de Unamuno (1864–1936)

It’s time to fly up to the sky.Against the wind, rise on fire.

I see that everything wasn’t wrong.I’m back again, nothing was lost.

Breathing new life, embrace my fate.Look around and believe again.

— Angelus Apatrida (Reborn)

A G R A D E C I M I E N T O S

Después de una época sombría, llena de descuidos y de dudas, por finllego a mi particular renacimiento; mi vuelta a la luz. Pero no ha sido solocosa mía, puesto que en esta vida hay muy pocas cosas que uno puedahacer solo. Por eso me gustaría dar las gracias a todas aquellas personasque han estado a mi lado durante todo este tiempo. Durante casi diez años,mi vida ha girado prácticamente alrededor de esta Tesis y hasta ahora me heencontrado con más penurias que alegrías. Por eso no puedo restringirme aagradecimientos profesionales, porque este trabajo ha sido muy personal ymuchos son los que me han tendido la mano, cada uno a su manera. Perome tendréis que perdonar, porque han sido tantas que será difícil expresarmi agradecimiento a todas las personas que se lo merecen.

En primer lugar, yo no hubiera llegado aquí si no fuera por José Manuel.Ya son muchos años, desde que empezamos a ver más alla de lo visiblehacia el comienzo de siglo. Durante este tiempo, he aprendido mucho de ti,me has ayudado a crecer profesionalmente y has facilitado que consiguieraterminar. ¡Gracias!

También tengo que agradecer al resto de gente con la que empecé en estaaventura que se llama GaTV. Fede, Carlos Alberto, David, Nuria, Iago yUsoa, que aunque ya no estés en el grupo, para mí eres una más. Nos hacostado mucho llegar hasta aquí, aunque no todos saben apreciarlo comodebieran. Aprendimos juntos lo que era un grupo de investigación, y hemosluchado mucho por sacarlo adelante. Gracias a ello he tenido la fortuna dedesarrollarme como profesional y poder sacar esta Tesis adelante.

No puedo dejar también de estar agradecido a la gente del grupo de Intel-ligent Imaging de TNO (Países Bajos) dirigidos por Ronald Kersten, dondedesarrollé mi estancia en el extranjero, y que me trataron fenomenalmente.En gran medida son culpables de que haya terminado la Tesis. Especial-mente: Klamer, Henri, jullie hebben ervoor gezorgd dat ik mijn enthousiasme vooronderzoek heb teruggevonden. Van jullie heb ik weer geleerd kritisch na te denken.Jullie hebben me ook laten zien hoe belangrijk het is hier plezier mee te hebben. Auke,ondanks dat we elkaar nog niet zo lang geleden hebben leren kennen en tot verschil-lende werelden behoren, zijn we heel snel goede vrienden geworden. Je zorgt ervoordat de dingen makkelijk gaan. Je altijd aanwezige lach is aanstekelijk en vrolijktiedereen op. Blijf zoals je bent!. I have also in mind Ninghang. My intern-ship wasvery friendly thanks to you.

Uno de los motivos que me impulsaron a doctorarme fue lo que disfruté ylo satisfecho que me sentí durante mi proyecto fin de carrera. Por fin vi quecon la formación que tuve durante la carrera se podían hacer cosas útiles,que para eso había estudiado. Pero en gran parte fue también gracias a lagente del Instituto del Patrimonio Cultural de España (entonces Instituto delPatrimonio Histórico Español) con los que colaborábamos: Marian, Miriam,

xiii

Maca, Tomás y Araceli. Esto me ha llevado a granjearme una buena amistadcon ellos. Pero ha sido Araceli con la que he mantenido una relación másespecial. Ya intentaste advertirme de lo difícil y poco recompensado que eradoctorarse. A pesar de ello, siempre me has apoyado y ayudado. Eres unpoco como una madre para mí y he aprendido mucho gracias a ti, cosas quehe podido aplicar a mi trabajo aquí, pero también a mi vida personal.

No me puedo olvidar de Carol. Aunque nuestros caminos no tengan elmismo destino, has sido una gran compañera de viaje, que soportaste todaslas dificultades que entrañaba esta aventura con paciencia.

También se merecen mi agradecimiento mis primos de Albacete: Gabri,Jorge, Paquito, Fran, Wences,. . . Siempre habéis confiado en mis capacidadesy, aunque no haya conseguido aún un puesto en la NASA, me respetáis. Nome puedo olvidar de Ruth, nos conocemos desde hace muchos años y, apesar de la distancia, siempre he sentido tu apoyo, consejo y comprensión.

La vida a veces te da sorpresas gratas e inesperadas. Ángel ha sido uno deesos casos raros. Conectamos desde el principio. Ser moteros, compartir esabendita pasión y forma de vida, fue solo un motivo inicial para conocernosmejor. Rápidamente, nos dimos cuenta que compartimos muchos puntos devista sobre la vida y nos entendemos a la perfección. Además, tú y Lorenatenéis un corazón que no cabe en toda Villanueva. Desde que nos conocimos,unos cuantos años ya, me habéis apoyado y alentado. Me habéis servido depaño de lágrimas y me habéis dado alas en los momentos más bajos. Y nosé si será porque los moteros somos gente especial, pero algo parecido meha pasado con Morgan. Ya sabes que tenemos pendiente celebrar esto conalgún GP y unas buenas rutas.

Por muy lejos que se llegue, uno no puede olvidar de dónde viene. Miorigen es muy humilde y serrano. Y eso te forja un carácter determinadodel que me siento orgulloso. Afortunadamente, sigo teniendo muchos lazosque me unen a él. No puedo olvidarme de mis amigos más antiguos, con losque he crecido y sé que tengo a mi lado cuando los necesito: Miguel y Samu.También tengo que agradecer todos estos años juntos a Raúl, que siempreme ha apreciado, apoyado y querido. Y por supuesto, tengo mucho queagradecer a Elisa, una hermana para mí y que me concedió una de las cosasde las que estoy más orgulloso: ser padrino de su hijo Álvaro. Tu apoyo hasido siempre incondicional y me has ayudado a levantarme muchas veces,¡gracias!

A los míos, Kike, José, Ainhoa, Marga, Lucía y Silvia, que habéis estadoen todo momento ahí. Me aprecíais tal y como soy, para lo bueno y paralo malo, y sabíais que al final terminaría esto. Ahora solo falta compartilocon vosotros, allá donde estéis. Porque no me habéis fallado y extraño cadamomento que hemos pasado juntos. No os podéis imaginar todo lo que oslo agradezco. No tengo palabras.

Y el último agradecimiento es para mi familia: tíos, primos, padres y her-mana. Os lo debo todo y soy quien soy gracias a vosotros. Papá, mamá, conel tiempo he sabido apreciar lo valiosa que es la educación que me habéisdado. Os tengo que agradecer también los valores que me habéis transmi-tido: siempre me habéis mostrado que con trabajo, honradez, humildad ysacrificio se puede conseguir casi cualquier cosa. Nina, aunque eres la her-mana pequeña, siempre me has protegido y has tratado de llevarme por elbuen camino; supongo que porque nunca has confiado mucho en mi buenjuicio. Pero en el fondo siempre me has apoyado, en todas y cada una delas decisiones que he tomado, aunque no las entendieras. Y por último ostengo que pedir perdón. Injustamente sois los que más me habéis sufrido

xiv

durante este tiempo. Sin embargo, habéis sido capaces de aguantar mis cam-bios de humor y muchas veces mi egoísmo, y sé que no ha sido nada fácil.Os quiero.

A todos, os llevo en mi corazón y os estaré eternamente agradecido.

Juan

xv

C O N T E N T S

i dissertation 1

1 introduction 3

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Research questions and dissertation outline . . . . . . . . . . . 6

2 image formation models 9

2.1 Light reflectance models . . . . . . . . . . . . . . . . . . . . . . 10

2.2 The photometric perspective of a camera . . . . . . . . . . . . 12

2.2.1 Light control module . . . . . . . . . . . . . . . . . . . . 15

2.2.2 Sensor chip . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.3 Digital signal processor . . . . . . . . . . . . . . . . . . 18

2.2.4 Models used in radiometric calibration . . . . . . . . . 19

2.3 Colour understanding . . . . . . . . . . . . . . . . . . . . . . . 20

2.4 Dynamic model . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3 prediction of the intensity variation of a region 25

3.1 Quotients relational model of regions . . . . . . . . . . . . . . 25

3.1.1 Single region and single light source . . . . . . . . . . 28

3.1.2 Photometric independent regions . . . . . . . . . . . . 30

3.1.3 Intercorrelated regions . . . . . . . . . . . . . . . . . . . 30

3.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.3 Solution methods . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.3.1 Outliers management . . . . . . . . . . . . . . . . . . . 35

3.3.2 Multicollinearity . . . . . . . . . . . . . . . . . . . . . . 35

3.3.3 Positive regressors . . . . . . . . . . . . . . . . . . . . . 36

3.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.4.1 Statistics definition . . . . . . . . . . . . . . . . . . . . . 37

3.4.2 Evaluation strategy . . . . . . . . . . . . . . . . . . . . . 39

3.4.3 Terrace dataset . . . . . . . . . . . . . . . . . . . . . . . 40

3.4.4 MUCT dataset . . . . . . . . . . . . . . . . . . . . . . . 55

3.4.5 Parking dataset . . . . . . . . . . . . . . . . . . . . . . . 56

3.4.6 MCDL dataset . . . . . . . . . . . . . . . . . . . . . . . . 72

3.4.7 Results discussion . . . . . . . . . . . . . . . . . . . . . 96

3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4 photometric correction 99

4.1 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.2 Proposed method in single cameras . . . . . . . . . . . . . . . 103

4.2.1 Scenes modelling and training mode . . . . . . . . . . 104

4.2.2 Runtime mode . . . . . . . . . . . . . . . . . . . . . . . 105

4.3 Proposed method in non-overlapping cameras . . . . . . . . . 106

4.3.1 Training mode . . . . . . . . . . . . . . . . . . . . . . . 106

4.3.2 Runtime mode . . . . . . . . . . . . . . . . . . . . . . . 107

4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

4.4.1 Performance indicators . . . . . . . . . . . . . . . . . . 108


4.4.3 Implementation of the reference methods . . . . . . . . 111

4.4.4 Vehicle re-identification in outdoor parking . . . . . . 112

4.4.5 Face re-identification . . . . . . . . . . . . . . . . . . . . 114

4.4.6 People re-identification in corridors . . . . . . . . . . . 115

xvii

xviii contents

4.4.7 Results discussion . . . . . . . . . . . . . . . . . . . . . 123

4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

5 exposure control during video acquisition 127

5.1 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

5.1.1 The measurement of the light . . . . . . . . . . . . . . . 129

5.1.2 The processing of the indicators . . . . . . . . . . . . . 130

5.1.3 The actuation of the camera . . . . . . . . . . . . . . . . 131

5.2 Proposed method . . . . . . . . . . . . . . . . . . . . . . . . . . 133

5.2.1 Control variables . . . . . . . . . . . . . . . . . . . . . . 133

5.2.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 135

5.2.3 Actuation . . . . . . . . . . . . . . . . . . . . . . . . . . 139

5.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

5.3.1 Performance indicators . . . . . . . . . . . . . . . . . . 141


5.3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

5.3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 145

5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

6 conclusions 151

6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

6.1.1 Dynamic Image Formation Model . . . . . . . . . . . . 151

6.1.2 Quotient Relational Model of Regions . . . . . . . . . . 152

6.1.3 Linear Correction Mapping . . . . . . . . . . . . . . . . 152

6.1.4 Camera Exposure Control . . . . . . . . . . . . . . . . . 153

6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

ii appendices 157

a notation conventions 159

b changes in the albedo within a flat surface 163

c image datasets description 165

c.1 Terrace dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

c.2 Parking dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

c.3 MCDL dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

c.4 MUCT dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

d extended results 173

d.1 Terrace dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

d.2 Parking dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

d.3 MCDL dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

bibliography 227

L I S T O F F I G U R E S

Figure 1.1 The Machine output display from the TV series Personof Interest . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Figure 1.2 Example of pictures obtained under different photo-metric conditions . . . . . . . . . . . . . . . . . . . . . 4

Figure 2.1 Example of a HDR processing . . . . . . . . . . . . . . 9

Figure 2.2 The IFM schema . . . . . . . . . . . . . . . . . . . . . . 10

Figure 2.3 Reflection diagram over a Lambertian-specular surface 11

Figure 2.4 Principle of the pinhole camera . . . . . . . . . . . . . 13

Figure 2.5 The camera pipeline . . . . . . . . . . . . . . . . . . . 14

Figure 2.6 The diagram of the modules of the Marlin colourcamera . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Figure 2.7 The lens pipeline . . . . . . . . . . . . . . . . . . . . . 15

Figure 2.8 The sensor chip pipeline . . . . . . . . . . . . . . . . . 17

Figure 2.9 The DSP pipeline . . . . . . . . . . . . . . . . . . . . . 18

Figure 2.10 The colour processing unit pipeline . . . . . . . . . . 21

Figure 2.11 The colour temperature chart . . . . . . . . . . . . . . 22

Figure 2.12 A colour gamut example . . . . . . . . . . . . . . . . . 23

Figure 3.1 Image formation process with two regions . . . . . . 26

Figure 3.2 Diagram of the albedo in two points of a flat surface 26

Figure 3.3 CDF of Gres,iGii

distribution . . . . . . . . . . . . . . . . . 29

Figure 3.4 Gres,iGii

function . . . . . . . . . . . . . . . . . . . . . . . . 31

Figure 3.5 Terrace dataset samples . . . . . . . . . . . . . . . . . . 41

Figure 3.6 Distribution of qf vs qbi per band using RAW pic-

tures (Terrace dataset) . . . . . . . . . . . . . . . . . . . 42

Figure 3.7 qf vs qf for RAW images of Terrace dataset. LS re-gression . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Figure 3.8 Residuals histogram per band of the RAW pictures ofTerrace dataset . . . . . . . . . . . . . . . . . . . . . . . 46

Figure 3.9 Distribution of qf vs qbi per band using JPEG pic-

tures (Terrace dataset) . . . . . . . . . . . . . . . . . . . 47

Figure 3.10 Residuals histogram per band of the JPEG pictures ofTerrace dataset . . . . . . . . . . . . . . . . . . . . . . . 50

Figure 3.11 Distribution of qf vs qbi per band using the γ-JPEG

pictures (Terrace dataset) . . . . . . . . . . . . . . . . . 51

Figure 3.12 qf vs qf for the γ-JPEG images of Terrace dataset . . . 52

Figure 3.13 Residuals histogram per band of the γ-JPEG picturesof Terrace dataset . . . . . . . . . . . . . . . . . . . . . 54

Figure 3.14 Face samples of MUCT database . . . . . . . . . . . . 55

Figure 3.15 Regions of MUCT database . . . . . . . . . . . . . . . 55

Figure 3.16 Distribution of qf vs qb1 per band of MUCT dataset . 57

Figure 3.17 qf vs qf for MUCT dataset . . . . . . . . . . . . . . . 59

Figure 3.18 Residuals histogram of MUCT dataset . . . . . . . . . 60

Figure 3.19 Regions and locations of the Parking scene . . . . . . 61

Figure 3.20 Distribution of qf vs qb1 per band of the location #1

of Parking dataset . . . . . . . . . . . . . . . . . . . . . 62

Figure 3.21 qf vs qf for location #1 of Parking dataset . . . . . . . 63

Figure 3.22 Residuals histogram of location #1 of Parking dataset.LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

xix

xx List of Figures

Figure 3.23 Residuals histogram of location #1 of Parking dataset.LSM-R . . . . . . . . . . . . . . . . . . . . . . . . . . . 67





Figure 3.26 qf vs qf for location #3 of Parking dataset . . . . . . . 71

Figure 3.27 MCDL dataset samples. Scene #1 . . . . . . . . . . . . 73

Figure 3.28 MCDL dataset samples. Scene #2 . . . . . . . . . . . . 73

Figure 3.29 Regions and locations of MCDL dataset . . . . . . . . 74

Figure 3.30 Distribution of qf vs qbi per band (Camera #1 - Loca-

tion #1 in MCDL dataset) . . . . . . . . . . . . . . . . 75

Figure 3.31 qf vs qf for camera #1 and location #1 of MCDL dataset 77


Figure 3.31 Residuals histogram per band of camera #1 and loca-tion #1 of MCDL dataset . . . . . . . . . . . . . . . . . 80












Figure 4.1 Types of correction methods . . . . . . . . . . . . . . . 100

Figure 4.2 Indoor surveillance scene modelling . . . . . . . . . . 105

Figure 4.3 Example of intraclass and interclass normalised Cu-mulative Histograms . . . . . . . . . . . . . . . . . . . 110

Figure 4.4 Diagram of the evaluation strategy . . . . . . . . . . . 111

Figure 4.5 ROC curve for Parking scene . . . . . . . . . . . . . . . 113

Figure 4.6 People samples of MUCT dataset . . . . . . . . . . . . 114

Figure 4.7 ROC curve for MUCT dataset . . . . . . . . . . . . . . 116

Figure 4.8 People samples in MCDL dataset . . . . . . . . . . . . 117

Figure 4.9 Example of person correction in MCDL dataset . . . 118

Figure 4.10 Region setup used by the Liu et al.’s algorithm . . . . 118

Figure 4.11 Example of forced segmentation errors for the training 119

Figure 4.12 MCDL dataset. ROC curve for camera #1 . . . . . . . 119

Figure 4.13 MCDL dataset. ROC curve for camera #2 . . . . . . . 121

Figure 4.14 MCDL dataset. ROC curve for both cameras . . . . . 123

Figure 5.1 Comparison between an old and a modern camera . 127

Figure 5.2 Robert Cornelius’ self-portrait (1839) . . . . . . . . . . 128

Figure 5.3 The auto-exposure algorithms workflow . . . . . . . . 129

Figure 5.4 Variations of light metering patterns . . . . . . . . . . 130

Figure 5.5 Inverted T pattern for light metering. . . . . . . . . . . 130

Figure 5.6 Programmed exposure modes in Photography . . . . 132

Figure 5.7 Example of AE control algorithm . . . . . . . . . . . . 132

Figure 5.8 Indicators in intensities distribution . . . . . . . . . . 134

Figure 5.9 The CEC flowchart . . . . . . . . . . . . . . . . . . . . 136

List of Figures xxi

Figure 5.10 Estimate of the variation of the exposure function inB vs Ef curve . . . . . . . . . . . . . . . . . . . . . . . . 138

Figure 5.11 The timeline of the CEC evaluation . . . . . . . . . . . 142

Figure 5.12 Visual comparison of AE methods . . . . . . . . . . . 144

Figure 5.13 Comparison of the evolution of the bright and con-trast in a sunny day with clouds alternation . . . . . 145

Figure 5.14 Comparison of the evolution of the WSL and BSL ina sunny day with clouds alternation . . . . . . . . . . 147

Figure 5.15 Comparison of the evolution of the bright and con-trast during unstable weather conditions . . . . . . . 148

Figure 5.16 Comparison of the evoltuion of the WSL and BSLduring unstable weather conditions . . . . . . . . . . 149

Figure B.1 Diagram of the albedo in two points of a flat surface 163

Figure C.1 Terrace dataset samples . . . . . . . . . . . . . . . . . . 165

Figure C.2 Regions of the Terrace scene . . . . . . . . . . . . . . . 166

Figure C.3 Parking dataset samples . . . . . . . . . . . . . . . . . 167

Figure C.4 Regions and locations of the Parking scene . . . . . . 167

Figure C.5 MCDL dataset samples. Scene #1 . . . . . . . . . . . . 168

Figure C.6 MCDL dataset samples. Scene #2 . . . . . . . . . . . . 168

Figure C.7 People samples in MCDL dataset . . . . . . . . . . . . 170

Figure C.8 Regions and locations of MCDL dataset . . . . . . . . 171

Figure C.9 Face samples of the MUCT database . . . . . . . . . . 172

Figure C.10 Regions of the MUCT database . . . . . . . . . . . . . 172

Figure D.1 qf vs qf for RAW images of Terrace dataset . . . . . . 174

Figure D.3 Residuals histogram per band of the RAW pictures ofTerrace dataset (ext.) . . . . . . . . . . . . . . . . . . . . 176

Figure D.4 qf vs qf for JPEG images of Terrace dataset . . . . . . 177

Figure D.6 Residuals histogram per band of the JPEG pictures ofthe Terrace dataset (ext.) . . . . . . . . . . . . . . . . . 179

Figure D.7 qf vs qf for γ-JPEG images of Terrace dataset . . . . . 180

Figure D.8 Residuals histogram per band of the γ-JPEG picturesof Terrace dataset (ext.) . . . . . . . . . . . . . . . . . . 181

Figure D.9 qf vs qf for location #2 of Parking dataset . . . . . . . 185

Figure D.10 Residuals histogram of location #2 of Parking dataset.LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

Figure D.11 Residuals histogram of location #2 of Parking dataset.LSM-R . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Figure D.12 Residuals histogram of location #3 of Parking dataset.LSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Figure D.13 Residuals histogram of location #3 of Parking dataset.LSM-R . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Figure D.14 Residuals histogram per band of camera #1 and loca-tion #1 of MCDL dataset (ext.) . . . . . . . . . . . . . 191

Figure D.15 qf vs qf for camera #1 and location #2 of MCDL da-taset (ext.) . . . . . . . . . . . . . . . . . . . . . . . . . 192

Figure D.16 Residuals histogram per band of camera #1 and loca-tion #2 of MCDL dataset . . . . . . . . . . . . . . . . . 193

Figure D.18 Distribution of qf vs qbi per band (Camera #1 - Loca-


Figure D.19 qf vs qf for camera #1 and location #3 of MCDL dataset196




Figure D.25 Distribution of qf vs qbi per band (Camera #2 - Loca-




Figure D.28 qf vs qf for camera #2 and location #3 of MCDL dataset205


L I S T O F TA B L E S

Table 3.1 Correlation between BG regions in Terrace dataset . . 41

Table 3.2 Regressors estimates for RAW pictures of Terrace da-taset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Table 3.3 Regressions statistics for the RAW pictures of the Ter-race scene. Red Band . . . . . . . . . . . . . . . . . . . 44

Table 3.4 Regressors estimates for JPEG pictures of Terrace dataset 48

Table 3.5 Regressions statistics for the JPEG pictures of the Ter-race scene. Red Band . . . . . . . . . . . . . . . . . . . 48

Table 3.6 Regressors estimates for γ-JPEG pictures of Terracedataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Table 3.7 Regressions statistics for the γ-JPEG pictures of theTerrace scene. Red Band . . . . . . . . . . . . . . . . . . 53

Table 3.8 Mean and variance of the normalised MSE result ofthe LSM over γ . . . . . . . . . . . . . . . . . . . . . . 53

Table 3.9 Regressors estimates of the MUCT dataset . . . . . . 56

Table 3.10 Regressions statistics for MUCT dataset . . . . . . . . 58

Table 3.11 Regressors estimates for location #1 of Parking dataset 61

Table 3.12 Regressions statistics for location #1 of Parking scene 64



Table 3.15 Correlation between BG regions in MCDL dataset (Cam-era #1 - Location #1). Green band . . . . . . . . . . . . 76

Table 3.16 Regressors estimates for location #1 of camera #1 ofMCDL dataset. Green band . . . . . . . . . . . . . . . 76

Table 3.17 Regressions statistics for location #1 of camera #1 ofMCDL dataset. Green band . . . . . . . . . . . . . . . 79




xxii

List of Tables xxiii











Table 4.1 Error rate of correction in Parking scene . . . . . . . . 112

Table 4.2 SaROC for Parking scene . . . . . . . . . . . . . . . . . 113

Table 4.3 Error rate of correction in MUCT dataset . . . . . . . 115

Table 4.4 SaROC for MUCT dataset . . . . . . . . . . . . . . . . 115

Table 4.5 Error rate of cameras correction in MCDL dataset . . 120

Table 4.6 SaROC for MCDL dataset . . . . . . . . . . . . . . . . 122

Table 5.1 Tuning parameters for the CEC algorithm . . . . . . . 135

Table 5.2 The values for the CEC settings used in the evaluation 143

Table 5.3 Mean and standard deviation of the control variables 146

Table 5.4 Rate time of unsuitable exposure and average dis-tance to maximum contrast . . . . . . . . . . . . . . . 146

Table A.1 Mathematical syntax . . . . . . . . . . . . . . . . . . . 160

Table A.2 Physical magnitudes . . . . . . . . . . . . . . . . . . . 161

Table A.3 Symbol definitions and models . . . . . . . . . . . . . 162

Table D.1 Regressions statistics for the RAW pictures of the Ter-race scenario. Green Band . . . . . . . . . . . . . . . . 182

Table D.2 Regressions statistics for the RAW pictures of the Ter-race scenario. Blue band . . . . . . . . . . . . . . . . . 182

Table D.3 Regressions statistics for the JPEG pictures of the Ter-race scenario. Green band . . . . . . . . . . . . . . . . 183

Table D.4 Regressions statistics for the JPEG pictures of the Ter-race scenario. Blue band . . . . . . . . . . . . . . . . . 183

Table D.5 Regressions statistics for the γ-JPEG pictures of theTerrace scenario. Green band . . . . . . . . . . . . . . . 183

Table D.6 Regressions statistics for the γ-JPEG pictures of theTerrace scenario. Blue band . . . . . . . . . . . . . . . . 184

Table D.7 Estimated regressors for location #2 of Parking dataset 190

Table D.8 Estimated regressors for location #3 of Parking dataset 190

Table D.9 Correlation between BG regions in MCDL dataset (Cam-era #1 - Location #1). Red band . . . . . . . . . . . . . 209

Table D.10 Correlation between BG regions in MCDL dataset (Cam-era #1 - Location #1). Blue band . . . . . . . . . . . . . 209

xxiv List of Tables

Table D.11 Estimated regressors for location #1 of camera #1 ofMCDL dataset. Red band . . . . . . . . . . . . . . . . 209



Table D.12 Estimated regressors for location #1 of camera #1 ofMCDL dataset. Blue band . . . . . . . . . . . . . . . . 210

Table D.13 Regressions statistics for location #1 of camera #1 ofMCDL dataset. Red band . . . . . . . . . . . . . . . . 210

Table D.14 Regressions statistics for location #1 of camera #1 ofMCDL dataset. Blue band . . . . . . . . . . . . . . . . 211




Table D.18 Regressors estimates for location #2 of camera #1 ofMCDL dataset. Green band . . . . . . . . . . . . . . . 212




















Table D.43 Regressors estimates for location #3 of camera #2 ofMCDL dataset. Green band . . . . . . . . . . . . . . . 221




L I S T O F A L G O R I T H M S

Algorithm 5.1 Camera Exposure Control . . . . . . . . . . . . . . . . 137

Algorithm 5.2 Additional functions of Camera Exposure Control . . 138

Algorithm 5.3 Exposure variation functions of Camera Exposure Con-trol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

xxv

A C R O N Y M S

ADC Analog Digital Converter

AE Automatic Exposure

AVI Audio Video Interleave

AWB Automatic White Balance

BG background

BRDF Bidirectional Reflectance Distribution Function

BSL Black Saturation Level

BTF Brightness Transfer Function

CCD Charge-Coupled Device

CDF Cumulative Distribution Function

CEC Camera Exposure Control

CEF Conditional Expectation Function

CFA Colour Filter Array

CHD Cumulative Histogram Distribution

CIE Commission Internationale de l’Eclairage

CMF Colour Matching Function

CMOS Complementary Metal Oxide Semiconductor

CRF Camera Response Function

CRT Cathodic Ray Tube

DIFM Dynamic Image Formation Model

DMT Diagonal-Matrix Transform

DSLR Digital Single Lens Reflex

DSP Digital Signal Processor

EMD Earth Mover’s Distance

EMoR Empirical Model of Response

EV Exposure Value

FG foreground

FoV Field of View

GigaE Gigabit Ethernet

HDR High Dynamic Range

xxvi

acronyms xxvii

HSV Hue, Saturation, Value

HVS Human Visual System

ICCM Inter Camera Correction Mapping

ICCR Inter Camera Colour Response

ICM Illumination Correction Mapping

IFM Image Formation Model

IIC Inverse–Intensity Chromaticity

JPEG Joint Photographic Experts Group

LCM Linear Correction Mapping

LS Least Squares

LSM-R LS RR Method

LuT Look-up Table

MCDL Multi–Camera Dynamic Light

MEC Minimal Error Criterium

MERC Mean value of the Explanatory regions and ResidualsCovariance

MKV Matroska Multimedia Container

MPEG Moving Picture Experts Group

MR Mean value of the Residuals

MRC Mean value of the Residuals Covariance

MSE Mean Squared Error

NLS Non-Linear Squares

NR Noise range

OER Overexposure range

PLS Partial Least Squares

PCA Principal Components Analysis

QRMR Quotient Relational Model of Regions

RANSAC RANdom SAmple Consensus

RGB Red, Green and Blue

ROC Receiver Operator Characteristic

RoI Region of Interest

RR Robust Regression

RRF Radiometric Response Function

SaROC Surface above the Receiver Operator Characteristic

xxviii acronyms

SIFT Scale-Invariant Features Transform

SLR Single Lens Reflex

SNR Signal Noise Ratio

SoA State of the Art

ST Saturation Tolerance

SVR Standard deviation of the Variances of the Residuals

USB Universal Serial Bus

WB White Balance

WSL White Saturation Level

Part I

D I S S E RTAT I O N

1I N T R O D U C T I O N

Computer Vision systems obtain information from images. These images aregenerated by cameras, which are theoretically able to provide informationsimilar to that perceived by the human eye1. While cameras have existed forcenturies, it was not until the 1990s that Computer Visions algorithms arosedue to the rapid evolution of the capabilities of computers. Currently, thenumber of cameras is growing at a very high rate due to their lower cost andtheir powerful potential applications. Therefore, the possibilities of the useof cameras are uncountable: surveillance; traffic and environment monitor-ing; entertainment and commercial information systems; and so on. As pre-cursors of future technologies, TV series and movies highlight a large num-ber of good examples of such future applications of cameras (Figure 1.1).Nevertheless, science fiction is actually far from reality in this field.

Figure 1.1: The Machine output display from the TV series Person of Interest. The Ma-chine is a mass surveillance computer system capable of extracting andinferring any type of information from any surveillance device (based onPedia of Interest2).

On the one hand, Computer Vision systems are not usually linked toeach other; therefore, the required information to enable a holistic approachis sometimes missing. Although camera networks are broadly deployedaround the world for a diverse set of purposes [99] and all types of im-ages and videos are shared on the Internet, access to the sources of contentis usually limited. Nevertheless, the problem is likely not scientific [48], butis commercial, social, or ethical in nature. Commercial interests and privacyissues set boundaries that must be respected. In addition, artificial intel-ligence algorithms intend to infer information in the manner that humanbeings do. Nevertheless, such algorithms still require large amount of devel-opment [33].

The great diversity of possibilities when acquiring an image of objectsgenerates dissimilarities in their appearance. Such dissimilarities involve the

1 Considering only cameras operating on the visible range.2 Pedia of Interest, Machine Point of View photos (consulted 08/2014)http://personofinterest.wikia.com/

Copyright of the original photographer or company and used under Fair Use as covered byWikia’s licensing.

3

http://personofinterest.wikia.com/

4 introduction

camera, scenario, poses, occlusions, illumination, and so on. This thesis ad-dresses the problem of the variations that are produced in the images of thesame objects caused by differences in the photometric3 conditions.

1.1 motivation

(a) Initial exposure. (b) Different white balance.

(c) Reduced exposure. (d) Increased exposure.

Figure 1.2: Example of pictures obtained under different photometric conditions bymodifying the camera exposure settings. (a) Image taken using the correctexposure and colour. (b) Image taken with change in the illuminant of thescene, which produces colour variations. (c) and (d) show a change in theexposure; the effect is similar to a decrease and increase, respectively, inthe intensity of the light source.

The appearance of an object within an image is determined by the waythat the camera captures the interaction between light and the object. Chan-ges in the photometric conditions affect the appearance of the object (Fig-ure 1.2). This change in appearance is aligned with the observation of Fin-layson et al. [30]: "in [ . . . ] tasks, such as object recognition, to digital photography,it is important that the colours recorded by a device are constant across a change inthe scene illumination. As an illustration, consider using the colour of an object asa cue in a recognition task. Clearly, if such an approach is to be successful, then thiscolour must be stable across illumination change. Of course it is not stable, sincechanging the colour of the illumination changes the colour of the light reflectedfrom an object." Thus, the photometric conditions influence the first stage ofany Computer Vision system: the acquisition stage. If the appearance of thesame object in multiple images is very different and this difference is notproperly corrected, then the performance of the remaining stages (such assegmentation, classification, recognition, etc.) could not be guaranteed.

3 Photometry is the science of the measurement of the light visible to the Human Visual Sys-tem (HVS). Although some cameras capture information from non-visible wavelengths, weonly account for images captured in the visible spectrum. Nevertheless, throughout this thesisphotometric as well as radiometric magnitudes are used. For more information regarding thedifferences between Radiometry and Photometry see [53, Chapter 2].

1.2 objectives 5

The possible photometric variations are due to changes in:

• The light sources: their intensity, number and type of illuminants.

• The position, orientation and pose of the objects, camera and lightsources.

• The shape of the objects.

• The photometric response of the camera: exposure settings or the cam-era properties.

Thus, maintaining the same appearance of objects in several acquisitions isquite difficult to achieve.

Two types of approaches are used to correct the effects of photometricvariations: i) absolute methods, and ii) relative methods.

Absolute methods do not require any reference from the previous cap-tured images. Absolute methods base the correction on the image itself andon some information of the scenario. Nevertheless, the effectiveness of suchmethods is often limited because it is difficult to define a general methodthat functions under any situation. In addition, absolute methods are some-times based on calibration stages, in which some knowledge of the illumi-nants is required; otherwise, the calibration is difficult to implement in areal environment.

Relative methods use a suitable photometric condition as reference, whichis established as a target. Next, the variation of the image related to thisreference is estimated and then properly corrected. The complexity of thesemethods comes from the definition of the suitable reference and the measureof the variation.

1.2 objectives

The objectives of this thesis are described in the title of this dissertation:Correction of the Colour Variations under Uncontrolled Lighting Conditions. Themain goal of this thesis is focused on studying and proposing image correc-tion methods. Each correction method consists of changing an undesirablephotometric behaviour with a suitable one. Thus, we seek the photometricinvariant responses of the captured objects. These methods address colourvariations, which involve changes in the spectral response of the lights andthe surfaces that produce variations in the pixel intensities. Furthermore,these methods must operate under uncontrolled lighting conditions. In thescope of this thesis, uncontrolled lighting conditions mean that several is-sues related to the lighting of the scene are unknown. More specifically,previous knowledge of the type of light sources and their intensity is notexpected. The same lack of previous knowledge is true of the type of colourdistributions of the surfaces and their type. Furthermore, the position andorientation of the elements of the scene is also unknown. The exceptionsare the non-moving regions of the Field of View (FoV) of the camera, whichare analysed to tune up the methods. This analysis results in the correctionbeing applied to scenes taken by static cameras, or at least those that havethe same FoV.

Both the absolute and the relative correction methods are considered. Forall cases, simple calibration stages and fast and low-cost computational al-gorithms are achieved.

6 introduction

1.3 research questions and dissertation outline

To accomplish the exposed objectives, two main problems arise:

1. The variation of the photometric responses of the objects cannot becomputed because those that should be corrected are unknown.

2. The desired information of the objects could be irrecoverable fromtheir image intensity if the camera sensor does not receive enoughlight or does receive too much light.

Fortunately, during the acquisition process of static cameras, a large re-gion of the captured image most likely does not change; therefore, its inten-sity variation can be calculated. The existence of this region depends mainlyon the scenario and the application. For example, in surveillance applica-tions, in which a large area of coverage of the cameras is usually required,large zones of the FoV where nothing moves are likely found. In addition, itseems reasonable to think that, under certain circumstances, the light vari-ation of objects illuminated by the same light sources is similar. Therefore,the non-moving regions can be used to estimate the photometric variationsof the unknown target objects. Furthermore, current cameras often have pro-gramming interfaces that allow for control of the acquisition process. Thus,these interfaces can be used to avoid undesirable captured images.

The previous reasoning raises the following research questions:

a) Can we forecast the changes in the intensity of a region using variationsof the surrounded regions?

b) Can we use the previous forecasting results to properly correct the pho-tometric variations of unknown objects in a camera? Can we extend thecorrection to a multi-camera architecture?

c) Can we ensure a well-exposed image capture by controlling its acquisi-tion process? If so, under which conditions?

To the best of our knowledge, the first two issues have not been proposedbefore. Although the third issue has been tackled by other authors, whoproposed automatic methods that control the camera exposition and thecolour management, it remains an unsolved question.

The structure of this dissertation is established to answer the previousquestions. First, Chapter 2 defines an Image Formation Model (IFM), whichis used in the remainder of the dissertation. To achieve this definition, acomplete study of the image acquisition pipeline is performed.

The next three chapters answer the three research questions. Each chap-ter starts with a problem statement. A review of the State of the Art (SoA)methods related to this problem follows. Going far beyond the current work,we establish a proposal that addresses the problem. This proposal is laterevaluated and, based on the generated results, analysed.

Chapter 3 addresses the photometric relationships between adjacent re-gions and proposes a method that forecasts the variation of the intensity ofa region using the adjacent ones. This method works under certain circum-stances; the boundaries that are used are also provided.

In Chapter 4, we propose an algorithm based on the previous analysisthat is capable of correcting the photometric variations in unknown objectsin a non-overlapping multi-camera architecture.

1.3 research questions and dissertation outline 7

Chapter 5 tackles the photometric correction from a different perspective.The algorithms analysed in this chapter control the acquisition process tohold an optimal exposure, avoiding under- and over-exposed captures.

Finally, the last chapter (Chapter 6) establishes some conclusions that pro-vide a reasoned response to the research questions while identifying thescientific contributions of this dissertation. In addition, this chapter identi-fies the future work that would extend these contributions; the research linesidentified in this thesis are outside of the scope of the original objectives.

2I M A G E F O R M AT I O N M O D E L S

(a) Darker image. (b) Lighter image. (c) Human-like perception.

Figure 2.1: The human eye and the cameras observe the world differently. Camerasensors are more limited as regards photometric sensitivity. (a) and (b) areimages of the same scene obtained under different exposure settings. (c)is an image similar to the image perceived by a human. Under-exposedpixels in (a) and over-exposed pixels in (b) are correctly exposed in (c)using an image processing technique called HDR.

The images that humans perceive like a source of information of the worlddepends on how the light interacts with it. Cameras are devices that capturethese interactions and transform the incoming light into electronic signals.These signals, accordingly processed and presented, represent a limited ver-sion of what the eyes can see1. This limitation reduces the suitable photomet-ric conditions where a picture is well-exposed, although an eye can correctlyperceived the scene (Figure 2.1). Computer Vision algorithms are also less ro-bust under light variations than the human brain. For those algorithms thatuse several samples of the same object (tracking, 3D reconstruction, recog-nition, re-identification, mosaicing, etc.), light changes are still a problemfurther to be solved.

Before thinking of designing the Computer Vision algorithms that han-dle light variations, it is important to study the image formation process(Figure 2.2). To form a picture of a scene from the visible spectrum, at leastone light source is required2. The emitted light hits the objects and, depend-ing on the properties of their surfaces, is reflected, refracted or scatteredgetting different alterations. The produced changes define the surfaces andthe scene geometry. These interactions are described via multiple models,usually called light reflectance models. They are introduced in Section 2.1.

1 For our purpose, this limitation is mainly related to the dynamic range. In the HVS, the dy-namic range is several orders of magnitude greater than in any digital camera. Nevertheless, inaspects like sensitivity to spectral bands, digital cameras provide information that HVS cannotperceive [105, 109]. Using the camera obscure fundamentals, cameras are even able to extractinformation not directly seen in the scene [103].

2 Except for some kind of fluorescent surfaces that are not addressed in this thesis.

9

10 image formation models

O

no

Ld

LoEs

Le

ns

ne

nv

S

C

nc

B

Figure 2.2: The image formation process. The radiation coming from light source S

is reflected by the object O, creating an image of it that is captured by thecamera C that forms a digital image B.

When the light of a scene reaches a digital camera, it goes through a lensthat transforms the light into an electronic signal in the camera sensor. Thecamera implements several processing algorithms before the final digitalimage is formed. Section 2.2 addresses this process. Its knowledge enablesto understand how the light variations affects the intensity values of theimage.

Colour is a visual perception that deserves special attention. Such as anyperception, colour is subject to multiple interpretations. Indeed, a part ofthe visual Psychophysics science [26] is in charge of studying the relationsbetween the physical measurements of the stimulus and the sensations thatthey produce. A colour image includes more information than a grey scaleimage. The processing of this extra information and its subjective natureincrease the complexity of the IFMs. Section 2.3 provides some insight oncolour understanding.

Considering the goal of the correction of the light variations, in Section 2.4we define a dynamic IFM that accounts the variations produced in the digitalimage when the light conditions change.

2.1 light reflectance models

The knowledge of the illumination and the design of the lighting is deter-minant for the successful of any image capture process. In this thesis, therelevant issue is that any light source emits an electromagnetic radiationcomposed by photons that interacts with the matter. This radiation is de-fined by its irradiance E [53, Section 2.3], as the power of electromagneticradiation received per unit area. The magnitude of the irradiance generallyvaries along wavelength. When the irradiance hits a surface, another radia-tion is emitted into a given solid angle. This radiation is called radiance L.The magnitude of the radiance generally varies along wavelength and dep-ends on the irradiance and the surface properties. The function that quanti-

2.1 light reflectance models 11

tatively describes the ratio between the incident irradiance to the reflectedradiance of a surface is called Bidirectional Reflectance Distribution Func-tion (BRDF). The BRDF is function of the incident and the reflected directions.

There are three types of surfaces depending on the types of reflections ofthe light:

1. Lambertian: reflects the light uniformly in all directions. The outgoinglight is called diffuse or scattered.

2. Specular: reflects the light in one direction following the laws of reflec-tion.

3. Fluorescent: emits photons by an external radiation that excites its ions,molecules, or atoms.

Most of the real world surfaces are a combination of Lambertian andspecular types. Fluorescent surfaces are not common and are not addressedin this thesis. Refracted light is also produced when the light goes throughthe surface. This effect is typical of the changes of the transmission mediumand is not addressed neither in this thesis.

ρo

no

Ld

EsLe

ns

ne

nv

θd θe

Figure 2.3: Radiation components and angles in the reflection phenomenon over acombination of Lambertian and specular surface.

The Lambertian surfaces takes its name from Lambert [62] who formu-lated the Lambert’s cosine law in 1760. The law says that the outgoinglight from a diffuse surface is directly proportional to the irradiance Es (Fig-ure 2.2) in the input and the cosine of the angle θd (also called albedo orforeshortening factor) between the surface normal no and the light source di-rection ns (Figure 2.3). This law may be expressed regarding the radianceLd as:

Ld = Es ρo cos θd = Es ρo ns · no (2.1)

where ρo is the BRDF, constant for all directions for these surfaces. The BRDF

is also called surface reflection coefficient or photometric response. In 1975, Phong[82] introduced an additive diffuse term to this equation caused by the ambi-ent light created by the inter-reflections. This term is called Phong’s shading.Nevertheless, unless the irradiance is low, the value of this term is oftennegligible.

The knowledge about the specular reflections are even older than the dif-fuse reflections. Hero of Alexandria (AD 10–70) [44] set the basis of the lawsof reflection. This type of reflection is also called highlight. There are severalmodels for the specular reflection. Zhang et al. [121] provided a survey of


the most representatives. Phong’s model is one of the most popular in theliterature. This model represents the specular reflection as powers of thecosine of the angle θe between the viewing direction nv and the speculardirection ne (Figure 2.3):

Le = Es ρo (cos θe)kep (2.2)

where θe = cos−1(nv · ne). The specular direction can be expressed as ne =

2 (no · ns) · no − ns. As regards kep, more specular surfaces yields larger expo-nents.

Torrance and Sparrow [104] (1967) presented a more accurate model basedon the idea of surfaces composed by randomly distributed mirrors, whichcan be modelled via a Gaussian function:

Le = Es ρo e−(ketθe)

2(2.3)

where θe is the same as Equation 2.2 and ket has a similar interpretationthan kep.

In the common surfaces, those composed of diffuse and specular proper-ties, the combined radiance is the weighted sum of both components:

L = Kd Ld + (1−Kd)Le (2.4)

with 0 Kd 1 depending on the surface.Considering Equation 2.3, Equation 2.4 yields:

L = Es ρo(Kd cos θd + (1−Kd) e

−(ketθe)2)

(2.5)

In Computer Vision, the specular reflections are undesirable because theyoften produce over-exposed local pixels of the camera sensor even thoughmost of the pixels are well-exposed. As Section 2.2 explains, these reflectancemodels turn invalid for over-exposed pixels.

In 1998, Wolff et al. [117] added other types of reflections produced inrough specular surfaces. He therefore improved the classic reflectance mod-els by separating the diffuse models for rough and smooth surfaces. Thedifference between them is basically that the power of the rough surfacereflection is not equally distributed in every direction; rather, it can be seensuch as a lobule oriented in the specular direction, having less magnitudethan the specular component.

Other approaches use these models for rendering purposes, e. g., via mul-tiplexing the components of several light sources [91]. In Section 2.4, we usesimilar concepts to collect several light contributions at the same time.

The study of the lightning and the reflectance models is overall importantin Computer Graphics. However, this introduction is sufficient in this thesisbecause we are more interested on how the light that reaches a camera isthan how the interactions between light and the surfaces can be simulated.Further reading and discussion can be found in [53, Chapter 2] and [98,Chapter 2].

2.2 the photometric perspective of a camera

The camera fundamentals are known since the ancient Chinese [77] andGreeks [12]. The pinhole camera (or camera obscura) that they used is still areference model in Computer Vision nowadays. The pinhole camera modelassumes that the light rays come through a small aperture and are projected

2.2 the photometric perspective of a camera 13

Figure 2.4: Principle of the pinhole camera. The light from the real world comesthrough a small hole and forms an inverted image (based on Wikipedia3).

in a screen forming an inverted image of the outside (Figure 2.4). The hu-man eye follows the same operation as the pinhole camera. Stockham [97](1972) studied this parallelism between the representation of images withina camera and the HVS. More real models account that the light traces morecomplex paths inside the lens. These models are mainly relevant in Pho-togrammetry, which requires reliable measurements of the distances withinthe images. Due to our interest is only on the light intensities, we use thesimpler pinhole camera principle.

Until Computer Vision arose, cameras and image processing techniqueswere designed to get suitable representations of the real world for the HVS.The applications capable of extracting information from the images, beyondthe visual perception, may change the suitable way that the images areformed and processed. The evolution of the surveillance cameras is an exam-ple of this change. At the beginning, these cameras were used for monitor-ing remote locations under the supervision of human beings. Thus, whena surveillance camera network was designed, the visual perception of thehighest number of locations was often the priority. As a consequence, thesecameras implemented compression algorithms, which degraded the qualityof the images but increased the number of monitored locations and min-imised the requirements of the bandwidth. These cameras had also low costsensors that provided understandable but noisy images and whose dynamicrange was very limited. When Computer Vision advancements enabled toextract valuable information for surveillance tasks, such as the presence ofintruders or the detection of abnormal behaviours, these cameras becameinappropriate. Thus, better sensors and low rate compression algorithmswere built within the cameras. This change implies that the photometricrequirements, and also the adopted IFMs, depend on the camera and the im-age processing technique. In this thesis, we make assumptions that are notvalid for every situation. Thus, we clarify the validity of each establishedassumption.

Depending on its functionality and usability, several types of camerasare identified. Besides the mentioned surveillance cameras, there are: pro-fessional digital cameras for Computer Vision, TV cameras, Digital SingleLens Reflexs (DSLRs), compact cameras, camcorders, webcams, and so on. Afurther reading is shown in [52, Chapter 8].

3 Wikipedia, Pinhole camera (consulted 01/2014)http://en.wikipedia.org/wiki/Pinhole_camera

http://en.wikipedia.org/wiki/Pinhole_camera


Cameras provide images or videos. However, nowadays, the most of thecameras provide both. In this thesis, a video is a set of temporally consecu-tive images.

The photometric camera response is commonly known as Camera Re-sponse Function (CRF). Radiometric calibration is the Computer Vision areain which the goal is to estimate the CRF (also known as Radiometric Re-sponse Function (RRF)) that maps the irradiance at the camera input to pixelintensities. We revisit the models that are used by the techniques that com-pute the CRF in Section 2.2.4. The techniques themselves are analysed inSection 4.1. Before that, we explain the architecture of a camera and thephysical phenomenons produced inside.

Light control

Lens

Exposurecontrol

Sensor chip

Photosensor Amplifier ADC

Digital Signal Processor

ColourProcessor

Unit

Signaladaptation

CompressionImage

wrapping

Lo Ei Schip,i

B(u, v)

Figure 2.5: The camera pipeline.

In Figure 2.5, we depict a diagram of the general camera pipeline basedon the models developed by Healey and Kondepudy [43], Jacobson et al.[52], Tsin et al. [111] and Szeliski [98]. Neither every camera has all themodules nor the modules are presented in this order, but every possiblemodule usually fits this model. The model is composed by the followingparts:

light control . The part composed by the lens and the mechanism ofthe control of the exposure (Section 2.2.1).

sensor chip. In charge of collecting the photons and transform them intoa digital signal (Section 2.2.2).

digital signal processor (dsp). It adjusts the digital signal to the de-sired image about colour, quality, and so on (Section 2.2.3).

In Figure 2.6, the camera pipeline of a commercial camera for ComputerVision (model Marlin AVT) is shown as a real example. The diagram practi-cally includes every module of Figure 2.5 except the lens, which is a sepa-rate component; and the compression module, which does not exist in thistype of cameras. The shading correction module corrects the fixed patternnoise (Section 2.2.2). Modules 4, 11–13 correspond to colour operations (Sec-tion 2.3). The diagram also includes some modules that are particular im-plementations (modules 7–10). For a complete explanation of each module,see [1].


Figure 2.6: The diagram of the modules of the Marlin colour camera (source [1]).

2.2.1 Light control module

Lens(ηlens)

Exposure control(T , N)

GD VIG VIG

Lo Esensor

GD: Geometric Distortion; VIG: Vignetting

Figure 2.7: The lens pipeline.

The first element is the lens. This element guides the radiance from thereal world Lo to the photoplane. The radiance is attenuated a factor of ηlens(optical transmittance). Geometric distortion is introduced in the lens due


to physical effects and manufacturing imperfections [106]. Because the ge-ometric distortion does not produce any photometric aberration, it is notaddressed in this thesis. On the contrary, the photometric behaviour of thecamera is changed by chromatic aberrations (Section 2.3) and vignetting ef-fects. Vignetting consists in the radiance falls off in the edges of the sensor.Besides the lens, other causes also produce vignetting; rather, they are notaddressed in this thesis. The optical transmittance is therefore constant inthe whole sensor. For further information on these causes and techniques tocorrect the vignetting, see [98, Chapter 2 & 10] and [55, 10, 38, 56].

The incoming light is managed via the exposure control. Basically, the expo-sure control consists in opening and closing the iris diaphragm (or aperturestop) during the exposure time T . This control has similar function to the irisin the human eye and it is usually adjustable to several aperture sizes. Theaperture is usually defined via the f-number or relative aperture N, which isexpressed as:

N =f

D(2.6)

f focal length

D aperture diameter

2.2.2 Sensor chip

Figure 2.8 shows the sensor chip pipeline. The irradiance at the central pointof the sensor is provided by integrating the radiance from all angles withinthe solid angle Ωc subtended by the lens aperture:

Esensor = ηlens

∫

Ωc

Lo(Ωc) cos θcdΩc (2.7)

θc angle with respect to the sensor plane normal nc (Figure 2.2)

If the vignetting is neglected, the irradiance expression is equal for thewhole sensor. For Lambertian surfaces, Lo(Ωc) does not depend on the solidangle. For specular surfaces, the irradiance depends on the specular ne andviewing nv directions (Equations 2.2–2.3). Regardless, this term is a constantcontribution to irradiance. In both cases, Equation 2.7 can be rewritten to [53,Chapter 4]:

Esensor = ηlens π1

1+N2Lo (2.8)

Assuming a pinhole camera, and no vignetting, there is therefore a linearrelationship between the radiance of a surface and the irradiance at thecamera sensor.

The sensor of the present cameras is composed by a photosensor array,which provides an electronic signal function of the number of photons col-lected. The photosensor consists of light sensitive elements called pixels,which are arranged in one or two dimensions4. Nowadays, the two maintypes of sensors are the Charge-Coupled Devices (CCDs) and the Comple-mentary Metal Oxide Semiconductors (CMOSs). We do not go into detail ofthese technologies because the output signal has the same expression for

4 Called linear or area sensors, respectively.


Photosensor(dP, Q)

FPNDCN(Ndc)

ShotNoise(Ns)

Amplifier

out

in

roff

g

ReadNoise(Nr)

ADC

QN(Nq)

out

in

BrawEi Inoisy,i Sgain,i Schip,i

FPN: Fixed Pattern Noise; DCN: Dark Current Noise; QN: Quantization Noise

Figure 2.8: The sensor chip pipeline.

both types. In Section 2.3, some dissimilarities in the colour sensors are de-scribed. For each pixel i, the number of generated electrons Ii are:

Ii = T Pd

∫

λEsensor,i(λ)Q(λ)dλ (2.9)

T exposure time

Pd pixel size

Esensor,i(λ) spectral irradiance at sensor pixel i5

Q(λ) camera sensor sensitivity

Equation 2.9 says that the number of electrons in each pixel is the inte-gration of the irradiance arriving at each pixel per wavelength and filteredby the camera sensor sensitivity. Q(λ) defines the response of the sensorto transform the energy of the incident light into electrons as a function ofthe wavelength. Due to imperfections in the manufacturing process, not ev-ery pixel has equal response. This yields a fixed pattern noise that, due to itsstationarity, is easy to characterise and thus eliminate. Then, assuming thatvignetting is neglected, the value of Q(λ) is constant for all pixels.

In this model we assume that: i) the irradiance is homogeneous at eachpixel area, and ii) the number of electrons in each pixel is independent of theremaining pixels. The first assumption is valid if the pixel size is sufficientsmall, what it is usually the case. The second assumption does not accountthe blooming effect that is produced in older CCDs when the light intensitysaturates the stored capacity of one pixel overflowing to its neighbours. Inthis case, this also would violate the linear relationship between the numberof collected electrons and the irradiance for the saturated pixel. Henceforth,we neglect these pixels, and their neighbours.

Healey and Kondepudy [43], and Tsin et al. [111] modelled the influenceof the noise sources in the image formation process:

Inoisy,i = Ii +Ndc +Ns (2.10)

Ndc accounts the dark current noise due to the random generation of elec-trons and holes in semiconductors. This noise is proportional to the expo-sure time T . This noise also increases with the temperature. Ns is the shotnoise due to the quantum nature of light. The shot noise follows a zero meanPoisson distribution in which its variance depends on Ii.

5 Conceptually the irradiance and the spectral irradiance are similar, but the adjective involves aspectral distribution of the magnitude.


After the photosensor, there is a signal adaptation. This adaptation usu-ally consists of an amplification (gain control g) plus an offset (bright controlroff ), what it becomes an affine transformation:

Sgain,i = g (Inoisy,i +Nr + roff ) (2.11)

The amplifier of this step generates read noise Nr, which is independent ofIi and has zero mean.

Till this step, the generated signal is analogue. The Analog Digital Con-verter (ADC) converts the analogue signal to a digital signal via quantizingits magnitude. According to [43], under reasonable assumptions, this pro-cess can be modelled such as the addition of a quantisation noise source Nq,which has a zero mean uniform probability distribution. The higher is thenumber of bits to represent the digital image, the less influence this noisehas. Considering the signal is limited, the overload noise is neglected.

Finally, from Equations 2.9–2.11, the sensor chip module provides a digitalimage whose pixels have the value:

Schip,i = Sgain,i +Nq

Schip,i = g(T Pd

∫λ Esensor,i(λ)Q(λ)dλ+Ndc +Ns +Nr + roff

)+Nq

(2.12)

In 2007, Withagen et al. [115], based on experimental evaluation, con-cluded that the contribution of the dark current, read noise and quantisa-tion noise can be neglected for larger intensity values compared to the shotnoise. Thus, Equation 2.12 can be simplified to:

Schips,i = g(T Pd

∫

λEsensor,i(λ)Q(λ)dλ+Ns + roff

)(2.13)

At this point, the camera has created a two dimensional raw digital imageBraw, which can be stored, transmitted or processed for improvement:

Braw(u, v) =Schip,i, ∀i ∈ Z : 1 i M

(2.14)

u, v spatial location in the digital image

M number of sensor pixels

2.2.3 Digital signal processor

Colour ProcessingUnit

Compression Image wraping

Signal adaptation

out

in

BrawB

Figure 2.9: The DSP pipeline.

Unless some specific cases, such as professional photographers that de-sire a complete control of the digital processing using appropriate externalsoftware, the DSP processes the raw image internally. The processes imple-mented in this module are usually complex and depends on the manufac-turer. Figure 2.9 shows the most common modules. For colour cameras, the


first module is usually a colour processing unit. This module is described indetail in Section 2.3.

The next module is a general digital processing unit in charge of adapt-ing the image to distinct requirements. For example, most of the camerasimplement a gamma corrrection6. This correction consists of a power functionγ originally used to adapt the image to the response of the old Cathodic RayTube (CRT) monitors (Equation 2.15). This non-linearity usually produces de-viations regarding the simpler linear IFMs. Again, perceptual and signal pro-cessing objectives are opposed. Fortunately, several techniques [27, 38, 50]detect and correct this non-linearity.

Bgamma(u, v) = Bγraw(u, v) (2.15)

Other example is the Look-up Table (LuT), an array that translates each in-put pixel value through any programmable transfer function. The LuT cantherefore implement the correction of non-linearities rapidly because it doesnot perform input/output operations but an array indexing operation.

The next module is the compression module in charge of reducing the sizeof the images regarding file storage. There are multiple techniques, lossyand lossless. The most common compression formats are Joint PhotographicExperts Group (JPEG) for images and Moving Picture Experts Group (MPEG)family for videos [94]. As regards Computer Vision, the main disadvantageof these techniques is that lossy algorithms degrade the quality of the imageand can make invalid the adopted IFM.

Finally, the image is stored in an internal memory or transmitted to anexternal device. The generated images are wrapped according to how theyare going to be managed. For example, if the images should be providedas video files, they are encapsulated in a video format (Matroska Multime-dia Container (MKV), Audio Video Interleave (AVI), and so on). The imagesalso may be transmitted via an standard communication protocol (CameraLink, IEEE 1394, Gigabit Ethernet (GigaE), Universal Serial Bus (USB), and soon [24]).

2.2.4 Models used in radiometric calibration

In the previous sections, particular forms of CRF are explained. The targetproblem of obtaining this function, is formulated such as:

B(u, v) = crf(Ecam(u, v)) (2.16)

Ecam = Esensorηlens

irradiance at the camera input

Grossberg and Nayar [40] (2004) formalised the space of CRFs by deter-mining the constraints that every function must accomplish. Thus, they for-mulated the following hypotheses.

Hypothesis 2.1. The CRF is the same for every pixel.

Hypothesis 2.2. The range of the pixel intensities is limited by two values.

Hypothesis 2.3. The CRF is monotonic.

6 Sometimes, this module is analogue and is integrated in the sensor chip.


Assuming these hypotheses, the irradiance values are then estimated viathe inverse function of the CRF Ψ:

Ecam(u, v) = crf−1(B(u, v)) = Ψ(B(u, v)) (2.17)

Our interest is not to recover the irradiance but to establish photometricrelations between objects from their images (Chapter 3) for what we basedour techniques on this knowledge.

Multiple types of functions model the CRF. The simplest model consists inconsidering it as linear, assuming that only linear processing is performedin the camera pipeline. Although it may be enough for raw images and it isassumed by multiple approaches (such us several shape from shading tech-niques [23]), this is not true, specially due to the DSP module. Before Gross-berg and Nayar, the first approaches used gamma models (B = α+ E

γcam)

(Mann and Picard [73] (1995)) or non parametric (Debevec and Malik [20](1997)). Mitsunaga and Nayar [76] (1999) realised that the response of severalreal cameras may be described such as a high-order polynomial. Followingthese models, Tsin et al. [111] used a Taylor’s series to approximate the func-tion. Based on their own statements, Grossberg and Nayar [40] defined anEmpirical Model of Response (EMoR) based on Principal Components Anal-ysiss (PCAs), which generalises the models of Mann and Picard and Debevecand Malik.

Most of CRFs refer to monochromatic cameras; but they are often easy toextend to colour ones.

2.3 colour understanding

Colour is a perception based on the spectral response to the visible light,its emission, and how it is reflected or refracted by the objects. This thesisconcerns the effects of the colour variations related to the digital imageformation. The goal is to understand how colour information can benefit toComputer Vision techniques.

Nothing changes about the descriptions in Section 2.1. The radiance andthe irradiance have an spectral distribution. The colour perception is basedon how the camera captures and processes this distribution. The humaneye is considered sensitive to wavelengths between 360nm and 830nm [93,Chapter 1], although this range is approximated, given that it is not equal foreverybody. According to the trichromacy, based on the three kinds of conesof the eye, "it is possible to produce a colour match for a given stimulus [ . . . ]by using only combinations of light from three light sources" [93, Section 1.4].The complete visible spectrum can be therefore represented by tristimulusvalues using Colour Matching Functions (CMFs). Most of the colour systemsuse the primary colours Red, Green and Blue (RGB)7.

The lens introduces a chromatic aberration due to the index of refrac-tion of the glasses, which depends on the wavelength [98, Section 2.2.3].Nevertheless, this effect can be neglected for most of modern lens becausemanufacturers introduce mechanisms that reduce it. Before the sensor, thecamera usually has a photopic filter that adapts the incoming light to thehuman eye response.

Multiple technologies can collect the colour information in the sensor. Themost extended is to use a single image sensor overlaid with a mosaic pattern

7 Nevertheless, the response of the three types of human cones are not correlated with the RGBbands.

2.3 colour understanding 21

of colours, called Colour Filter Array (CFA). Each pixel is sensitive to onlyone colour spectral band (RGB). Equation 2.9 is transformed to:

Ii,k = T Pd

∫

λEsensor,i(λ)Qk(λ)dλ (2.18)

Ii,k number of electrons in pixel i of colour k

Qk(λ) camera sensor sensitivity of colour k

Colour spacetransformation

Demosaicing Edge sharpening White balance

rgB rGb rgB rGb

rgB rGb rgB rGb

rgB Rgb rgB Rgb

rgB Rgb rgB Rgbout

in

Braw B ′raw

Figure 2.10: The colour processing unit pipeline.

This technology requires demosaicing algorithms, which interpolate themissing colour values using the ones of the adjacent pixels. This technique isimplemented in the colour processing unit (Figure 2.10 and the module 11 inFigure 2.6). The most important technology is based on the Bayer pattern [6],in which there is a predominance of the green-sensitive pixels due to thegreater responsivity of the cones in this wavelength. Novel techniques takeadvantage of spatial as much as spectral information [65]. In this line, Gaoet al. [34] (2012) defined a general sparsity-based framework and solve thedemosaicing issue using PCAs.

The next module of the colour processing unit is usually an edge sharpen-ing algorithm to compensate for blurring effects produced by the lens, filtersand CCD’s effects, and to provide a sharper image [93, Section 12.7.2]. Fromthe photometric point of view, this module is irrelevant.

When the colour temperature of the light source is not white (around5000K), the white surfaces are not observed having this colour. To addressthis deviation, a White Balance (WB) is required to adjust the RGB values tothe right white reference. Otherwise, surfaces would appear blueish, yellow-ish or reddish (Figure 2.11). Although the colour balance may be performedvia optic filters or the gain of the analogue amplifier, it is common to use adigital process. Because the white balance operation involves a linear trans-formation of the colour components, it may occur that the same surfaceseen by different WB settings (or different light sources) produces differentcolour images, but linearly dependent. The colour constancy methods "esti-mate the chromaticity of the light source and then correct the image to a canonicalillumination using the diagonal model" [36]. This diagonal model, introducedby Finlayson et al. [29] (1994) such as a Diagonal-Matrix Transform (DMT) D,based on the von Kries’ model, consists of a linear transformation of imageBin into Bout for each of the colour bands:

Bout = DBin ⇒

Bout,r

Bout,g

Bout,b

=

dr 0 0

0 dg 0

0 0 db

·

Bin,r

Bin,g

Bin,b

(2.19)

Apart of the RGB CMF, defined in 1931 by the Commission Internationalede l’Eclairage (CIE), multiple standards define other colour metrics using dif-ferent primaries. The CMFs are also known as colour spaces. CIE XYZ is also


Figure 2.11: The colour temperature chart. A white source light has a temperature ofapproximate 5000K. Greater temperatures yield blueish images. Lessertemperatures yield yellowish or reddish images (based on Fredrik Klin-genberg’s blog8).

an extended colour space that, unlike RGB, has not negative values. Some-times, it is desirable to separate luminance from chrominance. Thus, othercolour spaces have this ability: CIE LAB or Hue, Saturation, Value (HSV).The election of the colour space depends on the objective of the applica-tion. These colour space transformations may be implemented in the colourprocessing unit.

Neither every visible colour can be obtained nor every camera can providethe same colours. The range of colours that can be represented is calledgamut (Figure 2.12). Thus, the gamut of a camera is always a subset of thevisible gamut. Gamut mapping algorithms are also performed in this module.

The colour mapping algorithms within the camera usually do complexnon-linear transformations that do not fit the radiometric calibration mod-els introduced in Section 2.2.4. This effect specially happens with the mostsaturated colours. Kim et al. [56] solved this by using the following cameramodel based on three transformations to be estimated:

B = g(Dwb T s E) (2.20)

8 fredrkl, Color theory continue, white balance (consulted 02/2014)http://www.fredrkl.com/blog/color-theory-continue-white-balance

9 Wikipedia, Color gamut (consulted 02/2014)http://en.wikipedia.org/wiki/Color_gamut

http://www.fredrkl.com/blog/color-theory-continue-white-balance

http://en.wikipedia.org/wiki/Color_gamut

2.4 dynamic model 23

Figure 2.12: The CIE 1931 colour space chromaticity diagram comparing the visiblegamut with sRGB’s, derived from RGB’s, and colour temperature (basedon Wikipedia9).

B Digital image

E Irradiance at the camera input

Dwb White balance diagonal transformation matrix

T s Colour space transformation matrix

g(·) colour gamut function

Furthermore, Arandjelovic [2] found that the gamma value of the cameramodels may depend on the wavelength. In this case, three gamma’s shouldbe accounted.

Further reading on colour matters can be found in [93] and [26].

2.4 dynamic model

To accomplish the study of the influence of the light changes in the images,let us consider a dynamic scenario in which the lights sources positions andintensities, as well as the objects positions, change along time. In this sce-nario, there are NS light sources having spectral irradiances Ei(λ). UsingEquation 2.5, the spectral radiance Lo(λ) of a surface O is:

Lo(λ, t) = ρo(λ)

NS∑i

Ei(λ, t)(kd cos θd,i(t) + (1− kd)e

−(ketθe,i(t))2)

(2.21)

In Equation 2.21, the surface properties are assumed to remain unalteredalong time. On the contrary, the light power as well as the geometric re-lations between object, light sources and camera (Section 2.1) may change.This is indicated via a temporal dependence t.

To neglect the ambient light, we assume that the intensity of the lightsources is considerable.

Assuming that the photometric variations of the light sources are only dueto the intensity of the radiation, we define a time varying illumination gainE and a constant illumination term EC(λ), such that E(λ, t) = EC(λ)E(t). We


also define a varying function G that depends on the surface properties andthe geometric relations mentioned above:

G(t) = kd cos θd(t) + (1− kd)e−(ketθe(t))

2(2.22)

This yields:

Lo(λ, t) = ρo(λ)

NS∑i

ECi(λ)Ei(t)Gi(t) (2.23)

Assuming there is neither cross-channel processing nor vignetting10, andthe sensor is spectrally sharpened [29], having in mind the whole sensor,Equation 2.18 is simplified to:

Ik = T PdEsensor,k Qk (2.24)

Using Equation 2.8 and Equation 2.13, we define a function H that modelsthe internal parameters of the camera such as:

Hk(t) = ηlens π1

1+N(t)2gk(t) T(t)Pd (2.25)

In Equation 2.25 the gain term g is colour band dependant, to considerthe WB. Thus, using Equations 2.23–2.25, we write the image generated inthe sensor chip such as:

Schip,k(t) = Hk(t) ρo,k

( NS∑i

ECi,k Ei(t)Go,i(t) +Ns + roff (t))

(2.26)

Three types of changes may occur:

1. Power of emitted light; changing E.

2. Motion of camera, object or light source; changing G.

3. Changes of the camera settings; changing H (or roff ).

These situations are considered in the remainder of the thesis.Considering the gamma correction (Section 2.2.3), we define the final dig-

ital image B such as:

B = F(Schip) (2.27)

Schip =[Schip,r Schip,g Schip,b

]TF(a) = aγ

γ is a term that is not time-varying. We chose a γ-function because itis a valid solution for the CRF (Section 2.2.4). Nevertheless, F can be rede-fined to account other processes in the camera such as Bayer interpolation,vignetting, and so on; or to model other CRFs. In case γ cannot be recov-ered, there is a blind method [27] that estimates its value. Working withJPEG images, it could be assumed that γ = 1/2.2 [98, Section 10.1].

The Dynamic Image Formation Model (DIFM) defined by Equation 2.27

functions when the sensor pixels are not over- or under-exposed. Even more,if the pixel intensity is low, the noise could mask the signal, making invalidthis model. Thus, we consider that low and high pixel intensities are outliers.

In the following chapter, we use this dynamic model to determine a pho-tometric relation between multiple regions of a scene under light variations.

10 Vignetting correction is outside of the scope of this thesis. In case it is a problem, there arerapid and simple methods, such as Zheng et al. [123]’s, that correct it.

3P R E D I C T I O N O F T H E I N T E N S I T Y VA R I AT I O N O F AR E G I O N

The purpose of this chapter is to elaborate further on the first question out-lined in Section 1.3. Thus, we use the dynamic model developed in theprevious chapter to propose a linear predictive model that relates the pho-tometric variation of an objective region from the variations measured innearby regions. This approach is capable of forecasting the change of thephotometric response of a target when the light conditions alter, with nei-ther information of the light sources nor the target. Note also that we do notmodel absolute photometric responses. This would involve a prior knowl-edge of the surfaces properties.

We also establish the circumstances under which the proposed modelis feasible throughout a theoretical analysis plus an assessment with fourdatasets of real images.

In the context of this dissertation, a region is a surface, or part of it, thatbelongs to a unique entity. Hence, a surface may contain several regions,e. g., a wall can be split into multiple regions; and a region may be composedby several connected objects, such as a vehicle, a person or a floor withdifferent tiles.

In Section 3.1, the relation between intensities variation in regions withinthe same scene is theoretically examined. To the best of our knowledge noprevious research work has been done previously on this topic. Thus, SoA

is not described in this chapter. We formulate the model in Section 3.2. Thesolution to it is planned as a multiple linear regression analysis. This anal-ysis must face several issues that are gathered in Section 3.3. In Section 3.4,we implemented several experiments with own and public datasets. A set ofstatistics indicators and an evaluation strategy were built to assess the feasi-bility of the proposed model. According to that, we outline and discuss theobtained results. Finally, some conclusions regarding the proposed model,based on the experiments, are remarked in Section 3.5.

3.1 quotients relational model of regions

Let us consider the scene shown in Figure 3.1. A light source LS illuminatestwo regions, O1 and O2. A camera C captures the light coming from both re-gions and creates a picture that contains the corresponding formed images,B1 and B2, respectively. To generalise the analysis, the number of regions isexpanded to R+ 1 and the number of light sources to NS. It is assumed that:

• Regions are flat surfaces.

• Regions do not occlude the light sources radiation to each other.

• Images

Bi(t), ∀i ∈ Z : i = 1, · · · ,R+ 1

are not occluded.

• Interreflections and shadows are neglected.

Considering a dynamic scene and one colour band1, in time t, imagesSchip,i(t), ∀i ∈ Z : i = 1, · · · ,R+ 1

are generated in the sensor (Equa-

1 One colour band is considered for simplicity hereafter. The reasoning is similar to multiplebands.

25

26 prediction of the intensity variation of a region

O2

no2

Ld2

Lo

Es2

Le2

ns2

ne2

nv2O1

no1

Ld1

LoEs1

Le1

ns1

ne1

nv1

LS

C

nc

B1

B2

Figure 3.1: Image formation process with two regions.

tion 2.26). Time trf is selected as reference to determine the variations of thephotometric responses.

no

∆r

rs1

rs2

Ps

P1

P2

θ1θ∆

Figure 3.2: Diagram of the albedo in two points of a flat surface illuminated by thesame light source.

Actually, not all of the pixels of the image of the same region have thesame value. Apart of the vignetting effects (neglected in this thesis, as indi-cated in Chapter 2), the albedo depends on the position within the surface.Given a point P1 of the region O (Figure 3.2), and assuming that the lightsource is placed at Ps, let us define the vectorrs1 ≡ −−→

PsP1 ≡ (rs11 , rs12 , rs13). Forthe point P1 + ∆r within the same region, the new albedo is (Appendix B):

cos θ∆ =

√√√√∑3

i r2s1i∑3

i (rs1i ±∆ri)2cos θ1 (3.1)

Nevertheless, if the distance from the region to the light source is muchgreater than the distance between points, the albedo remains constant. Ifthis is true for every point of the region, hence:

‖rs1‖ ‖ ∆r‖, ∀ ∆r ∈ O ⇒ cos θ∆ ≈ cos θ1 (3.2)

In the remainder of this section, it is assumed that Equation 3.2 is accom-plished. In other words, the light sources are far away. In this regard, let us

3.1 quotients relational model of regions 27

formulate the expression that relates the response region to the explanatoryregions.

Definition 3.1. Qrf Si(t) is the quotient difference of the image generated in thesensor Schip(t) in time t due to region Oi by that generated in time trf :

Qrf Si(t).=

Si(t)

Si(trf )∈ R (3.3)

This function is also called the photometric variation function of region Oi.

Using Equation 2.26, and neglecting Ns and roff , Equation 3.3 leads to:

Qrf Si(t) =H(t) ρi

∑NSj ECj Ej(t)Gij(t)

H(trf ) ρi∑NS

j ECj Ej(trf )Gij(trf )(3.4)

Defining2

ki =1

H(trf )∑NS

j ECj Ej(trf )Gij(trf )=

ρiSi(trf )

(3.5)

, which is constant along time, Equation 3.4 is grouped into:

QrfSi(t) =

NS∑j

ki H(t)Gij(t)ECj Ej(t) (3.6)

Regarding the digital image QrfBi, knowing γ-function (Equation 2.27), itcan be computed from QrfSi:

QrfBi(t) =Bi(t)

Bi(trf )=

Sγi (t)

Sγi (trf )

=(QrfSi(t)

)γ (3.7)

Using E definition (Section 2.4), and omitting the temporal term for sim-plicity, Equation 3.6 is expressed in vector form as:

QrfSi = HpTi · e ⇒

pT

i = ki[Gi1 · · ·GiNS

]

e =[E1 · · ·ENS

]T (3.8)

Equation 3.8 means that the photometric variations of any region can bedetermined by a sum of the products of the camera internal parameters, theirradiances of the light sources, and its geometric relations. Nevertheless,they do not depend on the surface properties (ρi). Thus, it seems reason-able to think that, for a given region Ores (defined as the response region),QrfSres can be inferred from the variations of other regions (defined as theexplanatory regions)

QrfSi, ∀i = 1, · · · ,R+ 1 : i = res

.

Hypothesis 3.1. The photometric variation function of a region Ores (response re-gion) perceived by a camera can be estimated by a linear combination of the variationfunctions of other explanatory regions

Oi

observed by the same camera:

QrfSres =

R∑i=1

wi QrfSi,wi ∈ R, ∀i ∈ Z : i = 1, · · · ,R

(3.9)

2 The reference image is selected such that it has a value different to zero. Thus, the denominatoris never zero.


In this thesis, Equation 3.9 is called the Quotient Relational Model ofRegions (QRMR).

To better understand the relation between regions, a deeper analysis onw =

[w1 · · ·wR

]T is necessary. From Equation 3.6 and Equation 3.9, it holds:

QrfSres =

R∑i

NS∑j

wi ki HGij Ej (3.10)

Considering multiple colour bands, due to the QRMR is related to the pho-tometric response of the surfaces (included in the ki term), and this is differ-ent for each one, a model per colour band is necessary.

Equation 3.10 is also expressed in matrix form:

QrfSres = wT ·HP · e (3.11)

where PT =[p1 · · ·pR

]. Using Equation 3.8 and Equation 3.11, the following

relation holds:

HpTres · e = wT ·HP · e ⇒

pTres · e = wT ·P · e

(3.12)

Equation 3.12 yields that the irradiance intensities coming from each lightsource combined with the geometric relations among the light source, re-gion and camera, are linearly dependent upon the combination of the sameintensities and the equal geometric relations of the other regions. w is theweights vector that connects the contribution of each pT

i · e to pTres · e.

Although w is not always constant along time, a suitable estimate can becomputed by regression (Section 3.2). In the following subsections, Equa-tion 3.12 is used in three scenarios with the aim of illustrating the validityof the Hypothesis 3.1. The first scenario is the simplest one: it consists ofone region to be compared with and a single light source. In the second one,each of the regions is illuminated by one light source and each light sourceilluminates one region. In the third one, the responses of the regions arecorrelated each other.

3.1.1 Single region and single light source

Definition 3.2. Let us consider an scenario where R = 1 and NS = 1.

In this case, e = E1 and P = k1 G1. Using Equation 3.8, Equation 3.12

holds:

kres Gres E1 = w1 k1 G1 E1 (3.13)

Clearing w1 from Equation 3.13 and using Equation 2.22, it holds:

w1 =kres Gres

k1 G1=

kres

k1

kdres cos θdres + (1− kdres)e−(ketresθeres)

2

kd1 cos θd1 + (1− kd1)e−(ket1θe1)

2(3.14)

From Figure 2.3, for θd1 = π/2 the incoming light is parallel to the surface;thus, no reflection is produced and the relation between both regions has nosense. In this case, the value of w1 could be infinite for Lambertian surfaces,where kd1 ≈ 1.


Theoretically, to confirm Hypothesis 3.1, w1 must be constant. Neitherthe camera internal parameters nor the irradiance intensity modify its value.Whereas, when the light source or any object moves, the values of thecosines of Equation 3.14 change. Nevertheless, in certain circumstances, thevariation of w1 may be considered imperceptible or, at least, limited. For ex-ample, Figure 3.3 shows the Cumulative Distribution Function (CDF) of Gres,i

Giifor two Lambertian surfaces, as a function of the albedos between 0 and π/2

assuming a uniform distribution of them3. In this scene:

Gres,i = Gres

Gii = G1

cos θdres,i = cos θdres

cos θdii = cos θd1

Two issues may be highlighted:

1. Almost the 50% of the distribution is limited between 0 and 1.

2. Almost the 80% of the distribution is limited between 0 and 2.

This means that, with no prior knowledge of the albedos, the likelihoodthat the suitable w1 lies between

[0, kres/k1

]is close to 50% and between[

0, 2 kres/k1]

is close to 80%. Thus, when the objects move, the variation ofw1 is not constant but primarily restricted.

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1

Gres,i/Gii

Norm

aliz

ed fre

quency

Cumulative Histogram

Figure 3.3: CDF of Gres,iGii

consideringθdres,i , θdii =

[0, π

2

]. Red circle corresponds to

Gres,iGii

= 1. In this example, Lambertian surfaces are considered (i. e., kdres =

kdi = 1).

To shed further light on this analysis, in Figure 3.4a, Gres,iGii

is displayedas a function of θdres,i and θdii . Furthermore, in Figure 3.4b the sum of the

partial derivatives of Gres,iGii

is shown. The large values close to θdii = π/2 (light

3 Attending to Figure 2.3, by definition, the albedo is defined in this range.


source parallel to the region 1), has no physical mean, thus, these values areconsidered outliers. Besides, a small and concentrated area under this linehas a value greater than 5. In Figure 3.4a, the area under the line definedby the points (0, 3π/8) and (π/2,π/2) is blueish; in other words, limitedbetween

[0, 2

]. A variation in this range can be considered to be almost

constant.Regarding the gradient graph (Figure 3.4b), most of the distribution is

dark blue. It means that the sum of the partial derivatives is zero. In par-ticular, it happens in the area under the line θdres = θd1 . Thus, when thealbedos change within this area, w1 may be considered constant and theHypothesis 3.1, true.

Whereas, the area over the line defined by the points (0, 3π/8) and(π/2,π/2) is reddish, which is larger than ∼ 0.015. In that case, Hypothe-sis 3.1 may be not valid.

Nevertheless, this is a theoretical analysis and these types of considera-tions have no sense if no real experiments demonstrate its validity. Therefore,the selected values of likelihood and gradient are indicatives and cannot beunderstood as references. In Section 3.4 we further assess this model usingreal images.

3.1.2 Photometric independent regions

Definition 3.3. Let us consider an scenario where R = NS and each explanatoryregion only depends on one light source: Gij = 0, ∀i, j ∈ Z : i, j = 1, · · · ,R, i =j.

In this scenario each explanatory region is selected such as it receivesradiation from one light source and each light source only radiates to oneexplanatory region plus the response region. Although this is an unrealisticscene, it is very illustrative. In this case, P = diag

(ki Gii

). From this assump-

tion and Equation 3.8, Equation 3.12 holds:

kres [Gres,1 · · ·Gres,NS] · [E1 · · ·ENS]T = wT ·

[k1 G11 Ei · · · kR GR Ei

]T (3.15)

Hence, expanding Equation 3.15, it yields:

kres

R∑i

Gres,i Ei =

R∑i

wi ki Gii Ei (3.16)

The solution to Equation 3.16 that is independent of Ei is the following:

wi =kres Gres,i

ki Gii,∀i ∈ Z : i = 1, · · · ,R

(3.17)

The similarity with Equation 3.14 is obvious. Thus, the same considera-tions as those made in the previous section apply in this case for each oneof the explanatory regions. Indeed, an scenario that consists of a single lightsource and a single explanatory region is a particular case of this one.

3.1.3 Intercorrelated regions

Definition 3.4. Let us consider an scenario where R explanatory regions are in-tercorrelated. In other words, the photometric variation functions of R explanatoryregions have a linear relationship between them [41, Chapter 10]:

R∑i=1

Λi QrfSi +Z = 0 (3.18)


(a)

(b)

Figure 3.4: (a) Gres,iGii

function. X-axis represents θdres,i and y-axis, θdii . Line θdii = θdres,i

is shown in red as reference. In this example, Lambertian surfaces areconsidered (i. e., kdres = kdi = 1). Each value is represented by a differ-ent colour. On the right, the colormap scale. (b) is the gradient of (a),considered as the sum of both partial derivatives.

Λi constants such that not all of them are zero simultaneously

Z stochastic term

In multilinear regression models, this relation is known as multicollinearity.When Z = 0, the multicollinearity is perfect.

The previous definition is the case when the regions are close enough,their orientations are similar and the same light sources illuminate them.


Their photometric variation functions are therefore comparable and it is easyto determine a linear relationship between them. Certainly, this is the typeof relation that Hypothesis 3.1 claims between the response and the explana-tory regions when the latter are not independent. We have not stated yet thedissimilarities between the explanatory and the response regions. Originally,both regions are similar, although in Chapter 4 the response regions are re-ferred to foreground objects and the explanatory to the background surfaces.Regarding this chapter, a response region may be an explanatory region andviceversa. The election of explanatory or response region depends on the re-gion we define as objective. Thus, it is likely to find out multicollinearitybetween the explanatory regions. Nevertheless, the multicollinearity maydiminish due to occlusions between regions. These occlusions produce thatsome components of vector e are zero for the occluded regions. Therefore,those regions contain less information than the occluded regions. Conse-quently, occluded and non-occluded regions are partially independent.

Gujarati and Porter [41, Chapter 10] handle the implication of multicolli-nearity in linear regression models. Although this phenomenon is generallyan undesired relation between the explanatory variables, it does not meanthat the QRMR is invalid. Nevertheless, the implications of selecting collinearregions should be revisited. These implications are analysed more in detailin the remainder of this chapter.

To achieve the prediction of the variation of the response region using theQRMR, several considerations may be accounted regarding the explanatoryregions based on the previous analysis:

light sources influence . They should receive radiation from the samelight sources as the response region. If the response region is radiatedby a light source Si but none of the explanatory regions receive lightfrom it, the Ei component cannot be recovered. Furthermore, priorknowledge about the positions of the regions is expected.

type of surfaces . The spectral response of the surfaces should have com-ponents in every colour band. Otherwise, the photometric variationfunction is null in that band and no QRMR can be determined.

orientations . The normal of the surfaces should be similar to that of theresponse regions. This similarity guaranties that the relation remainsconstant (Figure 3.4b). In addition, if the regions belong to movingobjects, the ranges of the values for their normal surfaces must be con-sidered. Thus, prior knowledge about possible shapes and orientationsof the regions is expected.

positions . The relative distance between explanatory and response re-gions should be much less than between the regions and the sourcelights. If this is fulfilled, condition in Equation 3.2 is achieved.

number . To avoid multicollinearity, the number of regions should be aslow as possible. There is a tradeoff between avoiding multicollinearityeffects and collecting all the possible lights variations. Anyway, if acertain number of regions can explain the behaviour of a responseregion, there is no reason to include more.

Once the conditions where the QRMR may be used are analysed, let usstate a regression problem that enable to give a region variation predictionif this model is used.

3.2 problem statement 33

3.2 problem statement

Let us formulate the prediction problem using a dataset of images fromthe same scene taken along time and containing the explanatory and theresponse regions. Given a time reference, the quotients differences functionsare defined as Qf(t) = QrfSres(t), for the response region, and Qb

i (t) = QrfSi

for the i-th explanatory region. The t-th samples are denoted as: Qf(t =

tt) = qft and Qb

i (t = tt) = qbt,i.

Under these definitions, the prediction of the photometric variation func-tion of a particular region may be formulated by using a Conditional Expec-tation Function (CEF) [41, Chapter 2] such as:

E(Qf |qbt,1, · · · ,qb

t,R) = f(qbt,1, · · · ,qb

t,R) (3.19)

where E(Qf |qbt,1, · · · ,qb

t,R) is the conditional mean of the response regiongiven a set of sample values of the explanatory regions. This mean is afunction of the latter regions. Following Hypothesis 3.1, Equation 3.19 isrewritten as:

E(Qf |qbt,1, · · · ,qb

t,R) =

R∑i=0

βi qbt,i (3.20)

where qbt,0 = 1, ∀t ∈ Z and β =

[β0 · · ·βR

]T ∈ RR is a regressors coefficientsvector of unknown but fixed parameters.

Note that we introduce an additional term β0, or intercept term, whichmodels the camera noise, the bright term and the deviations from the as-sumptions settled in Section 3.1 such as the interreflections, the presenceof shadows or the radiation from light sources that do not influence in theexplanatory regions.

Regarding the expected value Qf, a deviation error is defined for each sam-ple:

εt = qft − E(Qf |qb

t,1, · · · ,qbt,R) (3.21)

This is also known as the residual of the sample.For P samples (with P > R), and using Equation 3.20, previous equation

may be written in matrix form:

ε = qf −Qb ·β (3.22)

where qbt,i is the element of the row t and column i of Qb ∈ RP×R, qf =

[qf

1 · · ·qfP]T ∈ RP and ε =

[ε1 · · · εP

]T ∈ RP

The goal is to estimate a regressor coefficient vector that minimises thesquared 2-norm of the deviation error vector:

β = arg minβ

(‖qf −Qb ·β‖2

)(3.23)

The estimated quotients of the response region are therefore obtainedfrom the explanatory regions as:

qft = qb

t · β (3.24)

where qbt =

[qb

t,1 · · ·qbt,R]∈ RR.


When the estimate is extended to every colour band, Equation 3.23 resultsas follows:

βk = argβkmin

(‖qk

f −Qkb ·βk‖2

)(3.25)

Throughout this chapter, Gaussian assumptions to linear regression mod-els are mostly assumed [41, Section 7.1].

Once the problem has been posed, several methods are discussed to solveEquation 3.25 in the following section.

3.3 solution methods

Multiple optimisation approaches are capable of estimating a parametricdistribution that solves Equation 3.23. Throughout this section, we focuson techniques based on Least Squares (LS) methods [118]. Although linearprogramming [19] or convex optimisation [9], which generalises the linearmethods, may provide with more accurate or more rapid estimates, theywere not used in this thesis. Neither accuracy nor computational time are thetarget here but to typify the validity of Hypothesis 3.1. In this regard, dataquality is more important than the involved optimisation technique. Themethods introduced in this section take advantage of the prior knowledgeof the nature of the photometric variation functions. Thus, three strategieswere explored:

1. Avoid the outliers due to shadows, occlusions, noise and non-lineareffects (e. g., overexposure). Robust Regression (RR) methods [51] canaddress this kind of data.

2. Correct or minimise the multicollinearity effects. Linear methods suchas Partial Least Squares (PLS) [25] or ridge regression [118, Section 7.3]can diminish the consequences of collinear data.

3. Enforce the regressors to be positive. Because a direct linear relation-ship between regions quotients is expected, it seems reasonable tothink that none of the regressors contributes negatively to the response.This constraint does not affect to the intercept term that may be pos-itive or negative. Using a Non-Linear Squares (NLS) method [92] isa solution that may account this constraint similar to a LS techniquedoes.

From the LS point of view, the analytical solution to Equation 3.23 is ob-tained by making its derivative equal to zero. Hence:

β =(QbT

Qb)−1

QbTqf (3.26)

If the distribution of the deviation term ε is normal, Equation 3.26 returnsthe maximum likelihood estimate. Nevertheless, the presence of outliers canproduce heavy tailed error distributions, which gives inadequate estimates.

If the distributions of the predictors are multicollinear, the explanatorydata matrix Qb can be almost singular. Then, the matrix could not be com-putationally solved. Although a solution was determined, the estimate maybe inaccurate and very sensitive to small variations of the predictors data. Inaddition, a high correlation among the explanatory data may involved thatthe information related to the intensity variation is redundant, leading to anoverfitting trouble. Let us shed further light on these issues in the followingsubsections.

3.3 solution methods 35

3.3.1 Outliers management

The easiest approach to the outliers removal is to filter them before thedata matrix is built. Some of the values of the outliers are easily identifiedand filtered following this way. For example, those quotients that are quasiinfinite or zero involve that either the reference or the current related regionis close to zero due to a total occlusion. These values can be rejected with noeffort. Nevertheless, other outliers produced by shadows, partial occlusionsor interreflections are more complicated to be recognised.

Fortunately, linear RR methods detects and removes unusual data that donot fit the linear relationship. Other methods address outliers as well, suchas RANdom SAmple Consensus (RANSAC) [31]. The application of thesemethods in Computer Vision is well known [124]. However, these meth-ods were not used in this thesis because to determine an optimal regressorestimate technique was not the target but to demonstrate the validity of theQRMR model. The most common general method of robust regression is M-estimation, or maximum likelihood estimation [51, Chapter 3]. Consideringthe linear problem of Equation 3.22, an objective function ρ(ε) is introduced:

ρ(ε)= ρ

(qf −Qb ·β

)(3.27)

Let Ψ = (δ/δε)ρ(ε), we define the weight function z as follows:

z(ε) =Ψ

ε(3.28)

We also define a weight matrix Z = diag(z(εt)

). Similarly to LS solution,

the analytic solution to Equation 3.27 is the following:

β =(QbT

ZQb)−1

QbTZqf (3.29)

The estimate depends on the weights that depend on the residuals thatdepend on the regressors estimates. To solve this, an iterative reweightedLS method is performed. The initial estimate is the solution to the commonLS method. At each iteration the residuals and their associated weights arecomputed from the previous iteration and a new regressors vector is ob-tained. The algorithm is repeated until the regressors vector converges oruntil some maximum iteration limit is reached.

The weight function should be designed such as it ensures that thoseresiduals in the heavy tail are filtered. Several functions satisfy this crite-rion4. One of the most commonly used functions is the bisquare weightfunction, used in this thesis. This function has the following expression:

z(ε) =

[1−

(εk

)2]2|ε| k

0 |ε| > k(3.30)

where k is a tuning constant5.

3.3.2 Multicollinearity

Besides a convenient selection of the explanatory regions (e. g., by using clus-ter appearance methods [58]), some techniques may be applied to addressthe multicollinearity.

4 Note that the LS approach is achieved for ρ(ε) = ε2.5 According to standard approaches, in the experiments of this thesis, a tuning constant of 4.685

was used.


Ridges regression [47] is one of the LS approaches that handles the multi-collinearity. The analytical solution takes the following expression:

β =(QbT

Qb + λI)−1

QbTqf (3.31)

where λ ∈ R > 0 and I is the identity matrix.Ridge regression involves an study of the suitable λ parameter. Using a

shrinkage method, such as lasso [102], aids to manually select a suitableparameter. To avoid non-automatic algorithms, we did not perform this inthe experiments. Instead, we used a PCA [118, Section 7.3.2] approach thatcan be directly connected with a ridge regression.

Other PCA relative techniques that tackle multicollinearity are PLSs [25].They estimates the regressors by projecting the explanatory and the re-sponse data to a new space and after performing a LS regression.

3.3.3 Positive regressors

The positive constraint in the regressors may be accounted by using a NLS

[92] approach in which the regressors to be solved are the square of theregressors. Similar to Equation 3.23, let us define the following NLS problemusing an auxiliary parameter:

u = argu min(‖qf −Qb ·

[u0 u

21 · · ·u2

R]T‖) (3.32)

The regressors are therefore obtained by the following expression:

βi =

u0 i = 0

u2i i = 1, · · · ,R

(3.33)

Note that the linear relation remains unaltered although the name of theNLS technique may lead to believe the contrary.

Equation 3.32 is mostly solved by iterative algorithms because analyticsolutions are not always admitted. For further reference, consult Seber andWild [92].

In the next section, the exposed approaches are compared with each otherin four real cases to better understanding the situations in which the estima-tion is reliable besides its limitations.

3.4 experiments

Throughout this section, a series of experiments that were done to evaluatethe Hypothesis 3.1 is shown. They used four datasets of real scenes: i) Ter-race, ii) MUCT, iii) Parking, and iv) Multi–Camera Dynamic Light (MCDL).Each dataset contained multiple images of the same site taken by four dif-ferent non-moving cameras and covering plenty of photometric variations(number; type; position of the light sources; position and orientation of theregions; and camera settings). While Terrace, MUCT and Parking datasetscontained one site each one, the MCDL’s contained two. Terrace dataset al-lowed to analyse a simple outdoor scene, with flat Lambertian surfaces andnon-moving foregrounds (FGs). MUCT dataset [75] is a public database usedfor face recognition in which the faces were acquired in multiple indoor lightconditions besides each picture contains a background (BG) region largerenough to estimate the QRMR. Up to five camera views was used during

3.4 experiments 37

the acquisition. Thus, although the faces position in the image remainedunaltered, the normals of the surfaces of the BG and FG changed, offering achallenging scene. Furthermore, it was a feasible scene due to the BG is closeto the FG. Parking dataset was a challenging outdoor scene because the QRMR

assumptions were not likely obeyed: over- and under-exposed pictures wereacquired; multiples cast shadows were found; and the FG surfaces were notLambertian and primarily produced highlights. MCDL dataset provided acomplete evaluation scene. This dataset contained indoor and outdoor lightvariations, multiple views of the same FGs and multiple types of Lambertiansurfaces.

In each scene, we selected the explanatory regions from the BG, that is, thenon-moving regions; therefore, the regions that remain in the same positionand orientation along the dataset acquisition. The response regions wereselected from the FG, that is, those pixels that belong to moving or relevantobjects. The full description of the datasets can be found in Appendix C. Asa basis in this thesis, neither prior knowledge about the light sources northeir position were used in the performance of the algorithms.

Terrace dataset contained both RAW images and compressed ones in JPEG

(with qualities in the range of 95−−100 % and compressions rates in therange of 2:1–3.7:1) format taken in the same conditions. We used them tostudy the differences between images such as the image sensor producesthem (RAW), and after some digital processing (JPEG), which often involvesnon-linear transformations (Chapter 2).

In the following subsection the definition of the indicators used for theassessment of the results are given. Section 3.4.2 explains the implementedstrategy. Afterwards, the most significant results are shown and analysed forthe four datasets6. Finally, a discussion of the results is given in Section 3.4.7.

3.4.1 Statistics definition

To evaluate the performance of the regressions and compare it with theresults of using each technique, the use of some statistics is recommended.The most significant statistics are defined hereafter besides some auxiliaryterms that help to obtain them.

Definition 3.5. The number of degrees of freedom df of a multiple linear regressionis the number of observations P minus the number of regressors R+ 1:

df = P− R− 1 (3.34)

Definition 3.6. In a multiple linear regression, the variance σ2 (or Mean SquaredError (MSE)) is the square norm of the residuals divided by the number of degrees offreedom.

σ2 =‖ε‖2df

=

(qf −Qb · β

)T (qf −Qb · β

)

df(3.35)

Definition 3.7. In a multiple linear regression, the coefficient of determination R2

is defined as follows:

R2 = 1−‖ε‖2

var(qf)= 1−

(qf −Qb · β

)T (qf −Qb · β)

(qf − qf

)T (qf − qf) (3.36)

6 In Appendix D extended results of the experiments are presented for consultation.


R2 is a parameter that measures the proportion of the total variation of theresponse vector explained by the explanatory data matrix [41, Section 3.5].In other words, R2 measures the goodness of the estimate to fit the data.The closer to one this coefficient is, the better.

Definition 3.8. In a multiple linear regression, given the variance–covariance ma-trix of the regressors cov(β),

cov(β) = σ2(QbT

Qb)−1 (3.37)

the standard error of ith-regressor sei is the square root of the ith-element of thediagonal of cov(β):

sei =√diagi

(cov(β)

)(3.38)

The standard error is an indicator of the precision of the regressor.

Definition 3.9. In a multiple linear regression, the t-statistic of regressor βi isdefined as follows:

t(βi) =βisei

(3.39)

This statistic is often used for Hypothesis testing to accept or reject a certainhypothesis about the regressors. In this context, t also analyses the influenceof each one in the estimate. The larger the value is, the more influence ithas. Furthermore, the combination of insignificant t values and large R2

evidences multicollinearity.

Definition 3.10. Given the explanatory data of a multiple linear regression, itscorrelation matrix is defined by the elements ij which are the Pearson coefficientsbetween the explanatory data from region i and j, that is, the i and j columns ofQb, respectively.

The Pearson coefficient rij is computed as follows:

rij =cov

(qb

i , qbj)

σqbiσqb

j

=E[(

qbi − µqb

i

)(qb

j − µqbj

)]

σqbiσqb

j

(3.40)

qbi column i of Qb

cov(qb

i , qbj)

covariance between qbi and qb

j

σqbi

variance of qbi

µqbi

mean of qbi

E[·] expectation

The correlation matrix is symmetric and its diagonal values are ones. Thevalues of the matrix elements fall between [−1, 1]. If the value is 1, there isa direct linear relationship between regions. If the value is −1, there is andirect inverse linear relationship between regions. Finally, if the value is 0,there is no linear relationship between the regions.

Large coefficients for the non-diagonal elements is also a characteristicevidence of multicollinearity.

3.4 experiments 39

3.4.2 Evaluation strategy

The experiment was carried out using all of the images of the four datasets.We selected the images depending on the experiment but all of the possibleimages were used in each one. Once the images were selected, first step wasto select the reference image. One reference for each FG region was selectedand it was the same for the rest of regions. The selection of the referenceimage can vary, but, generally, we searched for a well exposed image. Weconsidered an image is well exposed when there is neither over- nor under-exposured pixels and its average value is around the 25 % of the maximumpossible value (255 in 8-bit integer pixel resolution). Then, the quotients forevery region were computed.

At that moment, we had tuples of quotients(qf

i , qbi,1, · · · ,qb

i,R)

for eachcolour band. As the regression was done for each band separately (Equa-tion 3.25), hereafter we repeated the same procedure for each one. We fil-tered those tuples that contained extreme outliers, that is, close to zero or toinfinite quotients.

Finally, we done several regressions algorithms Section 3.3. The appliedmethods were the following:

lsm : LS method.

lsm-r : Linear RR method using the bisquare weight function.

nlsm : NLS method as described in Section 3.3.3.

pcam : LS method over the PCA of the data observation matrix choosing thecomponents that includes the 90 % of the variance.

plsm : PLS method as described in Section 3.3.2.

Not every method was applied in every dataset. For instance, in the Park-ing dataset there was only one explanatory region. Thus, those methods thataddress multicollinearity were not computed in this case because, presum-ably and as we confirmed in single tests, they do not improve the others.

To interpret the ability of inferring the behaviour of the response regionsin any scene using any of the algorithms, an analysis of the resulting errorterms was done.

Besides some tables including the statistics values defined in Section 3.4.1and the regressors estimates, in certain circumstances we produced severalgraphs:

• The observed values of the response region qf versus the observedvalues qi

b of each explanatory region.

• The fitted values of the response region qf versus the observed valuesqf.

• The histogram of the residuals ε.

To fulfil the Gaussian assumptions (Section 3.2), the disturbance error isexpected to have:

1. Zero mean value:

E(εt |qbt,1, · · · ,qb

t,R) = 0, ∀t (3.41)


2. Homoscedasticity, or equal variance:

var(εt |qbt,1, · · · ,qb

t,R) = σ2, ∀t (3.42)

3. No serial correlation:

cov(εt, εt ′ |qbt,1, · · · ,qb

t,R,qbt ′,1, · · · ,qb

t ′,R) = 0, ∀t, t ′ (3.43)

4. Zero covariance with explanatory regions:

cov(εt, qbt,r) = 0, ∀t, r (3.44)

These properties should be ideally accomplished and guide the evalua-tion. In a real situation, all of the properties are not always achieved al-though the model and the observed data are good enough, as Gujarati andPorter [41, Section 3.2] complain. Furthermore, εt’s cannot been computedbecause they must be obtained from the real regressors β and not theirestimated values β. Thus, only statistics related to εt could be calculated.To analyze assumption #1, the Mean value of the Residuals (MR) is provided.To analyze assumptions #2–#4, the observed data used for each experimentwere split into five random sets emulating five different observation sets.Then, we provide three parameters defined as:

1. The Standard deviation of the Variances of the Residuals (SVR) among thefive sets. The closer to zero, the more likely the regression is homosce-dastic.

2. The Mean value of the Residuals Covariance (MRC) among the five sets.The closer to zero, the less correlation between regressors.

3. The Mean value of the Explanatory regions and Residuals Covariance (MERC)among the five sets and all the explanatory regions. The closer to zero,the less covariance between regressors and explanatory regions.

In the following section the most representative results are shown. Some-times regions are denoted as Ri, in which i is the region identifier. Otherresults that do not provide with remarkable values are given in the Ap-pendix D. Thus, we avoid insignificant information in the main thread ofthe dissertation but we offer extended results to be consulted if required.

3.4.3 Terrace dataset

Some samples of this scene can be seen in Figure 3.5. This scene was themost controlled scene of the four. Every region was static, that is, the nor-mal surfaces remained unaltered in all of the pictures. Furthermore, occlu-sions did not exist. As the objects were static, three pictures were taken ineach capture. Thus, the zero-mean camera noise was reduced by taking theaverage value of the three shots.

Besides the comparison of the regression performance between the pro-posed methods, we also compared the results between the RAW picturesand their related compressed versions. The aim was to analyse if the pro-cessed images provided worse performance than the unprocessed images.Note that the camera processing often involves non-linear transformationsthat may invalidate our hypothesis (Section 2.2.3). One of these non-lineartransformations is the gamma correction. Following the recommendation of

3.4 experiments 41

Figure 3.5: Terrace dataset samples.

Szeliski [98, Section 10.1.1], we assumed that this gamma is 1/2.2 for imagesnormalised to 1 for each colour band and we applied the regression anal-yses to images corrected by this value (γ ′ = 2.2). We call γ-JPEG images tothese corrected pictures.

Table 3.1 depicts the correlation values between the two explanatory re-gions for each kind of image and colour band. The first impression was thatthe correlations are very high for all cases. These values were reasonabledue to the only difference between the photometric variation functions ofthe regions is the albedo.

Red Green Blue

RAW 0.9998 0.9996 0.9994

JPEG 0.9997 0.9993 0.9992

γ-JPEG 0.9995 0.9987 0.9984

Table 3.1: Correlation between qb1 and qb

2 by image format and colour band for theTerrace dataset.

Regarding the colour bands, the red band was the most correlated, fol-lowed by the green one. Nevertheless, the values were nearby (0.11 % ofmaximum deviation).

The RAW set provided the highest value, followed by the JPEG. Neverthe-less, the difference was not meaningful (0.10 % of maximum deviation).

Below, the analyses results of the regressions for the three kind of picturesis presented.

3.4.3.1 RAW

Figure 3.6 depicts the distribution of the FG versus the BG quotients for theRAW pictures. The linear relationship was very clear despite a few outliersthat could be identified in the red band close to qf = 2.75.

Two other issues were noticed:


1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qb1

qf

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qb2

qf

(a) Red Band.

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qb1

qf

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qb2

qf

(b) Green Band.

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qb1

qf

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qb2

qf

(c) Blue Band.

Figure 3.6: Distribution of qf vs qb1 (left graphs), and qb

2 (right graphs) for the threebands using RAW pictures of the Terrace dataset.

3.4 experiments 43

1. Most of the quotients in the red band were greater than one, whereasa high population of quotients in the green and blue bands were lessthan one.

2. The range of the quotients was larger in the red band.

The reason seemed to be the selected reference image. The intensity inthe red band of the reference regions was one of the smallest comparedwith the whole dataset. Whereas the intensity in the other bands was closerto the highest values of the dataset.

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qf

qf

Red

Green

Blue

Figure 3.7: qf vs qf for RAW images of Terrace dataset using the LS method. Thethree colour bands are shown. The graphs of the rest of the regressionmethods are almost identical. They can be found in Figure D.1. Black lineis the identity line.

Figure 3.7 depicts the distribution of the estimated values for the responseregion versus the observed ones using the LS method. Note that the regres-sion was almost perfect. Just a few red values were not aligned with theidentity line. They were the outliers detected in Figure 3.6a. We noticedthat the rest of the regression methods yielded the same visual result (Fig-ure D.1).

LSM LSM-R NLSM PCAM PLSM

Redβ0 0.0173 0.0152 0.0173 0.0124 0.0124

β1 0.1635 0.4923 0.1635 0.4830 0.4830

β2 0.8099 0.4884 0.8099 0.5031 0.5031

Greenβ0 0.0047 0.0021 0.0047 0.0050 0.0050

β1 0.4902 0.5297 0.4902 0.4742 0.4742

β2 0.4949 0.4562 0.4949 0.5097 0.5097

Blueβ0 -0.0000 -0.0001 0.0000 0.0014 0.0014

β1 0.6041 0.5848 0.6040 0.4738 0.4739

β2 0.3909 0.4076 0.3908 0.5115 0.5115

Table 3.2: Regressors estimates for RAW pictures of Terrace dataset.


In Table 3.2 the regressors estimates are shown for all of the methods andcolour bands. In each estimate, the sum of the regressors was close to one.This result means that the intensity variations of the explanatory regionsand the response region were nearby although the surface orientations weredifferent for the three regions. The cause was likely because the main lightsource (the sun) was far, its position changed and the radiation caused byinterreflections was high, providing similar distribution of the photometricvariation functions. Furthermore, the intercept term was very low, speciallyin the blue band. Note that this term models mostly the noise. Thus, thislow value was maybe caused by the high Signal Noise Ratio (SNR) of thedataset (including that each image sample was the average value of threeshots).

Furthermore, the weighted contribution of each regressor seemed random.That is, in some estimates β1 was greater than β2 and in some other elseit occurred the opposite without any obvious reason. In the green band,however, they were balanced (close to 0.5) for all methods. The green bandwas more faithful to the real data due to the Bayer interpolation (Section 2.3),but it did not seem related to the similar values of the regressors, which wasprobably only a coincidence. According to the quotients distributions, bothexplanatory regions seemed to have the same relation to the response region.Thus, this leads to think that there were likely multiple estimates that yieldsa total cost close to the minimal value. This set of regressors values seemedto be restricted to those which sum was close to one.

Regarding the regression method, two issues could be highlighted:

1. PCAM and PLSM yielded identical results.

2. NLSM and LSM estimates were practically equal. This always hap-pened when LSM estimates were positive because the solution thatNLSM provided with was analytically the same in such cases. Theslightly differences were consequences of the imprecision of the soft-ware tools.


MSE (×10−6) 1039.9 1198.1 1039.9 1082.4 1082.4

MR (×10−6) 0.0 8180.8 0.0 -0.0 -0.0

SVR (×10−9) 1061.9 1470.2 1061.9 1259.1 1259.1

MRC (×10−6) 29.1 -15.0 29.1 1.3 1.3

MERC (×10−6) 101.2 4340.7 101.2 -19.4 -19.4

R20.9991 0.9990 0.9991 0.9991 0.9991

t (β0) 14.772 12.050 14.772 10.376 10.377

t (β1) 5.996 16.823 5.996 17.367 17.366

t (β2) 30.994 17.382 30.994 18.841 18.842

Table 3.3: Regressions statistics for the RAW pictures of the Terrace scene. Red Band.

Table 3.3 depicts the statistics over the red band. They were slightly worsethan those in the green and blue bands. However, the discussion is equalfor all of them. The statistics of those bands can be found in Table D.1 andTable D.2.

3.4 experiments 45

Due to the same reasons previously exposed, the statistics of the LSM andNLSM (and PCAM and PLSM) were the same.

In general, R2 values were high; and MSE, MR, SVR, MRC and MERC wereclose to zero. Therefore, the fit was very reliable and the Gaussian assump-tions were fulfilled in all cases.

Note that, unlike LSM (and NLSM), the t values were high. This waseven the case of the intercept term. Although the estimates were close tozero, their t’s were also enough high to ensure that this term is significant.Therefore, it reinforced the idea of high SNRs. The contrary behaviour ofLSM was because β2 was greater than β1 in this regression meanwhile inthe remaining methods they were similar.

The higher error statistics of LSM-R could be explained by observing theresiduals histograms in Figure 3.87. The maximum values of the distribu-tions were outliers. This value was greater in green and blue bands in LSM-R because through the LS iterations, the outliers were discarded to better fitthe inliers. If the tail related to the outliers was neglected in these bands,LSM-R gathered the residuals in a smaller range (between −0.02 and 0.04).Therefore, LSM-R outperformed the others.

In the red band, the distributions were close similar. They were concen-trated in one bin (more than 70 %).

Furthermore, if the tail related to the outliers was neglected, all of thedistributions shown a slightly positive skewness and no bias.

In the next section, the results using the JPEG pictures are shown.

3.4.3.2 JPEG

Figure 3.6 depicts the distribution of the FG versus the BG quotients for theJPEG pictures. Similar conclusion to the RAW images analysis were extractedas regards linearity and outliers. On the contrary, the distributions werenearby in each band. The reason seemed to be the white balance and otheralgorithms implemented by the camera DSP that searched for equally dis-tributed bands and averages intensities (Section 2.2.3).

The distribution of the fitted values for the response region versus theobserved regions did not provide with significant information. Thus, it isshown in Figure D.4 for each method.

Table 3.4 shows the regressors estimates for LSM, LSM-R, and PLSM meth-ods and colour bands. NLSM and PCAM estimates are not provided becausethey are equal to LSM and PLSM respectively. A noticeable difference wasnot found with the RAW estimates. Maybe a greater value of the interceptterm would be expected caused by the compression algorithm. Nevertheless,note that the noise was reduced by taken the average value of three imagestaken simultaneously and the region quotients were computed taken theaverage value of the whole region.

Furthermore, unlike the previous case, the green band was not balancedfor LSM-R method. This strengthened the idea that the balanced situationobtained with the RAW dataset was a coincidence.

Table 3.5 depicts the statistics of the red band. Apparently, the results werebetter than those produced by the RAW dataset (Table 3.3). Nevertheless,comparing all of the bands in both datasets (Table D.1–Table D.4), we couldnot say that the regressions fitted one of the datasets better than the otherone. On the other hand, similar conclusions were extracted. That means, thefit was very reliable, the Gaussian assumptions were fulfilled in all cases

7 The NLSM and PCAM residuals histograms were equal to LSM and PLSM. They are shown inFigure D.3.


−0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0

0.082

0.16

0.25

0.33

0.41

0.49

0.57

0.66

0.74

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0

0.082

0.16

0.25

0.33

0.41

0.49

0.57

0.66

0.74

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) LSM-R.

−0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0

0.082

0.16

0.25

0.33

0.41

0.49

0.57

0.66

0.74

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) PLSM.

Figure 3.8: Residuals histogram per band of the RAW pictures of Terrace dataset.

3.4 experiments 47

1 2 3

0.5

1

1.5

2

2.5

3

3.5

qb1

qf

1 2 3

0.5

1

1.5

2

2.5

3

3.5

qb2

qf

(a) Red Band.

1 2 3

0.5

1

1.5

2

2.5

3

3.5

qb1

qf

1 2 3

0.5

1

1.5

2

2.5

3

3.5

qb2

qf

(b) Green Band.

1 2 3

0.5

1

1.5

2

2.5

3

3.5

qb1

qf

1 2 3

0.5

1

1.5

2

2.5

3

3.5

qb2

qf

(c) Blue Band.


2 (right graphs) for the threebands using JPEG pictures of the Terrace dataset.


LSM LSM-R PLSM

Redβ0 0.0019 -0.0037 -0.0018

β1 0.3705 0.8090 0.4908

β2 0.6260 0.1955 0.5103

Greenβ0 0.0012 -0.0140 -0.0000

β1 0.4592 0.8795 0.4900

β2 0.5432 0.1369 0.5138

Blueβ0 0.0116 -0.0164 0.0098

β1 0.4426 0.8742 0.4866

β2 0.5536 0.1444 0.5118

Table 3.4: Regressors estimates for JPEG pictures of Terrace dataset.

LSM LSM-R PLSM

MSE (×10−6) 241.1 422.9 246.5

MR (×10−6) 0.0 7700.7 -0.0

SVR (×10−9) 1.6 42.7 3.6

MRC (×10−6) 0.8 10.5 3.52

MERC (×10−6) 587.8 4064.1 583.1

R20.9996 0.9993 0.9996

t (β0) 3.171 -4.729 -3.009

t (β1) 35.278 58.160 46.224

t (β2) 61.954 14.611 49.950

Table 3.5: Regressions statistics for the JPEG pictures of the Terrace scene. Red Band.

3.4 experiments 49

and LSM-R yielded slightly worse statistics because the method did not fitthe outliers.

The only perceptible dissimilarity involved the t values of the interceptterm. Using the JPEG dataset these values were much lower than the RAWdataset. There was even a higher dissimilarity between the t values of β0and β1,2.

Regarding the histogram of the residuals (Figure 3.10) compared withthose generated using the RAW dataset, the skewness was more evident anda slightly negative bias was observed in almost all of the cases. Furthermore,the distributions were wider and the value of their modes was smaller (lessthan 50 %).

The main dissimilarities were: i) the distributions of the three bands weremore similar, and ii) the presence of a tail on the right. This tail leaded tomake more difficult to distinguish inliers from outliers. For example, ob-serving Figure 3.10b, it was difficult to guess if the bins greater than 0.02are inliers or outliers. Anyway, whether this tail was neglected, LSM-R out-performed the other methods.

In general, a diminished effect was observed between the RAW and theJPEG pictures, which it was caused by the processes implemented by thecamera DSP. Attending to these results, we concluded that fewer data fittedthe regression model in the case of JPEG compared to RAW.

In the next section, the results using the gamma corrected pictures areshown.

3.4.3.3 Gamma corrected

Figure 3.11 depicts the distributions of the FG versus the BG quotients afterall the JPEG pictures of the dataset were raised to 2.2-th power to correct thegamma transformation of the camera DSP. As regards the JPEG distributions,the maximum value of the data increased about a factor of 4. It also seemedthat there were more outliers, probably due to the data magnification.

LSM LSM-R PLSM

Redβ0 0.0449 0.0000 0.0248

β1 0.2530 0.9501 0.4784

β2 0.7249 0.0603 0.5157

Greenβ0 0.0514 -0.0081 0.0466

β1 0.4360 0.9501 0.4752

β2 0.5569 0.0741 0.5211

Blueβ0 0.0447 -0.0093 0.0539

β1 0.5375 0.9015 0.4709

β2 0.4610 0.1241 0.5212

Table 3.6: Regressors estimates for γ-JPEG pictures of Terrace dataset.

Table 3.6 shows the regressors estimates. Again the sum of the regressorswas close to 1. Also note that the intercept term for the LSM-R remainedclose to zero but it increased almost an order of magnitude for the othermethods. Regarding the explanatory regions, region #1 had a greater weight(almost 9 times) for the three bands for the LSM-R while the weights were


−0.02 0 0.02 0.04 0.06 0.08 0.1 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.02 0 0.02 0.04 0.06 0.08 0.1 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) LSM-R.

−0.02 0 0.02 0.04 0.06 0.08 0.1 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) PLSM.

Figure 3.10: Residuals histogram per band of the JPEG pictures of Terrace dataset.

3.4 experiments 51

5 10 15

2

4

6

8

10

12

14

16

qb1

qf

5 10 15

2

4

6

8

10

12

14

16

qb2qf

(a) Red Band.

5 10 15

2

4

6

8

10

12

14

16

qb1

qf

5 10 15

2

4

6

8

10

12

14

16

qb2

qf

(b) Green Band.

5 10 15

2

4

6

8

10

12

14

16

qb1

qf

5 10 15

2

4

6

8

10

12

14

16

qb2

qf

(c) Blue Band.


2 (right graphs) for the threebands using the γ-JPEG pictures of the Terrace dataset.


balanced for LSM and PLSM. The regression of the red band using the LSMwas an exception.

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

qf

qf

Red

Green

Blue

(a) LSM.

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

qf

qf

Red

Green

Blue

(b) LSM-R.

Figure 3.12: qf vs qf for the γ-JPEG images of Terrace dataset using the (a) LSM and(b) LSM-R. The three colour bands are shown. The graphs of the restof the regression methods can be found in Figure D.7. Black line is theidentity line.

The reliability of the data fit can be observed in Figure 3.128. This graphs

shows that the outliers, those samples that deviate from the identity line,did not fit the observed data for the LSM-R. Thus, we expected that theinliers fitted rather than other methods, but it could not be noticed fromthis figure.

The statistics related to the red band are shown in Table 3.79. In general,MSE, SVR, MRC, and MERC were larger compared with the non-corrected data(up to two orders of magnitude)10. Since these statistics depended on thesquare value of the residuals and the range of the quotients was larger,

8 The results for the PLSM are not included because they were very similar to those observed forthe LSM.

9 Corresponding to green and blue band can be consulted in Table D.5 and Table D.6, respec-tively.

10 Note that, MSE, SVR and MERC multipliers in the table are three orders of magnitude larger.

3.4 experiments 53

LSM LSM-R PLSM

MSE (×10−3) 10.141 25.333 10.822

MR (×10−6) -0.0 6217.2 0.0

SVR (×10−6) 5.4 81.8 15.25

MRC (×10−6) 344.8 -1831.2 201.2

MERC (×10−3) 16.1 159.9 13.9

R20.9992 0.9981 0.9992

t (β0) 19.864 0.004 10.627

t (β1) 22.388 53.199 40.987

t (β2) 69.132 3.636 47.607

Table 3.7: Regressions statistics for the γ-JPEG pictures of the Terrace scene. Red Band.

it seemed reasonable. Nevertheless, R2 values were slightly less, thus itseemed the previous regressions were more reliable than this one.

Figure 3.13 depicts the histogram of the residuals per band and method.The negative bias, the positive skewness and the tail on the right remainedsimilar.

Again LSM-R got the most concentration of the residuals close to themode. Nevertheless, the magnitude of the errors were up to one order ofmagnitude larger than those from RAW and JPEG pictures, while the ob-served data was up to four times. Therefore, the γ-JPEG pictures fitted worsethan the uncorrected ones.

Colour Band Mean(×10−5)

Variance(×10−11)

Red 3.58 11.95

Green 3.14 1.74

Blue 2.68 2.09

Table 3.8: Statistics over the normalised MSE obtained by running LSM over gammacorrected images within an interval of γ ∈ [0.2, 1.0] for the three colourbands in the Terrace dataset.

It may be possible that the selected gamma was not the most suitablevalue. To analyse the performance of the regression about gamma, we im-plemented a LSM over several values of γ. For each value of gamma, wenormalised the MSE of the regression dividing it by the maximum valueof the response data, to account the change in the range of the data whenthey are powered. Then, we estimated the mean and the variance of all ofthe normalised MSEs. The results per band are shown in Table 3.8. This lowvariance indicated that, on average, the regression did not depend on thegamma used to correct the JPEG images.

Next, the results of the regressions done over the MUCT dataset are pre-sented and discussed.


−0.2 0 0.2 0.4 0.6 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

0.56

0.62

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.2 0 0.2 0.4 0.6 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

0.56

0.62

Residuals

No

rma

lize

d f

req

ue

ncy

Red

Green

Blue

(b) LSM-R.

−0.2 0 0.2 0.4 0.6 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

0.56

0.62

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) PLSM.

Figure 3.13: Residuals histogram per band of the γ-JPEG pictures of Terrace dataset.

3.4 experiments 55

Figure 3.14: Face samples of MUCT database based on MUCT webpage11.

3.4.4 MUCT dataset

Figure 3.14 shows some samples from the MUCT Face Database. It providesa set of several image faces in an indoor environment under several lightconditions and taken from multiple camera views. The variation of the viewinvolved a slight change in the normal surfaces. The BG region was com-posed by the 115 first lines of the images (Figure 3.15) The BG region wasvery close to the FG although the light source was also close to the scene.This position would violate the far source assumption (Equation 3.2); how-ever, it did not seem to affect the results, as it is observed below.

Figure 3.15: Regions of MUCT database. The BG region is within the red border rect-angle. The FG region contains the face.

In this experiment, we did not evaluate the PCAM and PLSM becauseone explanatory region was only used; therefore, collinearity could not beproduced. NLSM results are not presented because β1’s were positive, thus

11 MUCT Details, The MUCT Face Database (consulted 08/2014)http://www.milbo.org/muct/muct-details.html

http://www.milbo.org/muct/muct-details.html


it performed exactly as LSM. After the results in the previous experiment,gamma correction was not applied to the pictures.

58 pictures of 10 persons were used under up to 3 different light settings.20 percentiles evenly spaced per picture were computed. Figure 3.16 depictsthe quotients distribution per colour band. The linear trend was clear, al-though was less than in the previous scene. Furthermore, the number ofoutliers was higher. As regards the colour band differences, the red banddistribution was the most concentrated whereas the blue band distributionwas the most disperse. This yielded a more variability of the data in theblue band. Three points clouds were identified, being clearer in the greenand blue bands. These clouds were likely caused by the discrete light set-tings.

LSM and LSM-R computed comparable regressors (Table 3.9). Similarlyto the Terrace scene, the sum of the regressors was around 1. In other words,when any change was not produced in the BG, the forecast was that theFG neither changes. In addition, the intercept term was greater than theprevious scene but it was still low. This result was coherent with the factthat these pictures were more noisy than the former ones.

LSM LSM-R

Redβ0 0.1505 0.0790

β1 0.8530 0.9265

Greenβ0 -0.0846 -0.1124

β1 1.0843 1.1053

Blueβ0 -0.1995 -0.2127

β1 1.2031 1.2059

Table 3.9: Regressors estimates of the MUCT dataset.

The estimates of both methods were almost equal (Figure 3.17). Most ofthem lied around the identity line, although a high number of outliers werealso observed. These were mainly located under the identity line (qf < qf)but they were not noticeable in the histograms of the residuals (Figure 3.18).

Both the histograms of the residuals and the indicators (Table 3.10) bringthat the estimates were convenient: residuals centred around zero; low MSEs;high R2’s (except the red band) and t’s; and Gaussian assumptions fulfilled.As regards the residuals, noticeable dissimilarities between colour bandswere not observed. The similar performance of both methods leaded to as-sure that LSM-R did not detect so many outliers in this dataset. Althoughthe slightly larger MRs and MERCs indicated that some of them were rejected.

Next, the results of the regressions performed over the Parking dataset arepresented and discussed.

3.4.5 Parking dataset

The results of the Parking dataset (Section D.2) are divided by the positionof the FG which can be located in three areas of the image (Figure 3.19).

Similarly to Section 3.4.4, only LSM and LSM-R were evaluated.All the images of the dataset were used. Note that there were up to 43 FGs

(vehicles) per picture.

3.4 experiments 57

0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

qb1

qf

(a) Red Band.

0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

qb1

qf

(b) Green Band.

0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

qb1

qf

(c) Blue Band.

Figure 3.16: Distribution of qf vs qb1 for the three bands of MUCT dataset.


LSM LSM-R

Red

MSE (×10−3) 25.766 26.099

MR (×10−3) 0.0 -3.1

SVR (×10−6) 27.9 29.2

MRC (×10−6) -646.8 -549.2

MERC (×10−3) -0.0 -2.2

R20.6288 0.6240

t (β0) (×103) 0.242 0.126

t (β1) (×103) 1.430 1.544

Green

MSE (×10−3) 25.660 25.744

MR (×10−3) 0.0 5.9

SVR (×10−6) 47.7 48.4

MRC (×10−6) -199.1 -139.3

MERC (×10−3) -0.1 -1.3

R20.8394 0.8389

t (β0) (×103) -0.179 -0.237

t (β1) (×103) 2.512 2.557

Blue

MSE (×10−3) 35.265 35.371

MR (×10−3) 0.0 10.2

SVR (×10−6) 89.1 89.6

MRC (×10−6) -269.5 -259.4

MERC (×10−3) -0.2 -0.4

R20.8810 0.8807

t (β0) (×103) -0.430 -0.457

t (β1) (×103) 2.991 2.993

Table 3.10: Regressions statistics for MUCT dataset.

3.4 experiments 59

0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

qf

qf

Red

Green

Blue

(a) LSM.

0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

qf

qf

Red

Green

Blue

(b) LSM-R.

Figure 3.17: qf vs qf for MUCT dataset using the (a) LSM and (b) LSM-R. The threecolour bands are shown. Black line is the identity line.

3.4.5.1 Location #1

Figure 3.20 depicts the distribution of the FG versus the BG quotients. A lin-ear trend in the three bands was noticeable. Nevertheless, the points cloudwas very disperse around the linearity. Furthermore, multiple straight linesescaped from the main trend:

• Multiple lines were defined by the form: qb1 = k, being k ∈ [0.5, 1] con-

stant. While in the red and green band at least four of these lines wereidentified, in the blue band there was only one. That is, there weresome values of qb

1 in which the intensity of the BG region diminishedfrom the reference value while the FG variations could have any value.Thus, for these qb

1 ’s no relation between BG and FG was established.Note that the line qb

1 = 1 existed in the three bands. This is the casewhen the intensity of the BG region did not change.

• Also in the range qb1 ∈ [0.5, 1], a line with a higher slope was notice-

able.


−1 −0.5 0 0.5 1 0

0.045

0.091

0.14

0.18

0.23

0.27

0.32

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−1 −0.5 0 0.5 1 0

0.045

0.091

0.14

0.18

0.23

0.27

0.32

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) LSM-R.

Figure 3.18: Residuals histogram per band of MUCT dataset.

3.4 experiments 61

Figure 3.19: Regions and locations of the Parking scene. BG region is the area withinthe red lines. FG regions belong to one of three different locations: Lo-cation #1, #2 and #3 are the areas at the top, in the middle and at thebottom, respectively.

• Also in this range, in the red and green bands there were several pointsclouds that could be approximated by straight lines of slope close tozero. In the range qb

1 ∈ [1, 1.5] a zero slope line could also fit otherpoints clouds. That is, sometimes, while BG quotients vary along theseranges, the FG quotients remained almost unaltered.

The shadows produced by the trees located behind the parking in theupper side (Figure C.3) could primarily explained this behaviour. Whereasthe bottom side of the BG did not suffer from these shadows. This resultleaded to weird photometric relations between the regions.

Although a linear relationship was hard to fit for several points, consid-ered outliers, we analysed how the linear regression may explain the be-haviour for the inliers.

LSM LSM-R

Redβ0 0.3923 0.3943

β1 0.6604 0.6647

Greenβ0 0.2211 0.1907

β1 0.8183 0.8535

Blueβ0 0.2384 0.2062

β1 0.7947 0.8285

Table 3.11: Regressors estimates for location #1 of Parking dataset.

In Table 3.11 the regressors estimates for the three colour bands are shownusing LSM and LSM-R. In both cases the slope was less than 1; i. e., for anyvariation of the BG, a lesser one was produced in the FG. Also note that,when qb

1 = 1, qf ≈ 1 in all the regressions. Thus, FG held unalterable whenBG did.

Figure 3.21 depicts the distribution of the fitted response data versus theobserved data. Note that the three bands mostly shown a similar behaviour.In general, qf rose along qf nearby the identity line. Nevertheless, a hugenumber of outliers lay far from it. For instance, those points clouds in thebottom-right side that corresponded to the lines q1 = k (k constant) in Fig-


0.5 1 1.5 2 2.5 3 3.5 4 4.5

0.5

1

1.5

2

2.5

3

3.5

4

qb1

qf

(a) Red Band.

0.5 1 1.5 2 2.5 3 3.5 4 4.5

0.5

1

1.5

2

2.5

3

3.5

4

qb1

qf

(b) Green Band.

0.5 1 1.5 2 2.5 3 3.5 4 4.5

0.5

1

1.5

2

2.5

3

3.5

4

qb1

qf

(c) Blue Band.

Figure 3.20: Distribution of qf vs qb1 for the three bands of the location #1 of Parking

dataset.

3.4 experiments 63

0.5 1 1.5 2 2.5 3 3.5 4

0.5

1

1.5

2

2.5

3

3.5

4

qf

qf

Red

Green

Blue

(a) LSM.

0.5 1 1.5 2 2.5 3 3.5 4

0.5

1

1.5

2

2.5

3

3.5

4

qf

qf

Red

Green

Blue

(b) LSM-R.

Figure 3.21: qf vs qf for location #1 of Parking dataset using the (a) LSM and (b)LSM-R. The three colour bands are shown. Black line is the identityline.

ure 3.20. Several points in the region qf ∈ [0.25, 1.5] over the identity linegenerated huge errors.

Comparing the methods, LSM-R estimates were closer to the identity linethan those given by LSM. Only the values greater than 2.5 for the red bandwere better fit in the LSM than using LSM-R. Actually, as Figure 3.20a shows,it seems that a straight line parallel to the main trend was identified forqf > 2.5. Thus, LSM tended to fit those values rather than LSM-R.

In Table 3.12, the statistics of the regressions are shown. The RGB bandsare included. The results were not comparable with the previous datasets.Considering the range of the quotients, both regressions brought large MSEs.As we expected, LSM-R produced slightly larger MSEs than LSM.

Furthermore, MR and MERC were at least three orders of magnitude largerin LSM-R than in LSM. The absolute values of MRC in the LSM were morethan four times larger in the green band, and more than two times in theblue band compared with LSM-R. That is, there was less correlation betweenregressors in the LSM-R than in the LSM. Whereas in the red band this valuewas more than three times larger in the LSM. This result was likely caused


LSM LSM-R

Red

MSE (×10−3) 93.654 93.734

MR (×10−3) -0.0 -8.2

SVR (×10−6) 188.2 189.7

MRC (×10−6) 689.9 694.4

MERC (×10−3) 0.1 -1.6

R20.7775 0.7773

t (β0) (×103) 2.63 2.64

t (β1) (×103) 7.40 7.44

Green

MSE (×10−3) 72.877 73.382

MR (×10−3) 0.0 -14.0

SVR (×10−6) 156.2 162.5

MRC (×10−6) 139.8 149.6

MERC (×10−3) 0.0 -4.4

R20.6950 0.6929

t (β0) (×103) 1.19 1.03

t (β1) (×103) 5.97 6.21

Blue

MSE (×10−3) 53.653 53.991

MR (×10−3) 0.0 -9.7

SVR (×10−6) 84.5 88.1

MRC (×10−6) 177.5 182.6

MERC (×10−3) 0.0 -3.6

R20.7144 0.7126

t (β0) (×103) 1.42 1.23

t (β1) (×103) 6.26 6.51

Table 3.12: Regressions statistics for location #1 of Parking scene.

3.4 experiments 65

by the estimates in the upper values, as we comment before. Nevertheless,all of these values were sufficient lower to confirm the Gaussian assump-tions.R2 values were less than 0.5 in all cases, being the blue band values the

worst. Therefore, the estimates reliability was weak. Nevertheless, t statisticswere unexpectedly high, providing very precise regressors. This statistic wascaused by the standard error was very low. The combination of low R2 withhigh t suggested that the linear regression was properly estimated while ahuge number of response data could not be explained by the QRMR.

Figure 3.22 and Figure 3.23 depict the histograms of the residuals perband for LSM and LSM-R, respectively. Most of the errors were gatheredbetween [−0.5, 0.5] forming an identifiable main distribution. Although themain distribution had no bias, positive residuals brought long tails. Thesetails corresponded to the outliers over the identity line of Figure 3.21. In ad-dition, the histograms of the residuals show a multimodal behaviour, moreevident in the red band, caused by possible secondary straight lines. Com-paring both regression methods, the main distribution was narrower in theLSM-R than LSM. This fact is specially noticeable in the green and bluebands. Thus, as we expected, the error of the estimates of the inliers wasless in LSM-R than in LSM.

3.4.5.2 Location #2

Location #2 belonged to the middle part of the scene and it was surroundedby the BG. Furthermore, the objects that may occlude the sunlight (treesand buildings) were further than in the other locations. Thus, a priori, thislocation should provided with the best fit.

Figure 3.20 depicts the distribution of the FG versus the BG quotients.There were multiple similarities with Figure 3.24, including the linear trendand the vertical lines close to qb

1 = 1. But in this location, at least three dif-ferent and almost parallel lines were noticeable. This issue was more clearin the green and blue bands, whereas these lines were more confused andeven shown different slopes in the red band.

Regarding the regressors estimates (Table D.7) and the plot of the fit re-sponse data versus the observed ones (Figure D.9), nothing different fromprevious location was inferred.

In Table 3.13, the statistics of the regressions are shown per band. Ratherthan the previous location, MSE was reduced up to six times (in the blueband) and R2 exceeded 0.5 and even reached 0.6689 in the blue band usingLSM-R. Thus, the QRMR was better fulfilled in this location.

The difference between LSM and LSM-R distributions of the histogramsof the residuals (Figure D.10 and Figure D.11) were barely noticeable (un-like a larger mode in LSM-R) and they were narrower than in location #1.Furthermore, instead of multimodal distributions, a slightly asymmetry ofthe main distribution and spurious peaks were perceived.

3.4.5.3 Location #3

In Figure 3.25 the distribution of qf versus qb1 is shown. The linear trend

was evident. Unlike the other locations, the vertical lines did not appear.Nevertheless, there were several points clouds that may fit straight lines.Those that had an slope close to zero also appeared in the other locations.

Regarding the regressors estimates (Table D.8), nothing remarkable wasadded.


−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0038

0.0077

0.012

0.015

0.019

0.023

Residuals

Norm

aliz

ed fre

quency

(a) Red band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0038

0.0077

0.012

0.015

0.019

0.023

Residuals

No

rma

lize

d f

req

ue

ncy

(b) Green band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0038

0.0077

0.012

0.015

0.019

0.023

Residuals

Norm

aliz

ed fre

quency

(c) Blue band.

Figure 3.22: Residuals histogram per band of location #1 of Parking dataset. The threecolour bands are shown in separate graphs. LSM applied.

3.4 experiments 67

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0038

0.0077

0.012

0.015

0.019

0.023

Residuals

Norm

aliz

ed fre

quency

(a) Red band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0038

0.0077

0.012

0.015

0.019

0.023

Residuals

Norm

aliz

ed fre

quency

(b) Green band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0038

0.0077

0.012

0.015

0.019

0.023

Residuals

Norm

aliz

ed fre

quency

(c) Blue band.

Figure 3.23: Residuals histogram per band of location #1 of Parking dataset. The threecolour bands are shown in separate graphs. LSM-R applied.


0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

qb1

qf

(a) Red Band.

0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

qb1

qf

(b) Green Band.

0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

qb1

qf

(c) Blue Band.


dataset.

3.4 experiments 69

LSM LSM-R

Red

MSE (×10−3) 86.343 88.894

MR (×10−3) -0.0 40.0

SVR (×10−6) 150.9 152.1

MRC (×10−6) -536.6 -423.5

MERC (×10−3) -0.0 -7.0

R20.5798 0.5673

t (β0) (×103) 2.19 1.35

t (β1) (×103) 6.47 6.94

Green

MSE (×10−3) 45.404 46.547

MR (×10−3) -0.0 24.3

SVR (×10−6) 39.0 41.9

MRC (×10−6) -425.3 -342.8

MERC (×10−3) -0.0 -3.9

R20.6227 0.6132

t (β0) (×103) 1.77 0.99

t (β1) (×103) 7.07 7.58

Blue

MSE (×10−3) 30.764 31.251

MR (×10−3) 0.0 15.0

SVR (×10−6) 24.6 25.7

MRC (×10−6) -154.3 -95.5

MERC (×10−3) -0.0 -2.5

R20.6726 0.6674

t (β0) (×103) 1.79 1.16

t (β1) (×103) 7.89 8.33



0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

3

qb1

qf

(a) Red Band.

0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

3

qb1

qf

(b) Green Band.

0.5 1 1.5 2 2.5

0.5

1

1.5

2

2.5

3

qb1

qf

(c) Blue Band.


dataset.

3.4 experiments 71

0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

2.5

3

qf

qf

Red

Green

Blue

(a) LSM.

0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

2.5

3

qf

qf

Red

Green

Blue

(b) LSM-R.

Figure 3.26: qf vs qf for location #3 of Parking dataset using the (a) LSM and (b)LSM-R. The three colour bands are shown. Black line is the identityline.


Whereas, the qf versus qf graphs (Figure 3.26) shows a better fit than inthe rest of locations, as we expected from the relation between response andexplanatory data. Nevertheless, the presence of a huge number of outlierswas still an inconvenience.

LSM LSM-R

Red

MSE (×10−3) 46.880 47.974

MR (×10−3) 0.0 23.7

SVR (×10−6) 4.8 3.9

MRC (×10−6) -266.6 -307.7

MERC (×10−3) -0.1 -5.7

R20.7702 0.7649

t (β0) (×103) 1.13 7.66

t (β1) (×103) 4.64 4.85

Green

MSE (×10−3) 36.245 39.685

MR (×10−3) -0.0 45.3

SVR (×10−6) 1.6 1.1

MRC (×10−6) -106.6 -149.5

MERC (×10−3) -0.1 -7.6

R20.7205 0.6940

t (β0) (×103) 1.44 0.73

t (β1) (×103) 4.07 4.36

Blue

MSE (×10−3) 35.309 38.941

MR (×10−3) -0.0 49.2

SVR (×10−6) 0.9 1.1

MRC (×10−6) 23.3 -38.1

MERC (×10−3) -0.1 -6.8

R20.6873 0.6551

t (β0) (×103) 1.55 0.84

t (β1) (×103) 3.75 4.02


The statistics referred to this location (Table 3.14) rather improved thoseobtained in location #2. All the errors were diminished and only t statisticsof β1 were slightly lower.

The comments made for the histograms of the residuals in location #2 arestill valid here (Figure D.12 and Figure D.13).

Next, the results extracted from the last dataset are presented.

3.4.6 MCDL dataset

The MCDL dataset (Section C.3) was composed by pictures taken in two sce-nes (Figure 3.27 and Figure 3.28). The link between these scenes was thatthey included the same people (FG regions). The results here are divided

3.4 experiments 73

(a) Open doors. (b) Close doors.

Figure 3.27: MCDL dataset samples. Scene #1.

(a) Light variation #1. (b) Light variation #2.

(c) Switching on the lights.

Figure 3.28: MCDL dataset samples. Scene #2.

by the scene, denoted as camera #1 and camera #2, respectively; and by thelocation within each one (Figure 3.29).

We defined six explanatory regions for each location that held a similarlayout. Methods that tackle the collinearity (PCAM and PLSM) and its ef-fects were also evaluated.

Some of the obtained graphs and tables follow a similar pattern and docontribute with analogous information. Thus in this section, some of theresults are only shown for the green band. Section D.3 gathers those relatedto red and blue band besides several residuals histograms and qf vs qf plotsfor further consultation.

All of the images of the dataset were used.


(a) Scene #1.

(b) Scene #2.

Figure 3.29: Regions and locations of MCDL dataset. The subscripts indicate the loca-tion and the superscripts the region. The labels of location one of scene#2 are avoided. There are two kind of surfaces: i) 1–3, made of whitepaint, and ii) 4–6, made of ceramic tiles. Both might be considered Lam-bertian. Their normals are virtually perpendicular.

3.4.6.1 Camera #1 - Location #1

Table 3.15 shows the correlation matrix of the explanatory data in this loca-tion for the green band. Unlike R2–R3, R2–R4 and R3–R4, the correlation waslow. We expected that regions that are close and have similar orientationsusually have high correlations, but it was not the main trend here. Therefore,it was likely that collinearity exists but it did not much influence the results.R6 brought weird correlation values, since negative correlation were not

expected. Due to its location was further from the indoor lights and couldbe not reached by the outdoor light (Figure 3.29a), it was possible that theremaining regions increased their intensity whereas R6 did not, which re-sulted in negative correlations.

Figure 3.30 depicts several plots that relate the response data to each ex-planatory region data per band. At first sight, we noticed that data were

3.4 experiments 75

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb1

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb2

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb3

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb4

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb5

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb6

qf

(a) Red Band.

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb1

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb2

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb3

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb4

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb5

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb6

qf

(b) Green Band.

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb1

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb2

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb3

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb4

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb5

qf

0.95 1 1.051.1

0.95

1

1.05

1.1

1.15

qb6

qf

(c) Blue Band.

Figure 3.30: Distribution of qf vs qbi (1 i 6) for the three bands of location #1 of

Camera #1 of the MCDL dataset.


R1 R2 R3 R4 R5 R6

R1 1 0.6800 0.6331 0.5081 0.0785 -0.0407

R2 0.6800 1 0.9628 0.8649 0.4266 -0.0475

R3 0.6331 0.9628 1 0.8652 0.4055 -0.0017

R4 0.5081 0.8649 0.8652 1 0.6513 0.2307

R5 0.0785 0.4266 0.4055 0.6513 1 -0.0922

R6 -0.0407 -0.0475 -0.0017 0.2307 -0.0922 1

Table 3.15: Correlation between BG regions Ri (1 i 6) in location #1 of camera #1

for the MCDL dataset. Green band.

mostly concentrated in very narrow vertical points clouds. Specially in BG

regions #1 and #2, these clouds were less than 0.05 width. In other words,the intensity barely changed in these regions while the response region quo-tients brought significant variations (approximately between 0.7 and 1.1). Toa lesser extent, the remaining BG regions showed a similar behaviour.


β0 -0.3841 -1.2174 -0.1950 -0.8753 -0.4445

β1 -0.9741 -3.8462 0.0000 0.0500 -0.8469

β2 2.8400 9.9104 0.3079 0.3008 0.9229

β3 -0.3601 -3.8358 0.2859 0.3418 0.4317

β4 -1.7204 -2.2843 0.1925 0.3560 0.1002

β5 0.8712 1.2389 0.3606 0.6154 0.4518

β6 0.7265 1.0343 0.0508 0.2120 0.3805

Table 3.16: Regressors estimates for green band for location #1 of camera #1 of MCDL

dataset.

In Table 3.16, the regressors estimates for the green band are shown. Theygreatly differed from one method to another. Depending on the method,some regions prevailed over the others, although R2 stood out over the restin LSM, and LSM-R and PLSM.

Other relevant issue was the negative regressors in all of the estimates.Because these negative values were provided for the same regions, theycould obey to partial shadows or interreflections. On the contrary, PCAMand NLSM, which forced all the regressors of the explanatory regions to bepositive, provided estimates lower than the rest. As we later see, it leaded topoor significant regressors.

The goodness of these estimates can be extracted from Figure 3.30. Theresults were far from being perfect. The best approximation to the identityline came from LSM-R (Figure 3.31b). A linear trend was hardly recognised.Whereas, the remaining methods showed a nearly random distribution ofthe fitted responses.

The poor performance was strengthened by the low values of R2 pre-sented in Table 3.17. In addition, the negative value for the LSM-R meantthat the norm of the residuals was greater than the variance of the observedresponses. In other words, our model fitted worse than a constant assump-

3.4 experiments 77

0.95 1 1.05 1.1 1.15

0.95

1

1.05

1.1

1.15

qf

qf

Red

Green

Blue

(a) LSM.

0.95 1 1.05 1.1 1.15

0.95

1

1.05

1.1

1.15

qf

qf

Red

Green

Blue

(b) LSM-R.

0.95 1 1.05 1.1 1.15

0.95

1

1.05

1.1

1.15

qf

qf

Red

Green

Blue

(c) NLSM.

Figure 3.31: qf vs qf for camera #1 and location #1 of MCDL dataset. The three colourbands are shown. Black line is the identity line.


0.95 1 1.05 1.1 1.15

0.95

1

1.05

1.1

1.15

qf

qf

Red

Green

Blue

(d) PCAM.

0.95 1 1.05 1.1 1.15

0.95

1

1.05

1.1

1.15

qf

qf

Red

Green

Blue

(e) PLSM.


3.4 experiments 79


MSE (×10−3) 0.343 2.771 0.578 0.515 0.430

MR (×10−3) -0.0 -8.9 -0.1 0.0 0.0

SVR (×10−6) 0.1 24.3 0.1 0.1 0.0

MRC (×10−6) -42.2 72.1 -68.4 -79.5 -68.4

MERC (×10−3) 0.0 -0.3 0.1 0.0 0.0

R20.7317 -1.1659 0.5486 0.5978 0.6637

t (β0) -7.66 -8.55 -3.00 -14.27 -7.92

t (β1) -19.26 -26.76 0.00 0.81 -14.95

t (β2) 26.83 32.96 2.24 2.32 7.79

t (β3) -4.25 -15.93 2.60 3.29 4.55

t (β4) -12.91 -6.04 1.11 2.18 0.67

t (β5) 22.64 11.33 7.22 13.06 10.48

t (β6) 18.04 9.04 0.97 4.30 8.44

Table 3.17: Regressions statistics for location #1 of camera #1 of MCDL scene. Greenband.

tion equal to the mean of the observed responses. Nevertheless, the reasoncould be the high error of the outliers.

Although the r-squared values did not adequately explain the observeddata, the MSEs was suitable and the t values were mostly sufficient highto highlight that the regressors were significant. Specially in LSM-R, whichprovided the greatest t’s, it suggested that this estimate could be suitablewhereas some response data could not be explained by the model. In thisregard, the largest error statistics of the LSM-R were explained by the out-liers.

According to these statistics, the worst performance was the NLSM, whichbrought larger errors and t’s close to zero. Therefore, the assumption ofpositive linear relationship between response and explanatory regions di-minished in value. Whereas, the best performance is the LSM.

Regarding the Gaussian assumptions, the zero mean value of the resid-uals and the homoscedasticity assumptions were fulfilled (low values forMR and SVR). The assumption of no serial correlation between residuals wasalso fulfilled although the MRC values were larger than in the previous da-tasets. Nevertheless, the covariance with the explanatory regions could notbe assumed to be zero12 although was still low. This means that the residu-als contained significant information regarding the explanatory regions thatwas not included in the fitted response, and it could be difficult to isolatethe influence of the explanatory regions and the residuals in the response.However, this is not critical for our purpose.

Figure 3.31 depicts the histogrmas of the residuals of the LSM, LSM-Rand PCAM. We realised that PLSM brought an histogram similar to LSMand NLSM to PCAM (Figure D.14). Unlike LSM-R, at first glance, none his-togram was biased. Furthermore, a slightly positive skewness was presentedin all histograms.

Although the best error statistics (Table 3.17) came from the LSM, themost of the residuals of the LSM-R were gathered in the bin closest to zero.

12 Note that the exponent of MERC is −3 meanwhile in the previous datasets is −6.


−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

0.94

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

0.94

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) LSM-R.

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

0.94

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) PCAM.

Figure 3.31: Residuals histogram per band of camera #1 and location #1 of MCDL

dataset13.

3.4 experiments 81

Thus, the errors of the LSM-R were mostly smaller but the fit of the outlierswas worse.

Following we analyse the similarities and differences in the rest of loca-tions.


R1 R2 R3 R4 R5 R6

R1 1 0.9695 0.2578 0.9118 0.6249 -0.8306

R2 0.9695 1 0.1877 0.9642 0.7566 -0.8824

R3 0.2578 0.1877 1 0.3024 -0.1517 -0.4613

R4 0.9118 0.9642 0.3024 1 0.7740 -0.9373

R5 0.6249 0.7566 -0.1517 0.7740 1 -0.6681

R6 -0.8306 -0.8824 -0.4613 -0.9373 -0.6681 1



The layout of the BG regions was similar in each scene. Nevertheless, com-pared with Table 3.15, the correlation matrix in location #2 (Table 3.18) hada different pattern. Only R6 seemed to have a similar behaviour caused bythe presence of negative correlations. However, in location #2 all of the cor-relations with R6 were negative and even higher than in location #1. As wellR2–R4 maintained a high correlation but this time R1–R2 had the highestlinear dependence. These differences leaded to think that, although the lay-out of the regions was similar, the photometric variation functions stronglydepended on the location.

In general, R2 and R4 held high correlations with the rest of regions.Regarding the relation between the response data and the explanatory re-

gions, Figure 3.32 depicts narrower vertical distributions than in previouslocation. In some cases, these distributions fitted two vertical lines. One ofthose lines was qb

i = 1 unlike in R3. In this scene the predominant lightvariation was when the doors open. So, the BG regions quotients took onlytwo values. One of these values is 1 because it corresponded to the refer-ence state. Whereas, the FG quotients took a wider range of values probablycaused by the change of the orientation and location of the bodies.

The case of R5 meant that the intensity barely changed.The reason of the plots included the point (1, 1) is that it corresponded

to the reference image. In addition, in region R3, the reference had a weirdvalue larger than in the rest of the dataset.

Regarding the regressors estimates (Table D.18), a clear predominant re-gion was not found in any estimate.

Figure 3.33 depicts the fitted response versus the observed response forLSM, LSM-R and PCAM. The linearity of the data was slightly intuited inLSM-R. Whereas, PCAM plot depicts a constant behaviour; therefore, anerroneous estimate.

Table 3.19 indicates that the fit of this location performed equal to the pre-vious one regarding r-squared and the zero covariance with the explanatory

13 We found a residual of value close to 6 in the three bands of LSM-R that was filtered. It waslikely the reason of the negative value of R2 and the larger error statistics in Table 3.17.


1 2

1

1.05

1.1

1.15

1.2

qb1

qf

1 2

1

1.05

1.1

1.15

1.2

qb2

qf

1 2

1

1.05

1.1

1.15

1.2

qb3

qf

1 2

1

1.05

1.1

1.15

1.2

qb4

qf

1 2

1

1.05

1.1

1.15

1.2

qb5

qf

1 2

1

1.05

1.1

1.15

1.2

qb6

qf

(a) Red Band.

1 2

1

1.05

1.1

1.15

1.2

qb1

qf

1 2

1

1.05

1.1

1.15

1.2

qb2

qf

1 2

1

1.05

1.1

1.15

1.2

qb3

qf

1 2

1

1.05

1.1

1.15

1.2

qb4

qf

1 2

1

1.05

1.1

1.15

1.2

qb5

qf

1 2

1

1.05

1.1

1.15

1.2

qb6

qf

(b) Green Band.

1 2

1

1.05

1.1

1.15

1.2

qb1

qf

1 2

1

1.05

1.1

1.15

1.2

qb2

qf

1 2

1

1.05

1.1

1.15

1.2

qb3

qf

1 2

1

1.05

1.1

1.15

1.2

qb4

qf

1 2

1

1.05

1.1

1.15

1.2

qb5

qf

1 2

1

1.05

1.1

1.15

1.2

qb6

qf

(c) Blue Band.

Figure 3.32: Distribution of qf vs qbi (1 i 6) for the three bands or location #2 of


3.4 experiments 83

1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(a) LSM.

1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(b) LSM-R.

1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(c) PCAM.




MSE (×10−3) 0.674 2.608 1.839 1.867 0.861

MR (×10−3) -0.0 6.9 -0.0 -0.0 0.0

SVR (×10−6) 1.4 14.2 1.9 2.2 1.3

MRC (×10−6) 43.3 131.6 298.8 277.7 59.4

MERC (×10−3) -0.0 1.3 -0.2 -0.2 -0.0

R20.7678 0.1019 0.3667 0.3570 0.7036

t (β0) 28.83 12.90 0.99 6.29 23.77

t (β1) 6.65 0.90 0.09 1.45 -8.58

t (β2) -12.70 -4.61 1.00 0.50 -1.02

t (β3) -9.12 -23.14 4.73 1.51 -7.75

t (β4) -7.63 -0.88 2.27 0.65 0.97

t (β5) 20.10 8.48 3.47 0.24 2.75

t (β6) -43.25 -19.94 0.00 -3.40 -28.86

Table 3.19: Regressions statistics for location #2 of camera #1 of MCDL dataset. Greenband.

regions. Nevertheless, the MSE was slightly worse; unlike in LSM-R due tothe less error of the outliers.

Regarding the histogram of the residuals (Figure D.16), the conclusionswere comparable to those obtained in the previous location.


R1 R2 R3 R4 R5 R6

R1 1 0.5767 0.5569 0.5228 0.5254 -0.4937

R2 0.5767 1 0.9848 0.9765 0.9675 -0.9628

R3 0.5569 0.9848 1 0.9594 0.9397 -0.9154

R4 0.5228 0.9765 0.9594 1 0.9687 -0.9603

R5 0.5254 0.9675 0.9397 0.9687 1 -0.9619

R6 -0.4937 -0.9628 -0.9154 -0.9603 -0.9619 1



In location #3, Table 3.20 shows a high correlation between regions unlikeR1. Thus, multicollinearity effects were very likely.

The distribution of the quotients (Figure D.18) was similar to that ob-tained previously. Here, R1 was the one that fitted the vertical line qb

1 = 1.In this case, the radiation in this region was mostly influenced by a lightbulb that was always on (Figure C.5 and (Figure 3.29a). This explained theinvariance of the intensity in this region.

The regressors estimates (Table 3.21) gave to one (R2) or two regions (R3)the predominance over the rest of the regions only in LSM and LSM-R, re-spectively. Nevertheless, PCAM and PLSM generated intercept terms much

3.4 experiments 85


β0 0.9563 0.4452 0.1184 0.9018 0.8980

β1 0.5152 0.6890 0.2473 0.0066 0.0074

β2 -1.4433 2.0422 0.0992 0.0430 0.0422

β3 0.3964 -2.3050 0.0000 0.0447 0.0438

β4 0.3295 -0.4084 0.2037 0.0612 0.0621

β5 0.3997 0.6347 0.2835 0.0859 0.0886

β6 -0.1539 -0.0977 0.0478 -0.1437 -0.1423


dataset.

higher than the rest. Therefore, these two methods generated low regressors,which did not properly fit the data (Figure D.19). The relation between thefitted response and the observed response was comparable to that observedin location #2.


MSE (×10−3) 1.182 1.978 1.268 1.296 1.295

MR (×10−3) -0.0 1.6 0.0 -0.0 -0.0

SVR (×10−6) 0.5 2.4 0.5 0.3 0.3

MRC (×10−6) 347.7 674.8 235.7 258.3 258.4

MERC (×10−3) -0.1 0.1 -0.1 -0.0 -0.0

R20.6057 0.3404 0.5773 0.5679 0.5683

t (β0) 8.01 2.88 0.96 7.21 7.18

t (β1) 6.51 6.73 3.02 0.08 0.09

t (β2) -6.17 6.75 0.41 0.18 0.17

t (β3) 2.84 -12.77 0.00 0.31 0.30

t (β4) 5.19 -4.97 3.10 0.92 0.93

t (β5) 10.23 12.56 7.01 2.10 2.17

t (β6) -4.79 -2.35 1.44 -4.28 -4.24


The statistics of the regressions (Table 3.22) show the poor performanceof LSM-R: greater MSE, MR, SVR, MRC; and very low R2. Compared withthe previous location, R2 was more than 0.1 greater in LSM-R, NLSM andPCAM; and less than 0.1 in LSM and PLSM. Also note that the t valuesof β0, β2, β3 and β6 were much greater than the rest in the mentionedmethods, which strengthened the evidence of multicollinearity. Even PCAMand PLSM generated this unevenness.

Regarding the Gaussian assumptions, although MRC was slightly largercompared with location #2, the values were still low enough to fulfil them.

According to the statistics, LSM arose as the best method. This was con-firmed by the histogram of the residuals histogram (Figure 3.34) althoughthe differences with LSM-R and PLSM were minor.


−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) LSM-R.

−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) PLSM.


dataset.

3.4 experiments 87


R1 R2 R3 R4 R5 R6

R1 1 -0.2634 -0.4596 -0.1682 -0.1043 0.8309

R2 -0.2634 1 0.9555 0.8164 -0.1400 0.1086

R3 -0.4596 0.9555 1 0.8511 -0.0970 -0.1109

R4 -0.1682 0.8164 0.8511 1 -0.1163 -0.0300

R5 -0.1043 -0.1400 -0.0970 -0.1163 1 0.0283

R6 0.8309 0.1086 -0.1109 -0.0300 0.0283 1



Table 3.23 shows the matrix correlation of the explanatory data in the lo-cation #1 of camera #2. High correlations were produced between: R1–R6,R2–R3, R2–R4 and R3–R4. Thus, multicollinearity effects were expected. Fur-thermore, regions #1 and #5 had a weak inverse correlation with the others;perhaps due to these regions were not very influenced by the outdoor lightwhereas the others did (Figure C.8b). This issue was also concluded fromthe distribution of the quotients (Figure 3.35).

Unlike region #5, in the three bands the regions shown a wider verti-cal points cloud where the indoor and outdoor light changes were mixed.Whereas, region #5 produced a random distribution.

Similarly to scene #1, a linear trend different from the vertical was notidentified in any region. Nevertheless, as we seen before, this was not anenough condition to reject the multiple linear relationship.


β0 0.0012 0.1482 -0.0096 0.1592 0.2529

β1 -0.4426 -0.5942 0.0030 -0.0744 -0.1362

β2 0.9443 1.4700 0.2997 0.2499 0.2680

β3 -0.1335 -0.5881 0.3791 0.3506 0.3650

β4 0.2029 0.1346 0.2202 0.3283 0.2728

β5 -0.0516 -0.0798 0.0000 -0.0360 -0.0372

β6 0.4809 0.5093 0.0984 0.0204 0.0166


dataset.

The regressors estimates (Table 3.24) were mainly controlled by region #2

in LSM and LSM-R. The latter generated similar regressors to the former.Thus, LSM-R did not classified many outliers in this experiment.

LSM-R still produced the best fit (Figure 3.36). The error was minor forqf < 1 for every method, but for larger values, the fit response got awayfrom the observed response.

Table 3.25 show the regressions statistics. The MSEs were low and the Gaus-sian assumptions were fulfilled as in the scene #1. As regards r-squared val-ues, they were higher than 0.7 for LSM, LSM-R and PLSM, which meansthat the reliability of the fit was high.


0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb1

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb2

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb3

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb4

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb5

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb6

qf

(a) Red Band.

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb1

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb2

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb3

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb4

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb5

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb6

qf

(b) Green Band.

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb1

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb2

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb3

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb4

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb5

qf

0.5 1 1.5

0.8

0.9

1

1.1

1.2

qb6

qf

(c) Blue Band.



3.4 experiments 89

0.8 0.9 1 1.1 1.2

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(a) LSM.

0.8 0.9 1 1.1 1.2

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(b) LSM-R.

0.8 0.9 1 1.1 1.2

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(c) PLSM.




MSE (×10−3) 2.015 2.204 2.438 2.450 2.398

MR (×10−3) -0.0 -0.5 -0.5 0.0 0.0

SVR (×10−6) 1.1 2.6 1.2 0.6 0.4

MRC (×10−6) 49.4 59.3 6.7 60.6 132.0

MERC (×10−3) -0.0 0.4 -0.2 -0.0 0.0

R20.7483 0.7247 0.6954 0.6939 0.7004

t (β0) 0.04 5.03 -0.31 5.12 8.22

t (β1) -6.99 -8.97 0.04 -1.06 -1.97

t (β2) 10.10 15.04 2.92 2.43 2.63

t (β3) -1.38 -5.80 3.56 3.28 3.45

t (β4) 4.22 2.68 4.17 6.20 5.20

t (β5) -7.84 -11.61 0.00 -4.96 -5.19

t (β6) 5.80 5.88 1.08 0.22 0.18


The t statistic of the R2 was significantly higher than the rest in LSMand LSM-R; which enhanced the significance of this region but evidencedmulticollinearity. Fortunately, this time PCAM and PLSM seemed to correctthis effect because their t statistics were more or less balanced. The residualshistograms (Figure D.23) were very similar for all methods. At first sight,they were unbiased; therefore, the regressions were suitable.


R1 R2 R3 R4 R5 R6

R1 1 0.6878 0.6921 0.7697 0.4716 0.9720

R2 0.6878 1 0.9981 0.9819 0.5666 0.5956

R3 0.6921 0.9981 1 0.9826 0.5891 0.6046

R4 0.7697 0.9819 0.9826 1 0.5943 0.7008

R5 0.4716 0.5666 0.5891 0.5943 1 0.5022

R6 0.9720 0.5956 0.6046 0.7008 0.5022 1



In location #2 the anti correlation disappeared (Table 3.26). Nevertheless,the large correlation remained. Only the correlations with R5 held below0.6. In this location, a linear trend was identified between qf and qb

i unlikeregion #5, which continued showing a random distribution (Figure D.25).

This random behaviour did not provide with suitable information aboutthe FG as Table 3.27 revealed. Thus, the values of β5 were almost irrele-vant compared with the remaining values. In this location, although β3 waslarger in absolute value than the others for LSM and LSM-R, the ranges ofthe values of the other estimates were comparable.

3.4 experiments 91

0.8 0.85 0.9 0.95 1

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(a) LSM.

0.8 0.85 0.9 0.95 1

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(b) LSM-R.

0.8 0.85 0.9 0.95 1

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(c) PLSM.




β0 0.3987 0.2921 0.0776 0.2310 0.2967

β1 -0.8946 -0.7804 0.1348 0.0835 0.0342

β2 -0.2910 0.2775 0.2398 0.2499 0.2706

β3 1.6186 1.0699 0.2393 0.2653 0.2946

β4 -0.6149 -0.5500 0.1796 0.2379 0.2248

β5 -0.1847 -0.1415 0.0000 -0.1335 -0.1369

β6 0.9666 0.8324 0.1228 0.0594 0.0114


dataset.

Figure 3.37 depicts the estimates of LSM, LSM-R and PLSM performance.Although the errors were considerable, the estimates were close to the iden-tity line along the whole range of data for LSM-R better than the rest.


MSE (×10−3) 0.441 0.485 0.836 0.643 0.597

MR (×10−3) -0.0 2.5 0.0 0.0 0.0

SVR (×10−6) 0.2 0.3 0.6 0.2 0.2

MRC (×10−6) 57.7 54.2 104.3 68.0 44.7

MERC (×10−3) -0.0 -0.2 -0.4 -0.1 -0.1

R20.9014 0.8918 0.8133 0.8563 0.8666

t (β0) 19.62 13.72 2.77 9.42 12.56

t (β1) -12.13 -10.10 1.33 0.94 0.40

t (β2) -2.42 2.21 1.45 1.73 1.94

t (β3) 15.72 9.92 1.69 2.13 2.46

t (β4) -11.71 -9.99 2.48 3.75 3.68

t (β5) -24.55 -17.95 0.00 -14.70 -15.65

t (β6) 13.04 10.71 1.20 0.66 0.13


The statistics of the regressions (Table 3.28) shown an improvement com-pared with location #1 in every issue for the LSM, LSM-R and PLSM:

1. Unlike MRC, all the statistics related to errors diminished.

2. R2 was larger than 0.80.

3. The t statistics were large and balanced but the related to region #5, asexpected.

Therefore, Gaussian assumptions were fulfilled besides multicollinearitycould be neglected.

Regarding the histogram of the residuals (Figure 3.38), these were unbi-ased.

3.4 experiments 93

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PLSM.


dataset.



R1 R2 R3 R4 R5 R6

R1 1 0.9770 0.9624 0.8320 0.8821 0.8857

R2 0.9770 1 0.9689 0.8201 0.8213 0.7963

R3 0.9624 0.9689 1 0.8081 0.8055 0.8013

R4 0.8320 0.8201 0.8081 1 0.7586 0.7048

R5 0.8821 0.8213 0.8055 0.7586 1 0.9086

R6 0.8857 0.7963 0.8013 0.7048 0.9086 1



The regions of location #3 were highly correlated (Table 3.29). The lowestvalue was 0.7048 between R4 and R6. The rest were larger than 0.75. Unlikethe previous locations, region #5 was also highly correlated with the others.

Regarding the distribution of the quotients (Figure 3.39), they were moredispersed than in previous locations but the linear trend was clear even inR5.

The values of the regressors remained similar than in location #2 (Ta-ble D.43). The fitted responses versus the observe responses were similar tothose observed in location #2 (Figure D.28).


MSE (×10−3) 1.231 1.248 2.091 2.469 2.065

MR (×10−3) -0.0 0.0 -0.1 -0.0 0.0

SVR (×10−6) 1.1 1.1 1.2 1.5 1.3

MRC (×10−6) -173.3 -162.9 -12.6 25.1 -25.2

MERC (×10−3) 0.1 0.0 -0.0 -0.0 -0.0

R20.8832 0.8816 0.8016 0.7657 0.8040

t (β0) -18.85 -15.63 -5.56 -5.07 -8.47

t (β1) 26.88 25.82 2.23 1.16 2.97

t (β2) 2.15 3.08 5.18 2.51 6.35

t (β3) -27.38 -26.86 0.00 5.22 -2.45

t (β4) -3.97 -3.11 3.22 2.48 -6.25

t (β5) 10.67 13.65 7.74 6.06 12.01

t (β6) -18.28 -20.70 0.30 2.01 0.98


Table 3.30 shows the regressions statistics. The MSE was lower than inlocation #1 but higher than in location #2. The R2 values were slightly lowerthan in location #2 and still larger than in location #1.

In these methods, t statistics behaviour was similar to location #2. Thus,multicollinearity was produced.

Comparing the residuals histograms (Figure D.30) to the previous ones,nothing else was noteworthy.

3.4 experiments 95

0.8 10.7

0.8

0.9

1

qb1

qf

0.8 10.7

0.8

0.9

1

qb2

qf

0.8 10.7

0.8

0.9

1

qb3

qf

0.8 10.7

0.8

0.9

1

qb4

qf

0.8 10.7

0.8

0.9

1

qb5

qf

0.8 10.7

0.8

0.9

1

qb6

qf

(a) Red Band.

0.8 10.7

0.8

0.9

1

qb1

qf

0.8 10.7

0.8

0.9

1

qb2

qf

0.8 10.7

0.8

0.9

1

qb3

qf

0.8 10.7

0.8

0.9

1

qb4

qf

0.8 10.7

0.8

0.9

1

qb5

qf

0.8 10.7

0.8

0.9

1

qb6

qf

(b) Green Band.

0.8 10.7

0.8

0.9

1

qb1

qf

0.8 10.7

0.8

0.9

1

qb2

qf

0.8 10.7

0.8

0.9

1

qb3

qf

0.8 10.7

0.8

0.9

1

qb4

qf

0.8 10.7

0.8

0.9

1

qb5

qf

0.8 10.7

0.8

0.9

1

qb6

qf

(c) Blue Band.




Following, a general discussion considering all the experiments is pre-sented.

3.4.7 Results discussion

On the whole, the regressions fulfilled the Gaussian assumptions accordingto the indicators we define in Section 3.4.2. Thus, the estimates were consid-ered valid solutions to the linear approach. Nevertheless, the experimentsdid not guarantee that the QRMR we propose was valid for the tested da-tasets. Multiple situations (non-Lambertian surfaces, shadows, occlusions,multiple normal surfaces and interreflections) produced deviations of thelinear relationship between FG and BG regions. Despite this huge number ofoutliers, linear trends were usually identified in the data.

In the Terrace dataset, the results with RAW images outperformed thosewith processed images considering the distributions of the residuals. Fur-thermore, the analysis of the results using gamma corrected images did notoutperform those with non-corrected images. Thus, there was not evidencethat gamma correction is worth. Regardless, the assessment demonstratedthat the Hypothesis 3.1 was accomplished in simple scenes in which neitherocclusions nor moving objects were presented. This issue is so even thoughnon-linear algorithms were used, although we must also consider that thequality of the images of this dataset (resolution, camera sensor, lens, com-press level and dynamic range) was high.

The experiments performed with the MUCT dataset showed that the Hy-pothesis 3.1 was fulfilled when the premises regarding neither occlusionsnor interreflections were held. The experiments also demonstrated that theQRMR may be still valid even though the light source is close to the regionsand the surfaces are not flat.

The evaluation of the Parking dataset highlighted that the big shadows,dissimilarities in the surface orientations, and FG regions composed by ma-terials with different photometric properties, produce weird points cloudsin the qf versus qb plots. Some of these clouds highlighted certain situ-ations in which the intensity of the FG changed while the intensity of BG

remained invariant. Some other else fitted straight lines different from themain trend. Despite these phenomenons, we were capable of establishing asuitable linear regression from this scene.

The assessment of the MCDL dataset emphasised that a suitable QRMR withmultiple regions might be determined even in photometric complex scenesif the occlusions are avoided. Nevertheless, we think that collecting moredata with different photometric conditions will provide with an improvedfit.

Not all regressions methods performed equal. Attending to the statis-tics, LSM outperformed the remaining in most of the indicators and scenes.NLSM did not provide the best results. Consequently, we neglect the initialassumption of positive photometric relation between regions in complexscenes. Furthermore, PCAM and PLSM did not shown total effectivenessagainst multicollinearity. To determine better regression methods that tacklethe multicollinearity could be an open issue. But if multicollinearity is actu-ally a problem is something that we should analyse when the regression isused for forecasting the FG variations. This issue is addressed in Chapter 4.In spite of the better performance of the LSM, in some scenes that were veryprone to have outliers (such as the Parking dataset) LSM-R became the bestalternative. This was demonstrated by the residuals of the inliers, which

3.5 conclusions 97

were gathered in narrower histograms distributions. Furthermore, the fittedresponse was closer to the identity line in the qf versus qf plots than in theLSM.

Regarding the dissimilarities between colour bands, we demonstrated thatthe optimal regressors differ about the colour. However, the behaviour pat-tern of the bands was mostly comparable. Only small differences in thered band of the RAW pictures were perceptible. Furthermore, we observedalmost unnoticeable differences in the fit of data of the blue band. Basedon our experiments, the goodness of the estimates does not depend on thecolour band.

Although we found large values of R2, this does not mean that the pre-diction is suitable with new data. This issue is something that we also testin the photometric correction experiments (Chapter 4).

To conclude, the QRMR was generally solid for a large part of the data.Furthermore, very small variations of the BG regions (qb ≈ 1) are usuallyproblematic. Thus, our model seems that work better when the change inthe radiation on the BG regions is considerable.

3.5 conclusions

Throughout this chapter, the photometric relation between regions underlight changes is examined. We established a Hypothesis 3.1 that modelthis relation as linear. We used it to formulate a multiple linear regressionframework to implement the model under certain assumptions (Section 3.3).When these assumptions were fulfilled, as in Terrace and MUCT datasets,the hypothesis is valid. Otherwise, we found that the errors of the estimatescan be indeed large.

Besides, even theoretical analysis indicated that the QRMR cannot be validfor any circumstance. On the one hand, the results confirmed that the pro-vided solution to the regression problem is the maximum likelihood esti-mate of w for the used dataset. Nevertheless, the actual value of this vectoris not constant. Fortunately, we indicated these situations and verified thatthe estimate is feasible for the most of the analysed cases.

Furthermore, from the three strategies identified in Section 3.3, we demon-strated that the positive regressors strategy (Section 3.3.3) is not necessary;and the multicollinearity, despite it exists, it seems not to be a problem.Moreover, methods that reject the outliers demonstrated to be helpful.

In the following chapter, we use the QRMR to hold a constant photometricresponse from the FG objects under several situations in which the lightchanges and the objects are unknown. We also evaluate the regression withtest images that are not included in the estimation.

4P H O T O M E T R I C C O R R E C T I O N I N N O N - O V E R L A P P I N GM U LT I C A M E R A A R C H I T E C T U R E S

The representation1 of objects in images varies when the photometric con-ditions change. As we identify with the dynamic IFM (Section 2.4), thesevariations are motivated by changes of: i) the light source(s); ii) the geome-try of the objects and the scene; and iii) the camera.

In several Computer Vision approaches such as object recognition, track-ing, stereo reconstruction, and so on, it is important that the colours recordedby a device are constant across a change in these conditions (Section 1.1).The aim of this chapter is to shed further light on the nature of those tech-niques that are capable of holding a colour understanding of the objectseven when those situations are produced. Furthermore, we propose a blindcorrection method that does not account prior knowledge of either the lightsources, the cameras or the target objects.

Given a colour image Bo(t0) of the surface of a target object O in time t0,we state that any photometric variation can be corrected such that any otherimage Bo(t) ∀t = t0, is comparable to the previous image whereby it allowsretrieving object O easier. That is:

Bo(t) = Γ(Bo(t)) | Bo(t) = Bo(t0) + ε, ∀t ∈ R, ε → 0 (4.1)

where Γ(·) is a correction function R3 → R3 that transforms the intensityvalues into photometric invariant values Bo. The optimal correction methodis the one that makes ε = 0 for any O and any photometric variation.

We consider conditions with multiple illumination sources, each with adifferent colour and a space-time variant intensity. In addition, we considerthat the images are captured by multiple cameras with non-overlap betweentheir FoVs. These cameras usually have different CRFs and no common pointto compare with, making harder determining Γ(·) for images of unknownobjects taken by several cameras.

The remainder of the chapter is organised as follows. In Section 4.1, theexisting approaches related to correction of colour are described. The sectionincludes methods both for standalone cameras and for multi-camera archi-tectures. Later, in Section 4.2, we propose a correction method based on atrained QRMR (Chapter 3) that focuses on observed variations of the BG to cor-rect the unknown FG objects in a single camera. In Section 4.3, we extend themethod to deal with non-overlapping cameras. The entire correction is there-fore composed of a double mapping. The first mapping corrects the localillumination variations in a camera while the second mapping compensatesthe photometric changes between images acquired by different cameras. Themethod is evaluated in Section 4.4 in three experiments. The assessment isdone using the Parking, MUCT and MCDL datasets. The experiments consistof i) correcting all the FG regions using the proposed method, ii) performinga re-identification algorithm using the corrected images and other SoA meth-ods to compare with, and iii) evaluating the distances between the objectsone–to–one by extracting some performance indicators. The conditions inwhich the re-identification using our approach outperforms the others are

1 This chapter is mainly based on [110].

99

100 photometric correction

also analysed. Furthermore, the robustness of the method when segmenta-tion errors occur is also evaluated in the multi-camera approach. Finally, inSection 4.5, some conclusions are remarked.

4.1 state of the art

Photometric

Correction Methods

In-camera

Lens

control

Sensor

control

DSP

Post-processing

Standalone

method

Application

dependant

Figure 4.1: Types of correction methods. The italic means that the algorithms run-ning within the DSP can also be implemented as a standalone method.

Most of the techniques that hold the photometric variations in imagesfall into colour constancy algorithms already mentioned in Section 2.3. Twotypes of approaches cope with it regarding the required knowledge aboutthe illumination: i) absolute, and ii) relative approaches. The former deter-mine the scene illumination whereby the BRDF of the surfaces can be re-covered. The absolute techniques usually have high computational cost andrequire from calibration stages. The latter estimate the illumination varia-tion regarding a reference situation. Thus, the relative techniques usuallyrequire less prior information from the scene but they cannot determine thephotometric response of the surfaces. Although we go in depth in both ap-proaches, our proposal is based on the latter because the estimation of theBRDF is not necessary for this thesis and low cost algorithms are preferred.

The possibilities to achieve the invariance in the colour are also multiple.Figure 4.1 depicts a diagram that includes the types of solutions that addressthis issue. Taking advantage of the processing capabilities of the current cam-eras, several performances can be implemented within them. As we see inChapter 2, the elements of the camera affect the photometric behaviour ofthe images. Thus, some algorithms control these elements to compensatethe illumination variations. These elements are mainly the aperture of thelens, the exposure time of the shutter, and the gain of the sensor amplifier.Nevertheless, with multiple light sources, each with different spatial effect,such approaches are unable to effectively correct at every position withinan image because they do not allow for location dependent corrections. Re-gardless, the control algorithms are investigated in Chapter 5 with otherpurpose. The other element that can modify the photometric response is theDSP. The algorithms performed within the DSP can also run in an externaldevice.

This external device is usually a computer that performs a post-processingover the images provided by the camera. Two strategies can be distinguished:i) standalone, and ii) application dependant approaches.

4.1 state of the art 101

The latter assume that the actual Computer Vision application handlesthe changing imaging conditions. For example, in tracking applications, il-lumination independent descriptors [112] may be defined as trackers [69].Nevertheless, due to this thesis does not account for the last Computer Vi-sion application, these methods are not addressed here.

The former correct the images using image processing after acquisitionbut before applying the final algorithm. One of the advantages of the stan-dalone approaches is that they usually function for every later algorithm. Aswell, these methods may be used on their own or in combination with thepreviously explained methods. The algorithms are mainly based on knowl-edge about image formation and surface reflectance models (Chapter 3).Their aim is to correlate the image intensities with the surface reflectance.Nevertheless, this approach is an underconstrained problem. To determine aproper solution, several assumptions must be done what makes the methodsvery restricted. Finlayson et al. [29] (1994) established the basis of the firstapproaches about illumination estimation by defining the DMT. Afterwards,in 2001, these authors proposed [30] a novel algorithm and a probabilisticframework for illumination estimation. This framework could also describewell known methods, such as grey world and gamut mapping. To face theunderconstraint, the solution required the knowledge of a set of possibleilluminants plus a calibration stage. For further reading, Gijsenij et al. [35]provides a whole survey about these algorithms.

When the correction methods consist of simple computations, such asAutomatic White Balance (AWB) techniques [16, 49], they are implementedwithin the DSP [96] (Section 2.3). The most accurate methods have a highcomputational cost, such as the wavelets approaches that Cao et al. [13](2012) and Baradarani et al. [4] (2013) proposed to estimate an illuminantinvariant image for facial recognition. Furthermore, even though several al-gorithms attempted to deal with real images, their accuracy [5] –or theirapplication to complex scenes [114, 49, 87]– were not sufficiently demon-strated.

Other research lines insisted on modelling the CRF (Section 2.2.2). Most ofthese approaches required several images of the same scene acquired underdifferent exposures. Using these models, several correction algorithms maybe applied by comparing current pixel intensities to a reference or previousvalues. For example, Withagen et al. [116] (2010) developed a simple and fastcorrection method by presenting different ways to estimate a global inten-sity correction factor to correct the input images. Sayed and Delva [90] (2011)complemented this method. Sayed and Delva added a local correction usingthe mean and standard deviation variation; values that the conventional ob-ject recognition applications used to normalise the image intensity to max-imise the discriminative power [11]. Nevertheless, this calculation was verysensitive to noise and introduces blurring artifacts in the image. None of thecommented methods estimating the CRF consider colour images. Inspiredby the work of Grossberg and Nayar [40] (2004), Parameswaran et al. [78](2010) implemented an illumination compensation method based on orderconsistencies that used colour information of textured background areas ofthe image. This method mainly compensated global illumination changesby estimating a costly illumination transfer function for each image. Whenthe illumination change was local, several functions were estimated, whichincreased even more the computational cost. Recently, Lee et al. [64] (2013)estimated the CRF using the colour mixtures observed "around image edgeswhere two different radiances are mixed together due to the limited spatial resolution


of the image array or temporal motion blur" establishing an affine transforma-tion between irradiance and image intensity values; whereas Lin et al. [67](2011) used non-linear methods for this estimation. Lin et al. demonstratedthat non-linearities were introduced in the internal colour transformationsof the camera caused by colour saturation produced by the limited gamutmapping. In the same line, Arandjelovic [2] (2012) accounted for the gammacorrection of the camera. Although these non-linearities actually exist, theireffects can be neglected in some applications (Section 3.4.7).

Besides colour intensities, additional information was used in other cor-rection algorithms. Algorithms based on the retinex theory [63] used the factthat the low spectral values in the spatial-domain were caused by illumi-nation changes. These methods are mainly useful to obtain the direction ofthe predominant illuminant [16]. In addition, based on the ray projectiontheory, Zhang and Chu [122] (2011) established an affine transformation tocompensate the illumination changes.

The classical colour constancy algorithms and CRFs were the basis forthe first designs of methods that correct photometric differences in non-overlapped multi-cameras architectures specially focused on person re-iden-tification. Basically the Inter Camera Colour Response (ICCR) estimations,initially proposed by Porikli [83] (2003), were based on the assumption thatthe appearance of an object under distinct illumination conditions was thesame when their respective cumulative histograms were equal. A first ap-proach from Javed et al. [54] (2005) used a supervised method performinga probabilistic PCA using the normalised bin histograms of the objects forobtaining the Brightness Transfer Functions (BTFs) between cameras. Sev-eral works derived from Javed et al.’s. They mainly focused on: making themethod more robust against illumination changes using several frames ofthe same person for the estimate, as Madden et al. [72] (2007) work; definingimproved BTFs [86, 22]; or implementing non-supervised methods [37, 95].Newer approaches also included spatio-temporal relationships among cam-eras [14, 15, 70] or silhouettes, as Aziz et al. [3] (2011) proposed. How-ever, ICCR techniques required complex calibration stages or, in the non-supervised methods, a deeper knowledge of the moving objects. This learn-ing process involved the inclusion of additional algorithms, such as seg-mentation and tracking algorithms, adding possible error sources and alsoincreasing the complexity of the system. Furthermore, the ICCR and its vari-ations only allowed to correct the images in a global way. Although otheralgorithms applied local correction [18], they lacked in the real time con-straint due to their computational requirements for each estimate correction,including covariance matrices, were high.

Sometimes a value of the variation of the colour or about the illuminantis not required. Several methods performed an image enhancement only forvisualisation purposes or for obtaining the optimal exposure values. Thesemethods were usually rapid and easy to implement because they did not re-quired of any calibration and operated at the pixel level. Contrast enhance-ment, histogram equalisation, HDR and Automatic Exposure (AE) algorithmsbelong to this category. These methods could be implemented either withinthe camera or in an external device. They are analysed in Chapter 5.

The proposed approach consists of a double linear mapping that correctsboth local colour dissimilarities and inter-camera ones. The correction isbased on the BG information and a set of estimators that are computed in alearning phase using regression methods (Chapter 3). Once the regressorsare estimated, re-calibration is not required. The method is fast and accounts

4.2 proposed method in single cameras 103

every type of photometric variations. It is described in the following twosections.

4.2 proposed method in single cameras

As it is mentioned in the introduction, the aim of the proposed correctionmethod is to determine a correction function Γ(·). Based on the DIFM (Equa-tion 2.27), and assuming2 γ = 1, the image of object O in instant trf can bewritten as3:

Bo(u, trf ) = H(trf ) ρo

( NS∑i

ECi Ei(trf )Go,i(u, trf ) +Ns + roff (trf ))

(4.2)

A variable u is added due to the spatial variations of the photometricdistribution. For any instant t, the image is:

Bo(u, t) = H(t) ρo

( NS∑i

ECi Ei(t)Go,i(u, t) +Ns + roff (t))

(4.3)

Equation 4.3 is expressed as a linear function of Bo(u, trf ) as:

Bo(u, t) = Csc(u, t)Bo(u, trf ) (4.4)

in which sc refers to single camera and:

Csc(u, t) =

=H(t)

(∑NSi ECi Ei(t)Go,i(u, t) +Ns + roff (t)

)

H(trf )(∑NS

i ECi Ei(trf )Go,i(u, trf ) +Ns + roff (trf )) (4.5)

A corrected image Bo is written as:

Bo(u, t) =Bo(u, t)Csc(u, t)

(4.6)

Note that, if trf is considered as reference, Equation 4.5 corresponds to thephotometric variation function of the object O for which a spatial variable isadded (Definition 3.1 and Equation 3.4). In this case, from Equation 4.4, thecorrection function for each band is:

Γsc(Bo(u, t)

)=

Bo(u, t)QrfSo(u, t)

=Bo(u, t)

QrfBo(u, t)(4.7)

in which the superscript sc refers to a single camera correction function andCsc = QrfSo(u, t).

Thus, the slope of the correction function is the inverse of the photomet-ric variation function. If this function is determined, the original image isrecovered. Unfortunately, the target objects are unexpected, i. e., the objectbeing observed at any moment is unknown. Therefore, the quotient cannotbe computed. To solve this situation, the QRMR described in Chapter 3 isused using the known BG as explanatory regions. Using this model, Equa-tion 3.20 and Equation 4.7, we define the following correction function formulti-band images.

2 In Chapter 3 we show that this assumption is valid in the scope of this thesis.3 One colour band is considered for simplicity hereafter under convenience. The extension to all

bands is straightforward.


Definition 4.1. Given an scene composed by a target object O and a set of R

BG regions, the Illumination Correction Mapping (ICM) asck ∈ R for band k is

a function that compensates the photometric variations of O observed by a singlecamera and related to a reference time trf :

asck (u, t) .

=1

QrfBo(u, t)) =

1∑Ri=0 βi,k(u)QrfB

bi,k(u, t)

(4.8)

whereβi,k(u) ∈ R, ∀i ∈ Z : i = 0, · · · ,R

.

Note that the regressors are now function of the position. This issue isaddressed in the Section 4.2.1.

The ICM is similar to use colour invariant intensities because it compen-sates any photometric change. The ICM relies on BG information althoughsome variations of the FG are required to learn the QRMR. Thus, a trainingphase is required. Both the training and the runtime phase are describedin the following subsections. Besides the training, in Section 4.2.1 it is alsodescribed how the regions and spatial relations are defined.

4.2.1 Scenes modelling and training mode

The QRMR is learned by using some FG objects seen in different locations andilluminations conditions. We model the 3D world geometry in the acquiredimage by splitting the scene in a set of n = 1, · · · ,N predefined locations.Thus, each position u is mapped to a location n. For each location we definethe BG regions. The procedure of selection depends on the scene: geometry,objects, lights and motion. Four examples are described in the datasets usedin this thesis (Appendix C).

One regression is required for each location. Hence, the functions of re-gressors (Definition 4.1) become a set of parameters in R3:

βi,k(u) ≡βi,n,k, ∀n ∈ Z : n = 1, · · · ,N

(4.9)

The regressions estimates are done according to the considerations andproposals made in Chapter 3. In general, a representative number of pho-tometric changes must be collected for all of the locations. To calculate thephotometric variation functions, the QrfB

bi,k(u, t) of the i-th BG region and

the QrfBfk (u, t) of the FG region for each band k, time t and location u re-

garding a reference time trf are computed as the average value of the regionintensities:

QrfBbi,k(u, t) =

Bbi,k(u, t)

Bbi,k(u, trf )

(4.10)

QrfBfk (u, t) =

Bfk (u, t)

Bfk (u, trf )

(4.11)

In this regard, the number of training samples P must be much greaterthan the number of regressors (P (1+ R)×N× 3). When this conditionis not fulfilled, or several samples are linearly dependant, the percentiles ofthe intensities of the regions are used instead of their average values.

The location mapping also depends on the scene, besides the distance be-tween the actual location of the FG and the BG regions must be accounted.

4.2 proposed method in single cameras 105

Figure 4.2: Indoor surveillance scene modelling. xli denote different locations alongthe x-axis. The vertical lines ri’s cut these locations perpendicularly andare used for computing the distance of each pixel to the correspondingxli ’s.

For example, the light incidence in the Parking scene (Section C.2) is practi-cally homogeneous over each location, thus the correction for each locationdoes not depend on the precise position of the FG u. Nevertheless, in thosescenes in which the illumination distribution is more complex, certain al-gorithms can improve the reliability of the correction. For example, in theMCDL dataset (Section C.3) we proposed an interpolation function that ac-counts the position of each pixel along an x-axis defined in the walkingdirection on the ground plane (Figure 4.2). Thus, a correspondence u → x

is easily obtained. Let asck (xln) be the ICM in predefined location xln, the ICM

in the whole scene is constructed by:

asck (x) =

asck (xln) if x xl1

(x−xl

nxl

n−1−xln· asc

k (xln−1)+

+x−xl

n−1xl

n−xln−1

· asck (xln)

)if xln−1 < x xln

asck (xln) if x > xlN

(4.12)

If this correction method is combined with some 3D geometry recoverytechnique, the correction may be surely improved.

4.2.2 Runtime mode

In the runtime mode, the ICM is applied. To accomplish this correction, justthe average values of the BG regions are computed (Equation 4.10).

Thus, the corrected images are obtained from the following equation:

Bo,k(u, t) = asck (u, t)Bo,k(u, t) (4.13)

If the regression estimate is reliable, then the correction satisfies that:

Bo,k(u, t) = Bo,k(u, trf ) + ε, ∀t ∈ R, ε → 0 (4.14)


Besides the ICM, in the following section the correction method is ex-tended to the case of multiple non-overlapping cameras.

4.3 proposed method in non-overlapping cameras

The DIFM is also used for images captured by different cameras. Like Equa-tion 4.4, if we consider two images Bo,crf (u, trf ) and Bo,c(u, t) of the sameobject O taken by two cameras (distinguished by the subscripts crf and c) ininstant trf and t, Bo,c(u, t) is expressed as a function of Bo,crf (u, trf ) as:

Bo,c(u, t) = Cmcc (u, t)Bo,crf (u, trf ) (4.15)

in which mc refers to multi-camera and:

Cmcc (u, t) =

=Hc(t)

(∑NSi ECi Ei(t)Go,i,c(u, t) +Ns,c + roff ,c(t)

)

Hcrf (trf )(∑NS

i ECi Ei(trf )Go,i,crf (u, trf ) +Ns,crf + roff ,crf(trf )

) (4.16)

This time, the terms related to the camera properties and its spatial rela-tion with the scene add a conceptual complexity to the expression. Neverthe-less, the procedure to address the correction problem is similar to the singlecamera scenario. Thus, we propose a linear correction model to compensatethe photometric differences between cameras.

Definition 4.2. Given an scene composed by a target object O and a set of C cam-eras, the Inter Camera Correction Mapping (ICCM) for band k is a composition oftwo terms amc

c,k and bmcc,k that compensates the photometric variations of O observed

by the camera c and related to a reference camera crf :

Bo,k,c(u, t) = amck,c Bo,k,c(u, t) + bmc

k,c (4.17)

in whichamc

k,c ,bmck,c ∈ R, ∀c ∈ Z : c = 1, · · · ,C

.

bmc is added to partially compensate for model simplifications such ascamera noise, intensity offsets or camera non-linearities. The details of theestimation of amc

k,c and bmck,c are explained in Section 4.3.1.

To enhance the improvement of the correction, we propose a combinationof the ICM and ICCM.

Definition 4.3. Given an scene composed by a target object O, a set of C camerasand Rc regions in the FoV of camera c

∀c ∈ Z : c = 1, · · · ,C

, the Linear

Correction Mapping (LCM) for band k that compensates the photometric variationsof O related to a reference time trf and camera crf is:

Bo,k,c(u, t) = amck,c asc

k (u, t)Bo,k,c(u, t) + bmck,c (4.18)

in which Definition 4.1 and Definition 4.2 have been followed.

The following subsections explain the training and the runtime modes ofthe LCM, respectively.

4.3.1 Training mode

Similarly to the training for obtaining the ICM parameters (Section 4.2.1),the ICCM parameters are also estimated by linear regression. The observed

4.4 experiments 107

data are several intensity values of the FG objects in the different cameras toapproximate them to the reference value. Considering that the number ofsamples should be high, multiple percentiles extracted from the cumulativeobject histograms normalised between 0 and 1 are used.

The deviation error ε for sample w for camera c is:

εw = Hfw,k,crf

−(amc

k,c Hfw,k,c + bmc

k,c)

(4.19)

Hfw,k,crf

w sample of FG normalised cumulative histogram

in band k and camera crf

Hfw,k,c w sample of FG normalised cumulative histogram

in band k and camera c

For M samples, with M = h× τ 2 (being h the number of values ofthe histogram used, and τ the number of calibration objects), the ICCM forcamera c and band k is obtained by minimising the sum of the squared2-norm of the deviation errors of the M samples between the histogramvalues of the reference camera crf and the estimated camera c:

amc

k,c , bmck,c

= arg min

amck,c ,bmc

k,c

M∑w

‖Hfw,k,crf

−(amc

k,c Hfw,k,c + bmc

k,c)‖2 (4.20)

The methods to solve Equation 4.20 are explained in Section 3.3.

4.3.2 Runtime mode

In this case the corrected images are obtained by means of Equation 4.18.The terms of the ICCM are fixed; thus, only the calculations described inSection 4.2.2 are implemented.

Similarly to the single camera case, if both regression estimates are reli-able, the correction satisfies that:

Bo,k,c(u, t) = Bo,k,crf(u, trf ) + ε, ∀t ∈ R, ε → 0 (4.21)

In the next section, the LCM is compared with other colour correctionalgorithms in three real scenes to assess the performance of the proposedmethod and detect its strengths and weaknesses.

4.4 experiments

The experiments to assess the LCM were based on a re-identification appli-cation. The experiments consisted in, having two samples of an object, mea-suring the probability that they are the same or a different object by onlyusing their colour distributions. Furthermore, the scenes had very differentphotometric conditions. These samples were corrected by the proposed al-gorithm besides other SoA colour correction techniques to be compared with.The aim of the experiment was not the re-identification algorithm itself butthe improvement of the similitude (or dissimilitude) between images of thesame (or different) object.

Spatial pixel location and camera network topology were not considered.Thus, the comparison with common approaches such as the use of colourinvariants [61] or covariance matrices [88] was out of scope. Nevertheless,


we were aware of features vectors that include colour plus other informa-tion such as shape or texture information and they would improve the re-identification rates. Further reading on person re-identification SoA can befound in the work of Bedagkar-Gala and Shah [7] (2014).

The next subsections are organised as follows. Section 4.4.1 identifies aset of indicators that provided an objective measure of the correction perfor-mance. Section 4.4.2 explains how the experiments were planned and run.Section 4.4.3 shows the details of the implementation of the reference colourcorrection algorithms that were used for comparison. They were an AE algo-rithm [68], an AWB [100], and a combination of both.The results over the twoscenes are presented later. First scene consisted in the re-identification ofvehicles in a parking (Section 4.4.4) using the Parking dataset (Section C.2).Pictures of faces from MUCT dataset (Section C.4) of multiple people takenin an indoor scene with several illumination conditions composed secondscene. A single camera was used in this two experiments. Third scene (Sec-tion 4.4.6) used the MCDL dataset (Section C.3). In this case, the target ob-jects were persons and two non-overlapping cameras were used. Althoughseveral datasets on the internet are related to people re-identification in amulticamera network4 and some others also include light changes5,6, wewere not able to find any dataset that fitted all our requirements (overall,background information and camera’s auto-settings off); therefore, we cre-ated our own dataset. All of the scenes were challenging. On the one hand,the vehicles were composed mainly by non-Lambertian surfaces and thecolour distributions of different vehicles may be very similar. On the otherhand, the deformable shape of a person involved different colour distribu-tions and normal orientations for the same person. This is considered in thediscussion of the results given in Section 4.4.7.

4.4.1 Performance indicators

For these experiments, we assumed that the colour distributions of the tar-get objects are multimodal and contain an structure invariant to the illumi-nation and the camera. This structure was called the object signature and itwas related to the surface reflectance distributions. This signature was ex-tracted from the histogram of the image. Thus, it was independent from thespatial distances between pixels; therefore, from image size, object shapeand partially pose. The histogram was composed by multiple bins that splitthe intensity range in equally distributed intervals. In the experiments 128

bins were computed.The aim of the correction was to increase the discriminative power be-

tween the image of different objects while it maintained the structures forthe same object as similar as possible. Thus, the selection of the metric hadto address this issue. Ideally, given a normalised metric between zero andone, the distance between images of the same object (intraclass distance)is zero and the distance between images of different objects (interclass dis-tance) close to one.

The classical algorithms for person re-identification [54, 85, 22] usuallyused the Bhattacharyya distance [8], which indicated the amount of overlap

4 QMUL, underGround Re-IDentification (GRID) dataset (consulted 05/2014).http://www.eecs.qmul.ac.uk/~ccloy/downloads_qmul_underground_reid.html

5 VIPeR, Viewpoint Invariant Pedestrian Recognition (consulted 05/2014).http://vision.soe.ucsc.edu/node/178

6 Person Re-ID (PRID) 2011 dataset (consulted 05/2014).http://lrs.icg.tugraz.at/datasets/prid/index.php

http://www.eecs.qmul.ac.uk/~ccloy/downloads_qmul_underground_reid.html

http://vision.soe.ucsc.edu/node/178

http://lrs.icg.tugraz.at/datasets/prid/index.php

4.4 experiments 109

between two statistical samples or populations. Other widely used distancewas the chi–squared [21]. This distance considers that the difference betweenlarge bins is less important than the difference between small bins.

Nevertheless, both distances were bin–to–bin measures, that is, they onlycompared each bin from one histogram with its correspondence in the other.This type of measures fails due to quantisation, shape and pose differencesbesides other types of noise. Cross bin relationships are robust against theseproblems. Earth Mover’s Distance (EMD) [89] was one of the most extendedin image matching. EMD provides a minimum cost solution to transformone image distribution into another. In these experiments, the EuclideanEMD was computed for each colour band and then the average value wasused as distance.

For each experiment, a set of intraclass and interclass distances was gath-ered. The performance of the system was evaluated with an error measurebased on the Cumulative Histogram Distribution (CHD) of the computedEMDs normalised between zero and one. The CHD of the intraclass distanceswas denoted as CHDii and the CHD of the interclass as CHDij. Then, someclassification rates were defined.

Definition 4.4. Given a threshold value nth ∈ [0,+∞) and a set of intraclassdistances, the false negative rate is:

FN(nth) = 1−CHDii(nth) (4.22)

Definition 4.5. Given a threshold value nth ∈ [0,+∞) and a set of interclassdistances, the false positive rate is:

FP(nth) = CHDij(nth) (4.23)

Both false rates are graphically explained in Figure 4.3.

Definition 4.6. Given a false positive and false negative rate functions of a set ofhistograms distances, the Minimal Error Criterium (MEC) for distance r is:

MEC(r) = FP(r) + FN(r), ∀r ∈ R+ (4.24)

In the experiments we searched for that threshold nth for which the MEC

was minimal:

nth = arg minr

MEC(r) (4.25)

To analyse the distribution of the false rates, the Receiver Operator Char-acteristic (ROC) curve [101, Section 5.5] and the Surface above the ReceiverOperator Characteristic (SaROC) were also provided, for which the true posi-tive rate is TP(r) = 1− FN(r).


Figure 4.4 depicts a diagram of the evaluation steps. For each of the sce-nes, the first step consisted of splitting the datasets in two different sets ofsamples: i) training and ii) test.

The training stage was almost equal to the evaluation strategy followedin Chapter 3. The only difference was the addition of the ICCM regressorsestimate in the case of the multi-camera scene. Both linear regressions weredone by using LS RR Method (LSM-R) (Section 3.4.2) due to its suitable per-formance and management of the outliers (Section 3.4).


0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

FP (nth)

FN(nth)

Distance (r)

CHDii

CHDij

r = nth

Figure 4.3: Example of intraclass CHDii and interclass CHDij normalised Cumula-tive Histograms. Whether the distance between two objects is less thannth, the objects are classified as equal. Otherwise, the objects are classifiedas different. The false positive rate FP and the false negative rate FN forthe threshold distance nth are the magenta and black lines heights (≈ 0.32and ≈ 0.18 respectively).

In each scene, all of the samples were compared with each other. Thus, ifthe number of test images or target objects of the test dataset was huge, arandom selection was performed at first. This was the case of the Parkingscene, which was composed by several hours of footage and more than 70

vehicles. The next three steps of the test stage were equal to the three first ofthe training ones. After the quotients were created, the LCM was performedbased on the trained parameters. In the case of the Parking scene and theMUCT dataset, in which a single camera captured all of the pictures, onlythe ICM was used. In parallel, the reference colour correction methods (Sec-tion 4.4.3) were applied to the test images. Once all of the test samples werecorrected, for each one, we implemented two comparisons. On the one hand,we compared with the remaining samples of the same object (intraclass com-parisons) in every location. On the other hand, we compared with all of thesamples of different objects (interclass comparisons) in every location. After-wards, the distances were gathered in these two categories and their CHDs

were computed (Section 4.4.1). Finally, the performance indicators were de-termined.

The FGs in the MCDL dataset are moving objects that were manually seg-mented. To demonstrate the performance of the method with less optimalsegmentation, an experiment was implemented by eroding and dilating themanually segmented outlines. The results are included for comparison.

Before presenting the results, the reference corrections methods are de-tailed.

4.4 experiments 111

Datasetdivision

Ref. Imagecreation

Quotientscreation

Outliersfiltering

ICMestimate

ICCMestimate

ReferenceCorrectionMethod i

Random selection

Ref. Imagecreation

Quotientscreation

Outliersfiltering

ICM

ICCM

Intra–classcomparisons

Inter–classcomparisons

CHDijCHDii

Indicatorscalculation

...

Test Training

asc

amc

Figure 4.4: Diagram of the evaluation strategy for the correction method.

4.4.3 Implementation of the reference methods

The reference methods were chosen because they were standalone approa-ches, did not required from calibration stages and their computational costwere low. The first reference method was based on the algorithm proposedby Liu et al. [68] (2010) and addressed the intensity variations. To allowa fair comparison and use the same input images for all of the methods,instead of modifying the camera gain and exposure time, the AE algorithmimplementation consisted of estimating a correction coefficient similar to theresponse that it would have had by changing the parameters of the camera,based on the brightness of the image. This algorithm required to set up threeparameters: i) the reference average value to be reached, ii) the number ofregions, and iii) the weights of each region. These parameters were collectedin the correction term aae following this expression:

Bo,k(u, t) = aae(t)Bo,k(u, t) (4.26)

in which aae =BrfBw

. Brf is the reference value and Bw is related to thecurrent brightness of the image. We decided to calculate Brf by estimatingthe average value of the illumination reference images of the test set. Since


Bw and Brf were gray images, both were computed as the average of thethree bands. Liu et al. did not specified the method to set up the number ofregions or the weight values. These choices depended on the scene and itsresponse to the light changes. Both parameters were considered in the termBw. This term was the average value of the whole image using n weightedregions.

Regarding the second method, Tan et al. [100] (2004) proposed a novelInverse–Intensity Chromaticity (IIC) space that they used to estimate the illu-mination chromaticity of an image; thus, it addressed the colour variations.This estimate was used to correct the images to a neutral illumination. Thealgorithm operated for Lambertian and specular surfaces; obtained accurateand robust results; and did not require strong constraints on illumination.In these experiments, we used the implementation that the authors madeavailable online7.

4.4.4 Vehicle re-identification in outdoor parking

The single camera method was tested in a dedicated experiment using theParking dataset (Section C.2). The dataset was acquired in two time periodsof two days each one, which included different weather conditions. Theimages acquired during the first time were used for the training phase, andthe remainder for the test phase. Thus, different lighting conditions and FG

objects were guaranteed. Vehicles outlines were obtained by fixing a regionin each parking slot.

A total of 1779 frames were used in training including up to 74 differentvehicles. For testing, 30 vehicles were randomly chosen using a uniformdistribution of a total of 90. For each vehicle, 10 frames were also randomlychosen.

Regarding the AE parameters, the reference value was calculated by es-timating the average value of the illumination reference image of the testset. Due to the vehicles were located on an open area in which the lightinfluence was similar in the entire image, the number of regions was set toone.

Method Threshold MEC Improvement(%)

Uncorrected 0.05 0.30 –

AE [68] 0.05 0.16 46

IIC [100] 0.04 0.29 3

AE+IIC 0.04 0.17 44

LCM 0.09 0.28 9

Table 4.1: Error rate of correction in Parking scene. For the minimum MEC, the esti-mated threshold, the MEC, and the improvement compared with the im-ages with no correction are shown. Each row is: uncorrected images, byusing the AE algorihtm, the IIC, the AE plus IIC, and our algorithm (LCM).

The minimum MEC for every method is shown in Table 4.1. Our algorithmoutperformed the original images by 9 % besides the IIC. Nevertheless, theAE correction improvement was much larger than our approach.

7 consulted 05/2014http://php-robbytan.rhcloud.com/code/iic.zip

http://php-robbytan.rhcloud.com/code/iic.zip

4.4 experiments 113

Method SaROC Improvement (%)

Uncorrected 0.13 –

AE [68] 0.05 62

IIC [100] 0.13 -0

AE+IIC 0.05 61

LCM 0.07 43

Table 4.2: SaROC for Parking scene. The improvement compared with the uncorrectedcase is also provided. Rows description is similar to Table 4.1.

The ROC curve (Figure 4.5) also indicated an improvement as regards theuncorrected case and the IIC. The SaROC (Table 4.2) was improved a 43 %compared with the original images. For FP < 0.15 and FP > 0.6 our algo-rithm shown a similar behaviour than AE. Although the difference was small(a surface difference of 2 % between LCM and AE), its ROC curve always heldbelow AE’s. One of the possible reasons for this difference was the complex-ity of the FG surfaces. Our algorithm based the correction in the surfaceswhile AE did not. Thus, the mix of metal surfaces with the glass of the win-dows did not fit our assumptions. Although a noteworthy improvement wasachieved, the errors generated in the betas estimates (Section 3.4.5) wouldhave prevented that LCM outperformed the AE method.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False positive rate

Tru

e p

ositiv

e r

ate

Uncorrected

AE

IIC

AE+IIC

LCM

Figure 4.5: ROC curve for Parking scene. The true positive rate against the false pos-itive rate of the five analysis cases are shown. The red solid lines belongto the uncorrected case, the black solid lines to the AE [68] performance,the black dashed lines to the IIC [100], the black dotted lines to AE plusIIC, and the blue solid lines to our algorithm (LCM).


4.4.5 Face re-identification

MUCT (Section C.4) was selected to do an evaluation of the LCM in an in-door scene using people faces as target objects. Up to five camera viewswere used for the acquisition. Nevertheless, the ICM was only applied be-cause the position of the faces remained unaltered. This issue did generatea dissimilarity with the other datasets because both the FG and BG movedwith respect to the camera.

The selected people were split into training and test people. Three personswere selected for the former and seven for the latter. Up to six samples offaces with different light settings was used per person. A total of 24 sampleswere used for the training and 34 for the test. The first images of each subsetwere used as reference. The first 115 lines of the image were used as BG

region (Figure 3.15). Due to the number of images for the training was nothuge, instead of the average intensity value per band, 20 evenly spacedvalues of the images percentiles were used for the quotients computation.

(a) Originalpicture (001).

(b) Corrected pic-ture.

(c) Originalpicture (401).

(d) Corrected pic-ture.

Figure 4.6: People samples of MUCT dataset. id: 001 and id: 401

Figure 4.6 depicts two samples of two persons of the test subset. On theleft, original images are shown. On the right, corrected images using LCM.The colour distributions of the corrected images were more similar becausethe correction is based on a common reference image.

The MECs of the evaluated methods are shown in Table 4.3. The improve-ment of LCM was noteworthy (49 %). Whereas, the IIC algorithm did notoutperform the uncorrected case. The second best option was the combina-tion of the AE plus IIC but 10 % less than our proposal.

ROCs (Figure 4.7) and SaROCs (Table 4.4) confirmed the noticeable perfor-mance of our proposal. SaROC improvement was 13 % more than AE+IIC

and ROC curve arose above the remaining methods for FP > 0.1. Only in a

4.4 experiments 115



AE [68] 0.02 0.48 14

IIC [100] 0.03 0.57 -3

AE+IIC 0.02 0.34 39

LCM 0.06 0.28 49

Table 4.3: Error rate of correction in MUCT dataset. For the minimum MEC, the es-timated threshold, the MEC, and the improvement compared with the im-ages with no correction are shown. Each row is: uncorrected images, byusing the AE algorihtm, the IIC, the AE plus IIC, and our algorithm (LCM).

short range approximately between 0.01 and 0.1, LCM could not improve theothers.

Method SaROC Improvement (%)


AE [68] 0.16 50

IIC [100] 0.34 -6

AE+IIC 0.14 55

LCM 0.10 68

Table 4.4: SaROC for MUCT dataset. The improvement compared with the uncor-rected case is also provided. Rows description is similar to Table 4.3.

Next experiment evaluates the multi-camera correction approach.

4.4.6 People re-identification in corridors

The multi-camera method was tested in a dedicated experiment using theMCDL dataset (Section C.3): with multiple cameras, multiple people, multi-ple locations and changing lighting conditions.

In each camera view, we defined three locations (Figure 3.29), used oneperson for the training phase and eight other persons for the test phase.All of the persons were captured walking in two directions. Each one ofthe training and test samples were recorded under different outdoor lightconditions. In camera #2, the average background intensity during the ac-quisition of the training samples ranged between 102 and 123 of 255, whilethe average background intensity during testing ranged between 97 and112 of 255. The test persons each wore different shirts whose colors werewell-distributed along the RGB space. An example picture of each subject isshown in Figure 4.8. People outlines were manually segmented to obtain FG

information, used during training and evaluation. The shadows were alsomanually segmented and discarded.

A total of 96 frames were used for training and 192 frames for testing.The 96 training frames had 8 repetitions for each light condition, locationand camera (8 frames x 2 light conditions x 3 locations x 2 cameras). The192 test frames had 2 repetitions for each person, light condition, location


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False positive rate

Tru

e p

ositiv

e r

ate

Uncorrected

AE

WB

AE+WB

LCM

Figure 4.7: ROC curve for MUCT. The true positive rate against the false positive rateof the five analysis cases are shown. The red solid lines belong to theuncorrected case, the black solid lines to the AE [68] performance, theblack dashed lines to the IIC [100], the black dotted lines to AE plus IIC,and the blue solid lines to our algorithm (LCM).

and camera (8 persons x 2 directions x 2 light conditions x 3 locations x2 cameras). Due to the number of images for the training was not huge,instead of the average intensity value per band, 20 evenly spaced values ofthe images percentiles were used for the quotients computation.

An example of a corrected image is shown in Figure 4.9. Both the singlecamera and multi-camera corrections were closer to the images taken in thereference instant.

Regarding the AE parameters, the reference value was calculated by esti-mating the average value of the illumination reference images of the test set.We decided to split the scene into three regions as it is shown in Figure 4.10.Regarding the weights, we choose 0.15, 0.7 & 0.15 from the left to the rightfor camera #1. We gave a higher weight to the central region because peoplewere placed there. In a similar way, the chosen weights for camera #2 were0.2, 0.7 & 0.1. In this case the region on the left had a greater weight thanthe region on the right because the light coming from outside was reflectedon the wall on the left. These weights were chosen empirically after testingseveral variations, being the ones that performed the best.

To evaluate the robustness of our method against segmentation errors inthe foreground extraction, an additional experiment was implemented. Thesize of each foreground object was randomly extended or reduced: by a neg-ative value, which involves an area reduction; or by a positive value, whichinvolves an area enlargement. Two extra datasets were created (Figure 4.11):The first shows an area size variation of the half random samples of −30 %and the remainder of 60 %; and the second shows a larger area size variationof the half random samples of −90 % and the remainder of 200 %.

Note that the scope of this thesis is on colour correction. For colour cor-rection, the foreground segmentation was only used during training mode

4.4 experiments 117

Figure 4.8: People samples in MCDL dataset.

whereas the correction during runtime was based on background intensitiesonly. Foreground segmentation was only used during runtime for the eval-uation. Thus, to determine the sensitivity of the colour correction method,only the segmentation during training mode was modified by erosions anddilations, and not the segmentation used in the evaluation within the EMD

matching.Table 4.5a shows the error results of the single camera experiment for

camera #1 for the minimum MEC. The table shows an error reduction forour algorithm compared to the uncorrected case (MEC was 0.66 and 0.41,respectively), which implied an improvement of 38 %. The IIC algorithmdid not operate as well as LCM, and AE functioned even worse than theuncorrected case. The ROC curve (Figure 4.12) depicted that the improve-ment of our algorithm was very substantial while the IIC and AE algorithmshardly outperformed the original images. This fact was confirmed by theimprovements of the SaROC (Table 4.6a). When the dataset containing smallsegmentation errors was used, the results of the proposed algorithm stilloutperformed the remaining methods (MEC and SaROC improvement of 23 %against 4 % and 5 % for the IIC, respectively). Nevertheless, as expected, forthe dataset with the largest segmentation errors, all of the indicators demon-strated a deterioration compared with the original images (MEC and SaROC

deterioration of 35 % and 71 %, respectively).The error results in camera #2 (Table 4.5b) confirmed that LCM outper-

formed the others. Note that IIC did not improve the uncorrected case. Al-though results of AE were positive and noteworthy (MEC was 0.38 and theimprovement is 35 %), LCM error was equal. The analysis could be extendedto the ROC curve (Figure 4.13) and its SaROC (Table 4.6b). LCM curve broughta large improvement compared with the original images, which included a


(a) (b) (c)

(d) (e)

Figure 4.9: Example of person correction in MCDL dataset. (a) Original person tobe corrected. Corrected persons for the intraclass (b) and interclass (c)results. Reference images taken in the intraclass (d) and interclass (e) ex-periment. Images (b) and (c) are more similar to images (d) and (e) incolour distribution, respectively.

(a) Camera #1. (b) Camera #2.

Figure 4.10: Region setup used by the AE algorithm [68].

4.4 experiments 119

(a) 60 % of dila-tion.

(b) 30 % oferosion.

(c) 200 % of dila-tion.

(d) 90 % oferosion.

Figure 4.11: Example of forced segmentation errors for the training. The area sizeof the original dataset has been enlarged and reduced by dilation anderosion operations, respectively.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False positive rate

Tru

e p

ositiv

e r

ate

Uncorrected

AE

IIC

AE+IIC

LCM

LCM (−30,+60)

LCM (−90,+200)

Figure 4.12: MCDL dataset. ROC curve for camera #1. The true positive rate againstthe false positive rate of the seven analysis cases are shown. The redsolid lines belong to the uncorrected case, the black solid lines to theAE [68] performance, the black dashed lines to the IIC [100], the blackdotted lines to AE plus IIC, and the blue solid lines to our algorithmusing a manual segmentation, the dashed lines to our algorithm withsegmentation errors of -30 and +60 % of area variation, and the dottedlines to our algorithm with segmentation errors of -90 and 200 %.




AE [68] 0.04 0.69 -4

IIC [100] 0.02 0.64 4

AE+IIC 0.02 0.67 -1

LCM 0.02 0.41 38

LCM (-30,+60) 0.02 0.51 23

LCM (-90,+200) 0.03 0.89 -35

(a) Camera #1 correction



AE [68] 0.02 0.38 35

IIC [100] 0.03 0.66 -12

AE+IIC 0.02 0.53 10

LCM 0.02 0.38 35

LCM (-30,+60) 0.02 0.37 38

LCM (-90,+200) 0.04 0.66 -12

(b) Camera #2 correction



AE [68] 0.03 0.60 13

IIC [100] 0.02 0.71 -3

AE+IIC 0.02 0.64 6

LCM 0.03 0.47 31

LCM (-30,+60) 0.03 0.55 19

LCM (-90,+200) 0.04 0.79 -15

(c) Correction between both cameras.

Table 4.5: Error rate of cameras correction in MCDL dataset. For the minimum MEC,the estimated threshold, the MEC, and the improvement compared withthe images with no correction are shown. Each row is: uncorrected images,by using the auto-exposure algorithm (AE), the IIC, auto-exposure plusIIC (AE+IIC) and our algorithm using three different segmentations for theforeground: i) manual –LCM–, ii) modifying the area size of the manualsegmentations in -30 % and 60 % –LCM (-30,+60)–, and iii) area variationsof -90 % and 200 % –LCM (-90,+200)–.

4.4 experiments 121

45 % less SaROC. For FP > 0.2, AE and LCM curves had similar profiles andtheir SaROC differed slightly: 0.10 and 0.12, respectively. When the datasetcontaining small segmentation errors was used, the results of the proposedalgorithm were even closer to the AE performance: equal MEC (0.38) andSaROC (0.10). Nevertheless, for the dataset with the largest segmentation er-rors, all of the indicators demonstrated a deterioration compared with theoriginal images (MEC and SaROC deterioration of the 12 % and 36 %, respec-tively).

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False positive rate

Tru

e p

ositiv

e r

ate

Uncorrected

AE

IIC

AE+IIC

LCM

LCM (−30,+60)

LCM (−90,+200)

Figure 4.13: MCDL dataset. ROC curve for camera #2. The figure description is sim-ilar to Figure 4.12.

In the case of the multi-camera correction method, the error results areshown in Table 4.5c. Compared to intraclass results, the MEC was slightlyworse than the corresponding mean of the MEC of camera #1 and camera#2. Deterioration of LCM was probably caused by different illumination andshadow geometries at both cameras. Still our algorithm significantly out-performed the others. As Table 4.5c shows, the MEC improvement of IIC

correction was negative and using the AE+IIC algorithm was fairly closeto the uncorrected case (a difference of 6 %) caused by the poor perfor-mance of camera #1. The interclass results show that LCM obtained a largeimprovement of 31 %, which it was noteworthy and more than twice theimprovement of the AE algorithm. Figure 4.14 and Table 4.6c confirm theimprovement of the LCM: although the AE curve was over the uncorrectedcurve for almost every false positive rate value, the LCM curve was evencloser to 1 and, hence, much better than AE’s. Furthermore, the larger dif-ference for FP > 0.2 between LCM and the rest involved that this performedbetter when the light variation was large because the more towards the rightthe ROC was, the larger EMDs were dealt with and the EMDs were related tothe intensity differences between images. Regarding the error segmentation,the previously commented trend was maintained in this case: with the smallsegmentation error better results than with the remaining algorithms werestill generated, and the large errors deteriorated even the experiment withthe original images.


Method SaROC Improvement(%)


AE [68] 0.28 -5

IIC [100] 0.25 5

AE+IIC 0.26 3

LCM 0.15 44

LCM (-30,+60) 0.21 23

LCM (-90,+200) 0.46 -71

(a) Camera #1 correction



AE [68] 0.10 52

IIC [100] 0.26 -21

AE+IIC 0.19 10

LCM 0.12 45

LCM (-30,+60) 0.10 54

LCM (-90,+200) 0.29 -36

(b) Camera #2 correction



AE [68] 0.23 15

IIC [100] 0.29 -6

AE+IIC 0.25 8

LCM 0.17 39

LCM (-30,+60) 0.20 25

LCM (-90,+200) 0.39 -44

(c) Correction between both cameras.

Table 4.6: SaROC for MCDL dataset. The improvement compared with the uncorrectedcase is also provided. Rows description is similar to Table 4.5.

4.4 experiments 123

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False positive rate

Tru

e p

ositiv

e r

ate

Uncorrected

AE

IIC

AE+IIC

LCM

LCM (−30,+60)

LCM (−90,+200)

Figure 4.14: MCDL dataset. ROC curve for both cameras. The figure description issimilar to Figure 4.12.

4.4.7 Results discussion

To correctly understand the results obtained from the experiments presentedbelow, we must identify the implications of the analysis tools that we haveused. That is, according to the uncertainty principle of Heisenberg [45](1927), these tools would have influenced in the error computations. In thiscontext, the use of a determined metric to measure the distances betweenimages besides the re-identification application are methods that allow tomeasure the reliability of the proposed correction method. It may be thecase that other metric or application provides better results. Both the met-ric and the application were chosen under the assumption that they weresuitable tools to confirm our proposal. Nevertheless, in certain cases suchas the Parking scene, the colour distributions of the vehicles could be almostequal for similar paint colours. This was the case of greyish paints, which,in addition, are very popular options. Thus, the isolated use of the colourdistributions of the objects for the re-identification did not provide suitablerates in this case. These issues should be considered to discuss the obtainedresults.

Regarding the Parking scene, the results shown an improvement of theLCM method compared with a non correction approach. Our method alsooutperformed the AWB technique. IIC demonstrated not being useful in thisscene although variations in the colour spectrum of the light source wereproduced. Nevertheless, LCM correction was not sufficient to exceed the AE

algorithm. In Section 3.4.7, we concluded that multiple points clouds did notfit the main linear trend in the BG–FG relation. Even variations of FG intensitywere produced when the BG remains invariant. These phenomenons (likelyproduced by big shadows, dissimilarities in the surfaces orientations andcomplex FG surfaces) introduced errors in the FG–BG relation that influencedthe reliability of the correction: both the training and the runtime.


Fortunately, the results were different with the MUCT dataset. The indoorlight variations, close regions and without interreflections or occlusions pro-vided a suitable environment for LCM. Thus, the results demonstrated that,in those circumstances, LCM outperformed the reference colour correctionalgorithms.

The results were also satisfactory with the MCDL dataset. In camera #1,LCM outperformed the others. This was probably because our method re-jected the over-exposed regions of the BG, making the algorithm more robustagainst highlights. Furthermore, the BG–FG relationship was more reliablethan the transformation values calculated by the AE and IIC algorithms, asthe BG–FG relationship was trained for local rather than global illuminationchanges. In the case of camera #2, the most important source of light wasthe sun, and the indoor lights had less influence. Although indoor lightswere switched on/off, the influence of the outdoor light variation produceda higher intensity and colour variation. Thus, it seems that IIC was not ableto estimate the correct main illuminant in every picture. In addition, the in-door lights did not overexpose the image acquisition as happened in camera#1. We can say that the multiple light sources and over-exposed BG regionsin camera #1 made a more complex scene compared to camera #2. This issuemade photometric variations in the FG objects of camera #2 simpler. Thus,it seems reasonable that errors in uncorrected images were slightly smallerthan in camera #1, as Table 4.5 and Table 4.6 show. The results regardingthe FG segmentation errors demonstrated that our method maintained sim-ilar error rates when the size variation was reduced by 30 % and enlargedby 60 %. In this case, the LCM still outperformed the AE, IIC and AE+IIC

methods. The deterioration of the algorithm became noticeable when thesegmentation errors were always large (in the case of −90 % and 200 %).

In view of all of these results, the proposed method was capable of correct-ing photometric variations in objects even in complex illumination scenes oncondition that BG areas were close enough to FG objects and the camera op-erated on the linear region. In this context, complex scenes may be thosein which a multitude of reflections on Lambertian surfaces are generated,at least one illuminant is changed and the intensity of lights vary signif-icantly. In addition, for scenes in which the main light source came fromindoor lights and a huge number of interreflections were produced or non-overlapping cameras acquired the images, the LCM outperformed the AE andAWB algorithms.

4.5 conclusions

In this chapter, we describe a computationally efficient method to correctphotometric variations in single cameras and non-overlapping cameras sce-nes. For each image, these variations are estimated by simple vector oper-ations and the correction is based on a double linear mapping. One of themappings is based on the QRMR (Chapter 3) while the other is based onICCR methods. This is linked with the fact that, contrary to other colour con-stancy algorithms, the illuminants do not have to be estimated. Therefore,the scenes are easy to calibrate and the use of canonical, known illuminantsor colour checker patterns are not required.

Our algorithm performs much better than uncorrected images and SoA

AE besides a representative colour correction algorithm (IIC). Furthermore,the algorithm is able to operate properly in complex environments underseveral illuminants because, unlike most of interclass and other correction

4.5 conclusions 125

methods, local photometric changes are considered and the estimate of theBG–FG relation makes it easy to adapt changes in any object using BG in-formation. The proposed algorithm is also robust against FG segmentationerrors. Besides, it corrects changes in the settings of the camera as long asthe change in the camera response is linear.

To conclude, the main contribution of this chapter is that in a non-over-lapping multi-camera system –although scenes’ illumination is dissimilar(indoor and outdoor)– we propose an algorithm capable of estimating acorrection that helps to maintain the colour appearance of objects in suchscenes.

Up to this point, one of the working hypothesis as regards the IFM isthat the images are correctly exposed. In the experiments we rejected theunder- and over-exposed images. In the following chapter, we go into detailregarding those techniques that control the camera elements to get a suitableexposure. In this line, we propose a control algorithm that can be combinedwith the LCM.

5E X P O S U R E C O N T R O L D U R I N G V I D E O A C Q U I S I T I O N

The image correction algorithms1 assume that the images are neither over-nor under-exposed. Nevertheless, this is not the common case in real sys-tems. Some approaches filter the pixels that suffer from this effect. Thedrawback is that significant information is likely rejected. This issue can beaddressed via controlling the process of the image acquisition using an AE

algorithm. Both the control of the camera and the quality of the images haveevolved since the first camera models (Figure 5.1). Nevertheless, sufficienteffort to get robust and reliable AE algorithms has not been done yet.

(a) (b)

Figure 5.1: One of the first produced camera obscura (a) (based on Wikipedia2) com-pared with a modern SLR camera (b) (based on Nikon Europe B.V.3).

In this chapter we establishes a real-time AE method to guarantee thatvideo cameras in uncontrolled light conditions achieve two objectives: i) takeadvantage of their whole dynamic range while ii) provide neither under- norover-exposed images. We assume that the whole dynamic range is reachedwhen the highest contrast for a given scene is obtained.

The remainder of the chapter is organised as follows. Section 5.1 providesan overview of the SoA of the AE techniques. In this section, we identify themain parts that compose this type of techniques and explain the existingalternatives. The proposed method is detailed in Section 5.2. This methodprovides a solution to each of the components of the AE techniques. Some ex-periments were implemented in which the proposed method was comparedwith a relevant SoA technique to assess the proposed algorithm. These exper-iments are described in Section 5.3, including their results and a discussionabout them. Finally, in Section 5.4, the main contributions of the chapter areemphasised.

1 The content of this chapter is partially presented on [108].3 Wikipedia, Daguerreotype (consulted 09/2014)http://en.wikipedia.org/wiki/Daguerreotype

3 Nikon, D7000 (consulted 09/2014)http://www.nikon.es/es_ES/product/digital-cameras/slr/consumer/d7000

4 Wikipedia, History of photography (consulted 09/2014)http://en.wikipedia.org/wiki/History_of_photography

127

http://en.wikipedia.org/wiki/Daguerreotype

http://www.nikon.es/es_ES/product/digital-cameras/slr/consumer/d7000

http://en.wikipedia.org/wiki/History_of_photography

128 exposure control during video acquisition

Figure 5.2: Robert Cornelius’ self-portrait (1839). One of the first pictures ever madeby one of the pioneers of Photography (based on Wikipedia4).

5.1 state of the art

Unlike the initial camera obscura designs and experiments in antiquity (Sec-tion 2.2), the first practical cameras were developed in the earlies 1800s [46](Figure 5.2) when the light that goes through the pinhole could be impressedon a light-sensitive material. Those cameras never had lenses and the expo-sure time was controlled roughly by a cover. Any tool to determine a suit-able exposure did not exist. Besides, the exposure depended mostly on thetype of plate and its processing. When the first exposure meters appearedand the effect of the different chemical processes were studied in depth, thephotographer could manually control the exposure of the taken pictures. Inthe 1960s, the commercial cameras already incorporated AE systems. Theright exposure was determined with the help of a built-in exposure meter5.When the capabilities of the chipsets started growing exponentially, the AE

algorithms improved the quality of the taken images. Most of the AE meth-ods were designed –and often patented [57, 80, 84]– by the camera manu-facturers that implemented different approaches for each stage. Kondo et al.[57] (1992) patented one of the first automatic process of capture, which alsoincluded an auto focus besides an AWB technique6.

Still cameras are not the only type of cameras that implements AE tech-niques. As well video cameras have these methods. Nevertheless, the con-straints of both are different. Meanwhile the former require from a suitableexposure before acquiring the image, the latter may use an adaptive algo-rithm that reaches correct settings in several captures, because the acqui-sition is continuous. This dissimilarity means that two types of differentapproaches tackle the problem. In this chapter, we consider both types. Inaddition, the absolute and the relative correction methods (Section 4.1) areaccounted henceforth.

Every AE technique is composed by three steps: i) to measure the amountof light that produces the image, ii) to process such measurement, and iii) toactuate the camera parameters that are capable of modifying the exposure.The remainder of the section sheds further light on the nature of these steps.

5 Although common digital still cameras also have this photocells [59], the purpose of this chap-ter is to provide a control algorithm that only uses the information extracted from the camerasensor. Therefore, the study provided in this section does not consider this and other auxiliaryelements.

6 This chapter does not address AWBs algorithms because they were already described in Sec-tion 4.1


Measure

Process

Control

Light

Camerasettings

Figure 5.3: The AE algorithms workflow.

5.1.1 The measurement of the light

In some situations, the entire area of the image has not the same relevance.Thus, the light coming from some zones must be emphasised in relationto the rest. The easiest systems used the intensity of one single point (spotmetering) or a set of pre-defined ones (multi-zone metering). Figure 5.4 de-picts some other patterns implemented in still cameras. They assume thatthe central zone is the most relevant because the objects to be photographedare mostly located within it. In the same line, Park and Bijalwan [80] (2014)–based on Liang et al. [66]’s work– patented a technique that used two re-gions –BG and FG– whose significance was weighted adaptively besides thecentral zone was also raised (Figure 5.5). Nevertheless, the importance of thecentral zone is relative even in Photography, in which the rule of thirds [74]encourages to shift the FG object from the centre. Liu et al. [68] (2010) wentfar beyond the classic approach. Their division into regions that dependedon the scene provided a better knowledge of the behaviour of the light, mak-ing the correction more accurate (Section 4.4.3). This algorithm provided aflexibility that outperformed the classical approaches. Besides, in surveil-lance systems, the monitored objects are likely located in any region of thescene. Price et al. [84] (2012) proposed a particular approach for mobilevideoconferencing applications. The technique of Price et al. automaticallyestimated the region where the person is located and used it for the con-trol algorithm. Some of the algorithms that manage the image uniformlyused its average value as the indicator that represents the quantity of lightand compared it with a reference [60, 79]. Instead of the average of theimage, Fowler [32] (2005) counted the pixels that exceed some thresholds.The histogram of the image represents the distribution of the light. Thus,


it delivers great information about the lights in the image and involves lowcost computations [107]. As regards the selection of regions of interest, Yanget al. [119] (2008) developed an algorithm that searched for the histogrampeaks and, under the premise that they correspond to less interest regions,they were used to discard the associated regions in the exposure estimation.Some processing techniques using the histogram are mentioned in the nextsubsection.

Figure 5.4: Examples of light metering patterns implemented in still cameras (basedon [52]). Lines are isolines that have the same significance. The centralzone is always the most important.

Figure 5.5: Inverted T pattern for light metering used in [80].

5.1.2 The processing of the indicators

After the indicators are extracted from the captured image, in the process-ing step several techniques were proposed to provide a suitable correc-tion. Regarding the video solutions, most of these techniques consisted incomparing the measured values with either reference ones or thresholds[32, 60, 68, 79, 80, 84]. Liang et al. [66] (2007) went far beyond and pro-posed an iterative algorithm that used the logarithm of the average value todetermine the correction. The absolute approaches optimised some generalindicators. One of the most significant approaches was the proposal of Wang[113] (2012), which maximised the information of the scene by determiningthe exposure that maximised the entropy. This method provided remark-able results in static scenes. The drawback of this method is that it failed inthose scenes where the content is dynamic and the entropy value changes


suddenly, because the convergence to the maximum value is not achievedeasily. Huang et al. [50] (2013) performed a gamma correction transforma-tion based on the cumulative histogram to maximise the contrast. The al-gorithm operated for images besides video, however, it did not considerover-exposed situations.

Unlike the video techniques, the correction cannot be iterative in images.Besides Huang et al.’s algorithm, the processing implemented in still cam-eras are connected to the Exposure Values (EVs) [52]. Depending on the scene(daylight, night, indoor, outdoor, and so on) or the kind of photograph (land-scape, party, portrait, sports, and so on), some prefixed tables provided thesuitable camera settings which ensure the correct EV (Figure 5.6).

5.1.3 The actuation of the camera

The current cameras have three mechanisms to control the exposure (Sec-tion 2.2): i) the exposure time T , ii) the relative aperture N, and iii) theelectronic gain g. In digital still cameras, the electronic gain is related withthe ISO speed [52, Chapter 18]. Thus, the actuation consists in convertingthe correction factors computed in the processing step into a combinationof these three factors. The scientific community focused on the processingalgorithms, unlike on exploring these aspects. This is likely due to the greatvariability of the constraints in real systems. In the previous subsection, wemention that the combination of these parameters depends on the scene andthe type of situation in Photography. In low light conditions and moving ob-jects, larger apertures are preferred. Nevertheless, this setting delivers shortdepths of field; therefore, to find unfocused objects in the acquired imagesis likely [52, Chapter 4]. Furthermore, the three parameters are not acces-sible in every cameras. For example, the cameras used in Machine Visionapplications have barely control of the aperture because this parameter isset manually. These issues were considered in the design of the actuationalgorithms.

Figure 5.6 depicts an example of the selection of the settings of T and N

in still cameras depending on the desired EV. Five modes are representedin the example. The intersection of the EV lines with the mode lines set theaperture and exposure time for each mode.

In Figure 5.7, a classic example of AE control in video cameras is shown.This algorithm was based on holding the average value of the image (calledvideo level) as large as possible without saturating the image. The algorithmdefined a 0 dB light level, in which the iris was full open (highest N) andthe gain set to 0. To provide the highest SNR, the gain should be better setto the lowest value. When the light level decreases, the gain increases untilthe highest value is reached. If the light level continues down, the videolevel also decreases proportionally. When the light level increases, the aper-ture reduces while the highest light level is maintained. In this example, inthe normal operation of the algorithm, the exposure time is constant. Nev-ertheless, the author proposed to reduce the exposure time –or use neutraldensity filters– to avoid over-exposure.


Figure 5.6: Programmed exposure modes. "The EV graph shows five possible programprofiles over an exposure range from EV 1 to 20. These are D (depth), W (wide),N (normal), T (telephoto) and A (action)." (based on [52])

Range of Automatic Operation

Iris

Control

Gain

Control

Beyond Here,

− Use ND Filters, or

− Shutter

+42

Normal Gain

Iris Closed

0

100

−30 0 (dB)

Vid

eo L

evel (%

)

Light Level

Maximum Gain

Iris Full Open

Normal Gain

Iris Full Open

Figure 5.7: Example of AE control algorithm (based on [71]). The values of the x–axisare irrelevant.

5.2 proposed method 133

5.2 proposed method

We propose the Camera Exposure Control (CEC) algorithm to cope with thetwo objectives indicated at the beginning of this chapter. The analysed algo-rithms (Section 5.1) cannot properly reach these objectives although someof the described techniques are partially used in this algorithm. On the onehand, the spot metering system or average value of the image as indicatorcan guarantee neither the highest contrast nor the over-exposure avoidance.On the other hand, the techniques based on other statistics or the histogram[50, 113] failed in moving scenes or the over-exposure constraint. The CEC

is based on histogram statistics and achieves the established targets. The de-tails are explained below following the three steps identified in Figure 5.3,which correspond with: i) to define some control variables, ii) to process analgorithm, and iii) to set up some actuators of the camera.

5.2.1 Control variables

The control variables are those in charge of measuring the light intensity inthe captured images. All of them are computed from the histogram.

Given an image B of M pixels and LEV digital levels, whose histogramdistribution is h, we define the following control variables.

Definition 5.1. The bright Br of the image B is the average value of its pixelintensities:

Br =1

M

LEV−1∑i=0

i h(i) (5.1)

Definition 5.2. The contrast Cr of the image B is the variance of its pixel intensi-ties:

Cr =

√√√√ 1

M

M∑i=1

(B(i) −Br)2 (5.2)

The bright indicates the darkness (low value) or the lightness (high value)of the image. The contrast provides an indicator of the variability of theintensity. Considering the image of an object, the larger contrast the moredetails are perceived from the object.

The remaining variables are computed from the CHD H that is obtainedstraightforward from the histogram:

H(r) =1

M

r∑i=1

h(i), r ∈ [0,LEV − 1] (5.3)

Definition 5.3. Given a range of intensity values SR ∈ [0,LEV − 1], the WSL ofthe image B for this range is the rate between the number of pixels that fall in theSR largest intensities and the total number of pixels.

Due to H is normalised between 0 and 1, this is equivalent to:

WSL = H(LEV − 1− SR) (5.4)

For the WSL, SR is called OER and it sets the lowest bound for the intensityvalues from which the pixels are over-exposed.


0 50 100 150 200 2500

0.005

0.01

0.015

0.02

0.025

0.03

Bright

Contrast

WSL

bins

his

togra

m

(a) Histogram Distribution

0 50 100 150 200 2500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Bright

ST

OER

1 - WSL

NR

idwsl

idbsl

bins

CH

D

(b) Cumulative Histogram Distribution

Figure 5.8: Indicators in intensities distribution. They are shown on 256 bins. Corre-spondence of bright, WSL, idwsl and idbsl are shown in both distributions.NR, OER and ST are displayed on (b). Black lines indicate bright or con-trast value; blue lines, idwsl and idbsl; green lines enclose the under- andover- exposure regions; and red lines enclose the part of the distributionthat fills the ST.


Definition 5.4. Given a range of intensity values SR ∈ [0,LEV − 1], the BlackSaturation Level (BSL) of the image B for this range is the rate between the numberof pixels that fall in the SR fewest intensities and the total number of pixels.

This is equivalent to:

BSL = H(SR) (5.5)

For the BSL, SR is called NR and it sets the upper bound for the intensityvalues from which the pixels are under-exposed. Most of these pixels can beconsidered noise.

The last two variables define the degree of over-exposure and under-exposure of the image depending on the parameters OER and NR, respec-tively. The closer to one, the further from a suitable exposure. On the con-trary, a value of zero means that the exposure is adequate.

Definition 5.5. Given a saturation tolerance ST ∈ [0, 1], the over-exposure indexidwsl is the minimum intensity from which the histogram accumulates the ST rateof pixels:

idwsl = arg minr

H(r) 1− ST (5.6)

Definition 5.6. Given a saturation tolerance ST ∈ [0, 1], the under-exposure indexidbsl is the minimum intensity in which the histogram accumulates the ST rate ofpixels:

idbsl = arg minr

H(r) ST (5.7)

Both indexes provides information regarding the proximity to an over-exposure (idwsl) –or under-exposure (idbsl)–. The larger idwsl is, the closer toan over-exposure is. The fewer idbsl is, the closer to a under-exposure is.

All of these variables are shown graphically in Figure 5.8. These variablescharacterise the image exposure and are used as input to the CEC algorithm.

5.2.2 Algorithm

symbol meaning

ST Saturation tolerance

Cmax Maximum contrast

OER Over-exposure range

NR Noise range

∆EVmax Maximum rate to increase Ef

∆EVdec Default rate to decrease Ef

Table 5.1: Tuning parameters for the CEC algorithm.

Algorithm 5.1 shows the pseudocode of the CEC algorithm and Figure 5.9,its flowchart. This works only if the radiance is enough to generate an image;that is, it does not work under very dark lighting conditions. Furthermore,a Region of Interest (RoI) is defined. This is helpful if, for example, the FoV

contains part of the sky, which is irrelevant for most of the Computer Visiontasks and would disturb the light metering. After, the algorithm performs a


Calculateindicators

CalculateEf

IncreaseExposure

DecreaseExposure

CalculateNt+1, Tt+1,Gt+1

ApplyNt+1, Tt+1,Gt+1

WSL > STBSL > ST

∨

C < Cmax

END

END

framet

Nt, Tt,Gt

Yes

Efobj Efobj

No No

Yes

Figure 5.9: The CEC flowchart for each captured image. t indicates current time andt+ 1, next iteration. The blocks content is detailed in Algorithms 5.1–5.3.

workflow for each captured image. We define some parameters (Table 5.1),which have to be instantiated in a real implementation. These parametersare explained in the remainder of this section. First, unlike the most of theapproaches, a background subtraction [42] is carried out if the scene con-tains great motion. Considering FG intensity in these scenes may lead toinstabilities in the control algorithm. For example, if the average intensityof a great part of the FG change suddenly from darkness to lightness, thecontrol variables would measure a change in the illumination that has notbeen produced.

For the next step, let us define a function that relates those camera param-eters that modify the exposure. The function is called exposure function Ef–ef in the pseudocode– and is given by:

Ef =g · T

1+N2(5.8)

This function follows a linear relationship between gain and time expo-sure and is inversely proportional to the square of the relative aperture. Notethat Ef reminds about Equation 2.25. Thus, considering linear response ofthe CRF, the changes made in this function mean proportional changes inthe digital image for the same input radiance.


Algorithm 5.1 CEC.

Require: L Lmin Ensure enough radiance1: selectROI2: while frame do3: bg ← extractBackground(frame,bg) Optional4: ef ← g · T/(1+N2) Exposure function5: c,wsl,bsl, idwsl, idbsl ← calcIndicators(bg)6: stat ← checkCurrentExposure(c,wsl,bsl)7: if stat = SUITABLE then8: g, T ,N ← calcNewActuators(idwsl,wsl, idbsl, stat, ef)9: applyChanges2Cam(g, T ,N)

10: end if11: frame ← currentFrame

12: end while

Once Ef is computed using the current camera settings, the control vari-ables (Section 5.2.1) are updated and the exposure of the image is checked.When the scene radiance distribution has a larger dynamic range than thecamera, both objectives cannot be reached simultaneously. To maximise thecontrast would involve working outside the linear range of the CRF. Thus, atrade-off between the objectives has to be designed. To accomplish this, weestablish three targets sorted by priority:

1. To avoid the brightest area of the histogram (OER).

2. To avoid the noisy area of the histogram (NR).

3. To maximise the contrast.

The most important issue is to avoid over-exposure because in these situa-tions, intensity information is lost. Unlike this, an under-exposed image haslow SNR but the noise may be removed. We define three states dependingon the contrast, WSL and BSL (Algorithm 5.2):

1. Unsuitable: in case the WSL is greater than the saturation tolerance.

2. Quasi-suitable: in case the BSL is greater than the saturation toleranceor the maximum contrast Cmax is not reached.

3. Suitable: in case the exposure is adequate.

In case the state is not suitable, an actuation on the camera parameters isdone (line 8, Algorithm 5.1 and line 12, Algorithm 5.2). This function con-sists in: i) decreasing the exposure if the state is unsuitable, or ii) increasingthe exposure if the state is quasi-suitable.

We design two different algorithms for the exposure variation. These algo-rithms are based on estimating the new value of the exposure function (efobj)that will get a suitable state in the next capture. The increase of the expo-sure uses idwsl to determine efobj. The calculation (line 6, Algorithm 5.2) isbased on a bisection model [17], which simplifies the CRF and assume a lin-ear relationship without noise between Ef and the digital image. Figure 5.10

illustrates the bisection calculation and the drawbacks of these assumptions.The aim of this transformation is to estimate efobj such that the idwsl movesfrom the current value to the OER boundary. Therefore, the pixels having avalue of idwsl will have the highest intensity but will not be over-exposed


Algorithm 5.2 Additional functions of CEC.

1: function checkCurrentExposure(c,wsl,bsl)2: if wsl > ST then3: status ← UNSUITABLE

4: else if bsl > ST ∨ c < Cmax then5: status ← QUASI− SUITABLE

6: else7: status ← SUITABLE

8: end if9: return status

10: end function

11: function calcNewActuators(idwsl,wsl, idbsl, status, ef)12: if status = QUASI− SUITABLE then13: g, T ,N ← incExposure(idwsl, ef)14: else UNSUITABLE

15: g, T ,N ← decExposure(idbsl,wsl, ef)16: end if17: return g, T ,N18: end function

B

Ef

B = f(L, e0)

e1e0

b0

b1

(a) High region

B

Ef

B = f(L, e0)

e1e0

b0

b1

(b) Low region

Figure 5.10: Estimate of the variation of the exposure function in B vs Ef curve. Foran initial point (e0,b0, a linear transfer function is determined (blackline). Grey dotted line indicates the CRF for Ef = e0 and function ofthe incoming radiance L. b1 is the target intensity. It is used, besidesthe transfer function, to get the new value of the exposure function e1.Due to the differences between the transfer function and the real CRF, anerror is done. Grey dots indicate the real digital value for e1 if the CRF

would not change with Ef . Although this assumption is false, it serves toillustrate the magnitude of the error. Graphs (a) and (b) show that thiserror depends on the region. Thus, lesser error is expected for highervalues.

and ST will not be exceeded. Using the figure, for b0 = idwsl, e0 = ef,b1 = LEV − 1−OER and e1 = efobj the expression of line 4 (Algorithm 5.3)is obtained. The figure also depicts an idea of the errors that can be pro-duced related to the expected performance. The variation of the exposurefunction is used to compensate the changes in the incoming radiance. So,the dotted gray graph can be seen as an example of a non-linear and noisy


CRF assuming that the x–axis corresponds to the input radiance and theexposure function is fixed. In the case of a lightness image, the differencebetween the expected intensity (green dot) and the value that cut the CRF

(grey dot) is shorter than that in the case of a darkness image. In darker sit-uations the transformation would over-expose the image. Due to this issue,a maximum variation of ∆EVmax is set to the slope of the transformation(line 6, Algorithm 5.3).

The decrease function sets the efobj according to idbsl and WSL values. Incase that the most of the image is over-exposed (line 13, Algorithm 5.3),a reset to a default value ef0 is carried out. Otherwise, the decrease ratefollows a relation proportional to WSL. Thus, the more over-exposure, themore reduction of ef. The limits of this rate are given by:

1(1−ST)∆EVdec

, if wsl = 0

1∆EVmax

, if wsl = 1

∆EVdec is a parameter that adjusts the speed at which the algorithm canrecover a suitable exposure from an over-exposure. If this parameter is high,it quickly leaves the OER but it could generate too dark images; therefore,oscillation is likely. If the parameter is low, to reduce the over-exposure maybe slow.

The last step consists in applying the new values of the actuators to thecamera. This step is explained in the next subsection. Then, next image isprocessed, returning to the beginning of the loop.

5.2.3 Actuation

The CEC uses the three mechanisms identified in Section 5.1.3. This step con-sists in transforming the efobj into the three parameters that control the lensaperture, exposure time and electronic gain. We establish two priorities: i)to hold the lens as open as possible, and ii) to set the gain as low as possible.This alternative is expected to properly operate in surveillance applicationsin which the object plane7 is usually far from the camera. Thus, the objectsare likely focused if the focus is at infinity even if the lens aperture is large.Furthermore, the blurring may be avoid due to fast movements, because thetime exposure can be lower. The low gain also avoid unnecessary noise. Al-though these premises function for surveillance scenes, other variations canbe easily implemented if the application constraints or the scene change.

The transformation is a piecewise function split into three parts depend-ing on the parameter control. Thus, in increasing order of Ef , for the lowestvalues, the aperture is controlled up to the highest lens opening meanwhilethe time exposure and the gain are set to the lowest value. Then, the timeexposure is controlled up to the highest value. Besides, when the lens aretotally open and the time exposure is maximum, the gain is increased. De-tails of the transformation are shown in the function exposure2Actuator ofAlgorithm 5.3.

7 the object plane is the one that contains the objects to be captured.


Algorithm 5.3 Exposure variation functions of CEC.

1: function incExposure(idwsl, ef)2: if idwsl < LEV − 1−OER then3: if LEV−1−OER

idwsl< ∆EVmax then Limit the increment up to a

max4: efobj ← LEV−1−OER

idwsl· ef

5: else6: efobj ← ∆EVmax · ef7: end if8: end if9: g, T ,N ← exposure2Actuator(efobj)

10: return g, T ,N11: end function

12: function decExposure(idbsl,wsl, ef)13: if idbsl > LEV − 1−OER then Total over-exposure14: efobj ← ef0 Default value15: else16: efobj ← (1−wsl)/∆EVdec+(wsl−ST)/∆EVmax

1−ST · ef17: end if18: g, T ,N ← exposure2Actuator(efobj)19: return g, T ,N20: end function

21: function exposure2Actuator(ef)22: if ef < Tmin/(1+N2

min) then Aperture control23: g ← 1

24: T ← Tmin25: N ←

√(Tmin/ef) − 1

26: else if ef < Tmax/(1+N2min) then Shutter control

27: g ← 1

28: T ← ef · (1+N2min)

29: N ← Nmin30: else Gain control31: g ← ef · (1+N2

min)/Tmax32: T ← Tmax33: N ← Nmin34: end if35: return g, T ,N36: end function

5.3 experiments 141

5.3 experiments

To evaluate the performance of the CEC, we implemented an experimentcomparing its operation in an outdoor surveillance scene with other SoA AE

algorithm. The scene was that one which was used to create the Parkingdataset (Section C.2). The details are explained in Section 5.3.2. Before, Sec-tion 5.3.1 identifies a set of indicators that serves to measure the goodnessof the algorithm according to the initial requirements. The results are shownin Section 5.3.3. We provide tables and graphs that depict the indicators, be-sides some image samples as visual assessment. Attending to the obtainedresults, some considerations are exposed in Section 5.3.4.

5.3.1 Performance indicators

We defined some indicators to evaluate whether the CEC accomplished theobjectives and whether outperformed other AE algorithm.

First, the behaviour of the control variables were analysed via the meanand standard deviation of the images during the time that the algorithmswere operating. The mean provided an idea of the proximity to a particularobjective meanwhile the standard deviation was related with the stabilityof the algorithm with regard to each parameter; that is, the capability ofholding a constant value along time. The less standard deviation the better.

Furthermore, we defined three parameters that coped with the three pri-orities indicated in Section 5.2.1.

Definition 5.7. The over-exposure time rate OEt is the time that the WSL is largerthan 3 %8 divided by the total time.

OEt =TwslTT

(5.9)

where Twsl is the time such that WSL > 3 %, and TT is the total time.

Definition 5.8. The under-exposure time rate UEt is the time that the BSL islarger than 3 % divided by the total time.

UEt =TbslTT

(5.10)

where Tbsl is the time such that BSL > 3 %, and TT is the total time.

Definition 5.9. The distance to maximum contrast dC in a time interval is themean of the Euclidean distances of the contrast to the established maximum valueCmax.

dC =1

TT

TT∑t=0

|Cr(t) −Cmax| (5.11)

where TT is the total time.

For these three parameters, the less value the better. The following detailsthe conducted experiments.


Wait WaitProcess

CECEvaluate

CECtSet previous

CECsettings

Getindicators

Setcomputed

settings

Getindicators

Set previousSoA

settings

· · ·

Figure 5.11: The timeline of the CEC evaluation. Along the grey part, the algorithmis not operating but waiting for the change in the camera parameters.


The experiments were implemented in the scene described in Section C.2using the same camera. The camera was operating 24 hours during 7 days.Along that time, the weather conditions were diverse: from very sunnyhours to rainy hours with dark light. Unlike the last hours of the eveningin which the scene was illuminated by lampposts, the only light source wasthe sun. The camera had a night mode enabled. Nevertheless, the lamppostswere turn off at 22:00 h and not enough light was generated in the camera.The data collected during that time were not considered in the evaluation.The camera was controlled by POST [28] request methods. Due to the possi-ble network latency and the lack of acknowledgement from the camera, wedefined a guard time of 3 s between each control iteration with the camera.The reason of this guard time was to wait sufficient time to ensure that theparameters we sent were really set within the camera. We assumed that nolight variation is produced in the scene along the guard time.

The AE algorithm was based on Liu et al. [68] work. The entire FoV wasused as a unique region. As reference bright Brobj, we set 1/3 of the maxi-mum intensity value. This is a common value used in Photography, which isa trade-off between having sufficient light and avoiding over-exposure. Thealgorithm set the Ef that reached the objective bright based on its currentvalue:

eft+1 =Brobj eft

Brt(5.12)

where the subscript t indicates time.The computation of the actuators was the same as the CEC.The execution of the experiment consisted in alternating between both

algorithms. It followed the timeline shown in Figure 5.11. First, the camerasettings determined via the CEC were sent to the camera. The wait intervalwas the guard time. Once the guard time finished, the indicators were calcu-lated and the CEC algorithm determined the next settings. After a new guardtime, the new indicators were calculated and stored for the evaluation. Atthe end, the camera settings obtained via the SoA AE were sent to the cam-era and the process were repeated again with this algorithm. This processwas, in turn, repeated till the end of the experiment. During this, the valuesshown in Table 5.2 were used. In addition, background subtraction was notnecessary because the traffic in the parking was little and, therefore, a biginfluence of the FG in the indicators was not produced.

After describing the followed strategy in the evaluation, its results aredetailed next.

8 We consider this value is a trade-off between the existence of highlights due to specular surfaceswhich do not severely affect Computer Vision algorithms, and a high portion of over-exposedpixels that may induces unmanageable errors. Nevertheless, other values may be used in dif-ferent scenarios.

5.3 experiments 143

symbol value

ST 1%

Cmax 100

OER 10

∆EVmax 24

∆EVdec 1/0.9

Table 5.2: The values for the CEC settings used in the evaluation.

5.3.3 Results

From the seven days of running, two significance days were selected. Theclimate along day one was mostly sunny with some clouds intervals whichgenerates sudden light changes. Along day two, the climate suffered largevariations: rain, clouds and sun alternated during the daylight.

Figure 5.12 shows several examples of images taken during the perfor-mance of both algorithms in the most significance hours and conditions.Unlike at night, the behaviour of the algorithms was very different. Atdawn and dusk, the hours in which the sun incidence change faster, ouralgorithm generated brighter images. Our algorithm outperformed the al-ternative when there were harsh shadows. Due to the high contrast of thiscondition, the alternative held the average value at the expense of saturatingpart of the image. When it rained, the behaviour was similar. The increaseof the specular reflectance of the surfaces due to the water also generatedover-exposure in the alternative method.

The plots of the evolution of the control variables (Figures 5.13–5.16) en-dorse these signs. These graphs depict the variables evolution since thedawn (t = 0) until the night. In t ≈ 5× 104 the night mode of the cam-era was enabled. Thus, an edge could be observed in all the plots on theright. As general observation, there was a rotation as regards the highestvalue of the bright and contrast between algorithms. At the beginning, cor-responding to the morning, the SoA AE presented a higher value. At thecentral time these values were similar and the CEC overtook SoA AE in theevening. It seemed that SoA AE was not able to hold an optimal contrastat dawn. The analysis of WSL plots (Figure 5.14a and Figure 5.16a) providemore details regarding the behaviour of the algorithms. The time intervalin which the bright and contrast of SoA AE were higher corresponded to anover-exposure, in which WSL even exceeded 5 %. This interval was relatedto the examples of Figure 5.12g and Figure 5.12h. The higher contrast in SoA

AE was explained by harsh shadows. Our algorithm sacrificed the contrastwith the aim of avoiding the over-exposure. WSL plots also demonstrate thatCEC held the target (1 %) until the last hours of the evening. As regards BSL,the profile of the plots was inverse to the WSL.

One of the dissimilarities between the plots of the two days was the in-crease of the oscillations of all the variables for both algorithms in the secondday. This was explained by the instability of the weather conditions.

Table 5.3 shows the mean and the standard deviation of these plots.Although the weather conditions were very different in both days, the

dissimilarities of the means were almost insignificant. Nevertheless, the in-crease of the standard deviation of both algorithms from day one to daytwo is noticeable. As previously commented, this increase was due to the


(a) SoA at dawn. (b) CEC at dawn.

(c) SoA at dusk. (d) CEC at dusk.

(e) SoA at night. (f) CEC at night.

(g) SoA – harsh shadows. (h) CEC – harsh shadows.

(i) SoA – rainy. (j) CEC – rainy.

Figure 5.12: Visual comparison of AE methods.

5.3 experiments 145

0 1 2 3 4 5

x 104

0

50

100

150

200

250

time (s)

Bright

SoA

CEC

(a) Bright.

0 1 2 3 4 5

x 104

0

10

20

30

40

50

60

70

80

90

100

time (s)

Contr

ast

SoA

CEC

(b) Contrast.

Figure 5.13: Comparison of the evolution of the (a) bright and (b) contrast for CEC

and SoA algorithms in a sunny day with clouds alternation.

high variation of the weather conditions in the day two. Unlike the meanof the contrast in day one and the standard deviation of the bright, our al-gorithm outperformed the alternative. The standard deviations were in theremaining cases lower, which involves a larger stability. This was speciallynoticeable in the contrast. Also the reduction in the WSL was noteworthy,which was consistent with the design of the algorithm.

Regarding under- and over-exposure time rates (Table 5.4), the CEC out-performed the SoA AE algorithm (up to 15 times for OEt). The distance to themaximum contrast is lower in SoA AE, but the difference is almost insignifi-cant.

5.3.4 Discussion

The results assessed that the CEC algorithm provided an outstanding expo-sure during the most difficult light conditions of the experiments environ-ment. These conditions were the light at dusk, dawn and in harsh shadows.This fact was demonstrated via visual assessment, the evolution of the con-trol variables and the performance indicators (Section 5.3.1).


SoA CEC

µ σ µ σ

WSL 1.37 1.62 0.87 0.36

BSL 0.84 1.02 0.77 0.94

Br 89.17 13.94 94.85 20.02

Ct 45.14 13.47 43.74 7.44

(a) Sunny with clouds alternation

SoA CEC

µ σ µ σ

WSL 1.34 1.70 0.88 0.66

BSL 1.60 7.61 1.09 1.07

Br 84.71 16.30 92.81 22.38

Ct 40.61 11.87 42.11 9.42

(b) Unstable weather conditions

Table 5.3: Mean and standard deviation of the control variables in two days for theCEC and classic AE methods.

SoA CEC

OEt 8.67 0.55

UEt 5.94 3.54

dC 54.86 56.26

(a) Sunny with clouds alternation

SoA CEC

OEt 16.85 1.53

UEt 3.15 2.90

dC 59.39 57.89

(b) Unstable weather conditions

Table 5.4: Rate time of unsuitable exposure and average distance to maximum con-trast in two days for the CEC and classic AE methods.

5.3 experiments 147

0 1 2 3 4 5

x 104

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

time (s)

WS

L (

%)

SoA

CEC

(a) WSL.

0 1 2 3 4 5

x 104

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

time (s)

BS

L (

%)

SoA

CEC

(b) BSL.

Figure 5.14: Comparison of the evolution of the (a) WSL and (b) BSL for the CEC andSoA algorithms in a sunny day with clouds alternation.

Furthermore, although the mean of the contrast was similar for both al-gorithms, the CEC has proven to be more stable; besides the remaining vari-ables. Finally, it was demonstrated that the use of the average intensity ascontrol variable was not enough for covering a wide range of lighting con-ditions.


0 1 2 3 4 5

x 104

0

50

100

150

200

250

time (s)

Bright

SoA

CEC

(a) Bright.

0 1 2 3 4 5

x 104

0

10

20

30

40

50

60

70

80

90

100

time (s)

Contr

ast

SoA

CEC

(b) Contrast.

Figure 5.15: Comparison of the evolution of the (a) bright and (b) contrast for theCEC and SoA algorithms during unstable weather conditions. An alter-nation between rainy, cloudy and sunny weather was produced duringthe experiment.

5.3 experiments 149

0 1 2 3 4 5

x 104

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

time (s)

WS

L (

%)

SoA

CEC

(a) WSL.

0 1 2 3 4 5

x 104

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

time (s)

BS

L (

%)

SoA

CEC

(b) BSL.

Figure 5.16: Comparison of the evolution of the (a) WSL and (b) BSL for the CEC

and SoA algorithms during unstable weather conditions. An alternationbetween rainy, cloudy and sunny weather was produced during the ex-periment.


5.4 conclusions

In this chapter, we highlight the necessity of ensuring a correct exposure inComputer Vision applications. Using the boundaries of the range of the cam-era could generate misleading information, due to either camera noise orpixel saturation. We define the AE workflow and analyse several techniquesfor each step, which can be combined. We detect a lack of information re-garding the actuation algorithms likely due to the difficulty to establish ageneral purpose algorithm.

Besides the known bright and contrast, the contributions of the proposedmethod comprise the definition of variables that characterise the proximityof the unsuitable exposure operating zones (idwsl and idbsl); and provide ameasurable value about the degree of under- and over-exposure (WSL andBSL, respectively). They allow determining quasi optimal camera actuatorswhich ensure a fast convergence to a suitable exposure.

The CEC algorithm has proven to be robust in the scenario of the experi-ments. Thus, the CEC performs properly under uneven light conditions be-sides it is capable of avoiding wrong exposures specially when there areharsh shadows; in which other algorithms fail. However, the reliability hasnot been demonstrated with special weather conditions such as snow –which increases the specularities of the surfaces– or fog –where most ofComputer Vision algorithms do not properly operate–. Furthermore, we donot know the performance in other kind of scenes, such as indoors. Fortu-nately, we expect that the algorithm is sufficient flexible to function in otherscenes by tuning CEC settings (Table 5.1).

To conclude, the proposed algorithm provides a fast and stable AE methodthat maintains an optimal exposure under the premises of highest contrastand low under- and over-exposure times at least in outdoors video surveil-lance applications.

6C O N C L U S I O N S

This thesis makes substantial contributions to the correction of colour vari-ations in images under uncontrolled lighting conditions. Although priorsignificant efforts have been made in this field, the results are extended tocompletely solve the problems that occur due to the lighting variations in ex-isting Computer Vision systems. This chapter highlights these contributionsthat assist in answering the research questions exposed in Chapter 1.

Future research lines may be derived from this thesis. These research linesare presented at the end of the chapter.

6.1 contributions

From the results obtained in this thesis, the following contributions can beestablished. These contributions are classified according to the chapters ofthis dissertation.

6.1.1 Dynamic Image Formation Model

We define our own image formation model for dynamic scenarios (DIFM,Chapter 2) in which the photometric conditions change. The proposal isnot completely novel; rather, it involves a combination of SoA approaches.Thus, the scientific contribution is not significant. Nevertheless, this modelis necessary for the development of the proposed methods and no other ex-isting formulation satisfies our requirements. The DIFM is complete, simple,feasible and modular:

complete . DIFM accounts for a great number of photometric conditionsand scenarios: Lambertian and specular surfaces; multiple light sour-ces; and multi-camera environments. The supported cameras are thosethat work in the visible wavelengths including those that do not have alinear response but assume that the sensor sensitivity is independentof the colour band (no Bayer pattern), although assume images areneither under- nor over-exposed. The dynamics refer to changes alongthe time of the positions and intensities of the light sources, as well asthe positions of the objects and the camera settings.

simple . DIFM relates the intensity variations of the light sources with thepixel intensity variations by means of a composition of two functions:a linear function plus a gamma function. As we demonstrated, thegamma function may be neglected or easily characterised, simplifyingits application.

feasible . DIFM is based on established proven models.

modular . DIFM considers the entire image formation process, from thelight creation to the digital image, as a composition of blocks. Theprocess that models each block could be modified without affectingthe rest of the blocks, e. g., if a camera sensor had a different behaviour,only its corresponding module would be changed.

151

152 conclusions

The model does not consider those elements that are not related to illu-mination, such as geometric distortion. The model also neglects vignettingfor simplicity; however, the inclusion of vignetting is straightforward. Fur-thermore, the model also requires sufficient intensity to neglect the ambientlight. Fortunately, these assumptions are reasonable for the most commonComputer Vision applications.

6.1.2 Quotient Relational Model of Regions

Using the DIFM, we demonstrate that, under certain circumstances and re-strictions, a linear relationship describes the photometric variations betweenregions illuminated by the same light sources (QRMR, Chapter 3). This factsimplifies the algorithms that could exploit these relations because a largenumber of mathematics methods are based on linearity. One of the possi-ble applications is the prediction of the intensity variation of a region fromthe measured variations of the surrounding regions, which can be achievedvia LS methods. Therefore, the response to the first research question (Canwe forecast the changes in the intensity of a region using the variations of the sur-rounded regions?) is affirmative; however, the response is limited to somespecific boundaries. Based on a theoretical analysis (Section 3.1), the estab-lished hypothesis is valid if: i) the light sources that illuminate the responseregion also illuminate, at least one explanatory region; ii) the responses ofthe surfaces exist in all of the colour bands; iii) the normals of the surfacesof the response region are similar to any of the normals of the explanatorysurfaces; and iv) the distances between the source lights and the regions aremuch greater than those between regions. Furthermore, the experiments(Section 3.4) demonstrate that the existence of occlusions, shadows, inter-reflections, very specular surfaces, and camera sensors operating close totheir limits may invalidate the forecasting as well. In real cases these effectscan be poorly avoided. Fortunately, if the effects are not high, the proposedsolution can handle them by means of outlier removal techniques such as RR

methods for the QRMR estimate. Furthermore, highly correlated explanatoryregions are also present in some scenes. The selection of the suitable explana-tory regions that model the response region behaviour is important to avoidthis multi-collinearity situation, while they collect a sufficient number of thephotometric variations. However, to avoid multi-collinearity is not a highpriority issue because the experiments demonstrated that this effect barelyinfluences the results, at least in the evaluated dataset. The experiments alsodemonstrated that the goodness of the estimates and the corrections do notdepend on the colour band.

6.1.3 Linear Correction Mapping

The obtained results in Chapter 4 help to answer the second and third re-search questions (Can we use the previous forecasting results to properly correctthe photometric variations of unknown objects in a camera? Can we extend thecorrection to a multi-camera architecture?). Several terms of these questionsshould be addressed in detail.

The QRMR is capable of forecasting the intensity variations produced inunknown objects in a camera from the information of the surrounding BG

regions. To determine the correct parameters of the model, we require atraining process. This training is done by using multiple samples that in-clude the highest number of photometric variations of all of the selected

6.1 contributions 153

regions. Using this knowledge, the proposed correction method (LCM) com-pensates the images for the photometric variations in multiple scenarios;therefore, LCM holds a similar appearance of the objects under such changes.The method is able to correct those changes that have been trained. Never-theless, the experiments demonstrate that, although the correction is notlikely optimal in non-trained situations, it is sufficient to outperform twoother SoA algorithms using the MCDL dataset. Regardless, the higher thenumber of photometric variations included in the training, the better themethod is able to correct the objects.

The second research question involves the phrase "properly correct". Nev-ertheless, this adjective is subjective. Thus, the answer depends on whatis understood by properly. Ideally, properly corresponds to the correction re-turns of an image equal to a reference image. Nevertheless, this situation isunreal because of the dissimilarities in the perspectives and the poses of theobtained images. Thus, the used indicator of goodness is that the distancebetween images of the same object must be as low as possible; the distancebetween the images of different objects are as high as possible. In this thesis,we define a distance based on the EMD of the image histograms.

Considering the previous comments, the answer to the third questionis affirmative because the LCM does a correction mapping between non-overlapping cameras. The only requirement is that several samples fromthe same object in all of the cameras are required for the training.

Once the training is complete, information regarding the target objects isnot required for the scene. However, the method requires previous knowl-edge regarding the motion of the target objects; in addition, the position andorientation of the BG to make a suitable selection of the explanatory regions,i. e., the selected regions must satisfy the identified constraints of the QRMR

(related to the surface normals, the distance to the light sources, and so on).Also, every scene is not appropriate to apply the LCM. For example, crowdedscenes do not enable the discovery of the explanatory regions; regions com-posed of very specular surfaces generate too many outliers, and low qualityimage sensors produce noisy images or non-linear behaviours.

6.1.4 Camera Exposure Control

The last set of research questions (Can we ensure a well-exposed image captureby controlling its acquisition process? If so, in which conditions?) is implicitlyanswered in Chapter 5. Before providing the response to the question, theconcept of a well-exposed image must be clarified. Most of the exposure-metering systems are based on measuring the digital intensity values of thespots or regions to compare them with a target value that is within thedynamic range of the camera. Nevertheless, the scientific community, cam-eras manufacturers and related professionals do not agree on a standarddefinition. We establish that an image is well exposed when it is neither un-derexposed, i. e., the quantity of dark noise is neglected compared with thequantity of light collected in the image, nor overexposed, i. e., the incominglight is collected without exceeding the maximum pixel intensity, and its his-togram is as wide as possible for the captured scene. This definition mostlyensures that Computer Vision applications will not have malfunctions dueto illumination. Nevertheless, this combination is not always obtained dueto the lighting conditions or the camera quality. Based on these premises,the optimal exposure is reached when the camera is working within its dy-namic range and the contrast of the image is at its maximum. To reach this

154 conclusions

state, we define a set of indicators, which are capable of providing quanti-tative information regarding the exposure in addition to the average valueused in common approaches.

The CEC is designed to reach the optimal exposure by actuating the shut-ter, iris and electronic gain of the camera. The highest priority of CEC is toavoid overexposure because the information contained in the overexposedpixels is lost. In addition, the CEC enlarges the SNR by avoiding low intensityvalues while maximising the contrast, providing a stable luminosity.

A standard evaluation of AE algorithms does not exist. Thus, we defineseveral performance indicators (Section 5.3.1) and an evaluation procedure(Section 5.3.2) that allow for comparison of any technique in the video ac-quisition for static cameras. In this evaluation the proposed technique out-performs a widely used algorithm, especially in unfavourable lighting con-ditions, whereas other algorithms have their drawbacks. The results of theexperiments allow for answering affirmatively the research question.

Regarding the conditions, we demonstrate that the method is reliable inthe scenario in which it was proved, for example, a surveillance parkinglot where dissimilar lighting conditions are produced: several alternatingweather conditions, which include sunshine, cloud coverage and rain, andharsh shadows. The extension to any other outdoor scenario is expectedto be straightforward. Nevertheless, the conditions in which these types ofalgorithms cannot work include those that exceed the dynamic range of thecamera, due to an excessively too low irradiance, an excessively high one ordue to too many specular surfaces or low quality cameras.

6.2 future work

The results and conclusions of this thesis may lead to potential researchlines:

automatic selection of regions . The performance of the LCM dep-ends on a preliminary manual analysis in which the suitable selec-tion of the explanatory regions is done. This analysis involves approx-imate knowledge of the possible positions of the regions and the lightsources. Thus, an interesting field of research would be a segmentationalgorithm that defines the optimal explanatory regions. The startingpoint would be the clustering of the zones of the image that have sim-ilar photometric response. For example, Koppal and Narasimhan [58]observed that "the continuity (smoothness) of an appearance profile yieldsinformation about the derivatives of the BRDF of the scene point with regardsto source direction." Moreover, Paruchuri et al. [81] use a k-means clus-tering method to extract the RoIs via their intensity variation. Further-more, this segmentation algorithm could also extract the silhouettes ofthe FG, which are currently extracted manually.

extension of the difm . Although the proposed DIFM is sufficient forthe purpose of this thesis, an extension could solve some of its con-straints; therefore, DIFM could make the model more universal. Thiswould include the automatic characterisation of non-linear processingwithin the camera [27, 38, 50] and an adequate compensation. DIFM

would also incorporate a whole noise model valid for low qualitycamera sensors and underexposed situations. Furthermore, a modelvariation that considers the uneven spectral response of the camerabands may be investigated to delve into more realistic sensors.

6.2 future work 155

comparison with illuminant invariant methods . The proposedcorrection method is a valuable tool to improve the performance ofComputer Vision applications. However, some of these applicationshandle the illumination variations by means of, e. g., illuminant invari-ant descriptors, such as Scale-Invariant Features Transform (SIFT) [69].It would be interesting to evaluate the goodness of the LCM in thesetypes of applications.

analysis of other optimisation methods . In Chapter 3, besides theLS methods, other techniques are mentioned, such as RANSAC, LinearProgramming or Convex Optimisation. These methods are not evalu-ated because it is beyond the scope of this thesis and the obtained re-sults are likely satisfactory; however, these methods could be helpful incertain scenarios. One of the possible techniques would be a multilevelregression method [39]. In this case, the photometric relation betweenregions would be seen as a hierarchical statistical problem, in whichlight sources and explanatory regions would form two related levels.These types of models are not necessary in simple scenarios having afew light sources and regions. Nevertheless, these models seem help-ful in complex scenarios where, for example, a light source does notradiate to every explanatory region. Using these models, light sourceswould be only linked to those regions that receives their radiation, andthe explanatory regions would be cross classified because they couldreceive radiation from several light sources. This method could pro-duce more reliable estimates of the response region variations.

parameters autoconfiguration for the cec . In Section 5.4, we in-dicate that this method is flexible enough to function in other situ-ations besides the evaluated ones. To cope with new scenarios, thealgorithm parameters (Table 5.1) should be changed. It is interestingto design a method capable of determining the optimal parametersbased on the type of scene. In this line, Yuan and Sun [120] proposeda region classification based on graph–based segmentation and joiningthe connected areas in the same exposure zone. Similarly, the use ofthe entropy could be a good indicator of the complexity of the light dis-tribution in certain scenes. Both approaches could be the two startingpoints for extending the capabilities of the CEC.

Part II

A P P E N D I C E S

AN O TAT I O N C O N V E N T I O N S

List of notation conventions followed in this dissertation. As regards themathematical syntax, Table A.1 shows the used symbols. The symbols ofthe physical magnitudes are listed in Table A.2. Finally, Table A.3 depictsother symbols used for the expressions.

159

160 notation conventions

syntax definition

∗ convolution

≡ is equal to

≈ is approximately equal to.= is defined as

∝ is proportional to

much larger than

a scalar, random variable

|a| absolute value of a

a, β column vector, random vector, unidimensional signal pro-cess

ai ith element of vector

ai ith vector in a sequence

aij i, jth element of matrix A

a average of a

a median of a

a, a, A estimate of a, a,A

‖a‖2, ‖a‖ 2-Norm of a

A matrix

AT matrix transpose

A−1 matrix inverse

|A| matrix determinant

n unit vector

diag(ai)

Diagonal matrix whose ii-th element is ai

Qra.= a

arQuotient difference of a by ar

δ delta of Dirac

δ small offset or perturbation

δ/δx derivative with respect to x

∆ ratio

N(µ,σ2) Gaussian (or Normal) distribution

µ media

σ standard deviation

σ2 variance

var(·) variance

cov(·) covariance

E[·] expectation

argxmin(·) the value of x which minimizes the expression

β Regression coefficients vector

εi Deviation error or residual for i-th sample of a regression

Table A.1: Mathematical syntax

notation conventions 161

symbol [units] meaning

A [mm] Aperture diameter

B Digital Image

crf(·) Camera response function

dP [m2] Pixel size

Φ [W] Radiant flux

E [W ·m−2] Irradiance

E(λ) [W ·m−3] Spectral irradiance

EV Exposure value

f [mm] focal length

G [s · e−1] In-camera processing function

I Amount of electrons generated by a sensorpixel

L [W ·m−2 · sr−1] Radiance

L(λ) [W ·m−3 · sr−1] Spectral radiance

M [pixels] Sensor size

N Relative aperture, f-number

Ndc [e] Dark current noise

Nq Quantization noise

Nr [e] Read noise

Ns [e] Shot noise

Q [e/J] Camera sensor sensitivity

S Image generated at image sensor

T [s] Exposure time

ηlens Optical transmittance

λ [m] Wavelength

Ω [sr] Solid angle

φ [W] Radiant flux

Ψ(·) Inverse function of the CRF

ρ Surface reflection coefficient, photometricresponse of a surface

γC Foreshortening factor

Table A.2: Physical magnitudes and conversion factors

162 notation conventions

symbol definition

A Area

b Refers to background (used as superscript)

C Number of cameras

C Camera

c Refers to camera (used as subscript)

d Distance

e Error

e Refers to specular (used as subscript)

Ef Exposure function

f Refers to foreground (used as superscript)

h Histogram

H Cumulative histogram

i, j General indices

ic Refers to inter camera

il Illumination state index

k Colour band index

LS Light source

n Location index

N Number of locations

NS Number of light sources

O Object or surface

o Refers to object or surface (used as subscript)

P Number of samples in a temporal process

q values referred to a quotient

s Refers to light source (used as subscript)

sc Refers to single camera

rf Reference

r, rg Region index

R Region

RG Number of regions

t Time

u ≡ (u, v) Spatial location in a digital image

x ≡ (x1, x2, x3) Spatial location in the real-world

Γ Correction function

κ Correction factor

θ Angle

Table A.3: Symbol definitions and models

BC H A N G E S I N T H E A L B E D O W I T H I N A F L AT S U R FA C E

no

∆r

rs1

rs2

Ps

P1

P2

θ1θ∆

Figure B.1: Diagram of the albedo in two points of a flat surface illuminated by thesame light source.

In this Appendix, the change in the albedo within a flat surface due to alight source is studied. As the IFM depends on this angle, each point that be-longs to the same surface, produces different pixel intensities. Nevertheless,these variations can be neglected as it is explained below.

Figure B.1 shows the diagram of the scene to be analysed. Given a lightsource located in Ps, the albedo in the point P1 of a flat surface whose normalis no is the following:

cos θ1 = n0 · ns1 (B.1)

where ns1 is the unit vector of that rs1 ≡ −−→PsP1 ≡ (rs11 , rs12 , rs13) between Ps

and P1. It is written as:

rs1 ≡ ‖rs1‖ ns1

‖rs1‖ =√∑3

i r2s1i

(B.2)

For a point P2 of the same surface separated ∆r (rs2 = rs1 + ∆r), the albedois:

cos θ∆ = n0 · ns2 (B.3)

where ns2 is the unit vector of that between Ps and P1 and it is written as:

rs2 ≡ ‖rs2‖ ns2

‖rs2‖ =√∑3

i r2s2i =

√∑3i (rs1i +∆ri)2

(B.4)

The albedo of P2 can be rewritten as follows:

cos θ∆ = no · ns2

= no ·(‖rs1‖‖rs2‖ ns1 + ∆r

)

=‖rs1‖‖rs2‖ no · ns1 + no · ∆r

(B.5)

163

164 changes in the albedo within a flat surface

Since no is perpendicular to ∆r by definition, then no · ∆r = 0.Finally, the relation between the albedos in both points is the following:

cos θ∆ =

√√√√∑3

i r2s1i∑3

i (rs1i ±∆ri)2cos θ1 (B.6)

The symbol ± means that ‖rs2‖ can be greater than ‖rs1‖ or lesser depend-ing on the location of the three points involved: If P2 is closer to Ps than P1,the sign is negative. Otherwise, if P1 is closer, the sign is positive.

Note that, whether the distance between the points of the surface is muchless than the distance to the light source, it can be approximated that thealbedos are mostly equal:

‖rs1‖ ‖ ∆r‖ ⇒ cosθ∆ ≈ cos θ1 (B.7)

CI M A G E D ATA S E T S D E S C R I P T I O N

Throughout this thesis, some experiments that require datasets of imageswere performed. The general requirements of these datasets involved vary-ing lighting conditions and several regions that could be selected either fromthe BG or the FG. For that purpose, three datasets were created: i) Terrace, ii)Parking, and iii) MCDL1. In addition, the MUCT2 face database [75] was alsoused.

In this regard, we defined an scene as the contents of the FoV of a staticcamera. Thus, datasets i), ii) and MUCT had one scene meanwhile iii) hadtwo scenes. Within each scene, several locations were defined according tothe geometry of the scene and also the necessities of the experiment.

In this Appendix these image datasets are explained.

c.1 terrace dataset

Figure C.1: Terrace dataset samples.

This dataset is composed by pictures of an outdoor scene in Madrid takenin several days in Spring and Autumn. Figure C.1 shows some picturessamples that include different illumination.

Figure C.2 depicts the BG regions. The FG region is the red container onthe ground and it is static. The object of the FG region is made of plastic andits surface might be considered as Lambertian.

The involved light sources are the sunlight –in sunny, cloudy and rainydays– and an halogen light bulb of 400 W. It was approximately placed at5 m of the FoV.

1 MCDL is publicly available in:https://www.researchgate.net/publication/264462014_MCDL_Dataset

2 MUCT stands for “Milborrow / University of Cape Town”

165

https://www.researchgate.net/publication/264462014_MCDL_Dataset

166 image datasets description

The camera used was a SLR Nikon D80 with a lens of focal length 18−

70 mm and aperture of f/3.5− 4.5. Camera’s settings changed to cover asmany situations as possible. So, the exposure time, aperture, ISO, and whitebalance were altered during the acquisition.

Figure C.2: The regions of the Terrace scene are within the red border rectangles.Region #1 is on the left and region #2 on the right. Both surfaces aremade of the same material (white paint), considered Lambertian. Theirnormals are virtually perpendicular. The FG region is the red containeron the bottom.

The capture process was as follows. The camera took pictures of the scenewith fixed settings in several sessions from the sunrise until the dusk. Dur-ing each session, one of the camera’s settings was changed and the capturewas repeated until images with all the defined settings combinations wererecorded. To also account the halogen light source, one of the photographysessions was done without sunlight. Those settings combinations that pro-duced dark or burn images were discarded.

As the objects are static, three pictures were taken in each capture. Thus,the zero-mean camera noise could be reduced by taking the average valueof the three pictures. For each capture, both the RGB RAW image and theJPEG were stored. The total number of pictures for this dataset is 258. Theimage size is 3904× 2616 for the RAW images, and 3872× 2592 for the JPEG

images.

c.2 parking dataset

This dataset is composed of pictures of an outdoor parking in Madrid.The parking belongs to the Escuela Técnica Superior de Ingenieros de Teleco-municación3 of the Universidad Politécnica de Madrid4. Figure C.3 shows sev-eral pictures samples that include different illumination. The pictures wererecorded each 15 s during the daylight of two consecutive days in May andtwo other else in June. The single light source is the sunlight and the climatevaries between sunny and cloudy.

Figure C.4 depicts the BG regions and the defined locations, called parkingzones. Due to the homogeneity of the ground surface and its single orienta-tion, there is a unique BG region, contained within the red lines. It is made ofmostly homogeneous asphalt that is considered Lambertian. Although thereare also some stains and cracks, they are not noticeable in the acquired im-ages. Thus, they did not affect the analyses. The FGs are the vehicles parkedin the locations indicated by the blue boxes. We performed the algorithmproposed by Gálvez-del Postigo et al. [42] to filter the moving vehicles and

3 http://www.etsit.upm.es4 http://www.upm.es

http://www.etsit.upm.es

http://www.upm.es

C.2 parking dataset 167

Figure C.3: Parking dataset samples.

Figure C.4: Regions and locations of the Parking scene. BG region is the area withinthe red lines. FG regions belong to one of three different locations: Lo-cation #1, #2 and #3 are the areas at the top, in the middle and at thebottom, respectively.

their shadows and detect in which a vehicle parked or left any parkingzone. With this information, we created a ground truth where each vehicleis identified with an ID and a position within the image for every frame. Re-garding the surfaces, there are mainly two classes: i) painted metal, and ii)glass. Both have specular components (Chapter 2) and likely produce high-


lights. They have several orientations, but the most relevant normal is thatone parallel to the ground, i. e., to the BG region.

The pictures were acquired by a network colour camera AXIS P3346 withthe auto-exposure and auto-white balance options disabled. The image sizeis 2048× 1536 and they were stored in RGB JPEG. The total number of picturesfor this dataset is 4144.

c.3 mcdl dataset

(a) Open doors. (b) Close doors.

Figure C.5: MCDL dataset samples. Scene #1.

(a) Light variation #1. (b) Light variation #2.

(c) Switching on the lights.

Figure C.6: MCDL dataset samples. Scene #2.

This dataset is composed of two scenes in a multi-camera surveillanceenvironment within the facilities of TNO5 in The Hague, The Netherlands.Figure C.5 and Figure C.6 show several pictures samples of these scenes.

5 http://www.tno.nl

http://www.tno.nl

C.3 mcdl dataset 169

Although both are corridors, the illuminations are very different. In thisregard, two main differences may be distinguished. First, the scene #1 hasseveral doors on both sides of the corridor, whereas the scene #2 has none.Second, the scene #2 has a big window along the whole corridor and, in thescene #1, the outdoor light goes through the open doors. Two light sourcesilluminate these scenes: i) sunlight, and ii) incandescent light bulbs. Scene #1

is mainly influenced by the light bulbs, whereas scene #2 is mainly affectedby the sunlight.

The images were recorded during the daylight of a cloudy day in Septem-ber. The illumination changed via opening and closing doors (in scene #1),via switching the indoor lights on and off (in scene #2), and via outdoor illu-mination variations due to changing cloud coverage. For both cameras, thevariation of the average value of the pictures was greater than 10 %. Lightchanges caused by flickering effects could be neglected.

The FG regions are persons. Everybody was captured walking in two di-rections in both scenes. We took captures of nine persons that wore differ-ent shirts whose colours were well-distributed along the RGB space. All thesurfaces were Lambertian. Due to the nature of the people shape and mo-tion, the distribution of the normal of the surfaces is almost random. Inother words, we could not determine any value a priori. An example pic-ture of each subject is shown in Figure C.7. People outlines were manuallysegmented to obtain the regions. The people shadows were also manuallysegmented and discarded.

Three locations were defined for each scene. They can be seen in Fig-ure C.8. We created a ground truth in which each frame was labelled withthe person and its location. Each location was split into six BG regions.

The pictures were acquired by two Firewire colour cameras The ImagingSource DFx 31BF03 with the auto-exposure and auto-white balance optionsdisabled. The image size is 1024× 768 and they were stored in RGB JPEG. Thetotal number of pictures for this dataset is 288.


Figure C.7: People samples in MCDL dataset.

C.3 mcdl dataset 171

(a) Scene #1.

(b) Scene #2.

Figure C.8: Regions and locations of MCDL dataset. The subscripts pointed out thelocation and the superscripts the region. The labels of location one ofscene #2 are avoided. There are two kind of surfaces: i) 1–3, made ofwhite paint, and ii) 4–6, made of ceramic tiles. Both might be consideredLambertian. Their normals are virtually perpendicular.


c.4 muct dataset

This dataset [75] is composed of 3755 faces of 624 people. Up to 5 cameraviews and 10 different lighting setups were used (Figure C.7)6.

Figure C.9: Face samples of the MUCT database extracted from MUCT webpage.

The FG regions are faces. Due to its nature, the distribution of the normalof the surfaces is almost random. The BG was the upper size of the picture(Figure C.10) which is a flat coloured wall. This region is almost close to thefaces and its normal depends on the camera view.

Figure C.10: The BG region is within the red border rectangle. The surface is consid-ered Lambertian and its normal depends on the camera view. The FG

region contains the face.

In this dissertation a subset of the whole dataset was used. Up to 6 sam-ples faces of 10 persons having different gender, age, light conditions, andcamera view were selected. The total number of pictures is 58. The imagesize is 480× 640 and they were stored in RGB JPEG.

6 For further details, see: MUCT Details, The MUCT Face Database (consulted 08/2014)http://www.milbo.org/muct/muct-details.html

http://www.milbo.org/muct/muct-details.html

DE X T E N D E D R E S U LT S

This appendix includes tables and graphs of results that complement thosewithin the dissertation. Statistics indicators are defined in Section 3.4.1, re-gression methods in Section 3.4.2, and the datasets in Appendix C.

173

174 extended results

d.1 terrace dataset

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qf

qf

Red

Green

Blue

(a) LSM-R.

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qf

qf

Red

Green

Blue

(b) NLSM-R.

Figure D.1: qf vs qf for RAW images of Terrace dataset. The three colour bands areshown.

D.1 terrace dataset 175

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qf

qf

Red

Green

Blue

(a) PCAM.

1 2 3 4

0.5

1

1.5

2

2.5

3

3.5

4

4.5

qf

qf

Red

Green

Blue

(b) PLSM.

Figure D.2: (cont.) qf vs qf for RAW images of Terrace dataset. The three colourbands are shown.


−0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0

0.082

0.16

0.25

0.33

0.41

0.49

0.57

0.66

0.74

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) NLSM-R.

−0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 0

0.082

0.16

0.25

0.33

0.41

0.49

0.57

0.66

0.74

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PCAM.

Figure D.3: Residuals histogram per band of the RAW pictures of Terrace dataset(ext.).


0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

3.5

qf

qf

Red

Green

Blue

(a) LSM.

0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

3.5

qf

qf

Red

Green

Blue

(b) LSM-R.

0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

3.5

qf

qf

Red

Green

Blue

(c) NLSM-R.

Figure D.4: qf vs qf for JPEG images of Terrace dataset. The three colour bands areshown.


0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

3.5

qf

qf

Red

Green

Blue

(a) PCAM.

0.5 1 1.5 2 2.5 3 3.5

0.5

1

1.5

2

2.5

3

3.5

qf

qf

Red

Green

Blue

(b) PLSM.

Figure D.5: (cont.) qf vs qf for JPEG images of Terrace dataset. The three colourbands are shown.


−0.02 0 0.02 0.04 0.06 0.08 0.1 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) NLSM-R.

−0.02 0 0.02 0.04 0.06 0.08 0.1 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PCAM.

Figure D.6: Residuals histogram per band of the JPEG pictures of the Terrace dataset(ext.).


2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

qf

qf

Red

Green

Blue

(a) NLSM-R.

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

qf

qf

Red

Green

Blue

(b) PCAM.

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

16

qf

qf

Red

Green

Blue

(c) PLSM.

Figure D.7: qf vs qf for γ-JPEG images of Terrace dataset (ext.). The three colourbands are shown.


−0.2 0 0.2 0.4 0.6 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

0.56

0.62

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) NLSM-R.

−0.2 0 0.2 0.4 0.6 0

0.062

0.12

0.19

0.25

0.31

0.38

0.44

0.5

0.56

0.62

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PCAM.

Figure D.8: Residuals histogram per band of the γ-JPEG pictures of Terrace dataset(ext.).



MSE (×10−6) 329.2 350.9 329.2 329.3 329.3

MR (×10−6) 0.0 4266.7 -0.0 -0.0 -0.0

SVR (×10−9) 73.3 73.0 73.3 72.5 72.5

MRC (×10−6) -4.4 -9.6 -4.4 -0.3 -0.3

MERC (×10−6) 159.5 769.2 159.5 161.2 161.2

R20.9993 0.9992 0.9993 0.9993 0.9993

t (β0) 7.328 3.135 7.328 7.724 7.724

t (β1) 29.217 30.575 29.217 28.259 28.259

t (β2) 31.705 28.305 31.705 32.655 32.655

Table D.1: Regressions statistics for the RAW pictures of the Terrace scenario. GreenBand.


MSE (×10−6) 60.8 64.4 60.8 65.5 65.5

MR (×10−6) 0.0 1731.3 0.0 -0.0 -0.0

SVR (×10−6) 1.4 1.6 1.4 2.1 2.1

MRC (×10−6) -3.0 -3.2 -3.0 -0.3 -0.3

MERC (×10−6) 76.5 256.1 76.5 86.5 86.5

R20.9997 0.9997 0.9997 0.9997 0.9997

t (β0) -1.035 -3.590 -1.035 5.278 5.276

t (β1) 74.979 70.535 74.979 56.652 56.658

t (β2) 52.371 53.071 52.371 66.019 66.016

Table D.2: Regressions statistics for the RAW pictures of the Terrace scenario. Blueband.


LSM LSM-R PLSM

MSE (×10−6) 329.5 554.7 330.3

MR (×10−6) 0.0 7853.6 -0.0

SVR (×10−9) 14.4 95.7 16.9

MRC (×10−6) 11.6 18.8 12.5

MERC (×10−6) 840.4 2909.7 838.9

R20.9995 0.9991 0.9995

t (β0) 1.846 -16.091 -0.001

t (β1) 54.980 81.161 58.601

t (β2) 68.179 13.243 64.417

Table D.3: Regressions statistics for the JPEG pictures of the Terrace scenario. Greenband.

LSM LSM-R PLSM

MSE (×10−6) 374.0 625.5 375.9

MR (×10−6) -0.0 7744.7 -0.0

SVR (×10−9) 34.4 174.7 43.1

MRC (×10−6) 28.3 34.2 29.3

MERC (×10−6) 875.1 299.5 865.4

R20.9994 0.9990 0.9994

t (β0) 17.260 -18.736 14.477

t (β1) 56.100 85.672 61.511

t (β2) 73.794 14.884 68.045

Table D.4: Regressions statistics for the JPEG pictures of the Terrace scenario. Blueband.

LSM LSM-R PLSM

MSE (×10−3) 10.066 25.269 10.116

MR (×10−6) -0.0 57377.0 -0.0

SVR (×10−6) 19.3 108.4 -19.3

MRC (×10−6) -807.0 -1990.7 -826.4

MERC (×10−3) 7.5 120.8 6.8

R20.9993 0.9982 0.9993

t (β0) 23.947 -2.369 21.652

t (β1) 60.331 82.981 65.597

t (β2) 84.473 7.097 78.848

Table D.5: Regressions statistics for the γ-JPEG pictures of the Terrace scenario. Greenband.


LSM LSM-R PLSM

MSE (×10−3) 7.035 15.780 7.233

MR (×10−6) -0.0 38695.0 0.0

SVR (×10−6) 10.9 160.5 11.2

MRC (×10−6) -750.4 -904.8 -756.8

MERC (×10−3) 0.0 72.0 1.54

R20.9995 0.9990 0.9995

t (β0) 26.062 -3.616 30.962

t (β1) 104.310 116.809 90.126

t (β2) 99.015 17.797 110.391

Table D.6: Regressions statistics for the γ-JPEG pictures of the Terrace scenario. Blueband.

D.2 parking dataset 185

d.2 parking dataset

0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

2.5

3

qf

qf

Red

Green

Blue

(a) LSM.

0.5 1 1.5 2 2.5 3

0.5

1

1.5

2

2.5

3

qf

qf

Red

Green

Blue

(b) LSM-R.

Figure D.9: qf vs qf for location #2 of Parking dataset using the (a) LSM and (b)LSM-R. The three colour bands are shown. Black line is the identity line.


−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0088

0.018

0.027

Residuals

Norm

aliz

ed fre

quency

(a) Red Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0088

0.018

0.027

Residuals

Norm

aliz

ed fre

quency

(b) Green Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0088

0.018

0.027

Residuals

Norm

aliz

ed fre

quency

(c) Blue Band.

Figure D.10: Residuals histogram per band of location #2 of Parking dataset. Thethree colour bands are shown in separate graphs. LSM applied.


−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0088

0.018

0.027

Residuals

Norm

aliz

ed fre

quency

(a) Red Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0088

0.018

0.027

Residuals

Norm

aliz

ed fre

quency

(b) Green Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0088

0.018

0.027

Residuals

Norm

aliz

ed fre

quency

(c) Blue Band.

Figure D.11: Residuals histogram per band of location #2 of Parking dataset. Thethree colour bands are shown in separate graphs. LSM-R applied.


−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0039

0.0078

0.012

0.016

0.02

0.024

0.027

Residuals

Norm

aliz

ed fre

quency

(a) Red Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0039

0.0078

0.012

0.016

0.02

0.024

0.027

Residuals

Norm

aliz

ed fre

quency

(b) Green Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0039

0.0078

0.012

0.016

0.02

0.024

0.027

Residuals

Norm

aliz

ed fre

quency

(c) Blue Band.

Figure D.12: Residuals histogram per band of location #3 of Parking dataset. Thethree colour bands are shown in separate graphs. LSM applied.


−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0039

0.0078

0.012

0.016

0.02

0.024

0.027

Residuals

Norm

aliz

ed fre

quency

(a) Red Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0039

0.0078

0.012

0.016

0.02

0.024

0.027

Residuals

Norm

aliz

ed fre

quency

(b) Green Band.

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 0

0.0039

0.0078

0.012

0.016

0.02

0.024

0.027

Residuals

Norm

aliz

ed fre

quency

(c) Blue Band.

Figure D.13: Residuals histogram per band of location #3 of Parking dataset. Thethree colour bands are shown in separate graphs. LSM-R applied.


LSM LSM-R

Redβ0 0.3022 0.1888

β1 0.7640 0.8323

Greenβ0 0.2269 0.1285

β1 0.8210 0.8915

Blueβ0 0.2056 0.1344

β1 0.8264 0.8796

Table D.7: Estimated regressors for location #2 of Parking dataset.

LSM LSM-R

Redβ0 0.2370 0.1619

β1 0.8085 0.8554

Greenβ0 0.3039 0.1607

β1 0.7497 0.8409

Blueβ0 0.3406 0.1944

β1 0.7217 0.8118

Table D.8: Estimated regressors for location #3 of Parking dataset.

D.3 mcdl dataset 191

d.3 mcdl dataset

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

0.94

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) NLSM.

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

0.94

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PLSM.

Figure D.14: Residuals histogram per band of camera #1 and location #1 of MCDL

dataset (ext.).


1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(a) NLSM.

1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(b) PLSM.

Figure D.15: qf vs qf for camera #1 and location #2 of MCDL dataset (ext.). The threecolour bands are shown. Black line is the identity line.


−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

No

rma

lize

d f

req

ue

ncy

Red

Green

Blue

(b) LSM-R.

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) NLSM.


dataset.


−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) PCAM.

−0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PLSM.

Figure D.17: (cont.) Residuals histogram per band of camera #1 and location #2 ofMCDL dataset.


0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb1

qf0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb2

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb3

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb4

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb5

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb6

qf

(a) Red Band.

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb1

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb2

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb3

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb4

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb5

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb6

qf

(b) Green Band.

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb1

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb2

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb3

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb4

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb5

qf

0.8 1 1.2

1

1.05

1.1

1.15

1.2

qb6

qf

(c) Blue Band.

Figure D.18: Distribution of qf vs qbi (1 i 6) for the three bands or location #3

of Camera #1 of the MCDL dataset.


1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(a) LSM.

1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(b) LSM-R.

1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(c) PLSM.

Figure D.19: qf vs qf for camera #1 and location #3 of MCDL dataset. The three colourbands are shown. Black line is the identity line.


1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(a) NLSM.

1 1.05 1.1 1.15 1.2

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(b) PCAM.

Figure D.20: (cont.) qf vs qf for camera #1 and location #3 of MCDL dataset. Thethree colour bands are shown. Black line is the identity line.


−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) NLSM.

−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PCAM.


dataset (ext.).


0.8 0.9 1 1.1 1.2

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(a) NLSM.

0.8 0.9 1 1.1 1.2

0.75

0.8

0.85

0.9

0.95

1

1.05

1.1

1.15

1.2

qf

qf

Red

Green

Blue

(b) PCAM.



−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) LSM-R.

−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

No

rma

lize

d f

req

ue

ncy

Red

Green

Blue

(c) PLSM.


dataset.


−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) NLSM.

−0.2 −0.1 0 0.1 0.2 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PCAM.

Figure D.24: (cont.) Residuals histogram per band of camera #2 and location #1 ofMCDL dataset (ext.).


0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb1

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb2

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb3

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb4

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb5

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb6

qf

(a) Red Band.

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb1

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb2

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb3

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb4

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb5

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb6

qf

(b) Green Band.

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb1

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb2

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb3

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb4

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb5

qf

0.70.80.9 1 1.1

0.8

0.85

0.9

0.95

1

qb6

qf

(c) Blue Band.

Figure D.25: Distribution of qf vs qbi (1 i 6) for the three bands or location #2

of Camera #2 of the MCDL dataset.


0.8 0.85 0.9 0.95 1

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(a) NLSM.

0.8 0.85 0.9 0.95 1

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(b) PCAM.



−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM-R.

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) NLSM.

−0.1 −0.05 0 0.05 0.1 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) PCAM.


dataset (ext.).


0.7 0.75 0.8 0.85 0.9 0.95 10.7

0.75

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(a) LSM.

0.7 0.75 0.8 0.85 0.9 0.95 10.7

0.75

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(b) LSM-R.

0.7 0.75 0.8 0.85 0.9 0.95 10.7

0.75

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(c) NLSM.

Figure D.28: qf vs qf for camera #2 and location #3 of MCDL dataset. The three colourbands are shown. Black line is the identity line.


0.7 0.75 0.8 0.85 0.9 0.95 10.7

0.75

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(a) PCAM.

0.7 0.75 0.8 0.85 0.9 0.95 10.7

0.75

0.8

0.85

0.9

0.95

1

qf

qf

Red

Green

Blue

(b) PLSM.

Figure D.29: (cont.) qf vs qf for camera #2 and location #3 of MCDL dataset. Thethree colour bands are shown. Black line is the identity line.


−0.15 −0.1 −0.05 0 0.05 0.1 0.15 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) LSM.

−0.15 −0.1 −0.05 0 0.05 0.1 0.15 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) LSM-R.

−0.15 −0.1 −0.05 0 0.05 0.1 0.15 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(c) NLSM.


dataset.


−0.15 −0.1 −0.05 0 0.05 0.1 0.15 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(a) PCAM.

−0.15 −0.1 −0.05 0 0.05 0.1 0.15 0

0.1

0.21

0.31

0.42

0.52

0.62

0.73

0.83

Residuals

Norm

aliz

ed fre

quency

Red

Green

Blue

(b) PLSM.

Figure D.31: (cont.) Residuals histogram per band of camera #2 and location #3 ofMCDL dataset.


R1 R2 R3 R4 R5 R6

R1 1 0.6791 0.6282 0.5738 0.1658 -0.2137

R2 0.6791 1 0.9666 0.8798 0.4610 -0.1361

R3 0.6282 0.9666 1 0.8728 0.4338 -0.1028

R4 0.5738 0.8798 0.8728 1 0.6840 0.1368

R5 0.1658 0.4610 0.4338 0.6840 1 -0.0754

R6 -0.2137 -0.1361 -0.1028 0.1368 -0.0754 1

Table D.9: Correlation between BG regions Ri (1 i 6) in location #1 of camera #1

for the MCDL dataset. Red band

R1 R2 R3 R4 R5 R6

R1 1 0.5647 0.5066 0.3007 0.0524 -0.1148

R2 0.5647 1 0.9618 0.7737 0.4925 0.2920

R3 0.5066 0.9618 1 0.8541 0.4508 0.3657

R4 0.3007 0.7737 0.8541 1 0.5252 0.6699

R5 0.0524 0.4925 0.4508 0.5252 1 0.0657

R6 -0.1148 0.2920 0.3657 0.6699 0.0657 1

Table D.10: Correlation between BG regions Ri (1 i 6) in location #1 of camera#1 for the MCDL dataset. Blue band


β0 -0.5762 -0.6253 -0.2188 -1.0360 -0.4494

β1 -0.6366 -1.5406 0.0000 0.0008 -0.6786

β2 2.4582 3.9889 0.3026 0.2944 0.6941

β3 -0.2209 -1.2922 0.2923 0.3245 0.3636

β4 -1.8043 -0.9659 0.1857 0.3857 0.1168

β5 1.1567 0.9975 0.3923 0.7837 0.8274

β6 0.6229 0.4377 0.0495 0.2479 0.1258

Table D.11: Estimated regressors for red band for location #1 of camera #1 of MCDL

dataset.



β0 -0.1256 -0.3662 -0.1627 -0.3015 0.1572

β1 -0.1610 -0.1681 0.0000 0.0683 -0.4818

β2 1.5686 1.4646 0.4462 0.2873 1.4603

β3 -0.0777 -0.1747 0.2603 0.3050 0.0599

β4 -1.4089 -1.1821 0.0020 0.1793 -1.0540

β5 0.6227 0.7358 0.4181 0.4896 0.5035

β6 0.5819 0.6907 0.0372 -0.0265 0.3526

Table D.12: Estimated regressors for blue band for location #1 of camera #1 of MCDL

dataset.


MSE (×10−3) 0.332 0.484 0.581 0.472 0.408

MR (×10−3) -0.0 -1.6 -0.2 -0.0 -0.0

SVR (×10−6) 0.1 0.1 0.1 0.0 0.0

MRC (×10−6) -36.0 -4.2 -63.9 -70.9 -48.9

MERC (×10−3) 0.0 -0.1 0.1 0.0 0.0

R20.7564 0.6448 0.5729 0.6530 0.7006

t (β0) -11.49 -10.32 -3.29 -17.31 -8.08

t (β1) -11.88 -23.81 0.00 0.01 -11.42

t (β2) 24.09 32.38 2.24 2.42 6.14

t (β3) -2.58 -12.50 2.58 3.18 3.83

t (β4) -11.50 -5.10 0.89 2.06 0.67

t (β5) 24.14 17.24 6.18 13.71 15.58

t (β6) 16.39 9.54 0.98 5.47 2.99

Table D.13: Regressions statistics for location #1 of camera #1 of MCDL dataset. Redband.



MSE (×10−3) 0.426 0.454 0.553 0.586 0.445

MR (×10−3) -0.0 -2.0 -0.0 -0.0 0.0

SVR (×10−6) 0.0 0.1 0.0 0.1 0.0

MRC (×10−6) -40.4 -55.3 -64.1 -67.8 -47.5

MERC (×10−3) 0.0 -0.0 0.0 0.0 0.0

R20.6457 0.6228 0.5407 0.5135 0.6300

t (β0) -2.41 -6.81 -2.74 -4.94 2.95

t (β1) -3.27 -3.31 0.00 1.18 -9.58

t (β2) 13.43 12.16 3.36 2.10 12.24

t (β3) -0.62 -1.35 1.82 2.07 0.47

t (β4) -10.22 -8.31 0.01 1.11 -7.48

t (β5) 16.09 18.43 9.49 10.80 12.73

t (β6) 10.11 11.62 0.57 -0.39 5.99

Table D.14: Regressions statistics for location #1 of camera #1 of MCDL dataset. Blueband.

R1 R2 R3 R4 R5 R6

R1 1 0.9591 0.1929 0.8891 0.5507 -0.8217

R2 0.9591 1 0.1340 0.9523 0.7114 -0.8888

R3 0.1929 0.1340 1 0.2692 -0.2248 -0.4354

R4 0.8891 0.9523 0.2692 1 0.7200 -0.9323

R5 0.5507 0.7114 -0.2248 0.7200 1 -0.5978

R6 -0.8217 -0.8888 -0.4354 -0.9323 -0.5978 1

Table D.15: Correlation between BG regions Ri (1 i 6) in location #2 of camera#1 for the MCDL dataset. Red band


R1 R2 R3 R4 R5 R6

R1 1 0.9452 0.3740 0.9241 0.7083 -0.8110

R2 0.9452 1 0.2453 0.9829 0.8605 -0.8651

R3 0.3740 0.2453 1 0.3398 -0.0594 -0.5026

R4 0.9241 0.9829 0.3398 1 0.8404 -0.8950

R5 0.7083 0.8605 -0.0594 0.8404 1 -0.6849

R6 -0.8110 -0.8651 -0.5026 -0.8950 -0.6849 1



β0 2.1547 2.4788 0.1236 0.8781 2.0240

β1 0.3667 -0.4654 0.0000 0.1017 0.1016

β2 -1.4304 0.0000 0.1938 0.0740 -0.6778

β3 -0.0367 0.3830 0.0240 -0.0004 -0.0206

β4 -0.7486 -1.6635 0.2608 0.0592 -1.0535

β5 1.5655 1.1572 0.4031 0.0235 1.4380

β6 -0.8702 -0.8901 0.0000 -0.1294 -0.8107


dataset.


β0 2.2532 1.9824 0.1278 0.8183 2.0993

β1 0.3220 0.0852 0.0069 0.1170 -0.4691

β2 -1.3518 -0.9647 0.1753 0.0884 -0.1221

β3 -0.0413 -0.2059 0.0353 0.0114 -0.0396

β4 -0.5488 -0.1240 0.2702 0.0776 0.0787

β5 1.3739 1.1396 0.3914 0.0278 0.2126

β6 -1.0065 -0.9126 0.0000 -0.1316 -0.7587

Table D.18: Regressors estimates for green band for location #2 of camera #1 of MCDL

dataset.



β0 1.6665 0.8467 0.0647 0.5499 2.2886

β1 -0.6966 -0.8054 0.0636 0.1875 -0.7607

β2 0.5208 1.3245 0.2398 0.1412 0.0899

β3 0.0530 0.0159 0.0727 0.0490 0.0122

β4 -0.5042 -0.8208 0.2598 0.1722 0.2511

β5 1.1949 1.4715 0.3128 0.0606 0.3331

β6 -1.2337 -1.0324 0.0000 -0.1470 -1.2134


dataset.


MSE (×10−3) 0.609 11.405 1.353 1.399 0.631

MR (×10−3) -0.0 -16.6 -0.4 -0.0 0.0

SVR (×10−6) 1.0 452.2 1.1 1.2 0.9

MRC (×10−6) 20.1 196.5 197.3 172.8 4.1

MERC (×10−3) 0.0 -2.9 -0.2 -0.1 0.0

R20.7049 -4.5256 0.3443 0.3222 0.6943

t (β0) 26.89 7.15 1.04 7.23 24.82

t (β1) 8.26 -2.42 0.00 1.51 2.25

t (β2) -14.03 0.00 1.28 0.48 -6.53

t (β3) -8.47 20.42 3.72 -0.05 -4.67

t (β4) -11.34 -5.82 2.65 0.59 -15.67

t (β5) 23.90 4.08 4.13 0.24 21.57

t (β6) -36.83 -8.71 0.00 -3.62 -33.71




MSE (×10−3) 1.080 1.574 3.373 3.400 1.189

MR (×10−3) -0.0 1.0 -0.3 -0.0 0.0

SVR (×10−6) 2.1 2.1 4.9 5.9 2.0

MRC (×10−6) 151.9 253.8 629.2 617.2 194.5

MERC (×10−3) -0.1 0.4 -0.3 -0.2 -0.0

R20.8251 0.7452 0.4539 0.4496 0.8075

t (β0) 20.04 8.43 0.44 3.73 26.23

t (β1) -15.61 -14.95 0.81 2.37 -16.25

t (β2) 3.65 7.69 0.95 0.56 0.60

t (β3) 9.40 2.33 7.29 4.89 2.07

t (β4) -6.19 -8.34 1.80 1.19 2.94

t (β5) 14.56 14.85 2.16 0.42 3.87

t (β6) -39.71 -27.53 0.00 -2.67 -37.23


R1 R2 R3 R4 R5 R6

R1 1 0.3101 0.2941 0.2186 0.2163 -0.1701

R2 0.3101 1 0.9516 0.9606 0.9498 -0.9356

R3 0.2941 0.9516 1 0.8942 0.8601 -0.8069

R4 0.2186 0.9606 0.8942 1 0.9607 -0.9480

R5 0.2163 0.9498 0.8601 0.9607 1 -0.9567

R6 -0.1701 -0.9356 -0.8069 -0.9480 -0.9567 1


R1 R2 R3 R4 R5 R6

R1 1 0.9359 0.9264 0.9378 0.9271 -0.9537

R2 0.9359 1 0.9955 0.9886 0.9775 -0.9727

R3 0.9264 0.9955 1 0.9870 0.9700 -0.9548

R4 0.9378 0.9886 0.9870 1 0.9769 -0.9629

R5 0.9271 0.9775 0.9700 0.9769 1 -0.9645

R6 -0.9537 -0.9727 -0.9548 -0.9629 -0.9645 1




β0 0.9013 1.2146 0.1019 0.9577 0.2651

β1 0.5468 0.1076 0.2374 0.0026 0.1061

β2 -1.9268 2.6916 0.0429 0.0280 0.0380

β3 0.8413 -2.3566 0.0000 0.0231 0.0921

β4 0.4137 -0.8157 0.2731 0.0488 0.1915

β5 0.3371 0.3387 0.2808 0.0738 0.2677

β6 -0.1135 -0.1802 0.0636 -0.1346 0.0393


dataset.


β0 0.6802 1.7811 0.1074 0.7146 0.7155

β1 -0.1323 -1.3006 0.2461 0.0203 0.0202

β2 0.1859 1.1253 0.1081 0.0801 0.0791

β3 -0.3855 -1.1696 0.0000 0.1031 0.1009

β4 0.2690 -0.1285 0.1352 0.0880 0.0880

β5 0.4388 0.8535 0.3186 0.1068 0.1094

β6 -0.0561 -0.1612 0.0846 -0.1128 -0.1131


dataset.



MSE (×10−3) 1.168 2.550 1.229 1.321 1.241

MR (×10−3) -0.0 -5.7 0.0 -0.0 0.0

SVR (×10−6) 0.3 7.5 0.4 0.3 0.3

MRC (×10−6) 316.4 532.9 269.7 292.1 265.2

MERC (×10−3) -0.0 0.2 -0.1 0.0 -0.0

R20.5265 -0.0336 0.5019 0.4647 0.4970

t (β0) 7.89 7.20 0.87 7.89 2.25

t (β1) 8.46 1.13 3.58 0.04 1.59

t (β2) -8.05 7.61 0.17 0.11 0.15

t (β3) 5.76 -10.92 0.00 0.15 0.61

t (β4) 6.67 -8.90 4.29 0.74 3.00

t (β5) 8.67 5.90 7.04 1.78 6.68

t (β6) -3.74 -4.02 2.04 -4.17 1.26



MSE (×10−3) 1.034 1.540 1.109 1.153 1.151

MR (×10−3) -0.0 7.7 -0.0 0.0 -0.0

SVR (×10−6) 0.4 1.5 0.2 0.2 0.2

MRC (×10−6) 259.4 604.7 126.5 143.5 144.4

MERC (×10−3) -0.1 0.3 -0.2 -0.1 -0.1

R20.7645 0.6491 0.7473 0.7374 0.7377

t (β0) 3.97 8.52 0.61 3.95 3.96

t (β1) -1.18 -9.47 2.11 0.17 0.17

t (β2) 1.22 6.06 0.69 0.50 0.49

t (β3) -4.14 -10.28 0.00 1.05 1.03

t (β4) 4.66 -1.82 2.26 1.45 1.45

t (β5) 13.23 21.09 9.28 3.05 3.13

t (β6) -1.41 -3.32 2.05 -2.68 -2.69



R1 R2 R3 R4 R5 R6

R1 1 0.0452 -0.4640 -0.0095 0.0562 0.9572

R2 0.0452 1 0.8334 0.7654 -0.1525 0.1811

R3 -0.4640 0.8334 1 0.7686 -0.1665 -0.3407

R4 -0.0095 0.7654 0.7686 1 -0.1272 -0.0009

R5 0.0562 -0.1525 -0.1665 -0.1272 1 0.1449

R6 0.9572 0.1811 -0.3407 -0.0009 0.1449 1


R1 R2 R3 R4 R5 R6

R1 1 0.1767 0.1154 0.2276 -0.2727 0.5172

R2 0.1767 1 0.9794 0.8519 -0.0661 0.7094

R3 0.1154 0.9794 1 0.8951 -0.0450 0.6574

R4 0.2276 0.8519 0.8951 1 -0.0719 0.5083

R5 -0.2727 -0.0661 -0.0450 -0.0719 1 0.0444

R6 0.5172 0.7094 0.6574 0.5083 0.0444 1



β0 0.1306 0.3188 0.0341 0.3438 0.1226

β1 -0.3554 0.1581 0.0000 -0.0942 -0.2264

β2 1.1507 0.3870 0.2733 0.2086 0.5834

β3 -0.0449 1.0807 0.3916 0.3019 0.5111

β4 -0.0976 -0.7304 0.2164 0.2896 -0.2256

β5 -0.0448 0.0308 0.0005 -0.0359 -0.0451

β6 0.2624 -0.2451 0.0651 -0.0166 0.2807


dataset.


β0 -0.4868 -0.0778 -0.0723 -0.1423 -0.0208

β1 0.0233 -0.3214 0.0983 0.0327 0.0457

β2 1.1436 1.6885 0.2570 0.2832 0.2744

β3 -0.3122 -1.0309 0.2992 0.3719 0.3473

β4 0.3336 0.4825 0.2636 0.3673 0.3323

β5 -0.0206 -0.1141 0.0000 -0.0230 -0.1017

β6 0.3211 0.3731 0.1470 0.1128 0.1200


dataset.



MSE (×10−3) 1.777 2.383 3.043 2.802 1.838

MR (×10−3) 0.0 3.8 -0.1 0.0 -0.0

SVR (×10−6) 0.7 3.6 2.2 0.6 0.9

MRC (×10−6) -27.6 8.8 -108.2 172.1 -8.6

MERC (×10−3) 0.1 0.2 -0.8 0.0 0.0

R20.7493 0.6639 0.5708 0.6049 0.7408

t (β0) 6.05 12.75 1.21 12.68 5.58

t (β1) -6.89 2.65 0.00 -1.45 -4.31

t (β2) 14.76 4.29 2.68 2.13 7.36

t (β3) -0.52 10.74 3.44 2.77 5.79

t (β4) -2.32 -15.02 3.94 5.49 -5.28

t (β5) -7.41 4.41 0.06 -4.73 -7.34

t (β6) 3.76 -3.03 0.71 -0.19 3.96




MSE (×10−3) 2.692 3.267 3.071 2.909 3.152

MR (×10−3) 0.0 -6.0 -0.9 -0.0 -0.0

SVR (×10−6) 2.4 7.1 2.3 1.2 2.5

MRC (×10−6) -121.6 -90.8 9.1 -17.0 -120.2

MERC (×10−3) -0.0 0.9 0.3 -0.0 0.5

R20.7114 0.6497 0.6707 0.6881 0.6620

t (β0) -10.73 -1.56 -1.49 -3.02 -0.42

t (β1) 0.37 -4.68 1.47 0.50 0.68

t (β2) 9.69 12.98 2.04 2.31 2.15

t (β3) -2.81 -8.42 2.52 3.22 2.89

t (β4) 7.41 9.73 5.48 7.85 6.82

t (β5) -2.79 -14.02 0.00 -3.00 -12.73

t (β6) 4.27 4.50 1.83 1.44 1.47


R1 R2 R3 R4 R5 R6

R1 1 0.1430 0.0619 0.2226 0.2188 0.9825

R2 0.1430 1 0.9946 0.9845 0.4535 0.0987

R3 0.0619 0.9946 1 0.9757 0.4644 0.0235

R4 0.2226 0.9845 0.9757 1 0.5045 0.1923

R5 0.2188 0.4535 0.4644 0.5045 1 0.2805

R6 0.9825 0.0987 0.0235 0.1923 0.2805 1


R1 R2 R3 R4 R5 R6

R1 1 0.9639 0.9733 0.9673 0.6499 0.9732

R2 0.9639 1 0.9956 0.9665 0.6234 0.9050

R3 0.9733 0.9956 1 0.9787 0.6477 0.9266

R4 0.9673 0.9665 0.9787 1 0.6238 0.9519

R5 0.6499 0.6234 0.6477 0.6238 1 0.6541

R6 0.9732 0.9050 0.9266 0.9519 0.6541 1




β0 0.3167 0.3476 0.1033 0.3642 0.4105

β1 -0.4368 -0.1885 0.1126 0.0608 0.0265

β2 -0.7196 -1.3948 0.2272 0.2292 0.2276

β3 1.6170 1.7371 0.2434 0.2489 0.2609

β4 -0.3964 0.1304 0.1908 0.2121 0.2012

β5 -0.2211 -0.1659 0.0000 -0.1487 -0.1387

β6 0.8389 0.5342 0.1179 0.0316 0.0124


dataset.


β0 -0.0451 -0.2826 0.0079 0.0217 0.0233

β1 -0.4438 -0.0055 0.1823 0.1572 0.1625

β2 0.5143 -0.1030 0.2346 0.2707 0.2990

β3 1.2981 1.8421 0.2248 0.2779 0.2974

β4 -1.0708 -1.2669 0.1813 0.2795 0.2407

β5 -0.1993 -0.2481 0.0000 -0.1547 -0.1595

β6 0.9453 1.0640 0.1637 0.1442 0.1335


dataset.



MSE (×10−3) 0.371 0.444 0.798 0.502 0.512

MR (×10−3) -0.0 -2.1 -0.1 0.0 0.0

SVR (×10−6) 0.1 0.2 0.7 0.2 0.3

MRC (×10−6) 32.6 17.8 27.3 -31.4 -44.9

MERC (×10−3) -0.0 -0.2 -0.6 -0.0 0.0

R20.9029 0.8838 0.7910 0.8686 0.8660

t (β0) 27.37 27.46 6.08 27.06 30.19

t (β1) -8.39 -3.31 1.47 1.00 0.43

t (β2) -7.20 -12.76 1.55 1.97 1.94

t (β3) 19.26 18.91 1.98 2.55 2.64

t (β4) -9.37 2.82 3.07 4.31 4.05

t (β5) -31.63 -21.70 0.00 -18.29 -16.88

t (β6) 15.18 8.83 1.45 0.49 0.19




MSE (×10−3) 0.399 0.470 0.835 0.686 0.663

MR (×10−3) -0.0 -0.6 -0.4 0.0 0.0

SVR (×10−6) 0.1 0.2 0.8 0.2 0.2

MRC (×10−6) 31.8 58.1 119.3 120.4 112.3

MERC (×10−3) 0.0 -0.2 -0.1 0.0 0.0

R20.9221 0.9084 0.8370 0.8662 0.8707

t (β0) -1.97 -11.35 0.24 0.72 0.79

t (β1) -4.73 -0.05 1.34 1.28 1.34

t (β2) 5.75 -1.06 1.81 2.31 2.59

t (β3) 12.71 16.63 1.52 2.08 2.26

t (β4) -25.03 -27.31 2.93 4.99 4.37

t (β5) -30.06 -34.50 0.00 -17.81 -18.68

t (β6) 13.22 13.72 1.58 1.54 1.45


R1 R2 R3 R4 R5 R6

R1 1 0.9261 0.9136 0.7650 0.8752 0.8131

R2 0.9261 1 0.9828 0.8129 0.7710 0.5470

R3 0.9136 0.9828 1 0.8103 0.7522 0.5328

R4 0.7650 0.8129 0.8103 1 0.7155 0.4708

R5 0.8752 0.7710 0.7522 0.7155 1 0.7967

R6 0.8131 0.5470 0.5328 0.4708 0.7967 1


R1 R2 R3 R4 R5 R6

R1 1 0.9746 0.9379 0.8480 0.8426 0.9229

R2 0.9746 1 0.9388 0.8116 0.7917 0.8771

R3 0.9379 0.9388 1 0.7946 0.7865 0.8784

R4 0.8480 0.8116 0.7946 1 0.7556 0.8140

R5 0.8426 0.7917 0.7865 0.7556 1 0.9319

R6 0.9229 0.8771 0.8784 0.8140 0.9319 1




β0 -0.1545 -0.1173 -0.0485 -0.0152 0.1023

β1 2.5679 -0.2166 0.1501 0.1177 0.1337

β2 0.9083 2.8145 0.3764 0.2668 0.4724

β3 -1.0949 -1.5911 0.2257 0.3140 0.2666

β4 0.0522 -0.0926 0.1440 0.2296 0.1022

β5 0.2833 0.3367 0.1408 0.0815 0.0955

β6 -1.5651 -0.1336 0.0000 -0.0046 -0.1781


dataset.


β0 -0.4379 -0.3654 -0.1682 -0.1667 -0.2547

β1 3.0943 2.9928 0.3352 0.1894 0.4430

β2 0.1428 0.2055 0.4476 0.2355 0.5451

β3 -1.0004 -0.9880 0.0000 0.2702 -0.1159

β4 -0.0674 -0.0531 0.0713 0.0596 -0.1372

β5 0.2995 0.3858 0.2832 0.2409 0.4368

β6 -1.0329 -1.1775 0.0218 0.1606 0.0720

Table D.43: Regressors estimates for green band for location #3 of camera #2 of MCDL

dataset.


β0 -0.5399 -0.5141 -0.1844 -0.1987 -0.3148

β1 2.1239 2.3761 0.3319 0.1970 0.4467

β2 -0.0884 -0.2003 0.2722 0.2276 0.3336

β3 -0.7866 -0.8923 0.0000 0.2680 -0.1533

β4 -0.1790 -0.1461 0.0542 -0.0411 -0.1676

β5 0.2277 0.2379 0.3256 0.3405 0.5307

β6 0.2386 0.1387 0.1917 0.2003 0.3173


dataset.



MSE (×10−3) 1.421 2.032 2.332 2.477 2.262

MR (×10−3) -0.0 -3.3 -0.1 0.0 0.0

SVR (×10−6) 0.8 3.9 1.7 2.0 2.4

MRC (×10−6) -27.8 83.3 172.1 149.0 281.7

MERC (×10−3) -0.0 -0.2 -0.2 -0.2 -0.1

R20.8729 0.8183 0.7915 0.7784 0.7977

t (β0) -7.53 -4.79 -1.85 -0.56 3.95

t (β1) 15.62 -1.10 0.71 0.54 0.64

t (β2) 11.81 30.61 3.82 2.63 4.87

t (β3) -23.86 -29.00 3.84 5.18 4.61

t (β4) 2.91 -4.32 6.28 9.71 4.52

t (β5) 10.64 10.57 4.13 2.32 2.84

t (β6) -17.47 -1.25 0.00 -0.04 -1.58




MSE (×10−3) 1.497 1.550 2.285 2.584 2.125

MR (×10−3) -0.0 -5.0 -0.2 -0.0 -0.0

SVR (×10−6) 1.9 1.8 1.7 2.4 2.5

MRC (×10−6) -239.5 -257.0 -151.0 -66.0 -190.0

MERC (×10−3) 0.1 0.2 0.2 0.0 0.0

R20.8400 0.8343 0.7558 0.7238 0.7729

t (β0) -21.85 -20.45 -6.04 -6.12 -10.70

t (β1) 24.08 26.48 3.05 1.70 4.25

t (β2) -1.39 -3.10 3.47 2.73 4.42

t (β3) -21.87 -24.39 0.00 5.67 -3.58

t (β4) -9.10 -7.30 2.23 -1.59 -7.15

t (β5) 7.00 7.19 8.11 7.97 13.70

t (β6) 3.40 1.94 2.21 2.17 3.79


B I B L I O G R A P H Y

[1] AVT Marlin Technical Manual. Allied Vision Technologies GmbH, v2.0.0edition, March 2006. (Cited on pages 14 and 15.)

[2] O. Arandjelovic. Colour invariants under a non-linear photometric cam-era model and their application to face recognition from video. PatternRecognition, 45(7):2499–2509, 2012. (Cited on pages 23 and 102.)

[3] K. E. Aziz, D. Merad, and B. Fertil. People re-identification across multi-ple non-overlapping cameras system by appearance classification and sil-houette part segmentation. In IEEE International Conference on AdvancedVideo and Signal-Based Surveillance (AVSS), 2011, pages 303–308, Septem-ber 2011. (Cited on page 102.)

[4] A. Baradarani, Q. M. Jonathan Wu, and M. Ahmadi. An efficient illumi-nation invariant face recognition framework via illumination enhance-ment and dd-dtwt filtering. Pattern Recognition, 46(1):57–72, 2013. (Citedon page 101.)

[5] K. Barnard, L. Martin, and B. V. Funt. Colour by Correlation in a Three-Dimensional Colour Space. In European Conference on Computer Vision(ECCV), 2000, pages 375–389, 2000. (Cited on page 101.)

[6] B.E. Bayer. Color imaging array, 1976. US Patent 3,971,065. (Cited onpage 21.)

[7] A. Bedagkar-Gala and S. K. Shah. A survey of approaches and trends inperson re-identification. Image and Vision Computing, 32(4):270–286, 2014.(Cited on page 108.)

[8] A Bhattacharyya. On a measure of divergence between two statisticalpopulations defined by their probability distributions. Bulletin of theCalcutta Mathematical Society, 35:99–109, 1943. (Cited on page 108.)

[9] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge Univer-sity Press, Cambridge, UK New York, 2004. (Cited on page 34.)

[10] M. Brady and G. E. Legge. Camera calibration for natural image studiesand vision research. Journal of the Optical Society of America A, 26(1):30–42,2009. (Cited on page 16.)

[11] G. J. Burghouts and J. M. Geusebroek. Performance Evaluation of LocalColour Invariants. Computer Vision and Image Understanding, 113:48–62,2009. (Cited on page 101.)

[12] J. Campbell. Film and Cinema Spectatorship: Melodrama and Mimesis.Polity, 2005. (Cited on page 12.)

[13] X. Cao, W. Shen, L. G. Yu, Y. L. Wang, J. Y. Yang, and Z. W. Zhang.Illumination invariant extraction for face recognition using neighboringwavelet coefficients. Pattern Recognition, 45(4):1299–1305, 2012. (Cited onpage 101.)

227

228 bibliography

[14] K. W. Chen, C. C. Lai, P. J. Lee, C. S. Chen, and Y. P. Hung. AdaptiveLearning for Target Tracking and True Linking Discovering Across Mul-tiple Non-Overlapping Cameras. IEEE Transactions on Multimedia, 13(4):625–638, August 2011. (Cited on page 102.)

[15] X. Chen, K. Huang, and T. Tan. Object tracking across non-overlappingviews by learning inter-camera transfer models. Pattern Recognition, 47

(3):1126–1137, 2014. (Cited on page 102.)

[16] D. Cheng, D. K. Prasad, and M. S. Brown. Illuminant estimation forcolor constancy: why spatial-domain methods work and the role of thecolor distribution. Journal of the Optical Society of America A, 31(5):1049–1058, May 2014. (Cited on pages 101 and 102.)

[17] M. Cho, S. Lee, and B. D. Nam. Fast auto-exposure algorithm based onnumerical analysis. volume 3650, pages 93–99, 1999. (Cited on page 137.)

[18] A. Colombo, J. Orwell, and S. Velastin. Colour Constancy Techniquesfor Re-Recognition of Pedestrians from Multiple Surveillance Cameras.In Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms andApplications (M2SFA2), 2008, Marseille, France, 2008. (Cited on page 102.)

[19] G. B. Dantzig and M. N. Thapa. Linear Programming: 1: Introduction, vol-ume 1 of Springer Series in Operations Research and Financial Engineering.Springer, 1997. (Cited on page 34.)

[20] P. Debevec and J. Malik. Recovering high dynamic range radiance mapsfrom photographs. In ACM SIGGRAPH 1997 classes, pages 369–378, NewYork, NY, USA, 1997. (Cited on page 20.)

[21] Y. Dodge. The Concise Encyclopedia of Statistics. Springer, 2009. (Citedon page 109.)

[22] T. D’Orazio, P. L. Mazzeo, and P. Spagnolo. Color Brightness Trans-fer Function evaluation for non overlapping multi camera tracking. InACM/IEEE International Conference on Distributed Smart Cameras (ICDSC),2009, volume 2, pages 1–6, September 2009. (Cited on pages 102

and 108.)

[23] Jean-Denis Durou, Maurizio Falcone, and Manuela Sagona. Numeri-cal methods for shape-from-shading: A new survey with benchmarks.Computer Vision and Image Understanding, 109(1):22–43, 2008. (Cited onpage 20.)

[24] European Machine Vision Association (EMVA). Introduction to ma-chine vision standards, January 2014. URL http://www.emva.org/cms/

index.php?idcat=24&lang=1. (Cited on page 19.)

[25] V. Esposito Vinzi, W. W. Chin, J. Henseler, and H Wang, editors. Hand-book of Partial Least Squares. Springer Handbooks of Computational Statis-tics. Springer, 2013. (Cited on pages 34 and 36.)

[26] M. Fairchild. Color appearance models. J. Wiley, Chichester, West Sussex,England Hoboken, NJ, 2005. (Cited on pages 10 and 23.)

[27] H. Farid. Blind inverse gamma correction. IEEE Transactions on Im-age Processing, 10(10):1428–1433, October 2001. (Cited on pages 19, 24,and 154.)

http://www.emva.org/cms/index.php?idcat=24&lang=1

http://www.emva.org/cms/index.php?idcat=24&lang=1

bibliography 229

[28] R. Fielding and J. Reschke. Hypertext Transfer Protocol (HTTP/1.1):Semantics and Content. RFC 7231, June 2014. URL http://tools.ietf.

org/html/rfc7231. (Cited on page 142.)

[29] G. D. Finlayson, M. S. Drew, and B. V. Funt. Spectral sharpening: sen-sor transformations for improved color constancy. Journal of the OpticalSociety of America A, 11(5):1553–1563, May 1994. (Cited on pages 21, 24,and 101.)

[30] G. D. Finlayson, S. D. Hordley, and P. M. Hubel. Color by correlation:A simple, unifying framework for color constancy. IEEE Transactions onPattern Analysis and Machine Intelligence, 23:1209–1221, 2001. (Cited onpages 4 and 101.)

[31] M. A. Fischler and R. C. Bolles. Random sample consensus: Aparadigm for model fitting with applications to image analysis and au-tomated cartography. Communications ACM, 24(6):381–395, June 1981.(Cited on page 35.)

[32] K. R. Fowler. Transient response for automatic gain control with mul-tiple intensity thresholds for image-intensified camera. IEEE Transac-tions on Instrumentation and Measurement, 54(5):1926–1933, October 2005.(Cited on pages 129 and 130.)

[33] K. Frankish and W. M. Ramsey. The Cambridge Handbook of ArtificialIntelligence. Cambridge University Press, 2014. ISBN 0521871425. (Citedon page 3.)

[34] D. Gao, X. Wu, G. Shi, and L. Zhang. Color demosaicking with an im-age formation model and adaptive pca. Journal of Visual Communicationand Image Representation, 23(7):1019–1030, 2012. (Cited on page 21.)

[35] A. Gijsenij, T. Gevers, and J. van de Weijer. Computational Color Con-stancy: Survey and Experiments. IEEE Transactions on Image Processing,20(9):2475–2489, September 2011. (Cited on page 101.)

[36] A. Gijsenij, R. Lu, and T. Gevers. Color Constancy for Multiple LightSources. IEEE Transactions on Image Processing, 21(2):697–707, February2012. (Cited on page 21.)

[37] A. Gilbert and R. Bowden. Incremental, scalable tracking of objectsinter camera. Computer Vision and Image Understanding, 111(1):43–58, July2008. (Cited on page 102.)

[38] D. Goldman. Vignette and Exposure Calibration and Compensation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(12):2276–2288, 2010. (Cited on pages 16, 19, and 154.)

[39] H. Goldstein. Multilevel Statistical Models. Wiley Series in Probabilityand Statistics. Wiley, 2010. (Cited on page 155.)

[40] M. D. Grossberg and S. K. Nayar. Modeling the space of camera re-sponse functions. IEEE Transactions on Pattern Analysis and Machine Intel-ligence, 26(10):1272–1282, October 2004. (Cited on pages 19, 20, and 101.)

[41] D. Gujarati and D. Porter. Basic Econometrics. McGraw-Hill/Irwin, 5thedition, October 2008. (Cited on pages 30, 32, 33, 34, 38, and 40.)

http://tools.ietf.org/html/rfc7231

http://tools.ietf.org/html/rfc7231

230 bibliography

[42] C. Gálvez-del Postigo, J. Torres, and J. M. Menéndez. Vacant parkingarea estimation through background subtraction and transience mapanalysis. IET Intelligent Transport Systems, page (Under review), 2014.(Cited on pages 136 and 166.)

[43] G. E. Healey and R. Kondepudy. Radiometric CCD camera calibrationand noise estimation. IEEE Transactions on Pattern Analysis and MachineIntelligence, 16(3):267–276, March 1994. (Cited on pages 14, 17, and 18.)

[44] Sir T. Heath. A History of Greek Mathematics, Volume II: From Aristarchusto Diophantus (Dover Books on Mathematics). Dover Publications, 1981.(Cited on page 11.)

[45] W. Heisenberg. Über den anschaulichen inhalt der quantentheoretis-chen kinematik und mechanik. Zeitschrift für Physik, 43(3-4):172–198,1927. (Cited on page 123.)

[46] R. Hirsch. Seizing the Light: A Social History of Photography. McGraw-HillHumanities/Social Sciences/Languages, 2008. (Cited on page 128.)

[47] A. E. Hoerl and R. W. Kennard. Ridge regression: Biased estimationfor nonorthogonal problems. Technometrics, 12(1):55–67, February 1970.(Cited on page 36.)

[48] J. Holler, V. Tsiatsis, C. Mulligan, S. Avesand, S. Karnouskos, andD. Boyle. From Machine-to-Machine to the Internet of Things: Introductionto a New Age of Intelligence. Academic Press, 2014. (Cited on page 3.)

[49] E. Hsu, T. Mertens, S. Paris, S. Avidan, and F. Durand. Light mixtureestimation for spatially varying white balance. In ACM SIGGRAPH 2008papers, pages 70:1–70:7, New York, NY, USA, 2008. (Cited on page 101.)

[50] S. C. Huang, F. C. Cheng, and Y. S. Chiu. Efficient contrast enhancementusing adaptive gamma correction with weighting distribution. IEEETransactions on Image Processing, 22(3):1032–1041, March 2013. (Cited onpages 19, 131, 133, and 154.)

[51] P. J. Huber. Robust Statistics. Wiley Series in Probability and Statistics.Wiley-Interscience, 1981. (Cited on pages 34 and 35.)

[52] R. E. Jacobson, N. Axford, S. Ray, and G. G. Attridge. Manual ofPhotography: Photographic and Digital Imaging. Butterworth-Heinemann,Newton, MA, USA, 9th edition, 2000. (Cited on pages 13, 14, 130, 131,and 132.)

[53] Bernd Jähne and Horst Haußecker, editors. Computer vision and applica-tions: a guide for students and practitioners. Academic Press, Inc., Orlando,FL, USA, 2000. (Cited on pages 4, 10, 12, and 16.)

[54] O. Javed, K. Shafique, and M. Shah. Appearance modeling for trackingin multiple non-overlapping cameras. In IEEE Computer Society Confer-ence on Computer Vision and Pattern Recognition (CVPR), 2005, volume 2,pages 26–33, June 2005. (Cited on pages 102 and 108.)

[55] S. J. Kim, J. M. Frahm, and M. Pollefeys. Radiometric calibration withillumination change for outdoor scene analysis. In IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2008, pages 1–8, June2008. (Cited on page 16.)

bibliography 231

[56] S. J. Kim, H. T. Lin, Z. Lu, S. Susstrunk, S. Lin, and M. S. Brown. A newin-camera imaging model for color computer vision and its application.IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(12):2289–2302, December 2012. (Cited on pages 16 and 22.)

[57] T. Kondo, A. Kikuchi, T. Kohashi, F. Kato, and K. Hirota. Digital colorvideo camera with auto-focus, auto-exposure and auto-white balance,and an auto exposure system therefor which compensates for abnormallighting, March 3 1992. US Patent 5,093,716. (Cited on page 128.)

[58] S. J. Koppal and S. G. Narasimhan. Clustering appearance for sceneanalysis. In IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2006, pages 1323–1330, 2006. (Cited on pages 35 and 154.)

[59] Robert L. Kremens, Nitin Sampat, Shyam Venkataraman, and ThomasYeh. System implications of implementing auto-exposure on consumerdigital cameras. volume 3650, pages 100–107, 1999. (Cited on page 128.)

[60] T. Kuno, H. Sugiura, and N. Matoba. A new automatic exposure systemfor digital still cameras. IEEE Transactions on Consumer Electronics, 44(1):192–199, February 1998. (Cited on pages 129 and 130.)

[61] I. Kviatkovsky, A. Adam, and E. Rivlin. Color invariants for personreidentification. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 35(7):1622–1634, 2013. (Cited on page 107.)

[62] J.H. Lambert. Photometria, sive De mensura et gradibus luminis, colorum etumbrae. Number v. 1-2 in Harvard University. Eberhard Klett: Augsberg,Germany., 1760. (Cited on page 11.)

[63] E. H. Land. The retinex theory of color vision. Sci. Am., 237(6):108–129,1977. (Cited on page 102.)

[64] J. Y. Lee, Y. Matsushita, B. Shi, I. S. Kweon, and K. Ikeuchi. Radio-metric calibration by rank minimization. IEEE Transactions on PatternAnalysis and Machine Intelligence, 35(1):144–156, January 2013. (Cited onpage 101.)

[65] X. Li, B. Gunturk, and L. Zhang. Image demosaicing: A systematicsurvey. In SPIE Visual Communications and Image Processing, 2008, volume6822, page 15, 2008. (Cited on page 21.)

[66] JiaYi Liang, Yajie Qin, and Zhiliang Hong. An auto-exposure algorithmfor detecting high contrast lighting conditions. In International Conferenceon ASIC (ASICON), 2007, pages 725–728, Oct 2007. (Cited on pages 129

and 130.)

[67] H. T. Lin, S. J. Kim, S. Susstrunk, and M. S. Brown. Revisiting radio-metric calibration for color computer vision. In IEEE International Con-ference on Computer Vision (ICCV), 2011, pages 129–136, 2011. (Cited onpage 102.)

[68] J. Liu, D. Ren, J. Zou, Y. Wu, and S. Li. Study of automatic exposure al-gorithm based on HD ip camera. In International Conference on AdvancedIntelligence and Awarenss Internet (AIAI), 2010, pages 265–268, October2010. (Cited on pages xx, 108, 111, 112, 113, 115, 116, 118, 119, 120, 122,129, 130, and 142.)

232 bibliography

[69] D. G. Lowe. Distinctive Image Features from Scale-Invariant Keypoints.International Journal of Computer Vision, 60(2):91–110, 2004. (Cited onpages 101 and 155.)

[70] C. C. Loy, T. X., and S. Gong. Multi-camera activity correlation analy-sis. In IEEE Computer Society Conference on Computer Vision and PatternRecognition (CVPR), 2009, pages 1988–1995, 2009. (Cited on page 102.)

[71] A. Luther. Video Camera Technology. Artech House, Boston, 1998. (Citedon page 132.)

[72] C. Madden, E. Cheng, and M. Piccardi. Tracking people across dis-joint camera views by an illumination-tolerant appearance representa-tion. Machine Vision and Applications, 18(3):233–247, 2007. (Cited onpage 102.)

[73] S. Mann and R. W. Picard. On Being ‘undigital’ With Digital Cameras:Extending Dynamic Range By Combining Differently Exposed Pictures.In Proceedings of IS&T, pages 442–448, 1995. (Cited on page 20.)

[74] S. Meech. Contemporary Quilts: Design, Surface and Stitch. Batsford, 2007.(Cited on page 129.)

[75] S. Milborrow, J. Morkel, and F. Nicolls. The MUCT Landmarked FaceDatabase. Pattern Recognition Association of South Africa, 2010. (Cited onpages 36, 165, and 172.)

[76] T. Mitsunaga and S. K. Nayar. Radiometric self calibration. In IEEE Con-ference on Computer Vision and Pattern Recognition (CVPR), 1999, volume 1,page 380, 1999. (Cited on page 20.)

[77] J. Needham. Science and Civilization in China: Volume 4, Physics and Phys-ical Technology, Part 1, Physics. Taipei: Caves Books Ltd, 1986. (Cited onpage 12.)

[78] V. Parameswaran, M. Singh, and V. Ramesh. Illumination compensa-tion based change detection using order consistency. In IEEE Conferenceon Computer Vision and Pattern Recognition (CVPR), 2010, pages 1982–1989, June 2010. (Cited on page 101.)

[79] S. Park, G. Kim, and J. Jeon. The method of auto exposure control forlow-end digital camera. In International Conference on Advanced Commu-nication Technology, (ICACT) 2009, volume 03, pages 1712–1714, February2009. (Cited on pages 129 and 130.)

[80] S. J. Park and D. C. Bijalwan. Method of controlling adaptive autoexposure based on adaptive region weight, May 20 2014. US Patent8,730,353. (Cited on pages 128, 129, and 130.)

[81] J. K. Paruchuri, E. P. Sathiyamoorthy, S. C. S. Cheung, and C. H. Chen.Spatially adaptive illumination modeling for background subtraction. InIEEE International Conference on Computer Vision (ICCV) Workshops, 2011,pages 1745–1752, 2011. (Cited on page 154.)

[82] B. T. Phong. Illumination for Computer Generated Pictures. Communi-cations ACM, 18(6):311–317, 1975. (Cited on pages 11 and 12.)

[83] F. Porikli. Inter-camera color calibration by correlation model func-tion. In International Conference on Image Processing (ICIP), 2003, volume 3,pages 133–136, September 2003. (Cited on page 102.)

bibliography 233

[84] D. S. Price, X. Zhou, and H. J. Wu. Auto exposure techniques for vari-able lighting conditions, December 4 2012. US Patent 8,325,890. (Citedon pages 128, 129, and 130.)

[85] B. Prosser, S. Gong, and T. Xiang. Multi-camera Matching under Il-lumination Change Over Time. In European Conference on Computer Vi-sion (ECCV) Workshop on Multi-camera and Multi-modal Sensor Fusion Al-gorithms and Applications, 2008, 2008. (Cited on page 108.)

[86] B. Prosser, S. Gong, and T. Xiang. Multi-camera Matching Using Bi-Directional Cumulative Brightness Transfer Functions. In British MachineVision Conference, 2008. (Cited on page 102.)

[87] C. Riess, E. Eibenberger, and E. Angelopoulou. Illuminant color estima-tion for real-world mixed-illuminant scenes. In IEEE International Con-ference on Computer Vision (ICCV) Workshops, 2011, pages 782–789, 2011.(Cited on page 101.)

[88] A. Romero, M. Gouiffés, and L. Lacassagne. Covariance descriptor mul-tiple object tracking and re-identification with colorspace evaluation. InJong-Il Park and Junmo Kim, editors, Asian Conference of Computer Vision(ACCV) Workshops, 2012, volume 7729, pages 400–411, 2013. (Cited onpage 107.)

[89] Y. Rubner, C. Tomasi, and L. J. Guibas. The Earth Mover’s Distance asa Metric for Image Retrieval. International Journal of Computer Vision, 40

(2):99–121, November 2000. (Cited on page 109.)

[90] M. S. Sayed and J. G. R. Delva. An Efficient Intensity CorrectionAlgorithm for High Definition Video Surveillance Applications. IEEETransactions on Circuits and Systems for Video Technology, 21(11):1622–1630,November 2011. (Cited on page 101.)

[91] Y. Y. Schechner, S. K. Nayar, and P. N. Belhumeur. Multiplexing forOptimal Lighting. IEEE Transactions on Pattern Analysis and Machine In-telligence, 29(8):1339–1354, August 2007. (Cited on page 12.)

[92] G. A. F. Seber and C. J. Wild. Nonlinear Regression. Wiley Series inProbability and Statistics. Wiley-Interscience, 2003. (Cited on pages 34

and 36.)

[93] G. Sharma, editor. Digital color imaging handbook. CRC Press, BocaRaton, FL, 2003. (Cited on pages 20, 21, and 23.)

[94] Y. Q. Shi and H. Sun. Image and Video Compression for Multimedia Engi-neering: Fundamentals, Algorithms, and Standards. Image Processing Series.CRC Press, 2nd edition, 2008. (Cited on page 19.)

[95] C. Siebler, K. Bernardin, and R. Stiefelhagen. Adaptive color transfor-mation for person re-identification in camera networks. In ACM/IEEEInternational Conference on Distributed Smart Cameras (ICDSC), 2010, pages199–205, 2010. (Cited on page 102.)

[96] K. E. Spaulding, R. M. Vogel, and J. R. Szczepanski. Method and appa-ratus for color-correcting multi-channel signals of a digital camera, 1998.US Patent 5,805,213. (Cited on page 101.)

[97] T. G. Jr. Stockham. Image processing in the context of a visual model.Proceedings of the IEEE, 60(7):828–842, July 1972. (Cited on page 13.)

234 bibliography

[98] R. Szeliski. Computer Vision. Algorithms and Applications. Texts in Com-puter Science. Springer, 1st edition, October 2010. (Cited on pages 12,14, 16, 20, 24, and 41.)

[99] N. Sánchez, J. Alfonso, J. Torres, and J.M. Menéndez. ITS-based cooper-ative services development framework for improving safety of vulnera-ble road users. IET Intelligent Transport Systems, 7(Issue 2):236–243, June2013. (Cited on page 3.)

[100] R. T. Tan, K. Nishino, and K. Ikeuchi. Color constancy through inverse-intensity chromaticity space. Journal of the Optical Society of America A,21(3):321–334, 2004. (Cited on pages 108, 112, 113, 115, 116, 119, 120,and 122.)

[101] S. Theodoridis and K. Koutroumbas. Pattern recognition. AcademicPress, Burlington, MA London, 2009. (Cited on page 109.)

[102] R. Tibshirani. Regression shrinkage and selection via the lasso. Journalof the Royal Statistical Society, 58:267–288, 1994. (Cited on page 36.)

[103] A. Torralba. Accidental pinhole and pinspeck cameras: Revealing thescene outside the picture. In IEEE Conference on Computer Vision andPattern Recognition (CVPR), 2012, pages 374–381, 2012. (Cited on page 9.)

[104] K. E. Torrance and E. M. Sparrow. Theory for off-specular reflectionfrom roughened surfaces. Journal of the Optical Society of America A, 57

(9):1105–1114, September 1967. (Cited on page 12.)

[105] J. Torres. VARIM: A Full Automatic System for Creating a High ResolutionReflectographic Mosaic, pages 71–78. Culture figurative a confronto traFiandre e Italia dal XV al XVII secolo. Silvana Editoriale, Milano, Italy,2008. (Cited on page 9.)

[106] J. Torres and J. M. Menéndez. A practical algorithm to correct geo-metrical distortion of image acquisition cameras. In International Confer-enceon Image Processing (ICIP), 2004, volume 4, pages 2451–2454, October2004. (Cited on page 16.)

[107] J. Torres and J. M. Menéndez. An adaptive real-time method for con-trolling the luminosity in digital video acquisition. In IAESTED Inter-national Conference on Visualization, Imaging and Image Processing, pages133–137, September 2005. (Cited on page 130.)

[108] J. Torres and J. M. Menéndez. Optimal camera exposure for videosurveillance systems by predictive control of shutter speed, aperture,and gain. In IS&T/SPIE Electronic Imaging, 2015, page (Pending to publi-cation), 2015. (Cited on page 127.)

[109] J. Torres, C. Vega, T. Antelo, J. M. Menéndez, M. del Egido, M. Bueso,and A. Posse. Formation of Hyperspectral Near-Infrared Images from Art-works, pages 3–15. Cultural Heritage and Archaeological Issues in Mate-rials Science. Cambridge University Press, 2012. (Cited on page 9.)

[110] J. Torres, K. Schutte, H. Bouma, and J. M. Menéndez. Linear colorcorrection for multiple illumination changes and non-overlapping cam-eras. IET Image Processing, 2014. doi: 10.1049/iet-ipr.2014.0149. (Citedon page 99.)

bibliography 235

[111] Y. Tsin, V. Ramesh, and T. Kanade. Statistical calibration of CCD imag-ing process. In IEEE International Conference on Computer Vision (ICCV),2001, volume 1, pages 480–487, 2001. (Cited on pages 14, 17, and 20.)

[112] K. E. A. van de Sande, T. Gevers, and C. G. M. Snoek. EvaluatingColor Descriptors for Object and Scene Recognition. IEEE Transactionson Pattern Analysis and Machine Intelligence, 32(9):1582–1596, 2010. (Citedon page 101.)

[113] G. Wang. Active entropy camera. Machine Vision and Applications,pages 1–11, 2012. (Cited on pages 130 and 133.)

[114] H. Wannous, Y. Lucas, S. Treuillet, A. Mansouri, and Y. Voisin. Im-proving color correction across camera and illumination changes by con-textual sample selection. Journal of Electronic Imaging, 21(2):1–14, June2012. (Cited on page 101.)

[115] P. J. Withagen, F. C. A. Groen, and K. Schutte. CCD Color Cam-era Characterization for Image Measurements. IEEE Transactions on In-strumentation and Measurement, 56(1):199–203, February 2007. (Cited onpage 18.)

[116] P. J. Withagen, K. Schutte, and F. C. A. Groen. Global Intensity Cor-rection in Dynamic Scenes. International Journal of Computer Vision, 86(1):33–47, 2010. (Cited on page 101.)

[117] L. B. Wolff, S. K. Nayar, and M. Oren. Improved diffuse reflectionmodels for computer vision. International Journal of Computer Vision, 30

(1):55–71, October 1998. (Cited on page 12.)

[118] X. Yan. Linear Regression Analysis: Theory and Computing. World Scien-tific Publishing Company, 2009. (Cited on pages 34 and 36.)

[119] H. Yang, Y. Chang, J. Wang, and J. Huo. A new automatic exposurealgorithm for video cameras using luminance histogram. Frontiers ofOptoelectronics in China, 1(3–4):285–291, 2008. (Cited on page 130.)

[120] L. Yuan and J. Sun. Automatic exposure correction of consumerphotographs. In Andrew Fitzgibbon, Svetlana Lazebnik, Pietro Perona,Yoichi Sato, and Cordelia Schmid, editors, European Conference on Com-puter Vision (ECCV), 2012, volume 7575 of Lecture Notes in ComputerScience, pages 771–785. Springer Berlin Heidelberg, 2012. (Cited onpage 155.)

[121] R. Zhang, P. S. Tsai, J. E. Cryer, and M. Shah. Shape-from-shading: asurvey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21

(8):690–706, August 1999. (Cited on page 11.)

[122] Y. Zhang and H. Chu. Ray Projection for Recovering Projective Trans-formations and Illumination Changes. IEEE Transactions on PatternAnalysis and Machine Intelligence, 33(3):446–458, March 2011. (Cited onpage 102.)

[123] Y. Zheng, S. Lin, C. Kambhamettu, J. Yu, and S. B. Kang. Single-ImageVignetting Correction. IEEE Transactions on Pattern Analysis and MachineIntelligence, 31(12):2243–2256, December 2009. (Cited on page 24.)

236 bibliography

[124] M. Zuliani, C. S. Kenney, and B. S. Manjunath. The Multiransac Al-gorithm and its Application to Detect Planar Homographies. In IEEEInternational Conference on Image Processing (ICIP), 2005, September 2005.(Cited on page 35.)

P U B L I C AT I O N S

Some ideas have been derived from the following publications:

journals

[J1] T. Antelo, M. del Egido, A. Gabaldon, C. Vega, and J. Torres. Elproyecto VARIM: Visión Artificial aplicada a la Reflectografía de Infrarro-jos Mecanizada, pages 86–99. Innovación tecnológica en conservacióny restauración del Patrimonio. Tecnología y Conservación del Patri-monio Arqueológico I. Universidad Autónoma de Madrid, Madrid,Spain, 2006.

[J2] N. Sánchez, J. Alfonso, J. Torres, and J.M. Menéndez. ITS-based co-operative services development framework for improving safety ofvulnerable road users. IET Intelligent Transport Systems, 7(Issue 2):236–243, June 2013.

[J3] J. Torres. VARIM: A Full Automatic System for Creating a High ResolutionReflectographic Mosaic, pages 71–78. Culture figurative a confronto traFiandre e Italia dal XV al XVII secolo. Silvana Editoriale, Milano, Italy,2008.

[J4] J. Torres, A. Posse, J. M. Menéndez, A. Gabaldon, C. Vega, T. An-telo, M. del Egido, and M. Bueso. VARIM, A Useful System for Acquir-ing and Composing Images in Painting Analysis Techniques, pages 27–42.Number 4 in e_conservation. The online magazine. e_conservation,April 2008. URL http://www.e-conservationline.com/content/

view/410/100/1/0.

[J5] J. Torres, A. Posse, and J.M. Menéndez. Descripción del sistema VARIM:captación y composición automática del mosaico reflectográfico, pages 89–98. Number 8 in Bienes Culturales. Revista del Patrimonio Culturalde España. Ministerio de Cultura, 2008.

[J6] J. Torres, D. Vázquez, T. Antelo, J. M. Menéndez, Posse A., A. Álvarez,J. Muñoz, Vega C., and M. Del Egido. Acquisition and formation ofmultispectral images of paintings. Óptica Pura y Aplicada, 45:201–207,June 2012.

[J7] J. Torres, C. Vega, T. Antelo, J. M. Menéndez, M. del Egido, M. Bueso,and A. Posse. Formation of Hyperspectral Near-Infrared Images from Art-works, pages 3–15. Cultural Heritage and Archaeological Issues inMaterials Science. Cambridge University Press, 2012.

[J8] J. Torres, K. Schutte, H. Bouma, and J. M. Menéndez. Linear color cor-rection for multiple illumination changes and non-overlapping cam-eras. IET Image Processing, 2014. doi: 10.1049/iet-ipr.2014.0149.

conferences

[C1] J. Alfonso, J. Torres, and J.M. Menéndez. Towards a user goals-basedmultilayered ITS architecture. In 8th ITS European Congress, May 2011.

237

http://www.e-conservationline.com/content/view/410/100/1/0

http://www.e-conservationline.com/content/view/410/100/1/0

238 bibliography

[C2] A. B. Mejía, E. Rey, J. Alfonso, N. Sánchez, and J. Torres. Sistemade Optimización de Rutas para los Servicios de Emergencia. In XIICongreso Español de Sistemas Inteligentes de Transporte, April 2012.

[C3] A. B. Mejía, J. Torres, J. Alfonso, J. M. Menéndez, and L. Merle. Gestiónde la Movilidad en OASIS. Integración de Servicios para la Gestiónde Tráfico en una Arquitectura de Comunicaciones V2I I2V. In XIICongreso Español de Sistemas Inteligentes de Transporte, April 2012.

[C4] A. Pecharromán, N. Sanchez, J. Torres, and J.M. Menéndez. Real-TimeIncidents Detection in the Highways of the Future. In 15th PortugueseConference on Artificial Intelligence, EPIA, pages 108–121, October 2011.

[C5] A. Posse and J. Torres. Unión de imágenes reflectográficas basadaen medidas de orden con aumento de intensidad y en selección depuntos por estructura. In XXII Simposio Nacional de la Unión CientíficaInternacional de Radio (URSI), September 2007.

[C6] A. Posse, J. Torres, and J. M. Menéndez. Matching points in poor edgeinformation images. In IEEE International Conference on Image Processing(ICIP), 2009, pages 197–200, November 2009.

[C7] J. Torres and J. M. Menéndez. Virtual reality devices in driving simu-lators: state of the art and ongoing developments at U.P.M. In Proc. ofthe Workshop On The Application Of New Technologies To Driver Training,Humanist project, January 2005.

[C8] J. Torres and J. M. Menéndez. An adaptive real-time method for con-trolling the luminosity in digital video acquisition. In IAESTED Inter-national Conference on Visualization, Imaging and Image Processing, pages133–137, September 2005.

[C9] J. Torres and J. M. Menéndez. Optimal camera exposure for videosurveillance systems by predictive control of shutter speed, aperture,and gain. In IS&T/SPIE Electronic Imaging 2015, page (Pending to pub-lication), 2015.

[C10] J. Torres and J.M. Menéndez. A practical algorithm to correct geo-metrical distortion of image acquisition cameras. In IEEE InternationalConference on Image Processing (ICIP), 2004, volume 4, pages 2451–2454,October 2004.

[C11] J. Torres and J.M. Menéndez. Algoritmo para la corrección de la lu-minosidad de una escena durante la adquisición de vídeo digital. InXX Simposio Nacional de la Unión Científica Internacional de Radio (URSI),September 2005.

[C12] J. Torres and J.M. Menéndez. Mejora de la composición del mosaicoreflectográfico mediante métodos de corrección geométrica. In II Con-greso del Grupo Español del IIC, pages 83–88, November 2005.

[C13] J. Torres, J. M. Menéndez, R. Garcia, and C. Lanza. Prototipo parala gestión de flotas de vehículos utilizando información procedente detacógrafos digitales. In V Congreso Español – I Iberoamericano de SistemasInteligentes de Transporte, November 2005.

[C14] J. Torres, N. Sanchez, and J. M. Menéndez. Método de deteccióny segmentación de movimiento en tiempo real mediante sustracción

bibliography 239

de fondo adaptativa. In XXII Simposio Nacional de la Unión CientíficaInternacional de Radio (URSI), September 2007.

[C15] J. Torres, A. Posse, J.M. Menéndez, A. Gabaldon, C. Vega, T. Antelo,M. del Egido, and M. Bueso. VARIM: A computer vision system forthe automatic creation of high resolution reflectographic mosaics. In50th International Symposium ELMAR, 2008, volume 2, pages 487–490,September 2008.

[C16] J. Torres, D. Vázquez, T. Antelo, J. M. Menéndez, A. Posse, A. Álvarez,J. Muñoz, C. Vega, and M. del Egido. Adquisición y Formación deImágenes Multiespectrales de Obras Pictóricas. In 7ª Reunión Españolade Optoelectrónica, OPTOEL’11, June 2011.

[C17] J. Torres, C. Vega, M. del Egido, A. Posse, J.M. Menéndez, T. An-telo, and M. Bueso. Formation of Hyperspectral Near-Infrared Imagesfrom Artworks. In XX International Materials Research Conference 2011,August 2011.

[C18] C. Vega, J. Torres, T. Antelo, M. del Egido, J. M. Menéndez, andM. Bueso. Varim 2.0: Non invasive nir hyperspectral imaging for analy-sis of cultural beings. In International Congress on Science and Technologyfor the Conservation of Cultural Heritage, October 2012.

invited conferences

[I1] A. Gabaldon, C. Vega, T. Antelo, and J. Torres. Varim: Visión artificialaplicada a la reflectografía de infrarrojos mecanizada. In Jornada Cien-tífica: Innovación Tecnológica en conservación y Restauración del Patrimonio.IV Semana de la ciencia, November 2004.

[I2] J. M. Menéndez and J. Torres. El tacógrafo digital desde un punto devista tecnológico. In Seminario sobre Tacógrafo Digital y Sistemas de PesajeDinámico, November 2005.

[I3] J. M. Menéndez and J. Torres. Nuevos conceptos de arquitectura ycomunicaciones en las autopistas. In Cursos de Verano de la UPM, July2011.

[I4] J. M. Menéndez and J. Torres. Nuevos aspectos tecnológicos a consid-erar en la ciudad del futuro. In ICEX Centro de estudios económicos ycomerciales, editor, Seminario sobre competitividad e innovación urbana,Ministerio de Economía y Competitividad, Madrid, October 2013.

[I5] J. M. Menéndez, J. Torres, M. Lopez, and M. Badillo. Inteligencia enautopistas. In Jornada sobre la I+D para la Autopista del Futuro: OASIS,November 2009.

[I6] E. Rey, A. B. Mejía, N. Sánchez, J. Alfonso, J. Torres, and J. M. Menén-dez. Route optimization system for road emergency services. In Eu-ropean Conference on Human Centred Desigh for Intelligent Transport Sys-tems, 2012.

[I7] J. Torres. Proyecto varim: Descripción técnica de la aplicación infor-mática. In Jornada Científica: Innovación Tecnológica en conservación yRestauración del Patrimonio. IV Semana de la ciencia, November 2004.

240 bibliography

[I8] J. Torres. Varim: A full automatic system for creating a high resolutionreflectographic mosaic. In Nord/Sud. Ricezioni fiamminghe al di qua delleAlpi, October 2007.

[I9] J. Torres. Técnicas de procesado de imagen y visión artificial en en-tornos artísticos. In La Ciencia del Arte II, June 2008.

[I10] J. Torres and N. Sanchez. Aplicaciones de visión artificial en diver-sos entornos. In Jornada Científica: Hands on Image Processing 2007 –Robotiker-Tecnalia, November 2007.

[I11] J. Torres and N. Sanchez. New approaches on computer vision appliedto security applications. In Jornada Científica: Hands on Image Processing2008 – Robotiker-Tecnalia, October 2008.

reports

[R1] E. Rey, J. Torres, J. Alfonso, N. Sánchez, and J. M. Menéndez. Gestiónde mejora de la movilidad a partir de servicios cooperativos. Techni-cal Report 5, Plataforma Tecnológica de la Carretera, 2012.

[R2] D. Sastre, J. Torres, and J. M. Menéndez. Sistemas de adquisición deinformación de tráfico: Estado actual y futuro. Technical Report 1,Plataforma Tecnológica de la Carretera, 2011.

[R3] J. Torres and J. M. Menéndez. Técnicas avanzadas de fusión de infor-mación de fuentes heterogéneas para la extracción de información demovilidad en carreteras. Technical Report 1, Plataforma Tecnológicade la Carretera, 2013.

colophon

This document was typeset using the typographical look-and-feel classic-thesis developed by André Miede. The style was inspired by Robert Bring-hurst’s seminal book on typography “The Elements of Typographic Style”.classicthesis is available for both LATEX and LYX:

http://code.google.com/p/classicthesis/

Happy users of classicthesis usually send a real postcard to the author, acollection of postcards received so far is featured here:

http://postcards.miede.de/

Final Version as of November 23, 2014 (classicthesis version 3.0.0).

http://code.google.com/p/classicthesis/

http://postcards.miede.de/

Documents

Correction of the Colour Variations under Uncontrolled Lighting Conditions