Transformation of Input Space using Statistical Moments : EA-Based Approach

Animated_DNA_Twirl

Transformation of Input Space using Statistical Moments: EA-Based Approach

Ahmed Kattan: Um Al Qura University, Saudi ArabiaMichael Kampouridis: University of Kent, UKYew-Soon Ong: Nanyang Technological University, SingaporeKhalid Mehamdi: Um Al Qura University, Saudi Arabia

The problem Standard Regression models are presented withObservational data of the form (xi, yi) i=1nEach xi denotes a k-dimensional input vector of design variables and y is the response.

When k n, high variance and over-fitting become a major concern.

are presented with n samples from an input space that is composed of observational data of the form (xi, yi)Each xi denotes a k-dimensional input vector of design variables and y is the response. When k n, high variance and over-fitting become a major concern.

2The problem

High dimensional regression problem Regression Model

Poor approximation

Solutions Curse of dimensionality is solved by: Reduce number of dimensions by selecting important features (e.g., PCA, FDA, ..etc.)Transformation of input space (e.g., GP, FFX, ..etc.)

Majority of work in this topic has been done for classification problems. The idea of transforming input space to reduce the number of design variables in the regression problems to improve generalisation is relatively little explored thus far.Contributions of this work Contributions A novel evolutionary approach to transform the high-dimensional input space of regression models using only statistical moments.analysis to understand the impact of different statistical moments on the evolved transformation proceduredramatically improve LRs generalisation and make it competitive to other state-of-the-art regression models.

The proposed transformation (xi, yi) (zi, yi)

Transformationx1,,,xkx0z1,,,znz0We transform the input vector x into and vector called z. The z is smaller than x and easier to be approximated by standard regression models.The proposed transformationWe used standard Genetic Algorithm

Genetic Algorithm

Population representation

Genetic Algorithm Search operatorsCrossover in which two individuals exchange statistical moments and their parameters, randomly.

op0 op 1 op2 opga0a2a3a7a5a8

a2a3a4a2a7...

a0a2 a7a0a5a6a7a9

.op0 op 1 op2 opga0a2a3a7a5a8

a2a3a4a2a7...

a0a2 a7a0a5a6a7a9

.

Genetic Algorithm Search operatorsAggressive mutation operator that replaces a statistical moment and its parameters, randomly selected, with another randomly selected moments from the pool of statistical moments.

op 1 op2 opga0a2a3a7a5a8

a2a3a4a2a7...

a0a2 a7a0a5a6a7a9

.a4a3 a9op0 New op0

Genetic Algorithm Search operatorsSmooth mutation operator where a parameter of a randomly selected statistical moment is mutated into a new parameter. op0 op 1 op2 opga0a2a3a7a5a8

a2a3a4a2a7...

a0a2 a7a0a5a6a7a9

.a4

Genetic Algorithm Fitness measureWe used average prediction errors of Linear Regression (LR) as a fitness measure for GA.

LR is a very simple algorithm where it considers the family of linear hypotheses:

In order to measure the effectiveness of GA individuals we need to find out whether they will improve regression models performance.

This is a challenging problem because the fitness measure needs to be aware of the generalisation level induced by the transformed space.

12Genetic Algorithm Fitness measureWhy LR ?

Hence, given these features LR can push the GAs evolutionary process to linearly align the transformed inputs with their outputs and minimise the dimensionality of the new space.

LR is known to give accurate predictions if the sample inputs are linearly aligned with their corresponding outputs. LR is known to perform better when the number of dimensions is limited.

13Genetic Algorithm Fitness measureThe GA aims to minimise the following fitness function:

Genetic Algorithm Training Two disjoint sets: training and validation.LR: two-folds cross-validation approach. The best individual in each generation is further tested with the validation set.We select the individual that yields the best performance on the validation set across the run.

Empirical tests We tested the effects of the transformation procedure on LR and compared the results against five regression models, namely: RBFNRBFN + PCAKrigingKriging + PCALRLR + PCA piecewise LRGenetic Programming Genetic Programming + PCA

Empirical tests

F1 = Rastrigin function

F2 = Schwefel function We tested 5 benchmark functions

Empirical tests F5 = Dixon & Price function

F3 = Michalewicz function F4 = Sphere function

Empirical tests For each test function, we trained all regression models to approximate the given function when the number of variables is 100 variables.500 variables.1000 variables.

Empirical tests

Empirical tests

Approximation Quality Sphere function for 2 variables

Empirical tests

LR approximate the Sphere function

LR approximate the Sphere function after input transformation

Learn from evolution

Because the performance of the evolved transformation procedures is good it would be interesting to understand what evolved solutions actually do.When looking at the evolved sequences of moments, one quickly realises that they are not easy to understand.Therefore, we visually project statistical moments in heat maps according to their contribution to good solutions for each problem.

If a statistical moment is absolutely essential to produce a good transformation procedure then it must be present in individuals with good fitnesses.Other less essential statistical moments may be present in both good individuals and inferior ones, so their fitness will be closer to the average fitness over all individuals. in order to rank the importance of statistical moments, we equally distribute the fitness value of each individual over all moments present in that individual

23Learn from evolution It is clear from the heat maps that each problem has its unique characteristics. Interestingly, there is a consensus among all maps that the following operators do not contribute to the construction of good transformation procedures. copy copy intercept.

Learn from evolution Also, all maps agree that the following are important across all problems. Average DeviationGeometric MeanMinMax We still do not have a full understanding of the effect of these moments on the transformed space. In future research we will focus on this aspect.

Conclusions

In this work we presented: A novel evolutionary approach to transform the high-dimensional input space of regression models using only statistical moments.analysis to understand the impact of different statistical moments on the evolved transformation procedure. dramatically improve LRs generalisation and make it competitive to other state-of-the-art regression models.

We hope our results will inspire other researchers to build a deeper understanding to discover relations between straight statistical momnets on making good transformation Thank you for paying attention!

Documents

Transformation of Input Space using Statistical Moments : EA-Based Approach