51
Understanding of complex Understanding of complex data data using Computational using Computational Intelligence methods Intelligence methods Włodzisław Duch Włodzisław Duch Dept. of Informatics, Dept. of Informatics, Nicholas Copernicus University, Nicholas Copernicus University, Toruń, Toruń, Poland Poland http://www.phys.uni.torun.pl/~duch http://www.phys.uni.torun.pl/~duch

Understanding of complex data using Computational Intelligence methods

  • Upload
    felix

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

Understanding of complex data using Computational Intelligence methods . Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland http://www.phys.uni.torun.pl/~duch. What am I going to say. Data and CI What we hope for. Forms of understanding. - PowerPoint PPT Presentation

Citation preview

Page 1: Understanding of complex data  using Computational Intelligence methods

Understanding of complex data Understanding of complex data using Computational Intelligence using Computational Intelligence

methods methods

Włodzisław DuchWłodzisław Duch

Dept. of Informatics, Dept. of Informatics, Nicholas Copernicus University, Nicholas Copernicus University,

Toruń, Toruń, PolandPolandhttp://www.phys.uni.torun.pl/~duchhttp://www.phys.uni.torun.pl/~duch

Page 2: Understanding of complex data  using Computational Intelligence methods

What am I going to sayWhat am I going to say

• Data and CIData and CI• What we hope for. What we hope for. • Forms of understanding. Forms of understanding. • Visualization. Visualization. • Prototypes. Prototypes. • Logical rules. Logical rules. • Some knowledge discovered. Some knowledge discovered. • Expert system for psychometry. Expert system for psychometry. • Conclusions, or why am I saying this? Conclusions, or why am I saying this?

Page 3: Understanding of complex data  using Computational Intelligence methods

Types of DataTypes of Data• Data was precious! Now it is overwhelming ...Data was precious! Now it is overwhelming ...

• Statistical data – clean, numerical, controlled Statistical data – clean, numerical, controlled experiments, vector space model. experiments, vector space model.

• Relational data – marketing, finances. Relational data – marketing, finances. • Textual data – Web, NLP, search. Textual data – Web, NLP, search. • Complex structures – chemistry, economics. Complex structures – chemistry, economics. • Sequence data – bioinformatics. Sequence data – bioinformatics. • Multimedia data – images, video. Multimedia data – images, video. • Signals – dynamic data, biosignals. Signals – dynamic data, biosignals. • AI data – logical problems, games, behavior …AI data – logical problems, games, behavior …

Page 4: Understanding of complex data  using Computational Intelligence methods

Computational IntelligenceComputational Intelligence

Computational IntelligenceData KnowledgeArtificial Intelligence

Expert systems

Fuzzylogic

PatternRecognition

Machinelearning

Probabilistic methods

Multivariatestatistics

Visuali-zation

Evolutionaryalgorithms

Neuralnetworks

Soft computing

Page 5: Understanding of complex data  using Computational Intelligence methods

Turning data into knowledgeTurning data into knowledgeWhat should CI methods do?What should CI methods do?

• Provide descriptive and predictive non-Provide descriptive and predictive non-parametric models of data.parametric models of data.

• Allow to classify, approximate, associate, Allow to classify, approximate, associate, correlate, complete patterns.correlate, complete patterns.

• Allow to discover new categories and Allow to discover new categories and interesting patterns.interesting patterns.

• Help to visualize multi-dimensional Help to visualize multi-dimensional relationships among data samples. relationships among data samples.

• Allow to understand the data in some way.Allow to understand the data in some way.• Help to model brains! Help to model brains!

Page 6: Understanding of complex data  using Computational Intelligence methods

Forms of useful knowledgeForms of useful knowledgeAI/Machine Learning camp: AI/Machine Learning camp: Neural nets are black boxes. Neural nets are black boxes. Unacceptable! Symbolic rules forever.Unacceptable! Symbolic rules forever.But ... knowledge accessible to humans is in: But ... knowledge accessible to humans is in:

• symbols, symbols, • similarity to prototypes, similarity to prototypes, • images, visual representations. images, visual representations.

What type of explanation is satisfactory?What type of explanation is satisfactory?Interesting question for cognitive scientists.Interesting question for cognitive scientists.

Different answers in different fields. Different answers in different fields.

Page 7: Understanding of complex data  using Computational Intelligence methods

Data understandingData understanding

Types of explanation: Types of explanation:

• visualization-based: maps, diagrams, relations ... visualization-based: maps, diagrams, relations ... • exemplar-based: prototypes and similarity;exemplar-based: prototypes and similarity;• logic-based: symbols and rules. logic-based: symbols and rules.

• Humans remember examples of each Humans remember examples of each category and refer to such examples – category and refer to such examples – as similarity-based or nearest-neighbors as similarity-based or nearest-neighbors methods do.methods do.

• Humans create prototypes out of many Humans create prototypes out of many examples – as Gaussian classifiers, RBF examples – as Gaussian classifiers, RBF networks, neurofuzzy systems do. networks, neurofuzzy systems do.

• Logical rules are the highest form of Logical rules are the highest form of summarization of knowledge. summarization of knowledge.

Page 8: Understanding of complex data  using Computational Intelligence methods

Visualization: dendrogramsVisualization: dendrogramsAll projections (cuboids) on 2D subspaces are All projections (cuboids) on 2D subspaces are identical, dendrograms do not show the structure.identical, dendrograms do not show the structure.

Normal and malignant lymphocytes.Normal and malignant lymphocytes.

Page 9: Understanding of complex data  using Computational Intelligence methods

Visualization: 2D projectionsVisualization: 2D projectionsAll projections (cuboids) on 2D subspaces are All projections (cuboids) on 2D subspaces are identical, dendrograms do not show the structure.identical, dendrograms do not show the structure.

3-bit parity + all 5-bit combinations.3-bit parity + all 5-bit combinations.

Page 10: Understanding of complex data  using Computational Intelligence methods

Visualization: MDS mappingVisualization: MDS mappingResults of pure MDS mapping + centers of Results of pure MDS mapping + centers of

hierarchical clusters connected.hierarchical clusters connected.

3-bit parity + all 5-bit combinations.3-bit parity + all 5-bit combinations.

Page 11: Understanding of complex data  using Computational Intelligence methods

Visualization: 3D projectionsVisualization: 3D projectionsOnly Only ageage is continuous, other values are binary is continuous, other values are binary

Fine Needle Aspirate of Breast Lesions, red=malignant, green=benignFine Needle Aspirate of Breast Lesions, red=malignant, green=benignA.J. Walker, S.S. Cross, R.F. Harrison, Lancet 1999, 394, 1518-1521A.J. Walker, S.S. Cross, R.F. Harrison, Lancet 1999, 394, 1518-1521

Page 12: Understanding of complex data  using Computational Intelligence methods

Visualization: MDS mappingsVisualization: MDS mappingsTry to preserve all distances in 2D nonlinear mappingTry to preserve all distances in 2D nonlinear mapping

MDS large sets using LVQ + relative mapping: MDS large sets using LVQ + relative mapping: Antoine Naud + WD, this conference.Antoine Naud + WD, this conference.

Page 13: Understanding of complex data  using Computational Intelligence methods

Prototype-based rules

IF P = arg minIF P = arg minR R D(X,R) THAN Class(X)=Class(P)D(X,R) THAN Class(X)=Class(P)

C-rules (Crisp), are a special case of F-rules (fuzzy rules).C-rules (Crisp), are a special case of F-rules (fuzzy rules).F-rules (fuzzy rules) are a special case of P-rules (Prototype).F-rules (fuzzy rules) are a special case of P-rules (Prototype).P-rules have the form:P-rules have the form:

D(X,R) is a dissimilarity (distance) function, determining decision borders around prototype P. P-rules are easy to interpret! IF X=You are most similar to the P=SupermanTHAN You are in the Super-league. IF X=You are most similar to the P=Weakling THAN You are in the Failed-league. “Similar” may involve different features or D(X,P).

Page 14: Understanding of complex data  using Computational Intelligence methods

P-rulesEuclidean distance leads to a Gaussian fuzzy Euclidean distance leads to a Gaussian fuzzy membership functions + product as T-norm. membership functions + product as T-norm.

Manhattan function => (X;P)=exp{|X-P|}Various distance functions lead to different MF.Ex. data-dependent distance functions, for symbolic data:

2

2

,,

, ,

,i i

i i ii

i i i i ii i

d X PW X PD

P i i ii i

D d X P W X P

e e e X P

X P

X P

X

, | |

, | |

VDM j i j ii j

PDF i j j ii j

D p C X p C Y

D p X C p C Y

X Y

X Y

Page 15: Understanding of complex data  using Computational Intelligence methods

Crisp P-rulesCrisp P-rulesNew distance functions from info theory New distance functions from info theory interesting MF. interesting MF.

Membership Functions Membership Functions new distance function, with local new distance function, with local D(X,R) for each cluster. D(X,R) for each cluster.

Crisp logic rules: use L norm:

D(X,P) = ||XP|| = maxi Wi |XiPi|

D(X,P) = const => rectangular contours.

L Chebyshev distance with thresholds P

IF D(X,P) P THEN C(X)=C(P)

is equivalent to a conjunctive crisp rule

IF X1[P1PW1,P1PW1] …… XN [PN PWN,PNPWN] THEN C(X)=C(P)

Page 16: Understanding of complex data  using Computational Intelligence methods

Decision bordersDecision borders

Euclidean distance from 3 prototypes, one per class.

Minkovski =20 distance from 3 prototypes.

D(P,X)=const and decision borders D(P,X)=D(Q,X).

Page 17: Understanding of complex data  using Computational Intelligence methods

P-rules for WineP-rules for Wine

Manhattan distance: Manhattan distance: 66prototypes kept, prototypes kept, 4 errors, 4 errors, f2 removed f2 removed

Many other solutions.

L distance (crisp rules):15 prototypes kept, 5 errors, f2, f8, f10 removed

Euclidean distance:11 prototypes kept, 7 errors

Page 18: Understanding of complex data  using Computational Intelligence methods

Complex objectsComplex objectsVector space concept is not sufficient for complex object.Vector space concept is not sufficient for complex object.A common set of features is meaningless. A common set of features is meaningless.

AI: complex objects, states, problem descriptions.General approach: sufficient to evaluate similarity D(Oi,Oj).

Compare Oi, Oj: define transformation

Elementary operators k, eg. substring’s substitutions. Many T connecting a pair of objects Oi and Oj objects exist. Cost of transformation = sum of k costs.Similarity: lowest transformation cost. Bioinformatics: sophisticated similarity functions for sequences.Dynamic programming finds similarities. Use adaptive costs and general framework for SBL methods.See Marczak et al (this conference).

ˆi k i j

k

O O O T

Page 19: Understanding of complex data  using Computational Intelligence methods

PromotersPromotersDNA strings, 57 aminoacids, 53 + and 53 - samples DNA strings, 57 aminoacids, 53 + and 53 - samples tactagcaatacgcttgcgttcggtggttaagtatgtataatgcgcgggcttgtcgttactagcaatacgcttgcgttcggtggttaagtatgtataatgcgcgggcttgtcgt

Euclidean distance, symbolic s =a, c, t, g replaced by x=1, 2, 3, 4

PDF distance, symbolic s=a, c, t, g replaced by p(s|+)

Page 20: Understanding of complex data  using Computational Intelligence methods

Logical rulesLogical rulesCrisp logic rules: for continuous Crisp logic rules: for continuous xx use linguistic use linguistic variables (predicate functions).variables (predicate functions).

sskk((xx) ) şş True [True [XXkkŁŁ xx ŁŁX'X'kk], for example: ], for example: small(small(xx) ) = True{= True{xx||xx << 1}1}medium(medium(xx) = True{) = True{xx||xx [1,2]}[1,2]}large(large(xx) ) = True{= True{xx||xx >> 2}2}

Linguistic variables are used in crisp (prepositional, Linguistic variables are used in crisp (prepositional, Boolean) Boolean) logic logic rules: rules:

IF small-height(IF small-height(XX) AND has-hat() AND has-hat(XX) AND has-beard() AND has-beard(XX) )

THEN (THEN (XX is a Brownie) is a Brownie) ELSE IF ... ELSE ... ELSE IF ... ELSE ...

Page 21: Understanding of complex data  using Computational Intelligence methods

Crisp logic decisionsCrisp logic decisions

Crisp logic is based on rectangular Crisp logic is based on rectangular membership functions:membership functions:True/False values jump from 0 to 1. True/False values jump from 0 to 1.

Step functions are used for Step functions are used for partitioning of the feature space. partitioning of the feature space.

Very simple hyper-rectangular Very simple hyper-rectangular decision borders. decision borders.

Severe limitation on the expressive Severe limitation on the expressive power of crisp logical rules! power of crisp logical rules!

Page 22: Understanding of complex data  using Computational Intelligence methods

DT decisions bordersDT decisions bordersDecision trees lead to specific decision borders.Decision trees lead to specific decision borders.SSV tree on Wine data, proline + flavanoids contentSSV tree on Wine data, proline + flavanoids content

Page 23: Understanding of complex data  using Computational Intelligence methods

Logical rules - advantagesLogical rules - advantagesLogical rules, if simple enough, are preferable.Logical rules, if simple enough, are preferable.

• Rules may expose limitations of black box Rules may expose limitations of black box solutions. solutions.

• Only relevant features are used in rules. Only relevant features are used in rules. • Rules may sometimes be more accurate than Rules may sometimes be more accurate than

NN and other CI methods. NN and other CI methods. • Overfitting is easy to control, rules usually Overfitting is easy to control, rules usually

have small number of parameters. have small number of parameters. • Rules forever !? Rules forever !?

A logical rule about logical rules is:A logical rule about logical rules is:IF IF the number of rules is relatively small the number of rules is relatively smallAND the accuracy is sufficiently high. AND the accuracy is sufficiently high. THEN rules THEN rules may bemay be an optimal choice. an optimal choice.

Page 24: Understanding of complex data  using Computational Intelligence methods

Logical rules - limitationsLogical rules - limitationsLogical rules are preferred but ...Logical rules are preferred but ...

• Only one class is predicted Only one class is predicted pp((CCii||XX,,MM)) = 0 or 1 = 0 or 1 black-and-white picture may be inappropriate in black-and-white picture may be inappropriate in many applications.many applications.

• Discontinuous cost function allow only non-Discontinuous cost function allow only non-gradient optimization. gradient optimization.

• Sets of rules are unstable: small change in the Sets of rules are unstable: small change in the dataset leads to a large change in structure of dataset leads to a large change in structure of complex sets of rules. complex sets of rules.

• Reliable crisp rules may reject some cases as Reliable crisp rules may reject some cases as unclassified.unclassified.

• Interpretation of crisp rules may be misleading.Interpretation of crisp rules may be misleading.

• Fuzzy rules are not so comprehensible. Fuzzy rules are not so comprehensible.

Page 25: Understanding of complex data  using Computational Intelligence methods

Rules - choicesRules - choicesSimplicity vs. accuracy. Simplicity vs. accuracy. Confidence vs. rejection rate.Confidence vs. rejection rate.

true | predicted r

r

p p p pp

p p p p

Accuracy (overall)Accuracy (overall) AA((MM)) = p = p++ ppError rateError rate LL((MM)) = p = p+ p+ p Rejection rateRejection rate RR((MM))=p=p+r+r+p+prr== 11LL((MM))AA((MM))SensitivitySensitivity SS++((MM))= p= p+|++|+ = = pp++++ /p/p++

SpecificitySpecificity SS((MM))== pp = p= p /p/p

pp is a hit; is a hit; pp false alarm; false alarm; pp is a miss. is a miss.

Page 26: Understanding of complex data  using Computational Intelligence methods

Neural networksNeural networks and rulesand rules

Myocardial Infarction~ p(MI|X)

Sex Age SmokingECG: ST

PainIntensity

PainDuration

Elevation

0.7

51 1365Inputs:

Outputweights

Inputweights

Page 27: Understanding of complex data  using Computational Intelligence methods

Knowledge from networksKnowledge from networks

Simplify networks: force most weights to 0, quantize remaining parameters, be constructive!

• Regularization: mathematical technique improving predictive abilities of the network.• Result: MLP2LN neural networks that are equivalent to logical rules.

Page 28: Understanding of complex data  using Computational Intelligence methods

MLP2LNMLP2LNConverts MLP neural networks into a network Converts MLP neural networks into a network performing logical operations (LN).performing logical operations (LN).

InputInputlayer layer

Aggregation: Aggregation: better featuresbetter features

Output: Output: one node one node per class. per class.

Rule units: Rule units: threshold logicthreshold logic

Linguistic units: Linguistic units: windows, filterswindows, filters

Page 29: Understanding of complex data  using Computational Intelligence methods

Learning dynamicsLearning dynamicsDecision regions shown every 200 training epochs in x3, x4 coordinates; borders are optimally placed with wide margins.

Page 30: Understanding of complex data  using Computational Intelligence methods

Neurofuzzy systemNeurofuzzy systemss

Feature Space Mapping (FSM) neurofuzzy system.Feature Space Mapping (FSM) neurofuzzy system.Neural adaptation, estimation of probability density Neural adaptation, estimation of probability density distribution (PDF) using single hidden layer network distribution (PDF) using single hidden layer network (RBF-like) with nodes realizing separable functions:(RBF-like) with nodes realizing separable functions:

1

; ;i i ii

G X P G X P

Fuzzy: Fuzzy: xx(no/yes) replaced by a degree (no/yes) replaced by a degree xx. Triangular, trapezoidal, Gaussian . Triangular, trapezoidal, Gaussian ...... MFMF..

M.f-s in many dimensions:

Page 31: Understanding of complex data  using Computational Intelligence methods

GhostMiner PhilosophyGhostMiner Philosophy

• There is no free lunch – provide different type There is no free lunch – provide different type of tools for knowledge discovery. of tools for knowledge discovery. Decision tree, neural, neurofuzzy, similarity-Decision tree, neural, neurofuzzy, similarity-based, committees.based, committees.

• Provide tools for visualization of data.Provide tools for visualization of data.• Support the process of knowledge Support the process of knowledge

discovery/model building and evaluating, discovery/model building and evaluating, organizing it into projects.organizing it into projects.

GhostMiner, data mining tools from our lab. GhostMiner, data mining tools from our lab.

• Separate the process of model building and Separate the process of model building and knowledge discovery from model use => knowledge discovery from model use =>

GhostMiner Developer & GhostMiner Analyzer GhostMiner Developer & GhostMiner Analyzer

Page 32: Understanding of complex data  using Computational Intelligence methods

Recurrence of breast cancerRecurrence of breast cancer

Data from: Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia.

286 cases, 201 no recurrence (70.3%), 85 recurrence cases (29.7%)

no-recurrence-events, 40-49, premeno, 25-29, 0-2, ?, 2, left, right_low, yes

9 nominal features: age (9 bins), menopause, tumor-size (12 bins), nodes involved (13 bins), node-caps, degree-malignant (1,2,3), breast, breast quad, radiation.

Page 33: Understanding of complex data  using Computational Intelligence methods

Recurrence of breast cancerRecurrence of breast cancer

Data from: Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia.

Many systems used, 65-78% accuracy reported.

Single rule:IF (nodes-involved [0,2] degree-malignant = 3 THEN recurrence, ELSE no-recurrence

76.2% accuracy, only trivial knowledge in the data: Highly malignant breast cancer involving many nodes is likely to strike back.

Page 34: Understanding of complex data  using Computational Intelligence methods

Recurrence - comparison. Recurrence - comparison.

Method 10xCV accuracy

MLP2LN 1 rule 76.2 SSV DT stable rules 75.7 1.0

k-NN, k=10, Canberra 74.1 1.2

MLP+backprop. 73.5 9.4 (Zarndt)CART DT 71.4 5.0 (Zarndt) FSM, Gaussian nodes 71.7 6.8 Naive Bayes 69.3 10.0 (Zarndt)

Other decision trees < 70.0

Page 35: Understanding of complex data  using Computational Intelligence methods

Breast cancer diagnosis. Breast cancer diagnosis.

Data from University of Wisconsin Hospital, Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg.Madison, collected by dr. W.H. Wolberg.

699 cases, 9 features quantized from 1 to 10: clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, mitoses

Tasks: distinguish benign from malignant cases.

Page 36: Understanding of complex data  using Computational Intelligence methods

Breast cancer rules. Breast cancer rules.

Data from University of Wisconsin Hospital, Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg.Madison, collected by dr. W.H. Wolberg.

Simplest rule from MLP2LN, large regularization:

If uniformity of cell size 3Then benign Else malignantSensitivity=0.97, Specificity=0.85

More complex NN solutions, from 10CV estimate:Sensitivity =0.98, Specificity=0.94

Page 37: Understanding of complex data  using Computational Intelligence methods

Breast cancer comparison. Breast cancer comparison.

Method 10xCV accuracy

k-NN, k=3, Manh 97.0 2.1 (GM)FSM, neurofuzzy 96.9 1.4 (GM)

Fisher LDA 96.8 MLP+backprop. 96.7 (Ster, Dobnikar)LVQ 96.6 (Ster, Dobnikar) IncNet (neural) 96.4 2.1 (GM)Naive Bayes 96.4 SSV DT, 3 crisp rules 96.0 2.9 (GM) LDA (linear discriminant) 96.0 Various decision trees 93.5-95.6

Page 38: Understanding of complex data  using Computational Intelligence methods

Collected in the Outpatient Center of Dermatology in Rzeszów, Poland.

Four types of Melanoma: benign, blue, suspicious, or malignant.

250 cases, with almost equal class distribution. Each record in the database has 13 attributes:

asymmetry, border, color (6), diversity (5). TDS (Total Dermatoscopy Score) - single index Goal: hardware scanner for preliminary

diagnosis.

Melanoma skin cancerMelanoma skin cancer

Page 39: Understanding of complex data  using Computational Intelligence methods

Method Rules Training % Test %

MLP2LN, crisp rules 4 98.0 all 100

SSV Tree, crisp rules 4 97.5±0.3 100

FSM, rectangular f. 7 95.5±1.0 100

knn+ prototype selection 13 97.5±0.0 100

FSM, Gaussian f. 15 93.7±1.0 95±3.6

knn k=1, Manh, 2 features -- 97.4±0.3 100

LERS, rough rules 21 -- 96.2

Melanoma resultsMelanoma results

Page 40: Understanding of complex data  using Computational Intelligence methods

27 features taken into account: polarity, size, hydrogen-bond donor or acceptor, pi-donor or acceptor, polarizability, sigma effect.

Pairs of chemicals, 54 features, are compared, which one has higher activity?

2788 cases, 5-fold crossvalidation tests.

Antibiotic activity of Antibiotic activity of pyrimidine compounds.pyrimidine compounds.

Pyrimidines: which compound has stronger antibiotic activity?

Common template, substitutions added at 3 positions, R3, R4 and R5.

Page 41: Understanding of complex data  using Computational Intelligence methods

Antibiotic activity - results.Antibiotic activity - results.

Pyrimidines: which compound has stronger antibiotic activity?

Mean Spearman's rank correlation coefficient used: rs

Method Rank correlation

FSM, 41 Gaussian rules 0.77±0.03Golem (ILP) 0.68Linear regression 0.65CART (decision tree) 0.50

Page 42: Understanding of complex data  using Computational Intelligence methods

Thyroid screening.Thyroid screening.Garavan Institute, Sydney, Australia

15 binary, 6 continuous

Training: 93+191+3488 Validate: 73+177+3178

Determine important clinical factors Calculate prob. of each diagnosis.

Hiddenunits

Finaldiagnoses

TSHT4U

Clinical findingsAgesex……

T3TT4TBG

Normal

Hyperthyroid

Hypothyroid

Page 43: Understanding of complex data  using Computational Intelligence methods

Thyroid – some results.Thyroid – some results.Accuracy of diagnoses obtained with different systems.

Method Rules/Features Training % Test %

MLP2LN optimized 4/6 99.9 99.36

CART/SSV Decision Trees 3/5 99.8 99.33

Best Backprop MLP -/21 100 98.5

Naïve Bayes -/- 97.0 96.1

k-nearest neighbors -/- - 93.8

Page 44: Understanding of complex data  using Computational Intelligence methods

PsychometryPsychometryMMPI (Minnesota Multiphasic Personality MMPI (Minnesota Multiphasic Personality Inventory) psychometric test.Inventory) psychometric test.Printed formsPrinted forms are scanned or are scanned or computerized versioncomputerized version of the test is used. of the test is used.

• Raw data: 550 questions, ex:I am getting tired quickly: Yes - Don’t know - No

• Results are combined into 10 clinical scales and 4 validity scales using fixed coefficients.

• Each scale measures tendencies towards hypochondria, schizophrenia, psychopathic deviations, depression, hysteria, paranoia etc.

Page 45: Understanding of complex data  using Computational Intelligence methods

PsychometryPsychometry• There is no simple correlation between single

values and final diagnosis. • Results are displayed in form of a histogram,

called ‘a psychogram’. Interpretation depends on the experience and skill of an expert, takes into account correlations between peaks.

Goal: an expert system providing evaluation and interpretation of MMPI tests at an expert level.

Problem: agreement between experts only 70% of the time; alternative diagnosis and personality changes over time are important.

Page 46: Understanding of complex data  using Computational Intelligence methods

Psychometric dataPsychometric data1600 cases for woman, same number for men.1600 cases for woman, same number for men.27 classes: 27 classes: norm, psychopathic, schizophrenia, paranoia, norm, psychopathic, schizophrenia, paranoia, neurosis, mania, simulation, alcoholism, drug neurosis, mania, simulation, alcoholism, drug addiction, criminal tendencies, abnormal addiction, criminal tendencies, abnormal behavior due to ... behavior due to ...

Extraction of logical rules: 14 scales = features.

Define linguistic variables and use FSM, MLP2LN, SSV - giving about 2-3 rules/class.

Page 47: Understanding of complex data  using Computational Intelligence methods

Psychometric dataPsychometric data

10-CV for FSM is 82-85%, for C4.5 is 79-84%. Input uncertainty ++GGxx around 1.5% (best ROC) improves FSM results to 90-92%.

MethodMethod DataData N. rulesN. rules AccuracyAccuracy ++GGxx%%

C 4.5C 4.5 ♀♀ 5555 93.093.0 93.793.7

♂♂ 6161 92.592.5 93.193.1

FSMFSM ♀♀ 6969 95.495.4 97.697.6

♂♂ 9898 95.995.9 96.996.9

Page 48: Understanding of complex data  using Computational Intelligence methods

Psychometric ExpertPsychometric ExpertProbabilities for different classes. Probabilities for different classes. For greater uncertainties more For greater uncertainties more classes are predicted. classes are predicted.

Fitting the rules to the conditions:Fitting the rules to the conditions:typically 3-5 conditions per rule, typically 3-5 conditions per rule, Gaussian distributions around Gaussian distributions around measured values that fall into the measured values that fall into the rule interval are shown in green. rule interval are shown in green.

Verbal interpretation of each Verbal interpretation of each case, rule and scale dependent.case, rule and scale dependent.

Page 49: Understanding of complex data  using Computational Intelligence methods

VisualizationVisualizationProbability of classes versus Probability of classes versus input uncertainty.input uncertainty.

Detailed input probabilities Detailed input probabilities around the measured values around the measured values vs. change in the single scale; vs. change in the single scale; changes over time define changes over time define ‘patients trajectory’. ‘patients trajectory’.

Interactive multidimensional Interactive multidimensional scaling: zooming on the new scaling: zooming on the new case to inspect its similarity to case to inspect its similarity to other cases.other cases.

Page 50: Understanding of complex data  using Computational Intelligence methods

ConclusionsConclusionsData understanding is challenging problem.Data understanding is challenging problem.

• Classification rules are frequently only the first step and Classification rules are frequently only the first step and may not be the best solution.may not be the best solution.

• Visualization is always helpful. Visualization is always helpful. • P-rules may be competitive if complex decision borders P-rules may be competitive if complex decision borders

are required, providing different types of rules. are required, providing different types of rules. • Understanding of complex objects is possible, although Understanding of complex objects is possible, although

difficult, using adaptive costs and distance as least difficult, using adaptive costs and distance as least expensive transformations (action principles in physics). expensive transformations (action principles in physics).

• Why am I saying all this?Why am I saying all this?Because we have hopes for great applications! Because we have hopes for great applications!

Page 51: Understanding of complex data  using Computational Intelligence methods

ChallengesChallenges

• Discovery of theories rather than data modelsDiscovery of theories rather than data models• Integration with image/signal analysisIntegration with image/signal analysis• Integration with reasoning in complex domainsIntegration with reasoning in complex domains• Combining expert systems with neural networksCombining expert systems with neural networks

……..

Fully automatic universal data analysis systems: Fully automatic universal data analysis systems: press the button and wait for the truth …press the button and wait for the truth …

We are slowly getting there. We are slowly getting there. More & more computational intelligence tools More & more computational intelligence tools (including our own) are available. (including our own) are available.