27
Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Embed Size (px)

Citation preview

Page 1: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Algoval: Evaluation ServerPast, Present and Future

Simon Lucas

Computer Science Dept

Essex University

25 January, 2002

Page 2: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Architecture Evolution

• Version 1: Centralised evaluation of Java submissions (Spring 2000)

• Version 2: Distributed evaluation using Java RMI (Summer 2001)

• Version 3: Distributed evaluation using XML over HTTP (Spring 2002)

Page 3: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Competitions

• Post-Office Sponsored OCR Competition (Autumn 2000)

• IEEE Congress on Evolutionary Computation 2001

• IEEE WCCI 2002• ICDAR 2003• Wide range of contests – OCR, Sequence

Recognition, Object Recognition

Page 4: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002
Page 5: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Sample Results

Page 6: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Statistics

Page 7: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Details

Page 8: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

More Details

Page 9: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Parameterised Algorithms

• Note that league table entries can include the parameters that were used to configure the algorithm

• This allows developers to observe the results of different parameter settings on the performance measures

• E.g.: problems.seqrec.SNTupleRecognizer?n=4&gap=11?eps=0.01

Page 10: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Centralised

• System restricted submissions to be written in Java – for security reasons– Java programs can be run in within a

highly restrictive security manager

• Does not scale well under heavy load

• Many researchers unwilling to convert their algorithm implementations to Java

Page 11: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Centralised II

• Can measure every aspect of an algorithms performance– Speed– Memory requirements (static, dynamic)

• All algorithms compete on a level playing field

• Very difficult for an algorithm to cheat

Page 12: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Distributed

• Researchers can test their algorithms against others without submitting their code

• Results on new datasets can be generated immediately for all clients that are connected to the evaluation server

• Results are generated by the same evaluation method. 

• Hence meaningful comparisons can be made between different algorithms.

Page 13: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Distributed (RMI)

• Based on Java’s Remote Method Invocation (RMI)

• Works okay, but client programs still need to access a Java Virtual Machine

• BUT: the algorithms can now be implemented in any language

• However: there may still be some work converting the Java data structures to the native language

Page 14: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Distributed II

• Since most computation is done on the clients' machines, it scales well.

• Researchers can implement their algorithms in any language they choose - it just has to talk to the evaluation proxy on their machine.

• When submitting an algorithm it is also possible to specify URLs for the author and the algorithm

• Visitors to the web-site can view league tables then follow links to the algorithm and its implementer.

Page 15: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Distributed (RMI)

Page 16: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

UML Sequence

Page 17: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Remote Participation

• Developers download a kit

• Interface their algorithm to the spec.

• Run a command-line batch file to invoke their algorithm on a specified problem

Page 18: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Features of RMI

• Handles Object Serialization• Hence: problem specifications can easily

include complex data structures• Fragile! – changes to the Java classes may

require developers to download a new developer kit

• Does not work well through firewalls• HTTP Tunnelling can solve some problems,

but has limitations (e.g. no callbacks)

Page 19: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

<future>XML Version</future>

• While Java RMI is platform independent (any platform with a JVM), XML is language independent

• XML version is HTTP based

• No known problems with firewalls

Page 20: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

XML Version

• Each client (algorithm under test) – parses XML objects (e.g. datasets) – sends back XML objects (e.g. pattern

classifications) to the server

Page 21: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Pattern recognition servers

• Reside at particular URLs

• Can be trained on specified or supplied datasets

• Can respond to recognition requests

Page 22: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Example Request

• Recognize this word:

• Given the dictionary at:– http://ace.essex.ac.uk/viadocs/dic/pygenera.txt

• And the OCR training set at:– http://ace.essex.ac.uk/algoval/ocr/viadocs1.xml

• Respond with your 10 best word hypotheses

Page 23: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Example Response

                          

1. MELISSOBLAPTES2. ENDOMMMASIS3. HETEROGRAPHIS4. TRICHOBAPTES5. HETEROCHROSIS6. PHLOEOGRAPTIS7. HETEROCNEPHES8. DRESCOMPOSIS9. MESOGRAPHE10.DIPSOCHARES

Page 24: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Issues

• How general to make problem specs– Could set up separate problems for OCR

and face recognition, or a single problem called ImageRecognition

• How does the software effort scale?

Page 25: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Software Scalability

• Suppose we have:– A algorithms implemented in L languages– D datasets– P problems– E algorithm evaluators

• How will our software effort scale with respect to these numbers?

Page 26: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Scalability (contd.)

• Consider server and clients• More effort at the server can mean less

effort for clients• For example, language specific

interfaces and wrappers can be defined• This makes participation in a particular

language much less effort• This could be done on demand

Page 27: Algoval: Evaluation Server Past, Present and Future Simon Lucas Computer Science Dept Essex University 25 January, 2002

Summary• Independent, automatic algorithm evaluation• Makes sound scientific and economic sense• Existing system works but has some

limitations• Future XML-based system will overcome

these• Then need to get people using this• Future contests will help• Industry support will benefit both academic

research and commercial exploitation