80
Blobby Land Gordon J. Pace Chris Porter Christian Colombo March 2010

Blobby Land Gordon J. Pace Chris Porter Christian Colombo March 2010

Embed Size (px)

Citation preview

Blobby Land

Gordon J. Pace Chris Porter

Christian Colombo

March 2010

Who killed Mr Blobby?

Mathematics and (partly) Computer Science are really all about one question: What can we say about objects which can be described in a finite manner?

Compression is usually done by trying to find regularities in the source: Finding concise rules which generate part of the

data bigger than the rule. Find common subparts and describe them once

Mr Blobby is innocent

Applications of compression are numerous: Smaller data for transmission and storage e.g. fit

a whole movie at high resolution on a single DVD. Less resources if the description we are

compressing is of a real life process: e.g. process chain optimisation.

Allow you to win the 2010 ICTSA Programming Challenge

Mr Blobby woz ’ere

Given: an image with a 3-bit depth (8 colours) a number n

To produce: not more than n blobs of paint approximating the

original image in less than one minute. using as little as possible paint

From Mr Ducky…

… to Mr Blobby

… to Mr Blobby

In 16 blobs

… to Mr Blobby

… and no, I did not produce this

automatically

Some Interesting Notes

The problem of finding an optimal solution is hard: There’s an infinite search space (allowing fractions, and

blobs with centres outside the grid) But could be reduced to a finite one if we ignore the less

paint is better constraint.

Someone asked “is it NP-complete?” It is in NP (so certainly not worse than NPC), and seems to be in NPC, but am not sure.

A brute force approach is out of the question

Here’s an evaluation image (given with 5, 50, 100, 1000 blobs)

Here’s what one team produced (1000 blobs)

and another team (1000 blobs)

and yet another (1000 blobs?!)

The Teams

The Teams: Industry-Based

1. CasaSoft: Karl Cassar, Mark Cassar2. Exigy Ltd: Christian Tabone, Clint Spiteri, Bernard Cachia,

Jean-Paul Sultana3. Ixaris: JP Ebejer, Jimmy Borg, Brian Vella, Brian Attard Bason4. Gigi tal-MITA: Daniel D'Agostino5. CCBill EU: Emerson Farrugia, Christopher Torpuland, Briun,

Wallace Wadge, Edward Mallia6. Holistic: Ruben Gatt, Matthew Falzon, Shaun Camilleri, Luke

Rocco7. Atlas Insurance: Karl Fenech

The Student Teams

8. Iš Vilniaus: Michael Debono, Andrew Borg Cardona9. papa charlie: Ian Agius, James Scicluna10. WM\{James}: Ingram Bondin, Kevin Falzon, Andrew Gauci11. daemon: Kurt Farrugia, Stefan Lia12. AJaZ: Andrew J Said, Julian Zammit13. C^D: Claire Dimech, Daniela Cauchi14. y.a.t.n.: Darren Demicoli 15. TDGS: Kyle Pullicino, Adrian Duca, Kurt Portelli, Paul Felice16. B3ANi3S: David Borg, Duncan Borg, Thorsten Callus

The Teams and their Solutions

01: CasaSoft Implemented using C# (.Net Framework 3.5) Based on a genetic algorithm. Reduced search space by specifying a boundary in which blobs

can be placed and using bounded accuracy.

Each generation of the genetic algorithm results in the best found blob according to the population which gives the best score. The genetic algorithm continues if there is still time available.

The population size and the number of epochs per generation are calculated as a function of the bitmap size.

An optimization to automatically calculate the largest square that can fit in the circle, enabling easy flagging of pixels to be painted.

The fitness function does not to check for similar pixels, so as not to slow the speed of the algorithm.

An Evaluation Problem (50, 100, 1000 blobs)

And their solution (50 blobs)

02: Holistic

Implementation language C#

Used a basic Genetic Algorithm to approximate the result.

No fine tuning and optimizations were done on this algorithm.

Alas, all problems timed out

03: Ixaris Approach:

Based on genetic algorithms (where genes are blobs) Initial population based on each colour layer being considered as one blob (and 25

random mutants generated on this) Genetic Operators – split (gives us two blobs), swap blobs, forceful remove, remove (if

the blob has little effect), tweak radius, tweak centre position. Changing probability distribution for the above Multi-Threaded – reproduction of chromosomes done in parallel because of inherent

independence from each other Issues 

Loads of magic variables (e.g. children count, max fit population, noise filter threshold, initial population, convergence limit, probability distributions for genetic operators)

Interesting note: how to find good approximations for these (in a few hours)? (Neural Networks, Approximate Bayesian Computing for parameter estimation seem too onerous)

As it very so often happens, last minute improvement fixes actually broke the solution (image reflected along x-axis – damn cartesian coordinate system)

Don’t laugh.  (Fixed at 8:20pm – too late).

An Evaluation Problem (5, 50, 100 blobs)

And their solution (50 blobs)

04: Gigi tal-MITA Implementation language: C

Using a floodfill approach, identify connected areas of the same colour. Count the pixels in each area, order them in descending order of number of pixels, and

calculate the centroid of each group. Fit circles into the image:

Start at the centroid with a circle of maximum radius for that area. Decrease the radius until it has a pretty damn high level of inclusion (defined by a

threshold variable) Inclusion is good if it includes that particular colour or even any other - as long as it's

not white (background colour) Blob the circle with the best fit

Repeat these steps, renumbering pixels which still have not been blobbed.

On Sunday morning I realised I did not have enough time to complete the algorithm as I would have wanted to. Realising that submitting an average solution is better than submitting nothing at all, I went for a compromise - I simply used a circle of average radius. Although this leaves room for a good deal if inaccurate pixels especially if the area is non-circular, overall it has given me an accuracy of 50-69%, which I guess is reasonable for an approximate solution.

An Evaluation Problem (5, 50, 100 blobs)

Their Solution (100 blobs)

05: Pr0nsters Inc. CCBill EU Our solution has the following main stages:

Edge detection Voronoi generation Clustering/filtering

Start by detecting colour edges. Next, using the edges, generate a Voronoi tessellation diagram to create a

number of polygons giving the locations where cells meet. These are transformed into circles. Clustering/filtering mechanism:

Take the most frequent colour and use that as our base blob to cover the image. Order the circles by radius Calculate circle overlaps - anything more than a fixed percentage gets dropped

(changing the threshold until the number of blobs is reduced as required). Stuff we didn’t manage to complete in time:

A better clustering mechanism by use of the DBScan algorithm to detect “noise” circles.  This was implemented but was too late to be merged in the main program

An Evaluation Problem (5, 50, 100, 1000 blobs)

Their Solution (1000 blobs)

07: Atlas InsuranceTechnologies: C#, .NET Framework 3.5, multithreading.

Heuristic driven using image resampling: Bitmap is downsampled, by a factor of 2, to a minimum resample area of 1×1. And also upsampled, by a factor of 3, to a maximum area of 2048×2048.

From the smallest downsample containing as-yet-unmatched pixels, randomly pick one unmatched pixel, and use it as the bounding square for a blob which would set it to the correct color.

Repeat the following until the largest upsample has been processed: Compute the respective similarity gains obtained from transforming the bounding

square by 1 pixel (only) in all directions, and pick the best transformation as the improved bounding square.

Multiply the bounding square’s dimensions and bottom-left coordinate by the stepwise resample factor (2 or 3) so as to translate it onto the corresponding pixels in the next (larger) sample.

Add the blob and go to the next unmatched pixel… Downsamples guarantee convergence of optimal bounding square for next blob

in O (log2 n) time. Upsamples help by achieving sub-pixel precision on the original bitmap.

An Evaluation Problem (5, 50, 100, 1000 blobs)

Their Solution (1000 blobs)

08: Iš VilniausImplementation Language: C# Heuristics and Algorithms

Images larger than 100x100 pixels area were resized down to fit an area of 100x100. The image is split into single colour components using a floodfill algorithm. Components are considered largest first.

A modified maximum number of blobs is used in generation. For performance, a simpler similarity function to the original bitmap was employed that

only considered the number of absolute correct pixels. Method 1: To generate timer-safe blobs, the center of the rectangular area defined by

each component is used as the centre of a blob. Method 2: A Genetic algorithm

Chromosomes were configurations of a random number of components up to the limit, and the number of splits for each component.

Each split of the components became a blob and the Center of Gravity calculated from the area of the split sub-component. The radius was calculated from the width and height of the split and also from the difference to the center of gravity.

The similarity function is used as a fitness function. Termination upon timeout or convergance.

The best solution is used

Evaluation Problem (5, 20, 100 blobs)

Their Solution (5 blobs)

09: Papa Charlie

Language of choice: Java The algorithm:

Divide the bitmap into components, each holding a colour area;

The similarity calculator was used to find the best result on a component;

Until the maximum number of blobs is reached, repeat the following procedure: Remove components with a small number of correct pixels Subdivide components with the most wrong pixels

Heuristics to cater for special cases were also created.

Another Evaluation Problem (5, 50, 100 blobs)

Their solution with 100 blobs

10: WM\{James} A suite of 3 algorithms, run separately, and the best result

chosen. Colour Based Overapproximation: For each colour (starting with

the most frequent one), the centroid of the extremal points of the points where the colour appears is calculated, and a blob drawn covering the whole area. Very fast but crude algorithm.

Cluster Based Overapproximation: Contiguous clusters of colours are ordered by size, largest first. The algorithm then assigns blobs of colour to each cluster until the blobs are exhausted, using the centroid approach. Increased cost to generate the clusters.

Cluster Based Overapproximation and Merging: Build upon previous by also merging clusters which have not been assigned a blob to an existing blob.

Result is also checked with a different blob ordering (biggest first).

There are also minor trigonometric optimisations.

Another Evaluation Problem (with 5, 20, 100 blobs)

Their solution with 20 blobs

11: daemon

“We used java to implement our blobify.  The approach we took was a genetic algorithm.”

All the problems timed out

12: AJaZTackled the problem in C# on .NET Framework 3.5

If there are enough blobs, put a blob of radius 0.0001 (we liked the way such a blob can cover a pixel!) at the centre of each pixel.

If not, use a Genetic Algorithm Large images (>2000 pixels), are scaled down to around 1000 pixels. Places blobs randomly (with no colour) Colour is calculated to be the most common color of the visible part of the circle Mutation:

Blobs with low amount of correct pixels are removed Blobs that overlap on a blob of the same color are removed Other Mutations: Change position, Change Radius, Change Blob order chances for each one are variable and change depending on which produces the most

"better" mutations. (Sort of tailor making the mutation for every different bitmap) The last 5 seconds spend mutating the best solution Stopping condition: 57 seconds (excluding input and output)

Guess the original image (1000 blobs)

Che Guevara (5, 50, 100, 1000 blobs)

13: C^D

The blobmap is divided into monochrome blobmaps. With each monochrome blobmap:

Extract a rectangle from this map such that all colored pixels are within it.

Create a blob with radius half the smallest side of the rectangle.

If half the pixels in the rectangle are coloured then use the blob

Otherwise, split the monochrome map and repeat. Different splitting mechanisms were used.

The blobs used are the biggest ones generated.

The smiley faces ended up as…

this using 100 blobs

14: y.a.t.n. Implementation language: C

Randomly place blobs in areas of the image which use the same colour.

The best blob (in terms of correct pixels and paint used) is kept.

The remaining blobs are randomly refined by adding blobs in positions which enclose the same colour of the blob being put and again assessing the similarity.

Repeat until the algorithm converges or until 50 seconds have elapsed.

Guess who this is (1000 blobs)

5, 50, 100, 1000 blobs

15: TDGS

Language used: Java The size of the input bitmap is increased to a square

with sides of length a power of two The bitmap is then split into multiple grids of various

sizes to determine the occurrence of colours at various areas of the image. In those areas where the occurrence of a colour is higher than a certain threshold, a blob is added to cover that area.

The grids in an outward spiral fashion (“just to be cool”)

Another Evaluation Problem (5, 50, 100, 1000 blobs)

1000 blobs

16: B3ANi3S Tried to emulate the way a human would try to solve

the problem: Recognize the basic shapes in the picture by reading the

bitmap line by line, storing the coordinates of the start and end points of each line having the same colour, and eliminating errors through sampling.

From the coordinates collected earlier, calculate the optimal blob position and size which will fill all the shape with least overflow, and if the total number of blobs is exceeded, remove the smallest blobs.

Order the blobs by checking the perimeter of the shape in the original image and seeing which colour is best on top.

No working solution submitted

Some other impressive solutions

More impressive solutions

Some other impressive solutions

Some other impressive solutions

Some other impressive solutions

Some other impressive solutions

Some other impressive solutions

Some other impressive solutions

But who won?

Student Teams

# Team Name Score

Student Teams

# Team Name Score

09 papa charlie 720

Student Teams

# Team Name Score

08 Iš Vilniaus 765

09 papa charlie 720

Student Teams

# Team Name Score

10 WM\{James} 1075

08 Iš Vilniaus 765

09 papa charlie 720

Student Teams

# Team Name Score

15 TDGS 1790

10 WM\{James} 1075

08 Iš Vilniaus 765

09 papa charlie 720

Student Teams

# Team Name Score

12 AJaZ 2685

15 TDGS 1790

10 WM\{James} 1075

08 Iš Vilniaus 765

09 papa charlie 720

Industry Teams

# Team Name Score

Industry Teams

# Team Name Score

04 Gigi tal-MITA 455

Industry Teams

# Team Name Score

03 Ixaris 465

04 Gigi tal-MITA 455

Industry Teams

# Team Name Score

05 CCBill EU 1595

03 Ixaris 465

04 Gigi tal-MITA 455

Industry Teams

# Team Name Score

01 CasaSoft 1710

05 CCBill EU 1595

03 Ixaris 465

04 Gigi tal-MITA 455

Industry Teams

# Team Name Score

07 Atlas Insurance 3380

01 CasaSoft 1710

05 CCBill EU 1595

03 Ixaris 465

04 Gigi tal-MITA 455

Merging the Results

# Team Name Score

07 Atlas Insurance 3320

12 AJaZ 1260

01 CasaSoft 840

05 CCBill EU 830

15 TDGS 690

Prosit tal-Programmi! Hope to see you again next year… To check the evaluation problems were unchanged:

1. Take the encrypted file2. Uncompress using the following password:

Il-materjal li inbeżaq mill-impatti ikkrejaw krateri oħra sekondarji. Xi krateri għandhom kisja ta' l-eġekta (materjal) li jifraq għal mijiet ta' kilometri mill-punt tal-impatt.

3. Rename to .zip4. Uncompress using the following password:

Il-Qamar huwa l-unika satellita naturali tal-Art. Jorbita il-pjaneta tagħna minn 350,000 għal 400,000 km, jakkonpanja l-Art fit-tidwir madwar l-istilla Sol (Xemx).Ġieli jissejjaħ Luna (mill-Kelma Latina għal qamar), jew Selene.

Thanks to…