Upload
daphne-whybrew
View
219
Download
1
Tags:
Embed Size (px)
Citation preview
Who killed Mr Blobby?
Mathematics and (partly) Computer Science are really all about one question: What can we say about objects which can be described in a finite manner?
Compression is usually done by trying to find regularities in the source: Finding concise rules which generate part of the
data bigger than the rule. Find common subparts and describe them once
Mr Blobby is innocent
Applications of compression are numerous: Smaller data for transmission and storage e.g. fit
a whole movie at high resolution on a single DVD. Less resources if the description we are
compressing is of a real life process: e.g. process chain optimisation.
Allow you to win the 2010 ICTSA Programming Challenge
Mr Blobby woz ’ere
Given: an image with a 3-bit depth (8 colours) a number n
To produce: not more than n blobs of paint approximating the
original image in less than one minute. using as little as possible paint
Some Interesting Notes
The problem of finding an optimal solution is hard: There’s an infinite search space (allowing fractions, and
blobs with centres outside the grid) But could be reduced to a finite one if we ignore the less
paint is better constraint.
Someone asked “is it NP-complete?” It is in NP (so certainly not worse than NPC), and seems to be in NPC, but am not sure.
A brute force approach is out of the question
The Teams: Industry-Based
1. CasaSoft: Karl Cassar, Mark Cassar2. Exigy Ltd: Christian Tabone, Clint Spiteri, Bernard Cachia,
Jean-Paul Sultana3. Ixaris: JP Ebejer, Jimmy Borg, Brian Vella, Brian Attard Bason4. Gigi tal-MITA: Daniel D'Agostino5. CCBill EU: Emerson Farrugia, Christopher Torpuland, Briun,
Wallace Wadge, Edward Mallia6. Holistic: Ruben Gatt, Matthew Falzon, Shaun Camilleri, Luke
Rocco7. Atlas Insurance: Karl Fenech
The Student Teams
8. Iš Vilniaus: Michael Debono, Andrew Borg Cardona9. papa charlie: Ian Agius, James Scicluna10. WM\{James}: Ingram Bondin, Kevin Falzon, Andrew Gauci11. daemon: Kurt Farrugia, Stefan Lia12. AJaZ: Andrew J Said, Julian Zammit13. C^D: Claire Dimech, Daniela Cauchi14. y.a.t.n.: Darren Demicoli 15. TDGS: Kyle Pullicino, Adrian Duca, Kurt Portelli, Paul Felice16. B3ANi3S: David Borg, Duncan Borg, Thorsten Callus
01: CasaSoft Implemented using C# (.Net Framework 3.5) Based on a genetic algorithm. Reduced search space by specifying a boundary in which blobs
can be placed and using bounded accuracy.
Each generation of the genetic algorithm results in the best found blob according to the population which gives the best score. The genetic algorithm continues if there is still time available.
The population size and the number of epochs per generation are calculated as a function of the bitmap size.
An optimization to automatically calculate the largest square that can fit in the circle, enabling easy flagging of pixels to be painted.
The fitness function does not to check for similar pixels, so as not to slow the speed of the algorithm.
02: Holistic
Implementation language C#
Used a basic Genetic Algorithm to approximate the result.
No fine tuning and optimizations were done on this algorithm.
Alas, all problems timed out
03: Ixaris Approach:
Based on genetic algorithms (where genes are blobs) Initial population based on each colour layer being considered as one blob (and 25
random mutants generated on this) Genetic Operators – split (gives us two blobs), swap blobs, forceful remove, remove (if
the blob has little effect), tweak radius, tweak centre position. Changing probability distribution for the above Multi-Threaded – reproduction of chromosomes done in parallel because of inherent
independence from each other Issues
Loads of magic variables (e.g. children count, max fit population, noise filter threshold, initial population, convergence limit, probability distributions for genetic operators)
Interesting note: how to find good approximations for these (in a few hours)? (Neural Networks, Approximate Bayesian Computing for parameter estimation seem too onerous)
As it very so often happens, last minute improvement fixes actually broke the solution (image reflected along x-axis – damn cartesian coordinate system)
Don’t laugh. (Fixed at 8:20pm – too late).
04: Gigi tal-MITA Implementation language: C
Using a floodfill approach, identify connected areas of the same colour. Count the pixels in each area, order them in descending order of number of pixels, and
calculate the centroid of each group. Fit circles into the image:
Start at the centroid with a circle of maximum radius for that area. Decrease the radius until it has a pretty damn high level of inclusion (defined by a
threshold variable) Inclusion is good if it includes that particular colour or even any other - as long as it's
not white (background colour) Blob the circle with the best fit
Repeat these steps, renumbering pixels which still have not been blobbed.
On Sunday morning I realised I did not have enough time to complete the algorithm as I would have wanted to. Realising that submitting an average solution is better than submitting nothing at all, I went for a compromise - I simply used a circle of average radius. Although this leaves room for a good deal if inaccurate pixels especially if the area is non-circular, overall it has given me an accuracy of 50-69%, which I guess is reasonable for an approximate solution.
05: Pr0nsters Inc. CCBill EU Our solution has the following main stages:
Edge detection Voronoi generation Clustering/filtering
Start by detecting colour edges. Next, using the edges, generate a Voronoi tessellation diagram to create a
number of polygons giving the locations where cells meet. These are transformed into circles. Clustering/filtering mechanism:
Take the most frequent colour and use that as our base blob to cover the image. Order the circles by radius Calculate circle overlaps - anything more than a fixed percentage gets dropped
(changing the threshold until the number of blobs is reduced as required). Stuff we didn’t manage to complete in time:
A better clustering mechanism by use of the DBScan algorithm to detect “noise” circles. This was implemented but was too late to be merged in the main program
07: Atlas InsuranceTechnologies: C#, .NET Framework 3.5, multithreading.
Heuristic driven using image resampling: Bitmap is downsampled, by a factor of 2, to a minimum resample area of 1×1. And also upsampled, by a factor of 3, to a maximum area of 2048×2048.
From the smallest downsample containing as-yet-unmatched pixels, randomly pick one unmatched pixel, and use it as the bounding square for a blob which would set it to the correct color.
Repeat the following until the largest upsample has been processed: Compute the respective similarity gains obtained from transforming the bounding
square by 1 pixel (only) in all directions, and pick the best transformation as the improved bounding square.
Multiply the bounding square’s dimensions and bottom-left coordinate by the stepwise resample factor (2 or 3) so as to translate it onto the corresponding pixels in the next (larger) sample.
Add the blob and go to the next unmatched pixel… Downsamples guarantee convergence of optimal bounding square for next blob
in O (log2 n) time. Upsamples help by achieving sub-pixel precision on the original bitmap.
08: Iš VilniausImplementation Language: C# Heuristics and Algorithms
Images larger than 100x100 pixels area were resized down to fit an area of 100x100. The image is split into single colour components using a floodfill algorithm. Components are considered largest first.
A modified maximum number of blobs is used in generation. For performance, a simpler similarity function to the original bitmap was employed that
only considered the number of absolute correct pixels. Method 1: To generate timer-safe blobs, the center of the rectangular area defined by
each component is used as the centre of a blob. Method 2: A Genetic algorithm
Chromosomes were configurations of a random number of components up to the limit, and the number of splits for each component.
Each split of the components became a blob and the Center of Gravity calculated from the area of the split sub-component. The radius was calculated from the width and height of the split and also from the difference to the center of gravity.
The similarity function is used as a fitness function. Termination upon timeout or convergance.
The best solution is used
09: Papa Charlie
Language of choice: Java The algorithm:
Divide the bitmap into components, each holding a colour area;
The similarity calculator was used to find the best result on a component;
Until the maximum number of blobs is reached, repeat the following procedure: Remove components with a small number of correct pixels Subdivide components with the most wrong pixels
Heuristics to cater for special cases were also created.
10: WM\{James} A suite of 3 algorithms, run separately, and the best result
chosen. Colour Based Overapproximation: For each colour (starting with
the most frequent one), the centroid of the extremal points of the points where the colour appears is calculated, and a blob drawn covering the whole area. Very fast but crude algorithm.
Cluster Based Overapproximation: Contiguous clusters of colours are ordered by size, largest first. The algorithm then assigns blobs of colour to each cluster until the blobs are exhausted, using the centroid approach. Increased cost to generate the clusters.
Cluster Based Overapproximation and Merging: Build upon previous by also merging clusters which have not been assigned a blob to an existing blob.
Result is also checked with a different blob ordering (biggest first).
There are also minor trigonometric optimisations.
11: daemon
“We used java to implement our blobify. The approach we took was a genetic algorithm.”
All the problems timed out
12: AJaZTackled the problem in C# on .NET Framework 3.5
If there are enough blobs, put a blob of radius 0.0001 (we liked the way such a blob can cover a pixel!) at the centre of each pixel.
If not, use a Genetic Algorithm Large images (>2000 pixels), are scaled down to around 1000 pixels. Places blobs randomly (with no colour) Colour is calculated to be the most common color of the visible part of the circle Mutation:
Blobs with low amount of correct pixels are removed Blobs that overlap on a blob of the same color are removed Other Mutations: Change position, Change Radius, Change Blob order chances for each one are variable and change depending on which produces the most
"better" mutations. (Sort of tailor making the mutation for every different bitmap) The last 5 seconds spend mutating the best solution Stopping condition: 57 seconds (excluding input and output)
13: C^D
The blobmap is divided into monochrome blobmaps. With each monochrome blobmap:
Extract a rectangle from this map such that all colored pixels are within it.
Create a blob with radius half the smallest side of the rectangle.
If half the pixels in the rectangle are coloured then use the blob
Otherwise, split the monochrome map and repeat. Different splitting mechanisms were used.
The blobs used are the biggest ones generated.
14: y.a.t.n. Implementation language: C
Randomly place blobs in areas of the image which use the same colour.
The best blob (in terms of correct pixels and paint used) is kept.
The remaining blobs are randomly refined by adding blobs in positions which enclose the same colour of the blob being put and again assessing the similarity.
Repeat until the algorithm converges or until 50 seconds have elapsed.
15: TDGS
Language used: Java The size of the input bitmap is increased to a square
with sides of length a power of two The bitmap is then split into multiple grids of various
sizes to determine the occurrence of colours at various areas of the image. In those areas where the occurrence of a colour is higher than a certain threshold, a blob is added to cover that area.
The grids in an outward spiral fashion (“just to be cool”)
16: B3ANi3S Tried to emulate the way a human would try to solve
the problem: Recognize the basic shapes in the picture by reading the
bitmap line by line, storing the coordinates of the start and end points of each line having the same colour, and eliminating errors through sampling.
From the coordinates collected earlier, calculate the optimal blob position and size which will fill all the shape with least overflow, and if the total number of blobs is exceeded, remove the smallest blobs.
Order the blobs by checking the perimeter of the shape in the original image and seeing which colour is best on top.
No working solution submitted
Student Teams
# Team Name Score
15 TDGS 1790
10 WM\{James} 1075
08 Iš Vilniaus 765
09 papa charlie 720
Student Teams
# Team Name Score
12 AJaZ 2685
15 TDGS 1790
10 WM\{James} 1075
08 Iš Vilniaus 765
09 papa charlie 720
Industry Teams
# Team Name Score
01 CasaSoft 1710
05 CCBill EU 1595
03 Ixaris 465
04 Gigi tal-MITA 455
Industry Teams
# Team Name Score
07 Atlas Insurance 3380
01 CasaSoft 1710
05 CCBill EU 1595
03 Ixaris 465
04 Gigi tal-MITA 455
Merging the Results
# Team Name Score
07 Atlas Insurance 3320
12 AJaZ 1260
01 CasaSoft 840
05 CCBill EU 830
15 TDGS 690
Prosit tal-Programmi! Hope to see you again next year… To check the evaluation problems were unchanged:
1. Take the encrypted file2. Uncompress using the following password:
Il-materjal li inbeżaq mill-impatti ikkrejaw krateri oħra sekondarji. Xi krateri għandhom kisja ta' l-eġekta (materjal) li jifraq għal mijiet ta' kilometri mill-punt tal-impatt.
3. Rename to .zip4. Uncompress using the following password:
Il-Qamar huwa l-unika satellita naturali tal-Art. Jorbita il-pjaneta tagħna minn 350,000 għal 400,000 km, jakkonpanja l-Art fit-tidwir madwar l-istilla Sol (Xemx).Ġieli jissejjaħ Luna (mill-Kelma Latina għal qamar), jew Selene.