ReMorph - Report - 2005

Re-Morph Facial Warping and Morphing Software

3rd

Year Group Project

Final Report

January 2006

Project Supervisor

Dr Daniel Rueckert

Group Members

Anish Mittal Jonathan Enav

Chris Roebuck Dharmesh Malam

Rikin Shah Ravi Madlani

1

Contents

1. Abstract 3

1.1. Introduction 3

1.2. Team Members 3

1.3. Project Supervisor 3

2. Background 4

2.1. Image Metamorphosis 4

3. Existing Solutions 6

3.1. Common Functionality 6

4. Specification Introduction 7

4.1. Minimum Specification 7

4.2. Extended Specification 8

4.3. Optimal Specification 8

4.4. Choice of Programming Language 8

5. GUI 9

5.1. Introduction 9

5.2. Analysis 9

5.3. Implementation 9

5.3.1. Class Listing 16

5.3.2. Screen Shots of GUI 18

5.3.3. Feature List 30

5.4. Testing 32

5.5. Evaluation 35

6. Core 38

6.1. Introduction 38

6.2. Analysis 38

6.3. Design 38

6.4. Implementation 43

6.4.1. Class Listing (Core) 47

6.4.2. Class Listing (Morph) 48

6.5. Evaluation 49

7. Core Algorithms 50

7.1. Introduction 50

7.2. Field Morphing (Beier-Neely Algorithm) 53

7.2.1. Background 53

7.2.2. Analysis 56

2

7.2.3. Design and Implementation 56

7.2.4. Testing 59

7.2.5. Evaluation 63

7.3. Mesh Warping 64

7.3.1. Background and Analysis 65

7.3.2. Design 66

7.3.3. Implementation 68

7.3.4. Testing 72

7.3.5. Evaluation 74

8. Performance Analysis 75

9. Project Summary 83

9.1. Dividing the Project 83

9.2. Continuous Integration 83

9.3. Module Integration 84

10. Bibliography 88

Appendixes 89

A. Organisational tools and Methods 89

B. Meetings and Pair-Programming Logbooks 91

3

1. Abstract

Facial warping and morphing

Warping is the process of changing the spatial configuration of one image to match

that of another whilst maintaining its coloring. Control points are used to mark out the

spatial landmarks, which a warping algorithm uses as a reference when manipulating

the image pixels. Morphing is a technique which combines warping with cross-

dissolution to create a smooth transition from one image to another. In this document

we outline our approach to solving the problem of implementing a tool to apply these

principles in morphing faces.

1.1 Introduction

The goal of this project was to create a program that would perform facial morphing.

Our implementation has focused on providing a tool which could be used as a

teaching aid in learning the different morphing techniques. We have chosen to

implement two different algorithms, one based on Field Morphing and the other on

the Mesh technique. These were chosen because of their different approach to

morphing, one using lines and the other points, as well as their contrasting

performance and quality levels. In terms of the software engineering aspect, we have

focused on making the application fully extensible, providing a plug-in architecture

whereby it can easily be integrated with third party algorithms.

1.2 Team Members

Jonathan Enav je203

Anish Mittal (Leader) akm103

Ravi Madlani rdm03

Dharmesh Malam (Secretary) dm203

Christopher Roebuck cjr03

Rikin Shah rs303

1.3 Project Supervisor

Dr Daniel Rueckert [email protected]

4

2. Background

Re-Morph is a program which allows users to morph between multiple images using

various algorithms and get a visual representation of the processes required to do so.

In this section we will give a brief account of facial ‘warping’ and ‘morphing’ which

will aid the reader in understanding the core of our project.

2.1 Image Metamorphosis

Metamorphosis is the fluid transformation from one image to another though facial

‘warping’ and ‘morphing’. In this project when we mention ‘morphing’ we will be

referring to what we defined as ‘metamorphosis’.

Cross Dissolve

Before warping the most common method of a transition from one image to another

was through cross-dissolve. Cross dissolve works by interpolating the colour of each

pixel over time from that of the source image to the destination image. The weight of

the start and end pixel depends on the transition period. An example of a cross

dissolve is given below.

Figure [N] Example of Cross Dissolve [vc98]

Warping

Warping an image causes it to change its spatial configuration whilst maintaining its

colouring. In our project we have implemented two types of warping: Field Warping

and Mesh Warping. Field Warping uses corresponding lines in both the source and

destination images to generate the warp. Mesh Warping uses corresponding points in

the images. Both types of warping will be discussed in further detail in the

Algorithms section. An example of a warp is shown below

5

Morphing

Morphing is the process of one image fading into another. It does not necessarily use

warping to achieve this but doing so yields much more realistic effects. Below is an

example of a morph sequence turning a human into a cat.

6

3. Existing Solutions

A great number of morphing solutions are currently available. In this section we

detail the features common to all of. Our solution aims to solve the weaknesses of

existing solutions and extend the existing feature set.

3.1 Common Functionality

All the solutions currently available require the user to follow more or less the same

process for creating a morphing sequence.

Process

Control Points:

Control Points are used to mark corresponding points on the two images e.g. match

the ears on the source image to the ears on the destination image. Control Points are

either points or lines depending on the warping algorithm being used.

Our Solution

The unique point to our solution is the breakdown of a morph which is available to the

user. The user can see all the images which were used to generate the morphing

sequence. We have also developed a plug-in architecture which will allow the

development of more algorithms. The full feature-list is available in the Feature List

section.

Select Start

Image

Select End

Image

Draw

corresponding

control points on

each picture

Generating

Morphing

Sequence

7

4. Specification Introduction

The following section describes the key deliverables necessary for the application to

satisfy our brief.

4.1 Minimum Specification

In order for the application to be complete in its functionality, it must as a minimum

conform to the specification detailed below.

The application will be split into three tangible sections, the GUI, a Core and the

Algorithms. This approach should enable us to split the work most effectively, and as

each section is independent should aid debugging and future extendibility of the

application. The key features of each section are described below.

GUI

The GUI is the interface between the user and the backend processes, and hence

should be both interactive and intuitive in use. It will incorporate the following

features as a means to achieve this goal:

� A wizard to guide the user through the warping-morphing process

� Loading of images either by browsing user-files or through drag-drop support

� Selection of control points for the morph/warp, automatically providing the right

‘control point tool’ depending on the morphing algorithm selected

� Automatic resizing of images

� Provide an interface to set user-preferences and parameters

� Integration of a movie player to show the generated morph/warp clips

� Provision of a ‘slider’ control to enable viewing at the intermediated stages of the

metamorphosis

Core

The Core is the interface between the user-oriented front-end and the backend

algorithms. It will control the following features:

� Provide a standard interface to enable communication between the front and back

ends, maximizing flexibility e.g. the plug-in can specify the number of parameters

required by the algorithm to the GUI dynamically through the core

� File management services to create/edit/save user preferences and files to a

‘project’ which can be reloaded

� Provision of a plug-in architecture for loading algorithms

� Generation of a movie from the series of images returned by the algorithms

Algorithms

� The algorithmic component of the application is where the warping/morphing of

the images actually takes place. The specification is as follows:

8

� A set of code libraries which can be ‘plugged’ into the core to perform the desired

functions

� The algorithms will be as self-contained as possible, with common functions

being grouped into shared code libraries to minimize duplication

� Morphing will be provided through two different warping algorithms to show

their relative effectiveness

� A cross-dissolving algorithm

4.2 Extended Specification

To increase the marketability of the product the following features would be suitable

additions, and should be achievable in the time available.

� Allow multiple face metamorphosis

� A teaching mode

� Output the stages of the metamorphosis individually to describe the intermediate

stages of the process

� Integration with a web-cam to enable instant image capture and metamorphosis

4.3 Optional Specification

These features would enhance the overall usability of the system, but are by no means

necessary and may not be achievable in the allotted project lifecycle

� Real-time warping of an image

� Automatic detection of facial features and auto-cropping of the face

� Enhance application usability through features such as a context sensitive help

� A more intuitive ‘control point tool’ – such as a lasso to increase the accuracy of

the metamorphosis

� Implement more algorithms to illustrate their effectiveness when used as teaching

tool

4.4 Choice of Programming Language

For our project we decided to develop using c#. This was after great consideration of

implementing in Java. The primary reason for using c# is the GUI intensive nature of

this project. C# has a very rich API full of features which are very useful for our

project including image manipulation and graphics features.

As a group we decided we needed a rich language with great GUI tools and an IDE to

help us with rapid development.

9

5. The GUI

5.1 Introduction

In this section we describe the full development cycle of the GUI (Graphical User

Interface). The GUI is the side of the program which the User interacts with in order

to use the features and perform tasks.

In order to measure the usability of the application we shall use Nielsen’s 10 Usability

Heuristics. These are:

1) Visibility of System Status

2) Match between System and Real World

3) User Control and Freedom

4) Consistency and Standards

5) Error Prevention

6) Recognition rather than Recall

7) Flexibility and efficiency of use

8) Aesthetic and minimalist design

9) Help users recognise, diagnose and recover from errors.

10) Help and Documentation

If we find that our GUI satisfies all of Nielsen’s 10 Usability Heuristics then we have

sufficient evidence that a usable interface has been developed.

5.2 Analysis

The GUI should provide an easy to use, intuitive interface for the user. From the

specifications it was agreed that the GUI should provide the following functionality:

� Load multiple images into application for morphing. This should be possible

through Drag-and-Drop and also a File Chooser dialogue box.

� Manipulate images to enable morphing to work

� Add, Select and Manipulate Control Points on all images in a Morph Sequence.

� Provide a wizard which walks the user through a morphing project.

� Display all intermediary frames required to generate the morph for use in a

‘Teaching Mode’.

� Provide a Movie Player to view the animated morph sequence.

5.3 Design

Main Form Design

10

Result Form Design

Control

Toolbar

Image 1 Image 2 Image ..

Status Bar

Re-Morph

Control

Toolbar

Status Bar

Re-Morph

Forward

Backward

Cross-Fade

Menu Bar

Menu Bar

11

Movie Player Design

Image Resize form

Above is the design of the 4 main forms. These forms contain all of the functionality

that is required by the specification. Below is a breakdown of the forms and their use.

The main form

Re-Morph

Slider

Control

Movie

Display box

Re-Morph

Image 1 Image 2

12

This form is the first form that the user will be presented with. It will allow the user to

select a morph from a menu item that will be dynamically created based on the plug-

ins that are available (a full listing of the menu items can be seen below).

Once the user has selected a type of morph, based on the plug-in that was chosen, a

picture box will then appear on the form. At this point the user has two options for

loading and image, firstly they can simply drag and drop an image on to the picture

box, or secondly they can click on a add image button and a file browser will appear

allowing the user to navigate to the images location.

Now that a start image has been loaded the user can add more images depending on

the number of faces they wish to warp between. We will also allow the user to choose

the number of frames they will like to have between each two adjacent images for that

part of the warp.

The user now has to place the anchor markers on the faces to facilitate the morph.

These will be different depending on the plug-in chosen. We propose to aloe the user

to add anchors in many different ways, such as right clicking the image, holding down

shift and clicking and clicking a button.

Once a user has placed all of the anchors in the correct place they are now ready to

proceed with the morph.

By clicking the morph button we may decide on showing the user a preview of the

morph occurring in real time, but this depends on how fast the morphs happens.

Once the morph has finished the next form will pop up.

Result Form

This form will display the information required for the teaching mode as described in

our specification. Firstly the forward warp will be displayed. This means that the

deformation of the leftmost image through all of the points of the other images to the

last will be shown. Secondly the reverse of this will be shown, and finally the

combination of the two images at each index will be shown, this image will be the

output of the warp.

This Form will allow the user to overlay the points on the morph for each image to

see how the morph has progressed, and will also allow the user to proceed to the next

form to view the video.

Movie Player

This form will allow the user to play a video representation of the warp. It should also

allow the user to save the image to disk and encode it with any codec that is installed

on the host pc.

Image Resize

This form was a late addition due to us noticing that morphs between images of

differing sizes was not desirable. We decided to prompt the user to the fact that two

13

images are different when the second is imported into the program. The image Resize

form will open and allow the user to choose to resize an image to match the other

either by stretching, adding blank space to the image to keep the aspect ratio, or

cropping.

Menu System

During the development of the application the menu structure changed, the final

structure is described below.

File

New

Beer Neely

Mesh

Open

Save

Save As

Close

Exit

Edit

Undo

Redo

Parameters

Anchors

Add

Remove

Select All

Clear All

Colour –

Style –

Size –

View Anchors

Images

Zoom --

Reset Size

Increase

Decrease

Auto Horizontal

Auto Vertical

Help

About

Manual

Most of these items are self explanatory and intuitive for a user of Microsoft products

to use.

14

File

New

Beer Neely

Mesh

The above is created dynamically based on the plug-ins that are available.

5.4 Implementation

The GUI was implemented using C# form design mode in Visual Studio. We made

use of both forms and windows GUI controls in our implementation to allow the reuse

of code and flexibility of the design.

Below is a diagrammatic representation of the structure of the program.

15

Web Cam

One of our requirements was to allow the user to add an image by the use of image

capture through a web cam. Most of the code to do this is integrated into direct show.

Direct show acts as a wrapper for some c++ code that is written to handle web cam

usage. We have a form with a picture box on it and we allow the user to select an

option to start the capture of an image. This image is then overlaid onto the screen on

top of the picture box. The user is then given 5 seconds before an image is taken, this

image is then displayed in the picturebox and so the user can use it as part of the

morph.

GuiController

CtlImageStrip

MainForm

PictureBox

CtlImageLine

CtlImagePoint

CtlAnchorPoint Image

clsResult

Result

ctlLine

ImageEditing

VideoForm

To the Core

16

5.4.1 Class listing

Below is a list of the classes and a brief description of their functionality.

GuiController

As described in the section ‘Our Solution’ the interfaces between the model and view

act as the controls, and this is such a class. This class acts as the interface between the

GUI and the Core and as such passes information around to allow the morph to

happen. Any change to the underlying model made from the GUI will pass through

this class to the core. For this reason we have made this class a singleton.

Actions such as opening and saving images and videos and loading saved projects are

conducted by this class. If the user wishes to open an existing project this class will

create the model and pass that to ctlImageStrip which will create the view using a

static method to decide which type of morph, i.e. a line or point morph the model

represents.

GUI controller is responsible for serialisation and more about that can be read in the

core section under the Model.

MainForm

This form contains all of the menus that get displayed, and the code to call the correct

code elsewhere to execute the desired action. This form makes extensive use of the

GuiController.

CtlImageStrip

This class is the parent class of both CtlImageLine and CtlImagePoint, and as such

contains all of the common functionality. It allows the user to drag and drop an image

into a picture box, add, select and delete multiple anchors (the points on the images to

facilitate the morph) and do any form of image editing that will be required, all of the

features will be contained in a feature list.

This class is also where changes to the model are initiated, the way that we decided to

solve the problem of updating both the model and the GUI is to have redraw functions

that allow the users to update the view, then update the model and then clear the view

and redraw the view. We took this approach for correctness and it does not lead to a

drop off in performance.

This class also contains the context menus which are displayed when the user right

clicks an area on the control.

ctlResult

This class contains the result of an executed morph. This means that it has to display 3

sets of images. To achieve this we made use of CtlImageStrip and so this form

contains 3 of these, one for the forward morph, one for the reverse morph and then

17

one for the morph with the cross dissolve between the two other sets of images. This

control sits in result.cs.

Result

This form contains the ctlResult , and allows the user to manipulate what they see and

to zoom in and out.

ctlImagePoints

This child GUI control sits on the MainForm form and contains a picture box for each

image that will form the morph. This class contains all of the specific code to points

i.e. the code to delete multiple points.

ctlImageLines

This child GUI control sits on the MainForm form and contains a picture box for each

image that will form the morph. This class contains all of the specific code to lines i.e.

the code to delete multiple points.

The way that the view links into the model is that the view has a picture box with a

list of anchors. These anchors are grouped into families, i.e. when an anchor is added

to a picture one is added to all of the other anchors, these anchors are then part of the

same family. This family number is then the index in the collection in the arraylist in

the model.

ctlAnchorPoint

This class represents a anchor point, which is one of the anchor types that the user can

use, which depends on the algorithm that the user chooses to use in the morph. An

anchor knows which picture that it is on, what family it belongs to and has a reference

to the strip that it is on. All of this information is required due to the nature of deleting

an anchor and the fact that it is required that the corresponding anchor is deleted from

all other images, leaving each image with the same number of anchor points.

The anchors markers are just one character using a system font available on all

windows machines (windings). This is a vector font contain things such as crosshairs,

triangles and diamonds, and these scale to any size easily. This class contains the

information about the font and colour of the anchor; this is required as we allow the

user to change the appearance of the anchors. There are 2 colouring algorithms that

colour the anchors corresponding to their families making them distinguishable.

clsViewLine

For lines the situation is more complex. The approach we decided on was to create to

ctlAnchorPoints and then to instantiate a class clsViewLine which represents the line.

With this representation we use GDI to draw the line. This class is also aware of its

location as the ctlAnchorPoint is.

AboutForm

18

This has some information about the program.

CustomMenuItem

This class is used for anchor settings and for giving the user a dynamic list of the

plug-ins.

GetParams

This class is required to hold the information about the parameters that the algorithm

can take as input so that the user can change them as desired.

globalGUIConsts

This class contains the values of variables such maximum sizes.

ImageEditing

This class contains the code to allow the user to edit the sizes of images. This is

required as images that are different sizes can not be used in a morph.

VideoForm

This is the form that plays the video that is created as a result of the morph. It contains

functionality to allow the user to play, pause and a slider to allow the user to view any

part of the video. The stream generation is conducted in the core, and so in that

section there is a section describing the library we used.

5.4.2 Screen shots of the GUI

In this section I will talk about the GUI design and some of the features avalible for

the user to use. Below is the icon that the user double clicks to run our application.

19

Above is the splash screen that first comes up once the user opens our application. We

designed this to grab the users attention.

Above is the wizard that guides the user through either opening a morph or creating a

new one. The list of the algorithms is created dynamically based on the implemented

plug-ins that are in the plug-ins folder.

User clicks open ->

20

If the user clicks open they will be presented with a file browser to locate any saved

projects. They can choose one and open it like the example below.

21

This has loaded up the images that the user has chosen previously, with all of the

anchors and preferences set from the point that the user clicked save. From this point

the user is free to morph.

User clicks New Morph ->

If the user clicks new morph they will be presented with this window to allow them to

import the images and add anchors. Below is the toolbar.

• The first icon allows the user to create a new morph. The list that hangs of this

icon is dynamically created as the list of the algorithms on the wizard was.

• Clicking on the second icon will open up the file browser seen before.

• The third icon allows the user to save their project to list. This is done through

XML serialisation which is explained later on.

• The third icon is the add anchor icon. This button will add an anchor to each

image that is in this window.

• The forth icon is the add image icon and it will add another picture box to the

form, they user may add as many images as they see fit.

• The next 4 icons are resizing icons which resize the appearance of the images

without changing the underlying image.

CtlImagStrip

MainForm

PictureBox

Toolbar

Menus

StatusBar

22

All of these icons have tool tips to help describe what they do, and at the bottom of

the form there is a status bar which keeps the user informed of any information we

deem necessary to know about where their mouse pointer is on the form.

Below are some menu items of interest for the anchors:-

What sets our application apart from many others is the extent to which IT will allow

the user to change the appearance of the program. This serves two purposes, firstly it

allows the user to set the sizes and font of the anchors to help them best see the

anchors and so place them in the optimal positions. This was the thing that was

lacking from many of the existing applications as anchors became lost when trying to

produce very detailed warps.

From this new project state the user can get to a point read to morph from either by

browsing to an image or by dragging and dropping an image on to the picture box.

The user can choose as many images as they desire. Below is an example of this state.

23

On this form we now have a new box which indicate the number of frames the user

wants to have output of the morph between the two images. The default is 5 as seen

above. The user would now click the morph button on the toolbar.

The above form will then pop up. It displays the morph as it occurs in real-time and a

progress bar of the progress through the morph. It also displays the time elapsed and

time remaining for the morph which is extrapolated based on the time elapsed and

percentage completed. The video stream is displayed on the picturebox a call back

function from a worker thread passes back the percentage completed and the frame

pictureBox

progressBar

button

24

outputted from the algorithm. More on threading later. After the morph finishes this

window disappears.

The user is then prompted with the video screen as seen below.

This form allows the user to view a video stream of the morph that was just produced.

The video area is a picturebox and we use a library avi-previewer which draws on top

of it to display the video. Avi previewer is described in a later section.

The track bar is moved when the video is played from within avi-previewer, and

conversely the slider can be moved to change the frame that is in the picturebox.

The video can also be sped up, or slowed down by altering the frames/Sec box, the

code to do this is also within avi-previewer.

The two buttons on the bottom right of the form are the ones of interest. They are

from the top, export video and teaching mode.

User clicks export video ->

picturebox

TrackBar

25

They are required to give a name and location.

We also allow the user to compress the video as they can become quite large in their

raw state. What this form does is it dynamically creates the list of all of the codecs

that are installed on the machine and allows the user to choose one and the rest is

handled by windows components.

User clicks teaching mode ->

26

This form contains 3 image strips of the images that go into making a morph. This

allows the user to see all of the stages of the morph, ie the forward warp, backwards

warp and the cross fade. The toolbar is shown below.

The first four icons allow the user to resize the images.

The next 3 allows them to select which imagestrip or combinations of image strips

they can see.

The eighth icon allows the user to view the anchors on the image, this then shows the

user exactly how the warp has happened as seen below.

picturebox

ToolBar

27

The last item is a printout of the time that the morph took. This was initially just used

in our performance analysis, but we decided that it would be good information for the

user to have. The time is generated in GuiController.

The way that this form works is that the algorithm passes out each frame and its

anchors after each iteration. This is then held in a clsResult object, which is in the

model. This object contains 3 clsTransforms, which are the model for the

ctlImageStrips. When this objects hits the GUI it looks like the above screenshot, and

hold all of the information to act like a teaching aid. All of these classes are described

in their respective sections; this should act as a brief description of what is happening

behind the scenes. The lines are just drawn on to the pictureboxes using GDI.

The last form to describe is the image editing form.

28

This form is loaded when the user imports an image that is not compatible with an

existing one. This will usually be due to the size of the image not being the same, and

so this form allows the user to preview the two images and see how altering the

different sizes affects the images so an informed choice of the size of all images can

be made.

Annotation

When clicking the add anchor button an anchor will appear in the top left corner of

each image. Dependent on the type of morph, this anchor could be a line or a dot.

The user can also position anchors with more accuracy by holding down shift as well

as clicking in the desired area where they want the line to appear.

The anchor should correspond to the same feature in both images, for example if there

is an anchor positioned over a mouth on the first image, then the corresponding

anchor should be positioned over the mouth on the second image and every other

image in the proposed morph.

picturebox

picturebox

29

To allow users to be able to figure out the corresponding anchors when the user

hovers their mouse over a particular anchor, each corresponding anchor will become

highlighted in every other image.

For increased accessibility and usability, each anchor can be modified in terms of its

size, shape and colour, all from the Anchors toolbar menu. This enables greater

accuracy when placing anchors.

Lines

default line anchor

When clicking the add anchor button from within the Bier Neely algorithm, a line

pops up on the screen as shown below:

The line can now be positioned on a facial landmark on each image, this is done by

dragging either end of the line to position it in the correct place.

It is important that the line points in the same direction in each image otherwise this

would lead to an undesirable morph, for this reason the line has a square end and a

circle end, so the user always can tell which way round the line is.

30

Dots

default dot anchor

When clicking the add anchor button from within the Mesh algorithm, a dot pops up

on the screen as shown below:

The dot can now be positioned on a facial landmark on each image. This is done by

selecting the dot and dragging the dot to the desired position on the image.

Please refer to the manual for further explanation of the GUI.

5.4.3 Feature List

31

As discussed in the section 'Our Solution', we have taken a user-centric approach and

as such we have a list of features that will facilitate the user conducting a morph.

Below is a list of features we intend to include in our final solution:

Images

User should easily be able to:

• Add – Images through dragging and dropping an image into an image

placeholder, through a file browser dialog and by clicking a button on the

toolbar.

• Remove - a specific Image from the proposed morph.

• Morph – the user can morph one image to another image, or can choose to do

a multi-morph with several images.

• Set Frame Rate – the user can set the amount of frames in a morph by

adjusting the number selector in-between images.

• Save – Separate images from before and after a morph.

• Swap Images– since images are displayed left to right in chronological order

of when they will be morphed, by swapping images in the GUI the user can

alter the order in which images are morphed to one another.

• Scale – Image sizes to one another to make them compatible for a morph.

• Capture – Live Images direct from a webcam or similar connected device.

• Zoom – In and out on images to enable accurate placing of anchors and to

view detail post-morph.

• Warp a user can warp an image by placing anchors in different place on 2 or

more of the same image.

Anchors

Anchors are the general term we use to describe the dots or lines that the user can

place on facial landmarks, such as the eyes, nose, mouth, hairline etc. It is essential

that these ‘anchors’ are set up accurately and placed on the same landmark on each

image in the morph to produce a good morph. Therefore the procedure for placing

anchors should be as efficient and intuitive as possible.

User should easily be able to:

• Add – anchors to the image in a variety of ways, using the toolbar button,

holding shift down while clicking, and through the right click menu.

• Modify – each anchor in terms of its size, shape, colour for increased

accessibility and usability.

• Match – the anchor co-ordinates from one image onto another image, through

the right click menu.

• Swap – the anchors co-ordinates from one image to another image, through

the right click menu.

• Select – anchors either single select by clicking the specific anchor or multi-

select in a variety of ways, by dragging a control box around anchors that need

selecting or by holding the ctrl key down and clicking on anchors to add them

one by one to the selection.

32

Miscellaneous

• Teacher Mode – After a morph, the user should be able to view the anchors

and how they have moved to morph the images.

• Show Video – Preview a video of the morph.

• Export Video – Export a video to a number of different video formats.

• Wizard – Basic wizard that prompts the user to open an existing morph file or

to create a new morph.

5.5 Testing

The first part of our external testing was to test the usability of the program from a

user’s perspective. We did this in two parts described below.

Firstly we used two independent testers and with no instruction, just a brief

description of what warping and morphing is.

Sagir ‘Haji’ Hassam (sah03) was the first tester, his feedback is as follows.

‘The splash screen is amazing; it really grabs your attention to the application and

gives a professional appearance to the program. I found conducting the morph very

simple and intuitive, the first problem that I ran into was that the images I used were

too small, but I realised that there was a zooming option and that impressed me a lot.

There are also a lot of options to change the appearance of the icons on the images

and that’s good, but maybe they can be taken out of the toolbar and moved into

menus, but overall I give the program 10/10 for usability and just think that the line

morph took quite a long time but the outcome was amazing and I was kept up-to-date

with the progress so that was a bonus.’

Sagir’s feedback is very positive. He managed to use the program affectively

from the start without any prompting and seemed very impressed with the application.

His point that the toolbar may be too complicated was a big concern to us, and we

have now moved many of the things out of the toolbar and into the menus, he is

pleased with the results.

A.P. – Apurva Udeshi was the second tester.

‘I liked it a lot. One problem that I found was that the anchors pop up in random

places and it is hard to keep up with where the new ones have come. Apart from that

the program was very easy to use, and a lot of fun to see the outcome. Also when

choosing 2 images of differing sizes, it would be good to have a preview of the

changes that are being made and not just the option to do it. Apart from that it is very

good.’

The testing here was interesting and Ap gave good feedback. To explain

further his findings before our final release we had the new anchors come up in a

diagonal line from the top left of the image, and this meant that if the user used many

anchors the new ones my occur on the actual facial part of the image. We changed

this and now all new anchors come up in the same place when the button is used to

create a new anchor, after showing A.P. this he was happy that it solved our problem.

The second issue that he pointed out to use was that if two images are of differing size

the program intervenes and asks the user to resize one of them, but no preview is

shown of how the image now looks, it just appears in its altered state in the picture

box. He proposed that we show the user a preview of what the changes mean to the

look of the image before it is too late for the user to rectify the problem. We did this

33

by adding a new for called the image editing form, and he was also pleased with this

addition.

The second approach to testing was to check the program against the Nielsen’s

10 Usability Heuristics. These are:





5) Error Prevention






1

2

4,8

6

10

34

I have picked the above forms to best convey our approach to fulfilling the criteria.


Status bar keeps the user apprised of the status of the system.


We have many links between the real world and the application such as picture

boxes, but the most vivid has to be the idea of an anchor.


The second form above shows how we give the user freedom. We tried to make

the program as bespoke as possible. Allowing the user to change the appearance

of anchors, change values of constants and so on, but the most vivid has to be the

fact that we allow the user alter incompatible images how they desire.


3,5,9

35

We tried to keep the GUI design of each form as standard as possible. One way

we did this was to have the toolbars in the same place and use the same icons on

each form for the same functionality.

5) Error Prevention

The second screen above is also an example of error prevention as prompts the

user to the problem of having 2 images that are different sizes.


Again the reuse of icons the user will be familiar with from everyday use of

windows.


This really was the motto of our project. We used standard windows hotkeys to

allow efficient use. We introduced new ones such as shift and mouse click to

place an anchor onto the image, the feature list out lines many more of these

features.


Again the toolbar is a good representation of this it is very minimal and stiff

effective.


The second screen shot above provides this functionality nicely. It is an error for a

user to try a morph between 2 images of differing size, as the morph is not

possible. This form is a nice way to allow users to recover from this error.


We have help in the program itself, we also have tool tips which act as help and

manuals and a wizard to guide the user through the use of our program.

Overall I believe that all of our forms meet these criteria well and as such we have

passed all of the test that we have.

5.6 Evaluation

Overall from the screen shots and the testing we can see that the GUI is well designed

from a user’s perspective. One of our main aims was to make the GUI as intuitive as

possible. To test this we used two independent testers and with no instruction, just a

brief description of what warping and morphing is, as described earlier in the testing

section. This was a huge success and so I think that we have met our requirement

here.

36

To further analyse our performance I will go through the main requirements and

evaluate how we performed in respect to what was expected.

� Load multiple images into application for morphing. This should be possible

through Drag-and-Drop and also a File Chooser dialogue box.

This has been achieved fully in mainForm where the images are loaded. The user

can import the images through the file browser and by the process of dragging and

dropping it onto the picture box.

� Manipulate images to enable morphing to work

As described before the morph will not work when images are different sizes, and

so that is what this is addressing. We have an image editing form that will allow

us to edit the images to make them compatible. This addresses the problem and

allows the manipulation of the images.

� Add, Select and Manipulate Control Points on all images in a Morph Sequence.

These points are now known as anchors. We allow the full manipulation of

anchors, from changing their position and their appearance to deleting. This is also

functionality displayed on the main form. We implemented many user friendly

features, all listed in the feature list and so I think we have surpassed our

requirement. From the very positive user feedback on this I think that we have met

this requirement well.

� Provide a wizard which walks the user through a morphing project.

We have provided a wizard to help the user. The feedback on this was that it may

not be required as the GUI is extremely easy to use, but we intend to cater for

every level of user and so this will help in doing that. This wizard is visible each

time the application is loaded.

� Display all intermediary frames required to generate the morph for use in a

‘Teaching Mode’.

This teaching mode is an option to be viewed when a morph is complete, the form

is called Result, and it contains all of the functionality discussed with our

supervisor for teaching purposes.

� Provide a Movie Player to view the animated morph sequence.

There is also a form to provide the functionality of playing the stream produced as

a result of the morph. This was explained above, and it provides all of the

desirable features. The users found that the features were all they required and

were very impressed with the export function that could encode the video.

We have met all of our requirements to a good level and so I think that overall our

GUI development was very good.

37

From a critical point of view I think that the way that we implemented the redraw

function each time that a change was made gave the GUI a flickering appearance,

where icons disappeared and reappeared often. This gives a sluggish look to the

program and could have been voided.

Also the user is able to put two points on the same pixel and this will cause problems

for the mesh algorithm. This could have been guarded against but caused other

undesirable behaviour so we decided to leave the behaviour as is. The undesirable

behaviour was that the user would not be able to put 2 anchors very close together

which gives good morphs, and, as described earlier the user feedback, the new

anchors should all be put onto the image in the same place, when the add anchor

button is used as the method of adding an anchor

The main approach that we took was to compare our solution to that of the Morphious

application available online. Our GUI is a lot more intuitive and we have many more

anchor options which we think distinguishes our application to Morphious and make

it a lot more usable.

Improvements that we could implement:-

• Get rid of redraw. The way that we used redraw worked well when it came to

validation but it lead to a jerky application. We should have had the GUI drive

the model and they way that we ended up having it is having the model drive

the GUI.

• Not all of the image types are acceptable to the application, we could improve

this by allowing the user to use any format they want, this will increase the

complexity of the program a lot, and we did not have the time to finish this

aspect.

• Automated addition of anchors would make the program a lot more user

friendly to a beginner. We looked into this and it became apparent that facial

feature recognition is another group project and so to try and add that to our

application would be very ambitious. We did find some implementations

online, but they did not integrate with out application well.

• Getting the application to be platform independent is always an advantage, but

our choice of development environment restricted this opportunity so getting

this application to work on Linux would be an improvement.

• We also would have liked to make our program appear more professional and

themed to XP, we think that making the GUI appear more ‘flashy’ would give

the wow factor that is somewhat lacking to a novice how many not be able to

appreciate what is behind the GUI.

38

6. Core

6.1 Introduction

The cores main functions are to connect the GUI to the algorithms and to carry out

common procedures in the process of morphing.

The core includes the model, which is an internal representation of the state of the

program including all of the images and anchors in a hierarchy.

.

6.2 Analysis

The core’s main functionality should be to:

• Load and manage all plug-ins and figures out which algorithms are

available to the user.

• Act as the middle man between the GUI and the Algorithms

• Encode the .avi video and outputs it in a variety of different formats.

• Carry out all parts of a morph common to all algorithms.

• Figure out which result generator to call, and sends the results back to the

GUI.

• Include a model that should represent each controls state at each stage of

the program.

6.3 Design

One of the main functions of the core was to ensure that it did as much of the common

work as possible, theoretically it should then be relatively simple for users to write

new plug-ins for our application. Thus it was imperative that the design of the core

was kept modular with a clean separation drawn between core and GUI. We ensured

that the core was designed early, so that no bottlenecks were created in the

development of the GUI and algorithms, and that they could be properly tested and

debugged.

The following basic requirements were used to design the core.

1. Load and manage GUI and plug-ins.

The core must first of all load the other components, i.e. the GUI and the plug-

ins/algorithms.

The sequence of events for this functionality is shown below:

39

2. Provide a Project object

We wanted to introduce the concept of projects and allow the user to save the

state of the program, and preferences to reload at a later date. For this to be

possible we used XML serialization. This integrates well with our model.

3. Provide an interface for communication with other components

The actual interfaces needed became clearer during implementation however it

was obvious from the outset that we would need a standard interface for:

• Anchors

We need to define what an anchor is and then let each different

algorithm extend this definition for its own anchor types.

• Algorithm parameters

We needed a standard way in which we call the algorithm, to do this

we needed to agree upon the amount and type of arguments that each

algorithm takes.

Application

Manager

Core Plugin

Manager

GUI

Controller

Main Form

1. Create Instance

2.1 Create Instance

2.2 loadPlugins()

3.0 Create Instance

4.0. run()

2.0 run()

4.1 Create Instance

4.3 updatePlugin()

40

• Algorithms

An algorithm takes a set of anchors and generates a series of images,

which the core then streams to a video file.

The sequence Diagram below shows how the core acts as an intermediary

between the GUI and the algorithms when a morph is executed.

We then multi-threaded this process for as described below:

Threads are a powerful abstraction for allowing parallelized operations and for

our purposes are useful because they allow graphical updates to occur by using

another thread to perform computations. It is for this reason that we decided to

make use of a multi-threading architecture, whereby on execution of a morph,

a new thread is invoked to handle the processing, thus enabling the GUI to be

fully functional rather than in a blocked/waiting state.

In our implementation, the following sequence of events occurs:

• GUI sends a request to the Core for a morph

• Core creates a new thread, on which it runs the morph

• For every frame of the morph this thread performs a call-back to a delegate

which displays the current morph progress on the GUI

• On completion of the morph, a Result object is created and passed back to

the GUI through the Core, which can then be traversed as displayed as the

user requests

• The thread created by the Core is then disposed

MainForm :GuiController :Core AbstractResultGenerator :IPlugin

1. Morph()

3 getResult()

3.1 Create Instance

4.2 Morph(Pixel,

controlPoints)

4.1.* getNextFrame()

2.0 res =

doMorph(transform)

3.2 GetResult()

4 Create Instance

41

The main problem we faced when implementing this was that we had threads

which interact with the GUI. A requirement in the .Net framework is that any

threads interacting with the GUI must run on a Single Threaded Apartment

(STA) GUI thread. After some research we found that to overcome this

problem, you have to use the beginInvoke / invokeRequired pattern. In

Microsoft’s words this pattern

"executes the specified delegate asynchronously on the thread that the

control's underlying handle was created on"

It basically checks whether the current thread is the one on which the delegate

should be run. If the thread has not been invoked, it invokes it and runs the

process, if it has, it keeps picking threads from the thread pool randomly until

it finds the right thread.

4. Generate video from a series of images

The core should provide this function and return immediately, so that the GUI

can stay responsive. Even though the code can easily be put into the GUI, we

feel that to make a clear separation between core and GUI, this task should be

done from the core.

5. Provide an object model that relates images in a morph to their anchors

The reason why we decided on a model was that it fits the model view

controller approach which gave us the greatest degree of freedom when it

came to things like saving projects and passing parameters around. It also

meant that the team were able to work independently to a greater extent than

we would have if there were no model, and the state was solely represented on

the controls and forms.

42

The design of the Model is very simple. For each control we need a representation of

its state at each stage of the program. Below is a class diagram of the final

representation of the model.

1

many

many

PluginInfo

clsTransform clsResult

clsImages

clsImageFrame Image

IAnchor

Coordinate Triangle Line

1

3

1

1

1

1

1

43

6.4 Implementation

C# was chosen as the implementation language as we wanted to use Visual

Studio.NET to aid in the design of the GUI and to use its various plug-ins to enable

our continuous integration server.

We went through our design goals and classes, properties and method stubs were

added in the right places to accommodate them. With this done, other developers

could write their code with these standards in mind, and the desired functionality of

the GUI and algorithms now finalised. All that remained was to fill out the method

stubs and debug and unforeseen problems, usually requiring collaboration between

developers on different parts of the project.

The implementation of the core evolved in a way such that small changes could be

made as problems were discovered or new features decided upon.

The implementation of the model within the core was a little more complicated, we

implemented the model and view to be as abstract as possible so that it can be

extended as simply as possible. An example of this is that the anchors are just generic

shapes so if an algorithm is designed that requires anchors that are square then Anchor

can be extended to implement this. Also the nature of the plug-in manager means that

other algorithms can be plugged in to our solution.

Below I have outlined two key components of the core, XML Serialization and AVI

Encoding.

XML Serialization

We quickly identified the most tedious part of the warping process for the animator to

be the addition of control-points to the source and target images. It then followed that

a feature enabling them to save their progress would be extremely beneficial. After

some quick research we came upon the inbuilt XmlSerializer class within c#, which

would enable us to transform our objects into some serial data format, namely XML.

We would make use of the Serialize() and DeSerialize() methods to provide the

saving and loading functionalities respectively.

We decided to use XML serialization as opposed to binary serialization for a number

of reasons. Firstly, XML being a text-based format meant it would be human

readable, and so we would be able to understand and manipulate the files, simplifying

debugging considerably. This would also mean that our files could be used by a third

party morphing program, because the author would be able to understand our files and

add functionality to their software to DeSerialize them. Finally, as XML was designed

to be a lightweight, platform independent format that could easily be transported, we

could extend our application by providing a web service, to which the user could send

the XML over HTTP.

The object at the root of the XML is the clsTransform object. The list following

details the attributes that are is serialized:

• Morph Type

44

• Source/Target image and associated properties – size etc.

• Control-Points

• Algorithmic Parameters

On implementing the serialization we came across a number of problems which we

needed to solve. Firstly, we found certain rules by which the classes to be serialized

have to conform:

• XML serialization serializes only the public fields and property values of an

object into an XML stream.

• XML serialization does not include type information.

• XML serialization requires a default constructor to be declared in the class that is

to be serialized.

• XML serialization requires all properties that are to be serialized as read write

properties. Read only properties are not serialized.

Having altered our class to conform to these rules, and adding XML attributes such as

the one shown below, we came to the problem of how to serialize an image.

The problem we came across is that an image is stored as a bit map of the RGB

values for each pixel, it is therefore stored as a binary file. In order to store the

image in XML we would need to convert it into ASCII format. To overcome this,

we had to write a new property which would read the images into a memory stream

object, which we would then convert into an ASCII byte array and embed into the

XML file.

To view a sample XML saved data file, see the appendix.

Encoding an AVI

To create an AVI file from a sequence of C# images (each stored as a bit map), we

made use of an existing AVI module. Using this module, which is essentially a

wrapper to windows own AVI processing library, the AVIFil32.dll COM object, we

can create a new AVI and then use the supplied AVI player to view it.

The library is structured as described by the diagram below:

Tells the serializer how to serialise the

object

Tells the serializer to serialize the

object, using the given name, including

the specified types for the elements

45

AviManager manages the streams in an AVI file. The constructor takes the name of

the file, which we obtain using System.IO.Path.GetTempFileName(), and opens it

with an empty video stream. We then call the AddVideoStream method, with the first

image from the result, which then creates a new video stream whose format is set to

those of the given image. We then iterate through the remaining images, calling the

AddFrame method on the video stream with each one. The stream is then stored in the

GUIController, so it can be exported using the MakeFileFromStream method should

the animator chose to.

The AviPlayer simplifies the task of displaying the AVI file greatly. The VideoForm

simply creates a new instance of the AviPlayer, giving it the video stream and the

picture box in which the video is to be displayed. There are some features which were

not supported by the player module such as pause, loop and a frame slider of the

progress through the movies were not. We extended the module to provide these

functions.

There was one key implementation issue during the integration of this library into our

application. The problem came when the AVIManager was encoding each bit map

into an AVI video frame. We were finding that after encoding the video, all the

images on the result form were flipped vertically, but the orientation of the video was

correct. On stepping through the code we found that the process of encoding an image

involves flipping it vertically, unfortunately the developer had forgotten to reverse

this process and so the images which were displayed on our form were upside-down.

Once these lines of code were found the solution was trivial, but we were remind of

46

how important it is to rigorously test all third party code before integrating it into our

application.

47

6.4.1 Class Listings (Core)

Below is a list of all the classes in the Core and the Model, a brief outline of these

classes is given.

AbstractResultGenerator.cs

This class generates the results for any type of morph. The class also contains the

code for cross fading.

Core.cs

This is the main core class that figures out which algorithms and plug-ins are loaded

and present when the program is launched. It contains the code that encodes the avi

video. It also decides which type of result generator to call the morph method on

before passing the results back to the GUI to process.

Line.cs

Contains all code to model a line, including the start and end point, code to calculate

the length of the line, the centre of the line and the gradient of the line.

LineResultGenerator.cs

See Field Morphing (Beier-Neely Algorithm)

MeshResultGenerator.cs

See Mesh Algorithm

PluginAttribute.cs

This class is used to decorate a plug-in, i.e. the name of the plug-in, a brief description

of what the plug-in does and the markerType used by the plug-in.

48

6.4.2 Class Listings (Model)

Most of these classes are just the representation of the states of the forms and controls

and so just contain state and very little functionality.

clsImageFrame

This class is the representation of a picture box with the anchors on it in the GUI. The

functionality is just various adding and deleting functions needed in order to best

mimic the user’s actions.

clsImages

This class contains a collection of images and a pixel format. This is the

representation of clsImageStrip in the GUI. The need for the pixel format is to make

sure that the images that are used in the morph are of the same type when it comes to

the type of the pixel for the morph to occur with the best results. There is also a

method called compatible which checks if to images are compatible when the second

is dragged into a picture box on the GUI, this is in order to alert the user to the fact

that the types of images should be the same for the morph to work.

Another important thing to point out in this class is that there is a method add which

you can give an image to and it will create a new ImageFrame and copy the anchors

of the previous ImageFrame, this is required in the normal use of the program.

clsResult

This class is used to represent what is shown on the ctlresult control. This means that

it contains 3 clsImages and has functionality to manipulate that data.

clsTransform

This is the representation of the clsImageStrip control. It contains the functionality to

add image replace images and delete images, again all of the functionality to represent

what the user can do in the control.

Coordinate

This class is the internal representation of a point control, and contains functionality

to validate where the point is i.e. the point has to be within the picture box that its

image is in.

IAnchor

This class is the abstract representation of a line or a point, and so is the parent class

of each. Within the core all anchors are referred to as of type IAnchor.

PluginInfo

49

This class contains the data that is held in the plug-in attributes that describe the plug-

in and outlines the marker type for that morph as well as the parameters that can be

altered by the user.

6.5 Evaluation

The finished core fulfils the specification it was built to.

However there was a lot less code re-use than was originally designed. This was due

to the result generators being dependent upon the anchor type used. (i.e. an anchor of

type line returns different results to an anchor of type point). Therefore part of the

originally planned core was actually implemented in the GUI controller.

Even though it would have been easier to encode video in the GUI controller,

structurally, encoding video is more of a responsibility attributed to the core than the

GUI and so was implemented in the core. This meant that a lot of message passing

had to occur, i.e. a button on the form would have to communicate with the form, then

communicate with the GUI controller, then communicate with the core all in order to

initiate something because of the guideline we had employed that stated that there

should be a clear separation between GUI and core.

The model was carefully planned and implemented, and does everything that we

planned. It keeps track of collections of images along with the anchors and their

respective positions on the images. It also enables us to load and save whole morph

projects to files using XML serialisation.

Limitations

Limited image format support:

Due to the limitations of the way C# and Visual Studio.NET handles images, the

application only supports JPEGs and Bitmap.

50

7. Core Algorithms

7.1 Introduction

In this section we go into detail on how the morphing and warping algorithms work in

greater detail.

Morphing

The generic process of morphing two images (SRC and DEST) has been described

below:

1) SRC image is warped to DEST

2) DEST image is warped to SRC

3) Sequences 1) and 2) are run concurrently and oppositely such that when SRC

is fully warped DEST will be unchanged. Therefore in a morphing sequence

of f frames, as we move from frame 1 to frame f we see SRC progressively

warping to DST’s shape (See Fig [n].a) and DST un-warping from SRC’s

shape to it’s unaltered state (See Fig[n].c)

4) Each frame of the concurrent sequence now contains two images Image 1

from Sequence 1 and Image 2 from Sequence 2. Image 1 and Image 2 are then

cross-dissolved (see below) to form the morphed frame. Image 1 goes from

full opacity to complete transparency whilst Image 2 does the opposite.

The above process produces a morph sequence from SRC to DEST.

Cross Dissolution

The most basic morph technique, used before warping came about, was the through a

cross-dissolve. This is when a transition from one image to the other is done simply

through a fading of the colours in each, using the interpolation technique described by

the diagram below:

Warp

Source

Image Destination

Image

Destination

Image Source

Image

Warp

Cross dissolve 75 % 50 % 25 %

25 % 50 % 75 %

a)

b)

c)

Cross dissolve

f 0 f 1 f 2 f 3 f 4

f 0

f 0

f 1

f 1

f 2

f 2

f 3 f 4

f 3 f 4

51

The above diagram describes the morphing progress described in Section [N].[1]. In

section b) we can see that each fame of the resulting morph sequence is taken as a

cross dissolution of one image from a) and another from c). In a sequence of f frames

the morph frame ‘n’ will consist of frame ‘n’(Image A) from a) and frame ‘f-n’

(Image C) from c). The colour composition of frame n from the morph sequence will

be (1-n/f)% of image A and (n/f)% of Image C.

The algorithm for cross-dissolving implements the method detailed in the diagram

above directly. It simply creates a new image, iterates through each of its pixels and

samples the correct percentage of the RGB values from the corresponding pixels in

each of the warped images. The diagram below should confirm the effectiveness of

the algorithm:

Warping

Warping an image is the process of changing its spatial configuration whilst

maintaining its colouring. There are a number of different algorithms commonly used

for this purpose. They take as input two images and two sets of corresponding Control

Points, and output a warped version of the source image, so its control-point features

match those of the destination image. Generally, a set of intermediate warps are also

created to provide a smooth transition between the source and target images.

Control Points

The warping algorithms use Control Points to identify the common features in both

images, using them as a reference on how the spatial configuration should be

changed. We are focussing on the warping of faces, so the control points in our case

would be the corresponding facial features such as the eyes, lips, mouth, hair line etc.

There are two ways to determine the common features on a face 1) Automatic Feature

Recognition, and 2) User Input.

After researching the industry-standard techniques for automatic recognition, and

through discussion with our supervisor we decided it would be best not to implement

automatic recognition, as the time available would not have permitted us to do so

convincingly.

Once the corresponding points have been marked on the two faces, the algorithm can

ensure that the specified features in the first image will end up with the corresponding

point in the second image. The warping can then be combined with cross-dissolution

to provide a more fluid and realistic morph. An example of a morph using these two

techniques together is shown below.

52

The warping technique used here is Mesh Warping and the interpolated meshes are

shown in Section a) and c) above. Section b) shows the cross dissolution of the two

warp sequences, resulting in particularly convincing metamorphosis.

In the following sections we will detail the two algorithms we decided to implement,

but it is best to first understand our thinking with regard to the logical placement of

these algorithms and how they would be integrated with the overall system.

A fundamental design consideration we have maintained throughout this project has

been to have a plugin architecture for the algorithms. We wanted users be able to

write their own algorithms and easily integrate them into our program in the form of a

dll. To enable this feature and make it easier for users to write plugins, we wanted the

algorithm plugin to contain only the lowest level operations, i.e. only those which are

unique to the particular algorithm. Typically working on the input of a single pixel

and outputting the coordinates of the replacement pixel.

We therefore created an abstract class AbstractResultGenerator which was

responsible for those features which are common to all algorithms, regardless of the

marker-type. The role of this class then boils down to performing the cross-dissolve

between the Forward Warp and the Backwards Warp. This class then acts as the

interface between the GUI and the algorithms, sending the morphological sequence

back to the GUI for display/AVI encoding.

Source

Image

Destination

Image

Warp

Warp

Cross-

Disolution

53

7.2 Field Morphing (Beier-Neely)

7.2.1 Background

The Field Warping technique uses two-dimensional control points to specify

corresponding features in the source and destination images. Each control point exerts

a field of influence on its surrounding area, with the strength of this field decreasing

in proportion to the distance from the control-point.

The algorithm presented by Thadeus Beier and Shawn Neely in 1992 was based upon

this technique, with the aim of simplifying the user-interface for morphing programs.

They presented a technique which would allow the user to specify corresponding

facial features using line pairs, thus providing a greater degree of control over the

morph by allowing the animator to choose the features they wanted the warp to focus

on.

7.2.2 Analysis

There are two ways to warp an image, Forward-Mapping and Reverse-Mapping.

Forward mapping scans the Source image pixel by pixel, copying each one to the

appropriate place in the destination image. The problem with this method is that the

pixels in the source may not map to a unique pixel in the destination, and so there will

remain some dead pixels. Reverse Mapping goes through the destination image in the

same way, but instead samples the correct pixel from the source image, thus ensuring

that every pixel in the destination image is painted to something appropriate.

The Beier-Neely algorithm implements Reverse-Mapping, building a warped image

by using coordinate mappings to find the pixel to be sampled. A mapping from one

image to the other can be defined as a pair of lines (one defined relative to the source,

and the other relative to the destination image), and it would be best to understand

how a single mapping is used to warp a pair of images, and then expand this for the

case of multiple line pairs.

Figure 1: Single line pair [Beier-Neely paper]

Taking the case above, the mapping is used to find a pixel (X’) in the source image

that corresponds to a particular pixel (X) in the destination image. The position of X’

is found by calculating the position of X relative to the control-line PQ, and finding

the pixel from the source image that is in the same position relative to P’Q’. This

54

process is then repeated for every pixel in the destination image, resulting in an image

whose pixels have all been replaced by a corresponding pixel from the source image.

The formulae below explain the mathematics behind this principle, note that

Perpendicular() returns a vector perpendicular to, and of the same length as, the

argument vector.

The value u is the proportional distance along the line, ranging from 0 to 1 as the pixel

moves from P to Q, and is outside this range should the pixel be outside the line. The

value v is either the perpendicular distance from the line, or should the pixel be

outside the line, the distance to the closest endpoint P or Q.

The single line case is simply a special case of the multiple line case, whereby each

line has associated with it a weight that determines the influence it exerts on a pixel,

depending on the distance from it. The following formula is used in the calculation:

where length is the length of the line, dist is the distance of the pixel to the line and α,

β, and ρ are algorithmic parameters as defined below (typical range in brackets).

• α determines the extent of the user’s control over the warp, a low value

meaning the pixels will go exactly where the user intended, with an increasing

value resulting in a less precise control but an increased overall smoothness to

the warp. (>0)

• β determines how a line’s weight is affected by its distance from a pixel, a

large value meaning a pixel will only be affected the line closest to it, and a

zero value meaning every line has the same relative influence. (0.5 – 2)

• ρ determines how line length influences line weight, a zero value meaning

length has no influence, and a higher value meaning weight is affected by

length. (0 – 1)

The multiple line algorithm is described below,

For each pixel X in the destination

DSUM = (0,0)

weightsum = 0

For each line Pi Qi

calculate u,v based on Pi Qi

calculate X'i based on u,v and Pi'Qi'

calculate displacement Di = Xi' - Xi for this line

55

dist = shortest distance from X to Pi Qi

weight = (lengthp / (a + dist))b

DSUM += Di * weight

weightsum += weight

X' = X + DSUM / weightsum

destinationImage(X) = sourceImage(X')

Briefly, for each line the algorithm calculates the u and v values and uses these to find

the position of X’. The displacement of the current pixel, X, from X’ is then weighted

and added to an accumulator, the process then being repeated for every line. Having

iterated through every line, the actual pixel X’ is calculated by taking X and adding

the weighted displacements to find the cumulative effect of every line. The process is

then re-run for every pixel X.

To create a realistic morphing sequence, it is necessary to generate multiple warped

images at intermediate stages between the source and destination frames. A new set of

coordinate mappings (line pairs) is then generated by interpolating the lines between

their positions in the source and destination images. The algorithm is then run at each

intermediate frame with the interpolated feature lines for that frame.

56

7.2.3 Design & Implementation

As the key design consideration we have maintained from the outset has been to have

a plugin architecture with only the most unique features to the algorithm in the plugin,

we structured the classes as detailed in the following section.

In the case of Beier-Neely, the lowest-level unique feature is the calculation of the

pixel to be sampled from the source image. It was decided that this calculation would

be performed by the Beier-Neely plugin. The features common to all morphs using

Line control-points would be handled by a wrapper class, Line Result Generator. This

class would extend an Abstract Result Generator class, which is responsible for the

functionality common to all morphing techniques regardless of control-point type.

The diagram below clarifies the structure:

It is important to understand that we came upon the structure detailed above after

several stages of re-factoring. First, we had everything needed for the warp in the

Beier-Neely plugin, including the cross-dissolve and line-interpolation code. We then

realized that this was wasteful because certain methods would be required by other

algorithms which used lines pairs for control-points. To handle the features common

to all morphs using Line control-points, we decided to have a wrapper

LineResultGenerator class. After a further stage of refactoring we found that we still

had code-repetition, for example the method which would cross-dissolve the forward

and reverse warped frames, this is when we decided to have an

AbstractResultGenerator class which contained features which were common to all

morphs regardless of control-point type.

Below is a description of the functionality of each class:

Abstract Result Generator

As detailed above

Line Result Generator

• Interpolate Control Lines for each intermediate frame

• Iterate through every pixel in the destination image and call the algorithm plugin

to find the corresponding pixel to be sampled from the source image.

• Call the algorithm for each intermediate warp frame, once in the forward and once

in the reverse direction

Interpolation of Control-Point Lines:

<<AbstractResultGenerator>>

LineResultGenerator

ALGBeierNeely

57

There are two different methods available for interpolating lines, the first is to just

interpolate the endpoints of each line, and the second is to interpolate the center

position the orientation and the length of each line. In the first case, rotating a line

through 180° would cause it to shrink in the middle of the morph. However, the

second case does not create the most obvious interpolations and so the user may be

confused. As we plan to show the user the intermediate stages of the morph, we will

be using the second method, thus avoiding the problem of line shrinkage. The

diagram below shows an example of the two interpolation techniques:

It is obvious that using the first method means the line disappears halfway through the

morph, whereas the line-length is maintained with the second method.

The algorithm to interpolate the control-point lines works on the following basis.

Given an array of all the control-point lines and a percentage P of the progress

through the warp, it calculates the centre, length and gradient of each new line. For a

source line S a target line T and a percentage P, it takes P percent of S’s values and

(1-P) percent of T’s values. It then creates a new line using a dedicated constructor

which takes as input a centre coordinate, length and gradient. The pseudocode below

should clarify this:

Line[] interpolatedLines = new Line[SourceLinesArray.Length]

i=0;

While i < sourceLinesArray.Length

S = SourceLinesArray[i]

T = TargetLinesArray[i]

new coordinate centre = P*S.Centre + (1-P)*T.Centre

new double gradient = P*S.Gradient+ (1-P)*T.Gradient

new int newLength = P*S.Length + (1-P)*T.Length

InterpolatedLines[i] = new Line(centre, gradient, newLength)

Loop

Beier-Neely Plugin

Source Target

Gradient/Mid-Point interpolation: Interpolated lines in blue

t

End-Point interpolation: Interpolated lines in blue

t

Source Target

58

The Beier-Neely plugin is basically a direct implementation of the pseudocode

describing the multiple-line algorithm listed above. We implemented it directly in

visual C# with the view to porting it to Unmanaged C to improve performance.

C# .NET hides most of memory management, which makes it much easier for the

developer due to features such as the Garbage Collector and the use of references.

However, these features hinder performance and we often need to use Unmanaged C

so we can directly access and manipulate memory using pointers rather than

references. However, Unmanaged C has an extremely complex syntax, and it is harder

to use as you need be more careful and logical while using pointers. If you misuse a

pointer you risk overwriting other variables, causing stack overflows or causing the

program to crash. Furthermore, there is no in-built type checking and so if you access

an incorrect variable which has the wrong type, .net won’t execute the statement.

Once we implemented the algorithm in C#, we performed some further research into

Unmanaged C and found that integrating it with visual C# is inherently unstable. We

had also begun implementing the Mesh algorithm and decided that it would be best to

concentrate on provide the user more features as Mesh is already a much faster

algorithm and even after porting to Unmanaged C, Beier-Neely would never match its

performance.

Integrating Algorithm with Core

When integrating algorithm with the GUI and Core, we came across a few problems.

When we saw visually for the first time how the algorithm worked, we realised that

there were errors in our code. This was due to the fact that we had mis-interpreted

some of the maths from the pseudo code and tested the algorithm using Nunit with

this mis-interpreted maths in the primitive GUI , where all the tests passed. After

debugging the algorithm code, we found the functions that were erroneous and

corrected them.

After this minor set-back, we made the decision to concentrate on a suitable GUI that

would support the second algorithm, the mesh algorithm, before we actually

developed the algorithm.

59

7.2.4 Testing

This section will illustrate how well the algorithm performs, showing real-life

examples of a metamorphosis. It would be best to first analyse how effective it is to

morph a circle to a square, using the cross-dissolve-only morph as a comparison.

The figure below displays the two morph sequences, using 8 control points:

It is obvious that the second sequence is considerably more effective. The areas of

overlap between the two object has been minimised through the warp, which has

resulted in much smaller shadowed portions. From the warping sequences below, it

can be seen that the algorithm is pulling out the four quadrants of the circle, and is

squashing the corners of the square, to math the spatial configuration defined by the

control-points.

Cross-Dissolve-Only technique

Beier-Neily technique

Warp

Warp

60

However, on further inspection of the sequence, it is obvious that the morph is not

fully convincing, in that there are still shadowed portions. This is due to the

incompatible nature of the control-points with the curvature of the circle; it is simply

not possible to accurately specify the spatial configuration of the circle with only 8

straight lines, and so the sequence below shows how the morph improves with 16

lines:

This result is quite significant as we are focusing on the facial morphing, and as the

spatial configuration of a face is best defined with a curve, it will be necessary to use

a large number of points to achieve an effective metamorphosis.

Facial Morphing

The following images display a facial metamorphosis using 40 lines, with the

parameters we found to be most effective: α:1, β:2, ρ:0.5.

The warp has produced a much more

convincing shape, as the curvature was

defined more accurately; resulting in

smoother edges

The size of the shadowed portion in this image is greatly

reduced, showing that the effect could be reduced further

by adding more control points.

61

The result is a very convincing morph, as can be seen by the realism of the middle

picture in the morph sequence. In the forward warp the most significant re-

configurations are the subjects jaw line, which has been straightened, and his facial

features which have been skewed up and pulled out slightly. It can also be seen that

the portion showing his clothing has been squashed vertically to match that of his

counterpart. In the reverse warp the most significant change would be in the shape of

this hair, which has now been pulled up in the middle to create a more triangular

shape, and the overall shift of features to the left. On close inspection of the centre

image, some shadowing can still be seen around the neck and left ear, but as

confirmed by the square-circle results, this could be improved by the addition of

further points.

Parameters

When testing the effects of the parameters, we ran 3 tests, one with each of the three

taking its highest value whilst the others were controlled at their lowest, the results are

shown below (two extreme warped frames and the centre metamorphosis frame):

Forward Warp

Reverse Warp

Morph Sequence

62

From the two extreme warps in each sequence it can be seen that having a high value

of α or ρ causes very little facial warping, whilst a high value of β has most effect.

This is directly related to the quality of the warp, which can be seen to be significantly

better in the second sequence, where β is high. This is due to our previous finding,

that a large number of lines are required to define a human face, and so the weight of

the line should be directly correlated to the pixel’s distance from the line. Through

further testing we were able to conclude that the best parameters for a facial morph

are those we stated previously, α:1, β:2, ρ:0.5.

α:2, β:0.5, ρ:0

α:0.1, β:2, ρ:0

α:0.1, β:0.5, ρ:1

63

7.2.5 Evaluation

As can be seen from the morph sequences produced by the Beier-Neely algorithm,

there are no real issues with the quality of the morphs produced, providing suitable

parameters and control-points are used. The nature of the algorithm is such that there

is difficulty in specifying the spatial configuration of the face due to the straight lines

used as control-points.

Upon some further research into the developments of the algorithm, we found that

Jackel Birkholz has proposed an extension of Beier and Neely’s work whereby feature

curves can be used instead of feature lines. This development directly addresses the

issue of using our current field-warping algorithm, and works on the basis of

calculating the shortest distance from the given pixel to the curve, and scaling this to

the curve length and arc ratio of the target feature-curve. The problem with this

development is the computational power required, and as this is the key issue we have

with this algorithm, as I will now explain, we did not think it necessary to attempt an

implementation of it.

Performance testing, as detailed in the Performance Analysis section, has highlighted

the issues time issues with this algorithm. After testing the Mesh algorithm and seeing

the performance gains by restructuring it, we believe we could potentially improve the

performance of Beier-Neely by:

• Removing all object creation from the plug-in. Currently, new Coordinate

objects are created within the loop and so for every feature line several objects

are created, and as this algorithm is called on every pixel countless new object

are created for one warp frame. When optimising the Mesh algorithm, we

instead replaced all object creation for storing Coordinates with a primitive Int

for the x and y component. This brought significant gains.

• At the start of the algorithm we should check the progress through the morph,

currently the algorithm is called even if we are at 0%, effectively executing the

heavy algorithm but returning an unaltered image.

• Restructuring the classes so they have a lower cohesion would improve the

performance greatly, as the sheer number of cross-class calls means we are

taking a performance hit. However, as this obviously breaks the key software

enginerring concepts we will not look into this further.

All the above improvements are ones we discovered after rigorously testing and

comparing the performance with the mesh algorithm. Unfortunately, as this was not

done till late in the development cycle, we have not implemented any of them, instead

focussing on adding user functionality.

The improvements we did make however, include the following:

• Loop analysis to remove all invariants from the loop

• Storing of repeat calculation – such as (q-p) in the calculations of u and v.

64

7.3 Mesh Warping

7.3.1 Background & Analysis

Mesh warping was pioneered at Industrial Light and Magic (ILM) by Douglas Smythe

[Smythe] for use in the move Willow in 1998. It has been successfully employed in

many subsequent motion pictures. The warp is based upon segmenting the image into

a quadrilateral mesh [Wolberg90]. For the explanation of Mesh warping the source

image will be referred to as IA and the destination image is referred to as IB. The

source image has as a mesh, associated with it called MA. A mesh is a set of control

points which correspond to key features on that image. The destination image also

has a mesh associated with it called MB.

All the intermediary frames in the morph are created from the following process:

For each frame f do

Linearly interpolate mesh M, between MA and MB

Warp IA to I1, using meshes MA and M

Warp IB to I2, using meshes MB and M

Cross fade I1 and I2 to create If

End

In our project we decided to implement Triangulation based warping [Gosh86]. In

Triangulation the source and target images are first dissected into a desirable set of

triangles with the control points as the corners of the triangles.

There are many different methods for optimal triangulation. We chose to implement

Delaunay Triangulation [Shew96] which works by maximising the minimum inner

angle of all triangles to make them look as equilateral as possible and avoid wedge

shaped triangles.

The triangulation can be computed with a divide-and-conquer algorithm of

complexity O(n log n) where n is the number of data points. Due to the strict

mathematical mapping of pixels from one image to the other the pixel transformation

can be computed in constant time. With N number of pixels in an image the overall

complexity is

O(n log n+N) [Lee96].

A common problem with triangulation methods is that ‘foldover’ is possible. This

occurs when the control points are moved in such a way that they cause self-

intersection. This is shown in the diagram below in the destination image it is clear

that the pixels in triangle BCD will come from source triangle ABC and BCD. This

causes ‘foldover’ as there are multiple source pixels referencing a single destination

pixel.

A B

DC

B

CD

A

65

This is commonly fixed by freezing the user’s movement. This means that the

resulting warp will not be as expressive as possible.

Each of the generated triangle s are subject to an affine transformation (Fig…) The

co-ordinates of the three vertexes on the source triangle and the corresponding

coordinates in the destination triangle allow us to formulate six simultaneous

equations in x and y which are solved to create a mapping function.

Input Image Destination Image

(x0,y0)

(x2,y2)

(x’2,y’2)

(x1,y1)

(x’1,y’1)

(x’0,y’0)

66

7.3.2 Design

Due to the plug-in nature of the algorithm section of our project the part of the lowest

level of the algorithm which returns simply a pixel must be a separate plug-in

component. The UML diagram below shows the structure of the algorithm section:

Abstract Result Generator

As detailed above

Mesh Result Generator

The Mesh Result Generator is responsible for the following:

� Perform Delaunay triangulation on Source Image

� Performs Match Triangulate on Destination Image

� For each Frame it interpolates the new mesh.

� Calculates Parameters required to map between source and destination triangles.

� Calculates bounds for the Fill method.

� Calls Fill method and for each pixel encountered call method in ALGMesh to find

the replacement pixel.

Foldover

As a group we had to make a decision on how to handle the event of a user moving

control points such that the triangulation algorithm would create more triangles in one

image than the other. This would cause foldover and thus undesirable effects in the

resulting morph sequence.

The common solution is to freeze the mesh boundaries of the MA. This has a

detrimental effect on the warp and the ability of the user to fully control the features

which he wishes to warp.

Instead of preventing the user from moving control points which would result in lots

of overheads in the GUI the group decided it would be best to let the user move the

point wherever they want and then perform a different triangulation algorithm on the

destination image which would always result in both images have the same number of

triangles. For the purpose of the report this second triangulation algorithm is called

Match Triangulate.

Match Triangulate

<<AbstractResultGenerator>>

MeshResultGenerator

ALGMesh

67

The Match Triangulate takes the set of triangles from the Source Image and replaces

the coordinates of the vertices with the corresponding points coordinates from the

Destination Image.

Interpolating intermediate Meshes

There are two ways in which the intermediate meshes can be calculated. The first is

by taking an edge from the Input Image and the corresponding edge in the Destination

Image and performing a linear interpolation between them. This would have to be

done twice in order to attain values for all three coordinates in the triangle. The

advantage of this is that linear interpolation would already be coded for Beier-Neely

algorithm. The disadvantage is the overhead of creating line objects and then

accessing them multiple times.

The second method for interpolating is by taking each vertex at a time in the Input

Image and linearly interpolating based on the corresponding vertex in the Destination

Image. This is done by taking a ratio of the x and y coordinates of both vertices. This

is a faster method but will require additional code in the Core. The benefit is that it

will allow for a programmer to develop an algorithm plug-in for single Control Points.

This means that Radial Basis Algorithms can be implemented as they are point based

warping algorithms.

Fill Method

The Fill method is required to scan within the bounds of the triangle calling a method

in ALGMesh which given the input of a vector and the set of six parameters will

output a new vector corresponding to the replacement pixel.

ALGMesh

� Contains the lowest level method which returns the coordinate of the replacement

pixel

68

7.3.3 Implementation

In this section we will discuss the implementation of the Mesh Warping algorithm in

the following order

1) Triangulation

2) Interpolation

3) Solving the Affine transformation

4) The Fill Method

Triangulation

The Delaunay triangulation algorithm which was implemented can be summarised by

the pseudo code below.

subroutine triangulate

input : vertex list + 8 boundary points

output : triangle list

initialize the triangle list

determine the supertriangle

add supertriangle vertices to the end of the vertex list

add the supertriangle to the triangle list

for each sample point in the vertex list

initialize the edge buffer

for each triangle currently in the triangle list

calculate the triangle circumcircle center and radius

if the point lies in the triangle circumcircle then

add the three triangle edges to the edge buffer

remove the triangle from the triangle list

endif

endfor

delete all doubly specified edges from the edge buffer

this leaves the edges of the enclosing polygon only

add to the triangle list all triangles formed between the point

and the edges of the enclosing polygon

endfor

remove any triangles from the triangle list that use the

supertriangle vertices

remove the supertriangle vertices from the vertex list

end

Most implementations of the above algorithm have a complexity of O(n2), where n is

the number of vertices. This is a result of the face that other programs incrementally

add vertices as they usually need to dynamically create the mesh as the user adds

more and more control points. Due to this, every time a new point is added, every

other triangle is checked to determine whether its circumcircle encloses the current

vertex point.

Another disadvantage of incrementally adding the vertices and calling the algorithm n

times is that calculations such as the triangle circumcircle data would have to be

calculated n times.

69

This is an overhead associated with the common method of freezing the mesh edges

to prevent the user from causing foldovers and strengthens our decision not to place

constraints on control points on the destination image. As a result of this

implementation issue the complexity is almost a linear function of the number of

points.

Black Triangles

The triangulation method above went through two iterations. In the first iterations the

vertex list passed to the algorithm was populated with the control points from the GUI

as well as the four corners of the picture box in order to create a triangulation

representative of our source image.

Unfortunately although the majority of the triangulations produced desirable results a

few did not. On a few occasions a triangulation similar to the one below occurred.

As a result there was a black triangle in the warped image which resulted in an

undesirable morph.

As shown above the left edge of the image has been completely ignored by the

triangulation algorithm.

In order to fix this we reduced the likelihood of an edge being missed out by adding

the mid points of the four edges as well (marked in red below). The resulting

triangulation was evidently better and it is unlikely that a morph that creates black

triangles will be created by the user.

Interpolation

To interpolate the triangles we used the point-to-point linear interpolation as apposed

to taking the lines. The diagram below

70

xf = x0 + f/n (x1 – x0)

yf = y0 + f/n (y1 – y0)

f = current frame number

n = number of frames

Solving the Affine transformation

In order to solve the Affine transformation a matrix solving c# add-on was used.

Which when given a matrix problem in the common form Ax=b solves the matrix and

returns x. This gave us the unique parameters for each transformation which is used

by ALGMesh to find the coordinates of the replacement pixel. This saved

development time and there were no issues involved in integrating this package into

our existing program.

Fill Algorithm

The fill algorithm works from the vertex with the lowest y-value to the vertex with the

greatest y-value. On each increment it calls a function which works outs the bounds

of the scan line. This is a two-step process, first a call is made to findBoundaries

which takes in a Triangle and a y value and returns the three x coordinates which

corresponding to the intersection of the three edges and the current scan line. Only

two of these x-values will be on the edges of the triangle and these are the bounds of

the scan line. The Fill Algorithm then scans between the x boundaries chosen and

(x0,y0)

(x1,y1)

(xf,yf)

71

calls the algorithm on the pixels. The diagram below shows how the fill algorithm

works.

Ref. http://astronomy.swin.edu.au/~pbourke/modelling/triangulate/

scan line

points returned by

findBoundaries

Boundary points for filling

fill

direction

72

7.3.4 Testing

In order to test the quality of morphing algorithms we are using the same images to

morph between and then compare the results of both algorithms. The input to our

program were the following two pictures. The control points are marked as the multi-

coloured circles on the two faces.

73

As a group we are very satisfied with the overall result of this morph. The key frame

to look at is the middle frame on the cross-fade. From this we are able to see if there

is any shadow effect. It is apparent that there is a little shadow on the bottom of the

image where there is a difference in where the two jackets start.

74

7.3.5 Evaluation

On the whole we were very impressed with the speed of this algorithm. However the

code can be optimised by implementing it in unmanaged C. Unmanaged C allows us

to directly access and manipulate memory using pointers rather than references.

However, Unmanaged C has an extremely complex syntax, and it is harder to use as

you need be more careful and logical while using pointers. If you misuse a pointer

you risk overwriting other variables, causing stack overflows or causing the program

to crash. Furthermore, there is no in-built type checking and so if you access an

incorrect variable which has the wrong type, .net won’t execute the statement.

As discussed later in the Technical Analysis of the solution the Mesh algorithm could

be optimized by checking if triangles have been moved before performing the fill

algorithm. This would prevent unnecessary runs of the fill algorithm which is

computationally quite complex.

75

8. Performance Analysis

This section of our report is concerned with analyzing our application from a

performance point of view. We expect a linear relationship between the time of a

morph and various changes in the number of anchors and pixels an image has.

We will compare and contrast the two algorithms and hopefully come to some sort of

conclusion about the optimal number of anchors each algorithm requires for a good

morph. We also hope to come to a point where we can identify problems or

bottlenecks and alter the application to increase the performance that we observe.

We will be performing these tests on a Dell P4 1.8GHz with 256MB ram machine and

so this should be taken into account when viewing the time taken for a morph, in the

labs the performance is considerably better.

The first thing that we will investigate is the time taken for a cross dissolve for 5, 10

and 15 frames as output, for two images one of 142x190 pixels and another of

285x380 pixels. These results are not skewed by .Net optimisations like the next set. I

re ran these tests because the results were hugely skewed by the fact that the first run

of the application is slower than subsequent ones due to .Net optimisations.

Below are a table and a graph of the times taken for the cross dissolving for the image

that is 142x190.

Frames

5 10 15 Time for small image (Sec) 0.5 2.6 4.4

Time for small image

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5 10 15

Frames


The cross-dissolve is a basic requirement for a morph, and is therefore required for

each algorithm. This test gives an indication of the pixel retrieval times and cross-

dissolve overhead. As expected the total number of frames is directly proportional to

the time taken in the dissolve.

76

To back this up, below are the graph and table for the larger image.

Frames

5 10 15

Time for large image (Sec) 5.0 13.5 20.5

Time for large image

0

5

10

15

20

25

5 10 15

Frames


Below are the two graphs plotted on the same axis:-

0

5

10

15

20

25

5 10 15

Frames



As you can see the gradient for the larger image is bigger than that of the smaller.

This is expected as the larger image has more pixels and so the dissolving algorithm

will take longer.

77

The amount of time taken is surprising. It takes this long due to the fact that we store

every image that is created to pass back to the user for the teaching mode, and

creating the model obviously has a big overhead. Also we loop through each pixel on

the pairs of images and view them then take the weighted average and obviously this

is an inefficient way to do the cross fading.

Next we will investigate the relationship between the time taken for a morph and the

number of anchors we place on the images. We are not hugely concerned with the

quality of the morph but will still attempt to make good morphs. While testing the

application on lab machines we identified that the time taken for the same morph

varied, so we will perform each test 3 times and take an average. In these set of results

the first result usually takes longer, this is due to .Net optimisations which speeds up

the later use of some dlls.

Below are the results from the tests that we performed. The times shown are

representative of the time taken to create a five frame morph sequences. From this

point onwards in the report assume five frames unless stated otherwise.

0

10

20

30

40

50

60

70

80

90

0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

No. of control points

Tim

e (

Se

c) Small Image (Beier Neely)

Small Image (Mesh)

Large Image (Beier Neely)

Large Image (Mesh)


0 5 6 7 8 9 10 11 20

Small Image (Beier Neely) 2.7 7.1 7.8 9.9 11.3 13.2 18.2

Small Image (Mesh) 2.4 2.7 2.6 2.6 2.6 2.6 2.6

Large Image (Beier Neely) 14.1 31.6 34.4 38.1 44.5 50.2 78.7

Large Image (Mesh) 13.2 13.1 12.8 12.9 13.1 13.1 12.9

The mesh warping results are not surprising on further inspection of the algorithm. If

one increases the number of points there is an increase in processing time by the

initial

78

triangulation component of the algorithm, we calculate this to be of O(n) time

complexity. This is no problem for modern computers, only if the number of points

were in the order of tens of thousands would it become a major factor. The Delaunay

triangulation algorithm is only run once at the beginning of the morph, so there is only

a very small overhead as the fill algorithm has to traverse through more triangles

when filling new images. Due to the constant linear nature of both of the mesh graphs

I think that we are able to say that the performance of the algorithm is independent of

the number of anchors, and is decreased as the resolution of the image is increased.

We quartered the number of pixels and saw a performance gain of 6 times the morph

speed.

This is enforced by the results shown for the zero anchor plot of the algorithm. This

time is at an equivalent level to that of the rest of the graph and indicates that the

processing that occurs before the manoeuvring of pixels is the bulk of the

performance loss. We have eight default points on the image to reduce the loss of

parts of the image to blank triangles, explained in the core algorithms section, this

means that the triangulation code is executed, the fill code, and the parameter solving

code. This indicates that these functions perform independently of the number of

anchors.

The Beier Neely results do not show the same flat gradient as the Mesh algorithm

exhibits. The gradient of the large picture is in the order of 3 seconds whereas the

gradient of the small picture was in the order of 0.7 seconds. This is expected as there

are approximately four times as many pixels in the large picture than in the small

picture. If one extrapolated the results for the small picture the graph plot for 0

control points would be close to 3.6s, however it turns out that the time is actually

2.7s which is below 3.6s as expected. This shows that the Beier Neely algorithm is

responsive to the absence of control points on the image however there is still no

performance gain over mesh.

An interesting point that we were not expecting to occur is that the Beier Neely

algorithm for a small image with more than 12 anchor points seems to take longer

than a large image using the Mesh algorithm with the same number of anchor points.

This leaves the performance of Beier Neely deterministically dominated by that of the

Mesh algorithm.

From the above analysis we can derive that there is a greater gain in performance by

optimizing the code in Beier Neely which deals with the calculation of replacement

pixels when there are control points on the image. The Mesh algorithm would be best

optimized with code that recognizes unchanged triangles.

79

0

2

4

6

8

10

12

14

16

18

20

0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20


Tim

e (

Se

c)

Small Image

Small Image(Cross Fade)

The chart above shows the time taken for a complete morph in comparison to the time

taken for the cross fade for a number of different control points.

One significant thing to note from this graph, as with all graphs showing cross fades,

is that its time is independent to the number of anchors, as they have no significance

when it comes to the cross fade, they are just ignored.

From this graph we can see that the difference between the time taken for a morph of

no anchors and the time taken for its cross fade is 2 seconds. The code responsible for

these 2 seconds can be identified easily. In this case it is the time taken to loop

through 142x190 pixels 2 times the number of frames times (5 in this case = 10). So

an instant optimization would be a check for no anchors before this loop.

Another observation is that the cross fade code takes a fraction of the time of a morph

and so does not really require optimization.

Below is the same graph for the larger image.

80

0

10

20

30

40

50

60

70

80

90

0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20


Tim

e (

Se

c)

Large Image

Large Image(Cross Fade)

The same conclusions can be made from this graph. In addition an increase in the

number of pixels by 4 has lead to an increase of 10 for the time taken to cross fade.

This is an interesting observation and may be an anomalous result. The graph below is

further analysis of this a graph which shows the time taken for the cross fade as the

number of pixels the image has increases.

y = 5E-05x - 0.3

0

10

20

30

40

50

60

70

80

90

100

0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000 2000000

No. of Pixels

Tim

e (

Sec)

This proves a linear relationship showing that each pixel takes 5E-5 seconds to cross

fade. The same analysis applies to the Mesh results and so is not included.

In this next section we compare the previous analysis with the same morphs but now

with two stage morphing. One would expect the time to increase linearly with the

number of stages in the morph. We assume that a one stage morph with ten frames

81

would have the same characteristics as a two stage morph with five frames in each

stage. The chart below shows our findings

0

20

40

60

80

100

120

140

160

180

0 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20


Tim

e (

se

c) Small Image (Beier Neely)

Small Image (Mesh)

Large Image (Beier Neely)

Large Image (Mesh)


5 9 20

Small Image (Beier

Neely) 15.2 23.7 37.3

Small Image (Mesh) 6.09 5.95 6.04

Large Image (Beier

Neely) 64.2 89.79 161.59

Large Image (Mesh) 27.2 27.1 27.2

Above we compared and contrasted each morph with changing the number of anchors

on an image and here we found the relationship between the different types of morphs

with different types of images so the same analysis holds. Below is a table of the

times for a one stage investigation.


0 5 6 7 8 9 10 11 20

Small Image (Beier Neely) 2.7 7.1 7.8 9.9 11.3 13.2 18.2

Small Image (Mesh) 2.4 2.7 2.6 2.6 2.6 2.6 2.6

Large Image (Beier Neely) 14.1 31.6 34.4 38.1 44.5 50.2 78.7

Large Image (Mesh) 13.2 13.1 12.8 12.9 13.1 13.1 12.9

From comparing and contrasting the values we can see that there is a two to one ratio

between the times with the two stage morph being twice as long as the one stage

morph. This is as expected as there is double the work.

82

All of this analysis was very useful in changing the architecture of the application to

the state that it currently is in. The major changes came in our implementation of the

algorithms. We found that placing all of the Mesh code into one class and not making

the process create new instances of other classes improved the performance greatly.

We also moved the bulk of the algorithm into the core and this had the desired effect

of improving the performance.

We did not follow the same approach for the other algorithm because of its nature,

and the fact that a morph with no anchors is 2 seconds longer than just the fade shows

to some extent the performance hit we have taken. We did make some changes

however. Due to time constraints we were not able to go as far as we did with Mesh

(see evaluation of core algorithms), but we made small changes like taking common

parts of repeated calculations out of loops which were executed on each pixel and this

did give a marked improvement.

83

9. Project Summary

9.1 Dividing the Project

When tackling the project is was decided that the software was best split into three

different sections:

1) The GUI

2) The Core

3) The Algorithms

As a group we then collaborated to decide who would work on which areas of the

project. The outcome was that Dharmesh Malam and Christopher Roebuck would

work on the GUI, Jonathon Enav and Anish Mittal would work on the Core and Ravi

Madlani and Rikin Shah would work on the Algorithms.

To allow the teams to develop concurrently without interference, we created a source

repository enveloped with a continuous integration engine. This led us to adopt many

of the practices of Extreme Programming methodology. After some development

iterations, we decided to re-engineer our practices and devise a more team oriented

plan. As a result, the original teams were broken up, the projects were unified under a

single solution and the development efforts became more centric. This meant that all

developers in the team were responsible for the whole project and not just their own

classes. If code needed improvement in the Core a previous GUI developer would

still take up the task. This helped the global understanding of the project and how

each unit was meant to integrate with all of the others.

9.2 Continuous Integration

The continuous integrations ensure that any time code is checked in to the repository

it is compiled, regression tested and archived as a release. During heavy coding, this

can be several times a day. Since each of these integration cycles produces working

binaries, it forces the most stringent of practices and ensures that we have a

deliverable available to demonstrate progress at any time.

84

9.3 Module Integration

We were told early on by our project supervisor Daniel Rueckert that most previous

groups did not realise errors in their algorithms soon enough as they took too long to

develop a GUI to test them. As we did not want to fall into this trap we ensured that

our GUI would have the ability to return visual results from the morphing algorithms

as soon as possible. This allowed us to integrate the three modules and begin ironing

out bugs as soon as possible.

Integration of GUI and Core

The first GUI was extremely simple and primarily used to test the functionality of the

core. On the 1st of November a completely overhauled GUI was uploaded to the

continuous build server. This included support for drag and drop of images and had

the ability for the first time to return visually the results from the Beier Neely

algorithm.

The GUI and core sat together very nicely, and due to the nature of our setup, we

could easily integrate the two components together. The application evolved with

these two components sat along side each other.

Integration of Algorithms with GUI and Core

Beier Neely:

When integrating the first algorithm, Beier Neely, with the GUI and Core, we came

across a few problems. When we saw visually for the first time how the algorithm

worked, we realised that there were errors in our code. This was due to the fact that

we had mis-interpreted some of the maths from the pseudo code and tested the

algorithm using Nunit with this mis-interpreted maths in the primitive GUI , where all

the tests passed. After debugging the algorithm code, we found the functions that

were erroneous and corrected them.

After this minor set-back, we made the decision to concentrate on a suitable GUI that

would support the second algorithm, the mesh algorithm, before we actually

integrated the algorithm.

Mesh Algorithm:

When integrating the mesh algorithm, we realised that too much code was in the mesh

plug-in and not enough in meshResultGenerator so code had to be moved around as

there was no clear separation between the GUI and core part of the algorithm.

Once the Initial Integration was complete we decided to change the entire team

structure. This was due to their being too much overlap between the modules once

they had been integrated. In order to ensure all manpower was being used in the

project we decided to make an Action List. Due to the nature of development after

the initial integration, testing was occurring all the time. As a rule every time a team

member tested something he was working on he recorded any fixes required in an

Action Point list which was available to all developers. When someone felt they

85

could do something on the Action Point list they checked out the point so the whole

team knew what everyone was working on.

This proved to be an extremely effective development methodology and helped us use

our time as efficiently as possible.

Final Usability Testing

Nielsen's Ten Usability Heuristics are listed below, along with a description of how

re-morph attempts to satisfy each heuristic.

1 Visibility of system status -- The system should always keep users informed about

what is going on, through appropriate feedback within reasonable time.

Re-morph keeps users informed about the current system status through the status bar,

and does not allow users to carry out functions (e.g. by graying out options/buttons) if

the system is not in the appropriate state.

2 Match between system and the real world --The system should speak the users'

language, with words, phrases and concepts familiar to the user, rather than system

oriented terms. Follow real-world conventions, making information appear in a

natural and logical order.

Re-morphs interface is intuitive, placing anchors is a metaphor, as the anchors once

placed should not move from their position, as they should be placed on facial

landmarks such as eyes, mouth and nose. The icons represent completely what the

desired function corresponds to.

3 User control and freedom - Users often choose system functions by mistake and will

need a clearly marked "emergency exit" to leave the unwanted state without having to

go through an extended dialogue.

Re-morph has the functionality to be able to delete unwanted images and anchors, to

quickly switch from one algorithm to another without having to carry out any

additional procedure.

4 Consistency and standards -- Users should not have to wonder whether different

words, situations, or actions mean the same thing. Follow platform conventions.

In Re-morph, each button has one and only one function and this is clear to the user

by its description and its icon, however many functions can be achieved by a variety

of methods, leading to increased usability.

5 Error prevention -- Even better than good error messages is a careful design which

prevents a problem from occurring in the first place.

Re-morph prevents a function being called if the system is not in the correct state for

that function to be carried out. This prevents errors from occurring in the first place.

86

6 Recognition rather than recall -- Make objects, actions, and options visible. The

user should not have to remember information from one part of the dialogue to

another. Instructions for use of the system should be visible or easily retrievable

whenever appropriate.

All actions can be carried out through the use of clicking a button or through the right

click menu. Each action can be recalled or recognised by its icon or description.

7 Flexibility and efficiency of use -- Accelerators -- unseen by the novice user – may

often speed up the interaction for the expert user such that the system can cater to

both inexperienced and experienced users. Allow users to tailor frequent actions.

Re-morph does not have support for user created macros, however there are a number

of hotkeys which can be used to speed up frequent actions, for example, adding

anchors is a frequent operation which can be sped up through the use of holding shift

down while clicking to add an anchor.

8 Aesthetic and minimalist design -- Dialogues should not contain information which

is irrelevant or rarely needed. Every extra unit of information in a dialogue competes

with the relevant units of information and diminishes their relative visibility.

Re-morph has minimal dialogue, and where the dialogue appears it does not

overpower the more relevant parts of the forms. It is always clear what the users next

step should be.

9 Help users recognize, diagnose, and recover from errors -- Error messages should

be expressed in plain language (no codes), precisely indicate the problem, and

constructively suggest a solution.

Error messages are in-frequent in re-morph due to the user not being able to do certain

functions at inappropriate times. However, any error messages that do appear should

be sufficiently detailed enough for the user to know what has caused the error and

how they should go about recovering from it.

10 Help and documentation -- Even though it is better if the system can be used

without documentation, it may be necessary to provide help and documentation. Any

such information should be easy to search, focused on the user's task, list concrete

steps to be carried out, and not be too large.

Re-morph does not come with any documentation as we feel that the interface is

intuitive enough on its own, however each icon has a tooltip describing its function.

87

9.4 Evaluation

The initial separation of the system into three components, GUI, Core and Algorithms

was, in our opinion, the best approach. It gave us the ability to follow the Model-

View-Controller design pattern and provided a natural separation which fits into the

philosophy of object orientated programming. The current class structure means there

is very low coupling between the classes of each component, so for example changing

a user form will require no alteration in the rest of the system providing it is

compatible with the GUI controller. However, the high cohesion has led to large

performance overheads in the form of excessive class interaction during the warping

process, as described in the Core Algorithm Section. There is a trade-off between a

performance orientated architecture and one that conforms to good software

engineering practices. In our opinion, we made this trade-off too close to good

software design for Beier-Neely, but learning from our mistakes struck the right

balance when implementing the Mesh algorithm.

A key focus throughout the system development has been to ensure that the software

is designed with a user-centric approach. By giving the system to a panel of

independent users we were able to tweak the application and are now confident with

its usability. We performed a blind test, whereby the panel were given two

applications Re-Morph and Morpheus, and were told to comment on their preference.

The feedback we received confirmed that our application had a higher overall

usability, with the teaching mode receiving high praise. However, there were

comments that our application was slower than its counterpart, though this is

understandable as it’s more feature-rich.

In terms of our group dynamic, by adopting XP practices and guidelines such as the

use of continuous integration, re-factoring and unit-testing tools, we were able to

develop quickly and productively. We held stand –up meetings at regular intervals,

starting with a weekly basis and progressing to a daily frequency as the deadline

approached. Through these meetings and by pair-programming we found that each

pair could be focussed on their individual task whilst receiving guidance from the

leader who maintained the projects focus.

Overall, we are very happy with the application we have produced and are confident

that it stands up well against the existing market leaders.

88

10. Bibliography

Beier Neely

http://www.hammerhead.com/thad/morph.html (George Wolberg Servey)

http://www.cs.princeton.edu/courses/archive/fall00/cs426/papers/beier92.pdf

http://mambo.ucsc.edu/psl/beier.html

http://www.fmrib.ox.ac.uk/~yongyue/morphing.html

http://www.cs.cornell.edu/zeno/projects/vmorph/MM97/VMorph-MM97.html

Mesh Warping

http://davis.wpi.edu/~matt/courses/morph/2d.htm

http://www.cs.berkeley.edu/~jrs/mesh/

Delaunay Triangulation

http://wwwicg.informatik.uni-rostock.de/~hb01/03%20-%20Birkholz%20-

%20Image%20Warping%20with%20Feature%20Curves.pdf

General

rme04 MSC paper Facial Warping and Morphing (2004)

Metise Third Year Project Report (2004)

George Wolberg Image Morphing: A Survey (Visual Computer, vol. 14, pp. 360-372,

1998.)

www.codeproject.com

www.csharpcorner.com

www.citeseer.com

89

Appendix A. Organisational tools and methods

A number of tools were used to aid the success of the project. This section documents

some of those tools.

Microsoft Visual Studio.NET -Visual Studio was the IDE used to develop the

application with a number of plug-ins:

Visual SourceSafe

Source code management and version control system for Visual Studio.NET.

Members can safely and easily manage source code, Web content, and any

other type of file—all from the comfort and convenience of Visual Studio

.NET

CruiseControl.NET (CCNet)

CCNet consists of a suite of applications, but at its core is the

CruiseControl.NET Server, which is an automated continuous integration

server. The Server automates the integration process by monitoring the team's

source control repository directly. Every time a member commits a new set of

modifications, the server will automatically launch an integration build to

validate the changes. When the build is complete, the server notifies the

developer whether the changes that they committed integrated successfully or

not.

ReSharper

90

A plug-in that increases productivity in C# by re-factoring code, providing on-

the-fly error highlighting and quick error correction.

Nunit

A Unit Testing framework for C#, to enable test-driven development.

Easy Icon Maker and Macromedia Fireworks 8 – To aid in the design of the

project logo and icons.

Agile Methods

• Pair programming – One member codes and another observes and adds input.

(XP – Extreme Programming)

• UML Diagrams to provide a visual conceptualisation of the functionality of

the application. (UP – Unified Process)

• Frequent Stand Up Meetings – Where each member would announce what

they have achieved, what they hope to achieve and any problems they have

encountered (Scrum)

• Continuous Integration – a process that completely rebuilds and tests an

application. (XP - Extreme Programming)

• Test driven development – writing test cases first and then implementing the

code necessary to pass the tests. (XP - Extreme Programming)

91

Appendix B. Meetings and Pair-Programming Logbooks

Friday 14th

October Meeting with dr (All present)

• Positions of Group Leader (Anish Mittal) and Group Secretary (Chris

Roebuck) decided.

• Background to Facial Warping and Morphing discussed.

• Possible Specification and extensions discussed.

Wednesday 19th

October Meeting (All present)

• As a group discussed ideas for minimum and extended specification

• Agreed upon first draft of specification

• Agreed on division of responsibilities

Thursday 20th

October Meeting with dr

• Went over the specification with dr

• Modified part of the specification to keep targets realistic

Friday 21st October – Deadline for Report 1

Tuesday 18th

October Meeting with dr about algorithms (Present:

rdm03,rs303,akm103)

• Meeting with dr specifically about algorithms

• Given pseudo code and given advice on how to code a solution to Beier Neely

algorithm

• Anish Mittal(GUI/Project Leader) attended to oversee the structure of the first

algorithm and how it fits in with the project as a whole.

Monday 24th

October Meeting (Present: cjr03,dm203)

• After making a start on GUI, realise that a lot of the GUI/Core functionality is

shared

• Discuss the beginnings of what is to become the model which will keep track

of all images and anchors attached

• Agree that to further develop a solution, it would be wise to meet with the

Core developers

Tuesday 25th

October Meeting (Present: cjr03, dm203, je203, akm103)

• Discuss the object model, and how it should be engineered to keep track of all

images and anchors

• Agree that the model should be part of the core but will be heavily referenced

by the GUI.

92

Tuesday 8th

November Meeting (Present je203, rs303, dm203)

• Plug-in Manager updated

Friday 11th

November Meeting (Present: all)

• Group meeting to discuss any outstanding content that should be in the second

report

Deadline – Project Report 2

Monday 14th

November Meeting (Present je203, rs303, dm203, rdm203)

• Mesh algorithm integrated into core/GUI and debugged/tested

Wednesday 16th

November Meeting (Present rs303, rdm203)

• Mesh algorithm working ok with GUI/core

• Progress Bar

Monday 21st November Meeting (Present: dm203,cjr03)

• Lots of usability features added to GUI e.g. selecting anchors, modifying

anchors size, shape, colour.

Wednesday 30th

November Meeting (Present akm103, je203)

• XML Serialisation fixed for saving and loading of project files.

Rikin Shah and Jonathan Enav

15th

October Worked on Beier Neely algorithm with Ravi Madlani

8 hours

20th

October Wrote the line interpolation and cross dissolve.

5 hours

28th

October Integrated Beier Neely with core, re factoring to new structure

(abstractResultGenerator)

8 hours

4th

November Researched and implemented avi encoder and player

5 hours

10th

November Researched and implemented xml serialisation

9 hours

22nd

November Extended AVI player fundamentals – slider to select individual

frames in the video

2 hours

Dharmesh Malam and Chris Roebuck

18th

October Worked on new GUI with drag and drop of images

4 hours

20th

October Planned Object model to keep track of images and their anchors

93

5 hours

28th

October Integrate GUI with Beier Neely algorithm, wrote result form to

view intermediary images post morph

9 hours

5th

November Updated GUI to enable video preview and export

4 hours

9th

November Wrote code to integrate webcam into program and ability to get

a live capture direct from webcam into the program ready to be

used in the morph

5 hours

14th

November Wrote, tested and debugged Mesh algorithm with Ravi Madlani

11 hours

15th

November Cleaned up GUI with code for progress bar and timer for

statistical analysis of the program

3 hours

23rd

November Added many functions to increase usability of GUI, ie multi-

select of anchors, modifying anchors colour, size, shape,

scaling images.

7 hours

27th

November Polish application, added icons and logos.

4 hours

Ravi Madlani and Anish Mittal

15th

October Research Beiner Neely algorithm

4 hours

17th

October Design Pseudo-code for Beier Neely

3 hours

18th

October Implement Beier Neely with Rikin Shah

8 hours

2nd

November Research mesh algorithm and design appropriate pseudo code

4 hours

14th

November Implement mesh algorithm with Dharmesh Malam

11 hours

20th

November Began technical analysis of program using timer function to

compare speed of morphs

94

7 hours

15th

December Write up of technical analysis, and mesh algorithm in final

report

5 hours

Documents

ReMorph - Report - 2005