38
Multimodal Interaction Dr. Mike Spann [email protected] http://www.eee.bham.ac.uk/spannm

Multimodal Interaction Dr. Mike Spann [email protected]

Embed Size (px)

Citation preview

Page 1: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Multimodal InteractionDr. Mike Spann

[email protected]://www.eee.bham.ac.uk/spannm

Page 2: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Contents

Introduction ImageJ introduction Image representation in ImageJDisplaying images in ImageJ ImageJ utility classRegions of interestWriting plug-insOpenCV introduction Image handling and display in OpenCVOther librariesKey tipsKey resources

Page 3: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

IntroductionTwo free platforms for developing image

processing software are ImageJ and OpenCVImageJ – JavaOpenCV – C++, C#

They both allow rapid development of image processing algorithms ImageJ through the use of plugins

Both have extensive libraries of well known image processing functions for the user to build upon

Page 4: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

ImageJ introductionAll of this material is taken from the

ImageJ tutorial accessed from my web-site

http://mtd.fh-hagenberg.at/depot/imaging/imagej/

Also check out the ImageJ home pagehttp://rsb.info.nih.gov/ij/

ImageJ is a free image processing system allowing easy development of Java based image processing algorithms in the form of plug-ins

Page 5: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

ImageJ introductionIt comes with a user-friendly GUI and can

run either as an application or an applethttp://rsb.info.nih.gov/ij/applet/

Page 6: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

ImageJ introductionIt can handle 8,16,24 and 32 bit imagesIt can handle most standard image formats

TIFFGIFJPEGBMPDICOM

It can handle stacks of imagesAlso there are plug-ins allowing it to handle

movie filesIt can also handle regions of interest

(ROI’s)

Page 7: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

ImageJ introductionThe key point about ImageJ is that it is simple

to add your own algorithms (written as plug-ins) callable from the front-end GUIFile I/O taken care of by ImageJPixel access easy from image handles defined

within ImageJ

Page 8: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJImageJ has 5 built-in image classes

8 bit grayscale (byte)8 bit colour (byte)16 bit grayscale (short)RGB colour (int)32 bit image (float)

It also supports image stacks consisting of images (slices) of the same size

Page 9: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJImageJ uses 2 classes to represent and

manipulate imagesImagePlus

An image is represented by an ImagePlus objectImageProcessor

This holds the pixel data and contains methods to access the pixel data

Page 10: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJPixel access methods in ImageProcessor

includeObject getPixels() – returns a reference to the

pixel array (need to cast to appropriate type)int getHeight() – returns height of pixel arrayint getWidth() – returns width of pixel array

Page 11: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJA subclass of ImageProcessor is passed to the

run() method of the plug-in filter (see later) ByteProcessor - 8 bit grayscale images ShortProcessor – 16 bit grayscale imagesColorProcessor – RGB imagesFloatProcessor – 32 bit floating point images

Page 12: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJPixel representation in ImageJ uses the byte

datatype for grayscale and colour images and short for 16-bit grayscalebyte/short are signed data types

byte ranges from –128 to 127 short ranges from –32768 to 32767

Obviously grayscale values are usually positive values

Page 13: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJTo cast a byte to an integer, we need to

eliminate the sign bit

Can cast back the other way easily enough

byte[] pixels=(byte[]) ip.getPixels();

int grey=0xxff & pixels[j];

pixels[j]=(byte) grey;

Page 14: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJThe ColorProcessor return the pixels as int[]

and the RGB values are packed into the one int variable

int[] pixels=(int[]) ip.getPixels();int red=(0xxff0000 & pixels[j]) >> 16;int green=(0xx00ff00 & pixels[j]) >> 8;int blue=(0xx0000ff & pixels[j]);

0 bit31

Page 15: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image representation in ImageJCan reconstitute an RGB array by shifting the

other way :

pixels[j]=((red & 0xff)<<16)+((green & 0xff)<<8)+(blue & 0xff);

Page 16: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Displaying images in ImageJA class ImageWindow is used to display

ImagePlus objects We don’t normally need to access methods of

ImageWindowThese are automatically called from ImagePlus

methods show(), draw() and updateAndDraw()

Page 17: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

ImageJ utility classImageJ contains a class called IJ which

contains a number of useful static methodsError messages

static void error(String message) – displays an error message in a dialog box

static boolean showMessageWithCancel(String title, String message) – allows the user to cancel the plug in or continue

Page 18: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

ImageJ utility classDisplaying text

static void write(String s) - Outputs text in a window – useful for displaying textual or numerical results of algorithms

Displaying text in a status barstatic void showStatus(String s)

Page 19: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

ImageJ utility classUser input

static double getNumber(String prompt, double default) – Allows the user to input a number in a dialog box

static String getString(String prompt, String default) – Allows the user to input a string in a dialog box

The GenericDialog class is a more sophisticated way of inputting more than a single number or string

Page 20: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Regions of interest (ROI’s)A plug in filter does not always have to

work on the whole imageImageJ supports ROI’s which can are

usually rectangular but can be other shapes as well.

We set/get the ROI using the following method of the ImageProcessor classvoid setROI(int x, int y, int w int h)Rectangle getROI()

Page 21: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Writing plug-insTo write a plug in requires developing a class

which implements either the PlugIn or PlugInFilter interfacesThe second is more usual as this is used when

the filter requires an input image

import ij.*;import ij.plugin.filter.PlugInFilter;import ij.process;

class myPlugin implements PlugInFilter

Page 22: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Writing plug-insMethods setup() and run() must then be

providedMethod setup() sets up the plugin filter for use

String arg allows input arguments to be passed to the plugin

Argument imp handled automatically by ImageJ It’s the currently active image

int setup(String arg, ImagePlus imp)

Page 23: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Writing plug-insMethod setup() returns a flag word

representing the capabilities of the plug-in filter (for example the types of images it can handle). For example :static int DOES_8Gstatic int DOES_RGBstatic int DOES_ALLstatic int NO_CHANGES (plug in doesn’t

change the image data)static int ROI_REQUIREDetc

Page 24: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Writing plug-insThe run() method is called when the plugin is

run from the ImageJ menuIt contains the code to process individual pixels

If no input image is required

If an input image is required

void run(String arg)

void run(ImageProcessor ip)

Page 25: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Writing plug-insOnce the Java program is written and

compiled, the .class file is stored in the plug-in directory which is a sub-directory of the ImageJ home directoryThe class name should contain an underscore

symbol (eg MyAlgorithm_ )It then appears under the plug-in sub-menu of

the ImageJ gui

Page 26: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

OpenCV introductionBased on Intel IP Library

More than 300 functionsFeatures

Multi-platformProvides a simple window manager with sliders mouse call

backs etcUses Intel Integrated Performance Primitives to enhance

performanceTutorials & DocumentationSample code

Download fromsourceforge.net/projects/opencvlibrary/

Operates underWindows 95/98/NT/2000/XPPOSIX Linux/BSD/OSX/Win2K/WinXP

Page 27: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image handling and display in OpenCV Still images from file

Using cvLoadImage()IplImage* cvLoadImage(const char* filename,int iscolor=1);

Returns a pointer to an ipl imageFilename: the name of the image fileiscolor: >0 image is loaded as 3 channel colour

0 image is loaded as grey-level<0 No. of channels determined by file

File formats Windows bitmaps: bmp, DIB JPEG files: jpeg, jpg, jpe Portable Network Graphics: png Portable image format: pbm, pgm, ppm Sun raster: sr, ras Tiff Files: tiff, tif

Page 28: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Image handling and display in OpenCVSimple code to load and display an image

Uses library HighGUINeed to register image window and display it

int cvNamedWindow(const char* name, int flags);name: id name and text on image bannerint: 1, autosize to fit the image

void cvShowImage(const char* name, const CvArr* img);name: id name and text on image bannerimg: pointer to image to be displayed

CvCapture* cvCaptureFromFile(const char* fname);fname:video file name

CvCapture* cvCaptureFromCAM(int index);index:positive integer specifies video source

Page 29: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

/* usage: prog <image_name> */#include "cv.h“ /* OpenCV data types and prototypes */#include "highgui.h“ /* Image read and display*/

int main( int argc, char** argv ){IplImage* img;

if(argc==2 && (img=cvLoadImage(argv[1], 1)) !=0) {cvNamedWindow("Image view", 1);

cvShowImage("Image view", img); cvWaitKey(0); /*very important, contains event */

/*processing loop inside */ cvDestroyWindow("Image view"); cvReleaseImage(&img); return 0;}

return -1;}

Page 30: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Other libraries:(CV)Image Processing

Edges, filters, morphologic operators, pyramids, histograms, boundary tracing

Structural analysisContour processing, computational geometry

Motion Analysis and Object TrackingObject tracking, optical flow, motion templates

Pattern RecognitionCamera calibration and 3D Reconstruction

Camera calibration, pose estimation, epipolar geometry

Page 31: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Other libraries:cxcoreOperations on arrays

Linear algebra, statistics, random number generator

Dynamic structuresTrees, graphs, sets, sequences

DrawingCurves, text, point sets, contours

Page 32: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Other libraries:MLMachine Learning

Bayes classifierK-nearest neighbour classifierSupport Vector MachineDecision TreesBoostingRandom ForestExpectation MaximisationNeural Networks

Page 33: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Other libraries:CvCam/CvauxCvCam

Video Input / Output Select and set up a camera Render a video stream Control the Video Process video frames Multiple cameras under Linux and Windows

Cvaux Experimental functions

Page 34: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Key tipsIplImage data Types

IPL_DEPTH_8U unsigned 8 bitIPL_DEPTH_8S signed 8 bitIPL_DEPTH_16U unsigned 16 bitIPL_DEPTH_16S signed 16 bitIPL_DEPTH_32S signed 32 bitIPL_DEPTH_32F single precision floating

pointIPL_DEPTH_ 64F double precision floating

pointChannels: 1, 2, 3 or 4

sequence is: b0, g0, r0, b1, g1, r1

Page 35: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Key tipsAccessing image pixels

Coordinates are indexed from 0Origin is top left or bottom leftTo access an element of an 8-bit 1-channel

image,I (IplImage* img):

I(x,y)~((uchar*)(img->imageData + img->widthStep*y)[x]

Page 36: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Key tipsTo access an element of an 8-bit 3-channel image,

I (IplImage* img):I(x,y)blue~

((uchar*)(img->imageData + img->widthStep*y)[x*3]

I(x,y)green~

((uchar*)(img->imageData + img->widthStep*y)[x*3+1]

I(x,y)red~

((uchar*)(img->imageData + img->widthStep*y)[x*3+2]

To access an element of a 32-bit floating point 1-channel image, I (IplImage* img):I(x,y)~

((float*)(img->imageData + img->widthStep*y)[x]

Page 37: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Key tipsIn general to access an element of an N-channel image of

type TI(x,y)c~

((T*)(img->imageData + img->widthStep*y)[x*N+c]

More simply using the provided macroCV_IMAGE_ELEM(img, T, y, x*N+c)

Efficient way to increment brightness of point (100, 100) by 30CvPoint pt = {100, 100};

uchar *tmp_ptr=&((uchar*)(img->ImageData

+ img->widthStep*pt.y))[x*3];

tmp_ptr[0]+=30;

tmp_ptr[1]+=30;

tmp_ptr[2]+=30;

Page 38: Multimodal Interaction Dr. Mike Spann m.spann@bham.ac.uk

Key resourcesA brief intro to OpenCV:

www.cs.iit.edu/~agam/cs512/lect-notes/opencv-intro/index.html

A more detailed guide (also describes DirectX/DirectShow):http://www.site.uottawa.ca/~laganier/tutorial/

opencv+directshow/cvision.htmWikipedia

http://en.wikipedia.org/wiki/OpenCVBook

Learning OpenCV: Computer Vision with the OpenCV Library (Paperback) Gary Bradski and Adrian KaehlerO'Reilly Media, Inc. (15 Jul 2008) ISBN 0596516134