Upload
matteo-valoriani
View
106
Download
1
Embed Size (px)
Citation preview
Matteo Valoriani
CEO ofSpeaker and ConsultantPhD at Politecnico of Milano
Microsoft MVP
Intel Software Innovator
@MatteoValoriani
https://it.linkedin.com/in/matteovaloriani
Nice to Meet You
2
Clemente Giorio
Senior Developer at
Speaker, Author and Instructor
Microsoft MVP
email [email protected]
@Tinux80
http://it.linkedin.com/pub/clemente-giorio/11/618/3a
Nice to Meet You
3
Agenda
• Store App
• Body Frame
• Coordinate Mapper
• Kinect Studio
• Gesture Recognition
• Gesture Builder
• Intro
• The Sensor
• Data Source
• Kinect Evolution
• Data Source
You have to be a
magician…
… or at least a
good illusionist
Kinect 2 - Specs
3D DEPTH SENSOR
RGB CAMERA
MULTI-ARRAY MIC
Hardware:
Depth resolution:512×424
RGB resolution:1920×1080 (16:9)
FrameRate:30 FPS
Mic resolution:48 kHz
1920 x 1080 array of color pixels• 30 or 15 fps, based on lighting
conditions
Elaborated Image Format:
RGBA, BGRA, YUY2, …
Raw Data: YUY2
ColorFrameSource
Range: 0.5 near – 4.5meters far
(Extended Depth to 8m)
Pixel Data16-bit distance in millimeters from the sensor’s
focal plane
DepthFrameSource
Pixel Data 0 to 5: Index of the corresponding body, as
tracked by the body source
> 5: No tracked body at that pixel
BodyIndexFrameSource
255
0 1
Version 1 Version 2
Depth range 0.4m → 4.0m 0.5m → 4.5m
Color stream 640×480@30fps 1920×1080@30fps
Depth stream 320×240 512×424
Infrared stream None 512×424
Type of Light Light coding ToF
Audio stream 4-mic array 16 kHz 4-mic array 48 kHz
USB 2.0 3.0
# Bodies Traked 2 (+4) 6
# Joints 20 25
Hand Traking External tools Yes
Face Traking Yes Yes+Expressions
FOV 57° H 43° V 70° H 60° V
Tilt Motorized Manual
System / Software Requirements
OS Windows 8, 8.1, Embedded 8, Embedded 8.1 (x64)
CPU Intel Core i7 (recommended)
RAM 4GB (o more reccomended)
GPU DirectX 11 (required)
USB USB 3.0 (Intel or Renesas chipsets)
Compiler Visual Studio 2012, 2013 (Supported Express)
Language Native (C++), Managed (C#,VB.NET), WinRT (C#,HTML)
Other Unity (Plugin), Cinder, openFrameworks (wrapper)
Basic Flow of Programming
Sensor Stream Frame Data
Sensor Source Reader Frame Data
Kinect for Windows SDK v1
Kinect for Windows SDK v2
Source independent to each Data(e.g. ColorSource, DepthSource, InfraredSource, BodyIndexSource, BodySource, …)
Doesn’t depend on each other Source(e.g. Doesn't need to Depth Source when retrieve Body Data)
In “New Project” create a new Windows Store app
Enable Microphone and Webcam capabilities
Add a reference to Microsoft.Kinect
Use the Microsoft.Kinect namespace in your code
Creating a new store app using Kinect
Represents a single physical sensor
Always valid: when device is disconnected no more frame are generated.
Use IsAviable Property to verify if the device is connected
The KinectSensor class
this KinectSensorthis// Make the world a better place with Kinectthis
Give access to frames– Events
– Polling
Multiple readers may be
created on a single source
Readers can be paused
Readers
InfraredFrameReader reader = sensor.InfraredFrameSource.OpenReader();
reader.FrameArrived += InfraredReaderFrameArrived;
...
Frame references
void InfraredFrameReaderInfraredFrameArrivedEventArgs
using (InfraredFrame frame = args.FrameReference.AcquireFrame())
{if (frame != null){
// Get what you need from the frame}
}}
Sent in frame event args
AcquireFrame gives access to the actual frame
• Gives access to the frame data– Make a local copy or access the underlying buffer directly
• Contains metadata for the frame– e.g. Color: format, width, height, etc.
• Important: Minimize how long you hold onto the frame– Not Disposing frames will cause you to not receive more frames
Frames
• Allows the app to get a matched set of frames from multiple
sources on a single event
• Delivers frames at the lowest FPS of the selected sources
MultiSourceFrameReader
MultiSourceFrameReader MultiReader =Sensor.OpenMultiSourceFrameReader(FrameSourceTypes.Color |
FrameSourceTypes.BodyIndex |FrameSourceTypes.Body);
var frame = args.FrameReference.AcquireFrame(); if (frame != null) {
using (colorFrame = frame.ColorFrameReference.AcquireFrame())using (bodyFrame = frame.BodyFrameReference.AcquireFrame())using (bodyIndexFrame = frame.BodyIndexFrameReference.AcquireFrame()){
//}
}
Range is 0.5-4.5 meters
Frame data is a collection of Body objects each with 25 jointsEach joint has position in 3D space and an orientation
Up to 6 simultaneous bodies
30fps
Hand State on 2 bodies
Lean
BodyFrameSource
Improved reliability and accuracyMore reliable lock-on and more stable joints
More anatomically correct skeletonHips in the right place, new shoulder parent
Six players tracked at all timesSimplified engagement, bystander involvement
Hand-tip and thumb jointsEnables subtle and more nuanced hand gestures
Per-joint orientationGreat for character retargeting
Skeletal Tracking Features
NU
I
ColorSpace (Coordinate System of the Color Image)
… Color
DepthSpace (Coordinate System of the Depth Data)
… Depth, Infrared, BodyIndex
CameraSpace (Coordinate System with the origin located the Depth Sensor)
… Body (Joint)
Coordinate System
Three coordinate systems
Coordinate mapper provides conversions between each system
Convert single or multiple points
Coordinate mapping
Name Applies to Dimensions Units Range Origin
ColorSpacePoint Color 2 pixels 1920x1080 Top left corner
DepthSpacePoint Depth,
Infrared,
Body index
2 pixels 512x424 Top left corner
CameraSpacePoint Body 3 meters – Infrared/depth
camera
Recordable Data Sources
Infrared
13 MB/s
Depth
13 MB/s
BodyFrame
BodyIndex
Color
120 MB/s
Audio
32 KB/s
Legend
Record/Play
Record Only
New tool, shipping with v2 SDK
Organize data using projects and solutions
Give meaning to data by tagging gestures
Build gestures using machine learning technologyAdaptive Boosting (AdaBoost) Trigger
• Determines if player is performing gesture
Random Forest Regression (RFR) Progress
• Determines the progress of the gesture performed by player
Analyze / test the results of gesture detection
Live preview of results
Gesture Builder
Heuristic
• Gesture is a coding problem
• Quick to do simple
gestures/poses (hand over head)
• ML can also be useful to find
good signals for Heuristic
approach
Machine Learning (ML) with G.B.
• Gesture is a data problem
• Signals which may not be easily
human understandable (progress
in a baseball swing)
• Large investment for production
• Danger of over-fitting, causes you
to be too specific – eliminating
recognition of generic cases
Gesture Recognition
General Info & Blog ->http://kinectforwindows.com
Purchase Sensor -> http://aka.ms/k4wv2purchase
Developer Forums -> http://aka.ms/k4wv2forum
Twitter Account -> @KinectWindows
A Facebook Group -> http://on.fb.me/1LSflbX
A LinkedIn Group -> http://linkd.in/1J9gFcY
A Twitter Account -> @KinectDevelop
A Google Plus Page -> http://bit.ly/1SHtduT
Kinect Resources
Slides and Code will be available on:
http://www.dotnetcampus.it
THANK YOU!
Q&A