68
The POST RELEASE TECHNOLOGIES OF CRYSIS 3 Twitter: @coolbeenz Email: [email protected]

The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Embed Size (px)

DESCRIPTION

For AAA games now there is a consumer expectation that the developer has a post release strategy. This strategy goes beyond just DLC content. Users expect to receive bug fixes, balancing updates, gamemode variations and constant tuning of the game experience. So how can you architect your game technology to facilitate all of this? Stewart explains the unique patching system developed for Crysis 3 Multiplayer which allowed the team to hot-patch pretty much any asset or data used by the game. He also details the supporting telemetry, server and testing infrastructure required to support this along with some interesting lessons learned.

Citation preview

Page 1: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

The POST RELEASE TECHNOLOGIES OF CRYSIS 3

Twitter: @coolbeenz

Email: [email protected]

Page 2: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Job done?Introduction

Page 3: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

CONTENTS1.The reasoning

2.Data Patching

3.Telemetry

Asset systems, Patch paks, Multiplayer flow, Handling failure & messaging

Collection, Storage, Syncing, Analysing, Matchmaking telemetry case study

Why, What, How

4.Release-DebugOther production mechanisms for gathering data

5.SummaryLessons learned and future developments

6.Questions?Over to you...

Page 4: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

THE REASONINGPART1

Page 5: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

“What are THEY for?”Post-Release Technologies...

Page 6: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

TWEAKING

IMPROVING Diagnosing

Fixing Facilitating

the gameplay

the game experience

the cause of problems

bugs

themed weekends

Page 7: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

“What EXACTLY are THEY?”Post-Release Technologies...

Page 8: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

POST RELEASE TECHNOLOGIES

= DATA PATCHING + RELEASE DEBUG + TELEMETRY

Page 9: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

“WHY DO WE NEED THEM?”Post-Release Technologies...

Page 10: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Because things do not always go to PLan

Page 11: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

T200 (X360)27th Sept

Open BetaJan 29th

Closed Alpha

Nov 2ndT200 (PC)Oct 4th

T200 (PS3)11th Oct

T200 (X360)8th Nov

T200 (PS3)22nd Nov

T200 (PC)29th Nov

Because despite alphas, betas and numerous large scale tests things will still slip through the

net. The players are your most thorough QA.

The CRYSIS 3 TEST SCHEDULE T200 = EA Worldwide Tech 200

Page 12: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

... For certification failures

... On discovering copyrighted content

... When players are abusing an exploit

As A way to Deploy ASSET FIXES RAPIDLY

Page 13: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham
Page 14: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

BECAUSE CERTIFICATION COSTSTIME & MONEY

December 2012 JANUARY 2013 FEBRUARY 2013 MARCH

03-Dec 10-Dec 17-Dec 24-Dec 31-Dec 07-Jan 14-Jan 21-Jan 28-Jan 04-Feb 11-Feb 18-Feb 25-Feb 04-Mar 11-Mar

Open-beta liveOpen-beta cert

Final cert ReleaseRTM

Day 10 cert Day 10 live

40%Of commits

During CERT & RTM WEREASSETS & DATA

Page 15: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

BECAUSE WE WANT PEOPLE TOKEEP PLAYING THE GAME

Page 16: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Because things don’t always go to PLan

SELL YOUR THEMED WEEKENDS

Page 17: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Because things don’t always go to PLan

SELL YOUR THEMED WEEKENDS

Page 18: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Because things don’t always go to PLan

SELL YOUR THEMED WEEKENDS

Page 19: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

SO THAT WE CAN REACT TO FEEDBACK

Page 20: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

AND BUILD A COMMUNITY

Page 21: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

DATA PATCHINGPART2

Page 22: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

CRYENGINE ASSET FILE SYSTEM - OVERVIEW

objects/level_specific/airport/architecture/terminal/main.cgfFiles referenced using paths

A virtual file systemFiles can be loose or part of asset packages (.pak) files

Files can be stored in memory, media or HDDPlatform agnostic API

Page 23: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

CRYENGINE ASSET FILE SYSTEM - PAK FILES

Paks are digitally signed and encrypted in mastered buildsAntitamper mechanisms

A collection of filesThese are essentially zip archives of a folder hierarchy

Paks searched in order of most recently openedStack based searching

Page 24: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

CRYENGINE ASSET FILE SYSTEM - PAK FILES

gEnv->pCryPak->OpenPak(“objects1.pak”);

gEnv->pCryPak->OpenPak(“objects2.pak”);

gEnv->pCryPak->OpenPak(“objects3.pak”);

objects1.pak

objects3.pak

objects2.pak

Search order

gEnv->pCryPak->FOpen(“objects/level_specific/airport/architecture/terminal/main.cgf”,”rbx”);

Page 25: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

CRYENGINE ASSET FILE SYSTEM - PAK FILES

Level loading, MPModeSwitch.pakSome created for specific loading

Contents generally organised by typeObjects, animation, scripts, music, sounds, etc

.dds0, .chr, .cgf, .cgaSome created for streaming

Page 26: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

PATCH PAKSA simple way to override ANY EXISTING ASSET?

... Create a patch.pak

... Mount this new pak file

... New assets will be prioritised

Mount it last or mark with a special ‘priority’ flag

Any subsequent file requests will be serviced by these patched files first

containing updated versions of specific assets

... Patching at the asset system levelSo individual game subsystems oblivious

... Only suitable for Title Updates and DLCAs we need to hardcode the loading of this pak file in a new executable

Page 27: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

ON DEMAND PATCHING

... Differing lifetimes

... Separate hot/cold assets

... Risk reduction

DOWNLOADING & Applying PATCH PAKS TRANSPARENTLY

number of patch paks?”

Double XP Weekend vs Level setup fixes

Weapon balancing vs player stats fixes

Smaller files mean less chance of failure

“Why do we need to support a variable

Page 28: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

ON DEMAND PATCHINGCRYSIS 3 IMPLEMENTATION DETAILS

Multiplayer Only

Process hidden within the transition to MP

Cache size of 2Mb (X360 only)

We already show a loading screen and re-initialise most game systems anyway

Self imposed limitations to reduced risk

Patch paks un-mounted on returning to single player

Regularly check for new updatesSo that players can be informed if they need to re-enter MP

Page 29: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

It all starts with a file called Permissions.xml...

ON DEMAND PATCHINGDOWNLOAD PAKS INTO MEMORY OVER HTTP

Page 30: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

MULTIPLAYER FLOW

User selects

Multiplayer

Login Online

ServicesTCR Reqs

Download

Permissions.x

ml

Check Cache

Download

Patch1.pak

Download

Patch2.pak

Mount paksInit Game

systems

Overview

Page 31: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

MULTIPLAYER FLOW

User selects

MultiplayerTCR Reqs

Login Online

Services

Download

Permissions.x

ml

Check Cache

Download

Patch1.pak

Download

Patch2.pak

Mount paksInit Game

systems

Points of failure

Page 32: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

MULTIPLAYER FLOW

TCR Reqs

TCR Requirements

Hook into existing handling

Require an extra 2Mb in save game

Cannot proceed unless allowed online

User selects

Multiplayer

Login Online

Services

How do we handle these?

Online play checks

Need extra storage to cache paks

Page 33: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

MULTIPLAYER FLOW

TCR ReqsDownload

Permissions.x

ml

Check Cache

Download

Patch1.pak

Download

Patch2.pak

Failing to download

General networking failures

Bespoke networking configurations

Abort!No patches

No telemetry

How do we handle

these?

What can go wrong?

Page 34: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

MULTIPLAYER FLOW

Download

Permissions.x

ml

Check Cache

Download

Patch1.pak

Download

Patch2.pak

Mount paks

Failing to download

What can go wrong?

MD5 Checks

TimeoutsGeneral networking failures

How do we handle these?

Cache Paks (Anti-tamper checks)Continue to download in the backgroundProvide help with manual download?

Page 35: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

MULTIPLAYER FLOW

Download

Permissions.x

ml

Check Cache

Download

Patch1.pak

Download

Patch2.pak

Mount paks

Failing to download

Implement a configurable timeout

Page 36: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

“WON’T THIS RESULT IN PLAYERS HAVING MIS-MATCHING SETS OF PATCHES?”

But...

Page 37: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

YESBut it is ok because we have a plan...

Page 38: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

1.Isolate PLAYERSThis is basically using the same checks used to isolate people running

old builds (Retail & Development)

Client AVersion

oxA5BC

Client CVersion

oxA5BC

Server 1Version

oxA5BC

Server 2Version

ox3370

Client BVersion

ox3370 Client DVersion

ox3370

Version code used as a matchmaking filter &

during context establishment.

P1

P2

P1

P2

Page 39: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

1.Isolate PLAYERSXOR in the MD5s of each patch pack to create a unique version code

Client AVersion

oxA5BC

Client CVersion

oxA5BC

Server 1Version

oxA5BC

Server 2Version

ox3370

Client BVersion

ox3370 Client DVersion

ox3370

P1 P2

P1 P2

0x96CC

0x0100

0xA4BCXOR

XORExe

P2

P1

0x3370Matchmaking =

Page 40: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

2.COMMUNICATELet players know that they are matchmaking against a reduced pool

Page 41: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

DATA PATCHING FUTURE DEVELOPMENTSASSET DELTAs

Full file must be deployed for small modification

Text based assetsXML & LUA Files can easily have a delta injected after assets loading

Some of our XML files can be up to 500Kb in size

Regularly check for new updates

Page 42: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

DATA PATCHING FUTURE DEVELOPMENTSASSET DELTAs

Patch XML Nodes

More complicated but huge savingsExtra tools & build steps required but xml patches reduced in size to 1-2% of original

Add, remove or modify at a node level

Page 43: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Current permissions.xml end-point fixed

Need a way to redirect the request externally

Added bonus

Using build-version, SKU-ID, Tags etc

Could use this to patch net-tests, fix dev builds etc

This makes testing new patches difficult

DATA PATCHING FUTURE DEVELOPMENTSRe-DIRECT HTTP REQUESTS

Page 44: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Some patches are not gameplay critical

Exclude these from any filteringBasically, do not XOR this packs MD5 into the matchmaking version

For example cosmetic asset changes or players personal stats configurations

0xA4BCXOR

XORExe

P2

P1

0xA5BCMatchmaking =

0x96CC

0x0100

DATA PATCHING FUTURE DEVELOPMENTSDIFFERENTIATE GAME CHANGING PAKS

Page 45: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

TELEMETRY COLLECTIONPART3

Page 46: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

TELEMETRY COLLECTION - CLIENT OVERVIEW

Data zipped up and streamed asynchronouslyCompressed and streamed

Collection and uploading via HTTPSimple API to push data from files or memory

Fire & Forget. Upload may fail for numerous reasons No Guarantees

Page 47: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

TELEMETRY COLLECTION - SERVER OVERVIEW

No requirements for immediate results No complex processing on the server

Storage of files received onlyOrganised by date, platform and type

Any usernames & accounts salted and hashedAnonymous data

Page 48: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

TELEMETRY COLLECTION - SYNCING DATA

Data deleted after seven daysServer data kept for fixed time period

Downloaded to Crytek servers Rsync-ed daily to internal servers

Ultimately discardedAnalysed locally

Page 49: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

TELEMETRY COLLECTION - PROCESSING

Considered the weakest link in the chainManually triggered and collated

Turning raw telemetry into useful dataAchieved with a mixture of python & Excel

Optimising has never been a high priorityProcessing is slow and intensive

Page 50: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

“HOW DO YOU HANDLE HUNDREDS OF THOUSANDS OF CLIENTS UPLOADING SIMILTANEOUSLY ?”

So...

Page 51: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

SAMPLE PLAYERSSample deterministically at the client end

User:

coolbeenz

bool shouldUpload = (Hash( username ) % denominator) < numerator;

0x12345678 0x2E8 NOHash % 1000 < 100 ?

Page 52: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

SAMPLE PLAYERSSample deterministically at the client end

Upload Do not Upload

Select a large denominator and do not change this

Choose a numerator to give you the desired sampling ratio

100

Vary the numerator to meet changing sampling demands

This sets the amount the sampling ratio can be incremented by

E.g 100/1000 = 10%

The individual users being sampled remains consistent

coolbeenz1000

Page 53: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

“WHAT KIND OF TELEMETRY DO YOU COLLECT?”And...

Page 54: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

CRYSIS 3 MATCHMAKING TELEMETRY

Matchmaking one of the top 5 complaints

Find a session fast but find a good session

For consoles & PC

This essentially boils down to ping times

PC also has a quick match option as well as a server browser

Based on MyCrysis Forum feedback

QUANTIFYING THE BLACKBOX

Tricky to balance and impossible to predictRequires constant re-evaluation even with adaptive algorithms

User experience feedback not good enoughYou know people are not happy but why exactly?

Page 55: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Create a system which is data driven

Server side

Client Side

Used Blaze servers. Rule based, highly configurable, including relaxation criteria

The rules and times used can be configured and therefore data patched

If we are going to collect telemetry we need to be able to action a response

CRYSIS 3 MATCHMAKING TELEMETRYSO WHERE DO WE START?

Page 56: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Q.How many times does a player matchmake?

Q.What kind of ping times do players get during that session?

Q.How long does it take a player to get into a session successfully?

Q.What is the most popular method of joining a session?

Q.What is the average matchmaking time?

CRYSIS 3 MATCHMAKING TELEMETRYDECIDE WHAT QUESTIONS NEED ANSWERING

Page 57: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Need a solution that does not result in GB’s of data

Collect a series of timestamped events in XML

Timestamps based on a zero base time

But still want to be flexible enough to answer a range of questions

Also collect meta data for each event

But still store a server timestamp for collating multiple clients data

<AttemptConnection Method="MatchMake" Timestamp="0.000" />“GameBrowser”“Join Session in progress”“Friend Invite”“Join Squad”

CRYSIS 3 MATCHMAKING TELEMETRYIMPLEMENT AN APPROPRIATE TELEMETRY SOLUTION

Page 58: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Q.How many times does a player matchmake?

Collect time stamped events with meta data

Q.How long does it take a player to get

into a session successfully?Q.What is the most popular

method of joining a session?

Q.What is the average matchmaking time?

CRYSIS 3 MATCHMAKING TELEMETRYIMPLEMENT AN APPROPRIATE TELEMETRY SOLUTION

Page 59: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

RESULTSMatchmaking Telemetry

The most surprising result was that there were still 2 major bugs in the

client side code

Eventually this was increased to 82%

One of these was fixed with a data patch. Win!

The results were very insightfulResulted in several iterative improvements

Initially 65% of players took less than 5 seconds to find a match

Still not perfect but there are many external factors at play

1 in 15 matchmaking requests fail

Page 60: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

RESULTSMatchmaking Telemetry

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6 7 8 9 10 11 12 13 20 255 10 15Time (s)

% Users Matched

Time to matchmake

Page 61: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

RESULTSMatchmaking Telemetry

How do players join a session?

Quick Match

Join Squad - Already In Game

Join Squad - Lobby

Private Game

Join Friends Game

Server Browser

Quick Match

Join Squad - Already In Game

Join Squad - Lobby

Join Friends Game

Console

PC

Page 62: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

Automate the analysis of the telemetry

Utilise A/B testing

User actions telemetry

The results change over time so results can be skewed by a different player pool

We did not collect all user action events. For example when the user backed out

Manual process meant delays in turning around changes

FUTURE DEVELOPMENTSMatchmaking Telemetry

Page 63: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

RELEASE DEBUGPART4

Page 64: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

DEBUG SCREENSEnsure you can gather the info you need in large scale public testing

Page 65: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

ERROR CODESEmbed error codes as well as user friendly (TCR) messaging

Page 66: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

SUMMARYStart Early

Collecting telemetry is easy

Have the ability to scale collection

Turning that into useful information is difficult

Be able to balance server load and fail safe

Think ahead, the technology involved is complex and cannot be bolted on

Make it easy to testDont underestimate the amount of test required in development

Automate as much as you canAny manual elements of the system become it’s weakest point

Get buy-in from managementIt is difficult to justify continued support when the returns are not directly financial

Page 67: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

“Do you HAVE ANY QUESTIONS?”That is it!

Page 68: The post release technologies of Crysis 3 (Slides Only) - Stewart Needham

THANKYOU FOR LISTENINGAny feedback, positive or negative welcomed

Twitter: @coolbeenz

Email: [email protected]