Upload
yan-cui
View
1.794
Download
0
Embed Size (px)
Citation preview
Scalability & Big Data challenges in Real-Time Multiplayer
games
Real-Time games in Top 100 Grossing (2017)
2014
2015
2016
2017
(3)
(6)
(8)
(13)
2018 ???
Enabling Factors
Source: PC Mag
Enabling Factors
Source: OpenSignal
Enabling Factors
QUIZ TIME: In 2017, which of these games has made the most revenue?
The world’s most popular MOBA on PC
The world’s most popular First Person Shooter Some game by Blizzard
Some game by EA A Chinese 5v5 mobile game you never hear of Some game by King
The world’s most popular MOBA on PC
The world’s most popular First Person Shooter Some game by Blizzard
Some game by EA A Chinese 5v5 mobile game you never heard of Some game by King
QUIZ TIME: In 2017, which of these games has made the most revenue?
>$400M Monthly RevenueSource: Bloomberg
>80M DAUSource: Tencent
10-20 inputs/s, sensitive to lags (> 300ms)
unpredictable network, limited bandwidth
Decisions, decisions...Build vs Buy?
Self-hosted vs Cloud?
Global deployment vs Centralized?
TCP vs UDP?
Server Authoritative vs Lock-Step?
Constraints/Trade-offs
Latency (RTT)
Cost
Complexity
Scalability
Operational overhead
Global Deployment
vs
Centralised
10-20 inputs/s, sensitive to lags (> 300ms)
optimize for this
Global Deployment● Players are geo-routed to closest multiplayer server.
● Matched with other players in the same geo-region for best UX.
● No need for players to “choose server”, it should just work.
Global Deployment● Should leaderboards be global or regional?
● Should guilds/alliances be global or regional?
● Should chatrooms be global or regional?
● Should liveops events be global or regional?
● Should players be allowed to play with others in another region? ie. play with distant relatives/friends.
● Should players be allowed to switch default region? eg. moved to Europe after Brexit
Server Authoritative
vs
Lock-Step
Server Authoritative● Server decides game logic.
● Client sends all inputs to server.
● Client receives game state (either full, or delta) from server.
Server Authoritative● Server decides game logic.
● Client sends all inputs to server.
● Client receives game state (either full, or delta) from server.
● Client keeps internal state for game world, which mirrors server state.
● Client doesn’t modify world state directly, only display with some prediction to mask network latency.
Client 1 Client 2Server
C1 control 1 C2 control 1
game state 1
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2game state 1
game state 2
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2game state 1
game state 2
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2game state 1
game state 2
game state 3C1 control 1C2 control 1C2 control 2
game state 3C1 control 1C2 control 1C2 control 2
C2 control 3
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
game state 1
game state 2
game state 3C1 control 1C2 control 1C2 control 2
game state 3C1 control 1C2 control 1C2 control 2
game state 4
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
game state 1
game state 2
game state 3C1 control 1C2 control 1C2 control 2
game state 3C1 control 1C2 control 1C2 control 2
game state 4
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
game state 1
game state 2
game state 3C1 control 1C2 control 1C2 control 2
game state 3C1 control 1C2 control 1C2 control 2
game state 5C2 control 3
game state 4
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
game state 1
game state 2
game state 3C1 control 1C2 control 1C2 control 2
game state 3C1 control 1C2 control 1C2 control 2
game state 5C2 control 3
game state 4
Pros● Always in-sync.
● Hard to cheat - no memory hacks, etc.
● Easy (and quick) to join mid-match.
● Server can detect lagged/DC’d client and take over with AI.
Cons● High server load.
● High bandwidth usage.
● Synchronization on the client is complicated.
● Little experience in the company with server-side .Net stack. (bus factor of 1)
● .NetCore was/is still a moving target.
high server load and bandwidth needs
client has to receive more data
Lock-Step*● Client sends all inputs to server.
● Server collects all inputs, and buffers them.
● Server sends all buffered inputs to all clients X times a second.
* traditional RTS games tend to use peer-to-peer model
Lock-Step*● Client sends all inputs to server.
● Server collects all inputs, and buffers them.
● Server sends all buffered inputs to all clients X times a second.
● Client executes all inputs in the same order.
● Because everyone is 'guaranteed' to have executed the same input at the same frame in the same order, we get synchronicity.
● Use prediction to mask network latency.
* traditional RTS games tend to use peer-to-peer model
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
C1 control 1C2 control 1C2 control 2
C1 control 1C2 control 1C2 control 2
C2 control 3
inputs, instead of game state
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
C1 control 1C2 control 1C2 control 2
C1 control 1C2 control 1C2 control 2
C2 control 3
RTT: time between sending an input to receiving it back from server
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
C1 control 1C2 control 1C2 control 2
C1 control 1C2 control 1C2 control 2
C2 control 3
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
C1 control 1C2 control 1C2 control 2
C1 control 1C2 control 1C2 control 2
C2 control 3
RTTframe time
Client 1 Client 2Server
C1 control 1 C2 control 1
C2 control 2
C2 control 3
C1 control 1C2 control 1C2 control 2
C1 control 1C2 control 1C2 control 2
C2 control 3
RTTframe time
RTT = latency x 2 + XXmin = 0, Xmax = frame time
Pros● Light server load.
● Lower bandwidth usage.
● Simpler server implementation.
Cons● Needs deterministic game engine.
● Unity has long-standing determinism problem with floating point.
● Hackable, requires some form of server-side validation.
● All clients must take over lagged/DC’d client with AI.
● Slower to join mid-match, need to process all inputs.
● Need to ensure all clients in a match are compatible.
fix-point math, server validation, ...
bandwidth
Build vs Buy
Pros● Easy to use.
● Already use it for prototype games.
● Multi-region, lobby, etc. come out-of-the-box.
● Had a long time to optimize their solution.
Cons● Quite expensive, pay for provisioned peak monthly CCU.
● “can we bet the future of our company on a third-party?”.
● Unknown global distribution at scale
● Accessibility of support.
● Limited extensibility.
● Runs on Windows.
So, we decided to build our own networking stack
+
A model for describing computation, coined by Carl Hewitt & co in 1973.
Later popularised by Erlang.
Actor Model
Carl Hewitt
Everything is an actor. Every actor has a mailbox.
An actor is the fundamental unit that embodies the 3 essential things for computation:● processing● storage● communications
Actor Model
Actors don’t share memory, they communicate only via messages.
When an actor receives a message, it can:● create new actors● send messages to other actors● do work
Actor Model
Actors don’t share memory, they communicate only via messages.
When an actor receives a message, it can:● create new actors● send messages to other actors● do work
Actor Model Johnny?
Not sharing memory prevents cascade failures when an actor crashes.
Ericsson AXD301
Inside an actor, messages are processed one-at-a-time, in a single-threaded fashion.
No need for locks!
Actor Model
single-threaded
Inside an actor, messages are processed one-at-a-time, in a single-threaded fashion.
No need for locks!
Simplifies concurrency, no deadlocks, race conditions, etc.
Actor Model
single-threaded
Lifts concurrency management to the mailbox.
Allows you to “think globally, but act locally”.
Actor Model
Lifts concurrency management to the mailbox.
Allows you to “think globally, but act locally”.
Easier to think about a complex system in terms of states and transitions, than to manage state mutations.
Actor Model
MATCH 1
C1 input
C2 input
current frame historyframe 1
frame 2
frame 3
buffering
connection open
MATCH 1
C1 input
C2 input
current frame historyframe 1
frame 2
frame 3
buffering
connection open
authenticate
MATCH 1
C1 input
C2 input
current frame historyframe 1
frame 2
frame 3C3 joined
buffering
connection open
authenticate
send/receive
MATCH 1
C1 input
C2 input
current frame historyframe 1
frame 2
frame 3C3 joined
buffering
MATCH 1
C1 input
C2 input
current frame historyframe 1
frame 2
frame 3C3 joined
C3 input
connection open
authenticate
send/receive
buffering
MATCH 1
C1 input
C2 input
current frame historyframe 1
frame 2
frame 3C3 joined
C3 input
connection open
authenticate
send/receive
buffering
broadcast!
MATCH 1
current frame historyframe 1
frame 2
frame 3
C1 input C2 input C3 joined C3 input
connection open
authenticate
send/receive
buffering
broadcast!
MATCH 1
current frame historyframe 1
frame 2
frame 3
frame 4
connection open
authenticate
send/receive
buffering
broadcast!
MATCH 1
current frame historyframe 1
frame 2
frame 3
frame 4
connection open
authenticate
send/receive
buffering
broadcast!C3 input
concurrency
MATCH 1
current frame historyframe 1
frame 2
frame 3
...
C1 input C2 input C3 joined C3 input
connection open
authenticate
send/receive
buffering
broadcast!
C1 input
MATCH 1
current frame historyframe 1
frame 2
frame 3
...
C1 input C2 input C3 joined C3 input
buffering
broadcast!
C1 inputC2 input
MATCH MATCH MATCH MATCH MATCH
MATCH MATCH MATCH MATCH MATCH
MATCH MATCH MATCH MATCH MATCH
MATCH MATCH MATCH MATCH MATCH
MATCH MATCH MATCH MATCH MATCH
MATCH MATCH MATCH MATCH MATCH
MATCH
C1 input
C2 input
current frame historyframe 1
frame 2
frame 3C3 joined
connection open
authenticate
send/receivebuffering
broadcast!
MATCH
C1 input C2 input
current frame historyframe 1
frame 2
frame 3C3 joined
connection open
authenticate
send/receivebuffering
broadcast!
MATCH
current frame historyframe 1
frame 2
frame 3
C1 input
C2 input
C3 joined
Socket actor
Match actor
MATCH
current frame historyframe 1
frame 2
frame 3
C1 input
C2 input
C3 joined
Root Aggregate
Socket actor
Match actor
MATCH
current frame historyframe 1
frame 2
frame 3
C1 input
C2 input
C3 joined
Root Aggregate
Socket actor
Match actor
MATCH
current frame historyframe 1
frame 2
frame 3
C1 input
C2 input
C3 joined
MATCH
current frame historyframe 1
frame 2
frame 3
C1 input
C2 input
C3 joined
C3 joined
act locally
think globallyhow actors interact with each other
aka, the “protocol”
the secret to building high performance systems is simplicity
complexity kills performance
Higher CCU per server
Fewer servers
Lower cost
Less operational overhead
Performance Matters
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
Performance Matters
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
Performance Matters
Threads are heavy OS constructs.
Each thread is allocated 1MB stack space by default.
Context Switching is expensive at scale.
Actors are cheap.
Actor system can optimise use of threads to minimise context switching.
Actor Model >
Non-blocking I/O framework for JVM.
Highly performant.
Simplifies implementation of socket servers (TCP/ UDP).
UDP support is “meh”...
Netty
Custom network protocol (bandwidth).
Buffer pooling (GC pressure).
Minimise Netty object creations (GC pressure).
Using direct buffers (GC pressure).
Disable Nagle's algorithm (latency).
Epoll.
Performance Tuning
AWS Lambda functions to run bot clients (written with Akka):
● Cheaper● Faster to boot up● Easy to update
Each Lambda invocation could simulate up to 100 bots.
Automated Load Testing
from US-EAST (Lambda) to EU-WEST (game server)
optimize for tail latencies
from US-EAST (Lambda) to EU-WEST (game server)
http://bit.ly/2xgGHXZ
Thank You!
QUESTIONS?