Upload
datatorrent
View
3
Download
1
Embed Size (px)
Citation preview
End to End Exactly once processing in Apache Apex
Hitesh [email protected] Gugale Shah
Agenda
● What is End to End Exactly once● Fault Tolerance ● Recovery from Operator failure.● Recovery Mechanisms.
● Importance of End to End Exactly once
● Achieving End to End Exactly once in Apache Apex●Example DAG achieving the desired goal
● Conclusion & Questions.
Fault
Recovery Mechanisms
●
●
●○○○
Replay data .. Big Data ..?
Image source https://www.pinterest.com/toysconcept/baby-fun-things/
Retrieve Operator State & Results
●
●
●
●
Recovery MechanismsAt most once
Subscribes to data from the start of the next window.
Ignore the lost windows and continue to processincoming data normally.
No duplicates
Possible missing data
Recovery Mechanisms
At most once At least once
Subscribes to data from the start of the next window.
Operator brought back to its latest checkpointed state and the upstream buffer server replays all subsequent windows
Ignore the lost windows and continue to processincoming data normally.
lost windows are recomputed & application catches up live incoming data
No duplicates Likely duplicates
Possible missing data No missing data
Recovery Mechanisms
At most once At least once Exactly once
Subscribes to data from the start of the next window.
Operator brought back to its latest checkpointed state and the upstream buffer server replays all subsequent windows
Operator brought back to its latest checkpointed state and the upstream buffer server replays all subsequent windows
Ignore the lost windows and continue to processincoming data normally.
lost windows are recomputed & application catches up live incoming data
Lost windows are recomputed in a logical way to have the effect as if computation has been done exactly once.
No duplicates No recomputation Likely duplicates & recomputation Duplicates/Recomputation?
Possible missing data No missing data No missing data
If window is recomputed then how “exactly” once?
Image source https://www.pinterest.com/toysconcept/baby-fun-things/
End-to-End Exactly Once
11
•
•
•
•
•ᵒ
•
Idempotency
12
End-to-End Exactly-Once
Aggregate CountsWords
Kafka Database
● Input○ Uses com.datatorrent.contrib.kafka.KafkaSinglePortStringInputOperator
○ Emits words as a stream○ Operator is idempotent
● Counter○ com.datatorrent.lib.algo.UniqueCounter○ aggregate over a window, retain the aggregates as state for the duration of the window, emit them at
the end of the window and clear the state.● Store
○ Uses CountStoreOperator○ Inserts into JDBC○ Exactly-once results (End-To-End Exactly-once = At-least-once + Idempotency + Consistent State)
https://github.com/DataTorrent/examples/blob/master/tutorials/exactly-oncehttps://www.datatorrent.com/blog/end-to-end-exactly-once-with-apache-apex/
End-to-End Exactly-Once
Aggregate CountsWords
Kafka Database
● Apex connectors retrieving data from external systems need to ensure recovery.
● Involves rewinding the stream and replaying unprocessed data from the source.
● The capabilities of the external system determine complexity.
● Kafka handles message persistence allowing to replay the message stream directly.
● Apex input operator needs to remember the offsets.
Idempotency - Apex Kafka operator●
●
●
●
●
●
Exactly Once Strategy
17
●
●
●
●
Exactly Once Strategy
18
d11 d12 d13
d21 d22 d23
lwn1 lw
n2 lw
n3
op-id wn
chk wn wn+1
Lwn+1
1 Lwn+1
2 Lwn+1
3
op-id wn+1
Data Table Meta Table
• Data in a window is written out in a single transaction
• Window id is also written to a meta table as part of the same transaction
• Operator reads the window id from meta table on recovery
• Ignores data for windows less than the recovered window id and writes new data
• Partial window data before failure will not appear in data table as transaction was not committed
• Assumes idempotency for replay
End-to-End Exactly-Once (Contd.)
Aggregate CountsWords
Kafka Database
public static class CountStoreOperator extends AbstractJdbcTransactionableOutputOperator<KeyValPair<String, Integer>>{ public static final String SQL = "MERGE INTO words USING (VALUES ?, ?) I (word, wcount)" + " ON (words.word=I.word)" + " WHEN MATCHED THEN UPDATE SET words.wcount = words.wcount + I.wcount" + " WHEN NOT MATCHED THEN INSERT (word, wcount) VALUES (I.word, I.wcount)";
@Override protected String getUpdateCommand() { return SQL; }
@Override protected void setStatementParameters(PreparedStatement statement, KeyValPair<String, Integer> tuple) throws SQLException { statement.setString(1, tuple.getKey()); statement.setInt(2, tuple.getValue()); }}
Everything Tailored Together
Conclusion
•
•
•
Conclusion contd.
•ᵒ
ᵒ
ᵒ
Aggregate CountsWords
Kafka Database
Resources••
ᵒᵒ
••
ᵒᵒ
•••••
ᵒ•