View
343
Download
0
Category
Preview:
Citation preview
1
Understanding Log Lines Using Development Knowledge
Ahmed E. HassanMeiyappan NagappanWeiyi Shang Zhen Ming Jiang
2
Practitioners have challenges in understanding log lines
Fetch failure
What exactly does this
message mean?
What could be the cause?
Is it affecting my data?
4
We performed an exploratory study on 3 large software systems
Zookeeper5,641 logging statements
1,080 logging statements
1,163 logging statements
5
We manually examined real-life inquiries about log lines from 3 sources
User mailing listsRandomly sampled logs
6
5 types of information are inquired about logs
Meaning
Cause
Impact
Solution
Context
What exactly does this message mean?
When does this occur?
What could be the cause?
How can I avoid this message/problem?
Is it affecting my data?
7
Experts are crucial in resolving log inquiries
by expert by non-expert replied by expert
only replied by non-expert
not answered
resolved un-resolved
0123456
5
1
0
1
3
0
2
0 0 0
3
0 0 0 0
Hadoop Cassanddra Zookeeper
8
Experts are crucial in resolving log inquiries
8 out of 11 resolved inquiries are resolved by experts.
by expert by non-expert replied by expert
only replied by non-expert
not answered
resolved un-resolved
0123456
5
1
0
1
3
0
2
0 0 0
3
0 0 0 0
Hadoop Cassanddra Zookeeper
9
Experts are crucial in resolving log inquiries
by expert by non-expert replied by expert
only replied by non-expert
not answered
resolved un-resolved
0123456
5
1
0
1
3
0
2
0 0 0
3
0 0 0 0
Hadoop Cassanddra Zookeeper
Inquiries are always resolved if experts reply.
10
Looking for an expert is not the optimal approach to resolve log inquiries
Over 20% of the inquires have no reply.
Wrong answers may be posted in reply to inquiries.
Identifying the expert of a log line is challenging.
First reply can take up to 210 hours.
12
Nothing in common between inquired logs
An on-demand approach is needed to assist in understanding logs.
Different log verbosity levels
0 to 2 degrees of fan-in
0 to 200 prior code change
Real-life inquiries
13
We propose to attach development knowledge to logs
Code commit
Issue reportsSource code
/*…*/
Call graph
Code comments
14
Code commit Issue reports
Source code
/*…*/
Code comments
Call graph
fetch failure
From method checkAndInformJobTrackerof file ShuffleScheduler.java
An example of using development knowledge to resolve inquiries of log “fetch failure”
15
Code commit Issue reports
Source code
/*…*/
Code comments
Call graph
fetch failure
Notify the JobTracker after every read error, if `reportReadErrorImmediately' is true or after every `maxFetchFailuresBeforeReporting' failures
An example of using development knowledge to resolve inquiries of log “fetch failure”
16Code
commit Issue reports
Source code
/*…*/
Code comments
Call graph
fetch failure
Called by method copyFailed in class ShuffleScheduler
An example of using development knowledge to resolve inquiries of log “fetch failure”
17
Code commit Issue reports
Source code
/*…*/
Code comments
Call graph
fetch failure
Allow shuffle retries and read-error reporting to be configurable. Contributed by Amareshwari Sriramadasu.
An example of using development knowledge to resolve inquiries of log “fetch failure”
18
Code commit Issue reports
Source code
/*…*/
Code comments
Call graph
fetch failure
MAPREDUCE-1171.… This is caused by a behavioral change in hadoop 0.20.1. ……One solution I could see is "Provide a config option... ”…
An example of using development knowledge to resolve inquiries of log “fetch failure”
19
Code commit Issue reports
Source code
/*…*/
Code comments
Call graph
fetch failure
Meaning: There is a data reading error.Cause: One of the possible reasons is a configuration.Context: The event happens during the shuffle period, while copying data.Impact: The event impacts the jobtracker.Solution: Changing a configuration option would solve the issue.
Amareshwari Sriramadasu is the expert to go to.
An example of using development knowledge to resolve inquiries of log “fetch failure”
Resolve the inquiry by development
knowledge
Go to the expert for help.
20
Overview of our approach
Version control system
Generating templates
for logs
Matching logs with log
templates
Attaching development knowledge to logs
Source code
Log templates
Development knowledge
21
Step 1: Generating templates for logs
Version control system
foo() { … Log_statement(“time=%d, Trying to launch, TaskID=%s”, time, taskid); …}
time=\d+, Trying to launch, TaskID=\S+
22
Step 2: Matching logs with log templates
time=\d+, Trying to launch, TaskID=\S+
time=1, Trying to launch, TaskID=task_1
time=2, launch task, TaskID=task_1…
time=10, task finished, TaskID=task+1Log template
Logs
23
Step 3: Attaching development knowledge to logs
Code commit
Issue reports
Source code
/*…*/
Call graph Code comments
Version control system
Issue tracking system
24
Can development knowledge complement
logging statements?
Complementing logging statements
Resolving real-life log inquiries
Can development knowledge help resolve real-life
inquiries?
25
We compare our approach against Google and mailing list for resolving real-life log inquiries
Real-life inquiries
26
Series10%
10%
20%
30%
40%
50%
60%
70%
80%Percentage of resolved log inquiries
Our approach outperforms Google and is comparable to mailing lists to resolve log inquiries
27
Meaning Cause Context Solution Impact0%
10%20%30%40%50%60%70%80%90%
100%
Percentage of each type of inquired information provided by our approach
Our approach provides 62% of inquired log information
28
Complementing logging statements
Resolving Log Inquiries
Can Development Knowledge Help Resolve Real-life
Inquiries?
YES!
Can development knowledge complement
logging statements?
29
Complementing logging statements
Resolving Log Inquiries
Can Development Knowledge Help Resolve Real-life
Inquiries?
YES!
Can development knowledge complement
logging statements?
30
We complement a random sample of logging statements using our approach
Zookeeper
300 randomly sampled logging statements
31
Development knowledge can complement logging statements
meaning cause context solution impact0
102030405060708090
100Percentage of logging statements complemented by our
approach
HadoopCassandraZookeeper
Issue reports are the best development knowledge to complement logging statements.
32
Complementing logging statements
Resolving Log Inquiries
Can Development Knowledge Help Resolve Real-life
Inquiries?
YES! YES!
Can development knowledge complement
logging statements?
33
Practitioners have challenges in understanding log lines
Fetch failure
What exactly does this
message mean?
… could this be the cause?
Is it affecting my data?
35
5 types of information are inquired about logs
Meaning
Cause
Impact
Solution
Context
What exactly does this message mean?
When does this occur?
… could this be the cause?
It will be great if some one can point to the direction how to
solve this?
Is it affecting my data?
37
We propose to attach development knowledge to logs
Code commit
Issue reportsSource code
/*…*/
Call graph
Code comments
39
Complementing logging statements
Resolving Log Inquiries
Can Development Knowledge Help Resolve Real-life
Inquiries?
YES! YES!
Can development knowledge complement
logging statements?
Recommended