31
Extracting Hidden Information From Application Log Files session goes here [email protected] Dag Nygaard. Testify AS, Norway.

Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Extracting Hidden Information From

Application Log Files session goes

here

[email protected]

Dag Nygaard.

Testify AS, Norway.

Page 2: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Agenda

• Who am I. Disclaimer. Background for this topic.

• Application Logging 001: Primary usage, how and with what tools.

• Secondary usage: Audit, security and system quality.

• What are testers looking for.

• How to find what we are looking for.

• Analysis of findings.

• Challenges related to analyzing application log files.

• The key points.

• Questions!

Page 3: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

CV

• Cand. scient. in Computer Science at University of Oslo, Norway.

• Worked as a programmer on consulting contracts since 1996.

• Focused on CI and Testing 5 years ago – often as a test team hacker.

• Senior Test Engineer at Testify AS.

• Current project: test developer at Skatteetaten, The Norwegian Tax Administration.

Page 4: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Disclaimer

Examples in these slides are anonymized unless considered public domain.

Analysis of application log files require that logging is implemented in the runtime environment and tester must have access to the log files.

Analyzing application log files is an exploratory testing track. There are few absolute truths. What is OK in one instance is a serious bug in another setting.

Page 5: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Background for this Talk

User acceptance test project.

• No access to code.

• Access to documentation, application GUI, database and

• Suspicion of not too good quality SW based upon error messages up front and timing issues.

application log files.

Page 6: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Application Logging

A sample from an application log file generated by a logging library such as Log4j might look like this:

Each line is divided into 4 fields:

Timestamp | Severity | Module/Class | Message

12.07.2016 09:56:29 | INFO | no.testify.petstore.ComCenter.RemoteDriver | Stream closed to device '42'. Stream was open for 00:00:01.328127412.07.2016 09:56:29 | DEBUG | no.testify.petstore.ComCenter.ConnectionManager | Removed connection initiated from 127.0.0.1:49182.12.07.2016 09:56:29 | ERROR | no.testify.petstore.ComCenter.CommandDispatcher | Could not process push connection from end point (127.0.0.1:49182): The network stream has been closed.12.07.2016 09:56:29 | INFO | no.testify.petstore.ComCenter.DataAccess. | Added 1 entities to database.12.07.2016 09:57:05 | INFO | no.testify.petstore.ComCenter.SocketListener | New incoming connection received from 127.0.0.1:4918312.07.2016 09:57:06 | DEBUG | no.testify.petstore.ComCenter.DataAccess | Getting module with address '2001915'.12.07.2016 09:57:06 | DEBUG | no.testify.petstore.ComCenter.DataAccess | Batch filled with 1 StatisticRowEntity items, 0 items left in queue.12.07.2016 09:57:06 | ERROR | no.testify.petstore.ComCenter.RemoteDriver | Received an open session request with an unknown network address: 2001915. Disposing the driver.

Page 7: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Log Configurations

• Severity Threshold: The severity is hierarchical:

ALL < DEBUG < INFO < WARN < ERROR < FATAL < OFF

All entries with set threshold and higher will be written to the log file.

• Rolling mechanism: Set how the log file is split - no split, daily or by size.

Page 8: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Changing Log Level

Changing the threshold level in the configuration changes what is logged.

• Without need to recompile the code.

• Without need to redeploy, only restart.

The logging library is designed to include log statements in the binary, with little performance costs and focus on speed, reliability and flexibility.

Page 9: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Primary Usage

The primary use of the logging library is to:

• Debug code.

• Audit the system.

An evolution of programming practice from embedding

System.out.print(msg)

in runtime code to log.debug(msg)

and choose post-deploy what to print.

Page 10: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Test Usage!

Test:Execute a test that trigger

some error event.

Assert:Confirm that the event is

logged in the logfile with an appropriate severity level

and a meaningful message.

Page 11: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Other uses – Audit

General purpose audit12.07.2016 09:57:06 | INFO | no.testify.petstore.ComCenter.DataAccess | Value saved, Id=312, operator=JD

Progress audit12.07.2016 09:57:05 | INFO | no.testify.petstore.ComCenter.SocketListener | Received packet 32 of 256, receiverId=412

Component audit12.07.2016 09:56:29 | INFO| | no.testify.petstore.ComCenter | ComCenter version 3.14 started on 127.0.0.1

Security audit12.07.2016 09:57:06 | WARN | no.testify.petstore.UserAdm | User 'JohnDoe' failed to login

Page 12: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Analyze: Counting and Grouping

Search: How many log file entries with severity Critical, Error and Warning are found.

Group per hour, per 24-hour period, per week, per month.

Do the errors come evenly dispersed or peak at certain periods of time - around backup time or lunchtime?

Group the log entries by similar message types, i.e. only differing in timestamp and id and then recount.

Does the numbers appear reasonable?

Page 13: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Counting and Grouping

Use dashboards for high visibility!

Page 14: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Grouping – by periods of time

Grouping and displaying results graphically aid in locating problem times.

Page 15: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Continuing - Exploratory Testing

• We use the log file entries as system indicators.

• Logged system faults that are expected confirm test results.

• Unexpected log entries indicate faults that need to be reported and followed up.

• Other log file entries will hopefully give us a feeling of good quality or help us create tests that confirm suspicion of lesser quality software.

Page 16: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Search and Grouping Tools

• All text editors have search functionality.

• Baretail is used by some tester and developers.

• Unix/Cygwin commands grep and wc are excellent to do some adhoc analysis.

• Splunk, LogStash and similar tools collect and index log files and make searches go faster, but begins to cost in funds and resources.

Page 17: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Message Content

Have grouped the log file entries by similarities in the message field, do messages convey something meaningful and with correct severity?

12.07.2016 09:56:29 | WARN | no.testify.petstore.ComCenter | A critical error occurred

Page 18: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Points of Interest - Negative

Not necessarily bugs but something to be aware of:

• Log file entries with no useful information.

• Extreme logging volumes – disk space, cleanup issues, searching takes more time, logging has a small overhead that becomes large.

• Several log file entries may duplicate the same information from different sources.

Page 19: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Point of interest – Progress Data

Practical extraction example:

12.07.2016 09:57:05 | INFO | no.testify.petstore.ComCenter.SocketListener | Received packet 32 of 256, receiverId=412

• Group by unique receiverId sorted by descending timestamp.Answer to: How many receivers are downloading something by the hour or by 24-hour period.

• Group by receiverId and highest packets received, sort by descending timestamp.Answer to: What is the max, min and average download progress for some period of time.Answer to: How many have recieved all packets and are done the past hour.

Page 20: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Serious Issues

Key words that should mean serious-level bugs unless handled properly

• OutOfMemoryException

• IndexOutOfBounds

• NullPointerException

• ParameterOutOfRange

• SQLException - database not responding

Page 21: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

OutOfMemory example

Exception in thread "ContainerBackgroundProcessor[StandardEngine[Catalina]]" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3210) at java.lang.String.<init>(String.java:216) at java.lang.StringBuffer.toString(StringBuffer.java:585) at org.netbeans.lib.profiler.server.ProfilerRuntimeMemory.traceVMObjectAlloc(ProfilerRuntimeMemory.java:170) at java.lang.Throwable.getStackTraceElement(Native Method) at java.lang.Throwable.getOurStackTrace(Throwable.java:590) at java.lang.Throwable.getStackTrace(Throwable.java:582) at org.apache.juli.logging.DirectJDKLog.log(DirectJDKLog.java:155) at org.apache.juli.logging.DirectJDKLog.error(DirectJDKLog.java:135) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1603) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1610) at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1590) at java.lang.Thread.run(Thread.java:619) Exception in thread "*** JFluid Monitor thread ***" java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2760) at java.util.Arrays.copyOf(Arrays.java:2734) at java.util.Vector.ensureCapacityHelper(Vector.java:226) at java.util.Vector.add(Vector.java:728) at org.netbeans.lib.profiler.server.Monitors$SurvGenAndThreadsMonitor.updateSurvGenData(Monitors.java:230) at org.netbeans.lib.profiler.server.Monitors$SurvGenAndThreadsMonitor.run(Monitors.java:169) Nov 30, 2009 2:22:05 PM org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor processChildren SEVERE: Exception invoking periodic operation: java.lang.OutOfMemoryError: Java heap space at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232) at java.lang.StringCoding.encode(StringCoding.java:272) at java.lang.String.getBytes(String.java:946) at java.io.UnixFileSystem.getLastModifiedTime(Native Method) at java.io.File.lastModified(File.java:826) at org.apache.catalina.startup.HostConfig.checkResources(HostConfig.java:1175)

Page 22: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Unhandled Exception with Stacktrace06.11.2017 09:46:43 | Error | no.testify.petstore.ComCenter.StoreWriter | Data access error while inserting FooValue into the database: Cannot insert duplicate key row in object 'dbo.FooValue' with unique index 'IX_FooValue_Type_Timestamp_SystemSequenceNumber_EntryId'. The duplicate key value is (131168, 1, 2017-05-11 09:40:33, 199286, 35000-2).The statement has been terminated.Exception: no.testify.petstore.ComCenter.Exceptions.DataAccessException: Error while trying to add entities. ---> System.Data.SqlClient.SqlException: Cannot insert duplicate key row in object 'dbo.FooValue' with unique index 'IX_FooValue_Type_Timestamp_SystemSequenceNumber_EntryId'. The duplicate key value is (131168, 1, 2017-11-06 09:40:33, 199286, 35000-2).The statement has been terminated.

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler,

TdsParserStateObject stateObj, Boolean& dataReady)at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()at System.Data.SqlClient.SqlDataReader.get_MetaData()at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean

forDescribeParameterEncryption)at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout,

Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method,

TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)at System.Data.Entity.Infrastructure.Interception.InternalDispatcher`1.Dispatch[TTarget,TInterceptionContext,TResult](TTarget target, Func`3 operation, TInterceptionContext

interceptionContext, Action`3 executing, Action`3 executed)at System.Data.Entity.Infrastructure.Interception.DbCommandDispatcher.Reader(DbCommand command, DbCommandInterceptionContext interceptionContext)at System.Data.Entity.Core.Mapping.Update.Internal.DynamicUpdateCommand.Execute(Dictionary`2 identifierValues, List`1 generatedValues)at System.Data.Entity.Core.Mapping.Update.Internal.UpdateTranslator.Update()--- End of inner exception stack trace ---at no.testify.petstore.ComCenter.EntityServiceBase`1.Add(Int32 customerId, TEntity entity)at no.testify.petstore.ComCenter.DataWriters`2.HandleFooValueDataUsingSequenceNumbers(Int32 customerId, TLogEntryEntity fooValueData)at no.testify.petstore.ComCenter.DataWriters`2.InsertFooValueDatas(Int32 customerId, List`1 fooValues)

Inner exception: System.Data.SqlClient.SqlException (0x80131904): Cannot insert duplicate key row in object 'dbo.FooValue' with unique index 'IX_FooValue_Type_Timestamp_SystemSequenceNumber_EntryId'. The duplicate key value is (131168, 1, 2017-11-06 09:40:33, 199286, 35000-2).The statement has been terminated.

at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler,

TdsParserStateObject stateObj, Boolean& dataReady)at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData()at System.Data.SqlClient.SqlDataReader.get_MetaData()at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString, Boolean isInternal, Boolean

forDescribeParameterEncryption)at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout,

Task& task, Boolean asyncWrite, Boolean inRetry, SqlDataReader ds, Boolean describeParameterEncryptionRequest)at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method,

TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean& usedCache, Boolean asyncWrite, Boolean inRetry)at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)at System.Data.Entity.Infrastructure.Interception.InternalDispatcher`1.Dispatch[TTarget,TInterceptionContext,TResult](TTarget target, Func`3 operation, TInterceptionContext

interceptionContext, Action`3 executing, Action`3 executed)at System.Data.Entity.Infrastructure.Interception.DbCommandDispatcher.Reader(DbCommand command, DbCommandInterceptionContext interceptionContext)at System.Data.Entity.Core.Mapping.Update.Internal.DynamicUpdateCommand.Execute(Dictionary`2 identifierValues, List`1 generatedValues)

Page 23: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

More Content - Exceptions

Public stack traces can be embarrassing.

Usually we only find them in the log files.

Not a confidence booster...Snapshot of a stack trace on the screen of a faucet(!) in a casino in Las Vegas...

Page 24: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Service keeps restarting

12.07.2016 09:56:29 | INFO| no.testify.petstore.ComCenter | ComCenter version 3.14 started on 127.0.0.1

This log file entry seems innocent. What if it is found 500 times each day and you have no idea why the service keeps restarting. You got an issue!

Also, have we lost or corrupted any data? Are there more services that have stopped?

Page 25: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Password Guessing or Just Forgetful

12.07.2016 09:57:06 | WARN | no.testify.petstore.UserAdm | User 'JohnDoe' failed to login

1-3 of these is OK. 5000 in 5 minutes and you have reason to believe the application is being attacked.

Not a bug, but time to add more security.

Page 26: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

General Data Protection Regulation

Breaches of GDPR is another issue on log file content that should be reported as a bug when found.

Custom scripts for searching for social security numbers and other personal data can be found on the Internet. Use these scripts to catch breaches.

Page 27: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Batch Processing

These two messages are not from the same issue, but from a real test:

• Level: Information.Added 1 entities to database.

• Level: Warning.Duplicate found while inserting a batch of 1 FooValue into the database. Attempting one FooValue at a time.

Keyword is the number, 1.

Not a bug, but smelling big overhead, possible resource hog, may be a performance issue.

Page 28: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Wrong Severity

Wrong severity on log file entries are not necessarily bugs, but do add to the feeling of quality.

Level: ERROR

Session timed out, userId=‘JohnDoe’.

Too low severity may be more serious.

Too many of the former is noise pollution. Too much noise pollution and testers might overlook serious issues.

Page 29: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Challenges

• We are using brittle data. Log file entries and the texts in them are often not based in the requirements. The texts in them might change from version to version.

• Extreme volumes at performance cost.

• Disk space and cleanup issues.

• Found incidents are often not functional bugs only hints of sloppy coding, looks ugly and does not assert confidence.

• Quick and dirty fixes to bugs related only to log file entries: change logging threshold or remove debug code statements only. I.e. remove the evidence.

• Lack of project status and support in the development team.

• Reporting what might be a non-functional bug could be touchy. Be careful!

Page 30: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Key Points

• Get logging into the requirements:• Consistent logging format on all applications.• Same level of logging on similar issues.• No missing exception handling.• Agree on log file threshold that will be used after acceptance

period.• Acceptable count on CRITICAL, ERROR and WARNING levels.

• Get Maintenance on the field – agree which threshold they need to act upon.

• Not all ERRORs are bugs, but too many of them should be a warning of possible quality issues.

• Few absolute truths, possibility of many indications that might be needed to be acted upon.

Page 31: Extracting Hidden Information From Application Logfiles · Disclaimer Examples in these slides are anonymized unless considered public domain. Analysis of application log files require

Questions?