Upload
brady-adkins
View
51
Download
6
Embed Size (px)
DESCRIPTION
Regular Expressions and XML Parsing. Objectives. After this session you should be able to: Understand and write Regular Expressions Create XML code that will use Regular Expressions to parse data from providers into parameters. Regular Expressions. - PowerPoint PPT Presentation
Citation preview
Objectives
After this session you should be able to:
Understand and write Regular Expressions
Create XML code that will use Regular Expressions to parse data from providers into parameters
Regular Expressions
Used to parse and analyze fields
Designed for matching text items
Requires extremely precise syntax
Regular Expression Overview Can be used in:
− Rule criteria− View criteria− Computer Group formulas− XML / parameter parsing
Popular Boost.org regular expression parser
Perl-like regular expression syntax
Features include: − Advanced text pattern matching− Timestamp conversion− UI support− Syslog IP filtering
Note: The regex that is used in Rules, Views and Computer Groups is not the same syntax as parsers
Regular Expression Example
Regular Expression = ^World\s+.* “^” means “Start of Line” “^World” means the text line must begin with “World” “\s+” means any number of spaces “.*” is a wildcard that will match anything else Matches:
“World Wide Web Publishing Service”“World with lots of space” “World Class”“World War”
Does not match: “WorldWide”“Wide World of Sports”“Wayne’s World”“War of the Worlds”
Regular Expression Example #2 Regular Expression = ^\s+TCP|ICMP\s+\d+.\d+.\d+.\d+:?\d+?
“^” means “Start of Line”
“\s+” means “any number of blank spaces”
“TCP|ICMP” means the literal words “TCP” or “ICMP” must be present
\d+.\d+.\d+.\d+:?\d+? means a field with 5 numerical (digits) parts, separated by periods and a colon, the colon and the 5th field may or may not exist.
Since each digital component has a “+” it can be any number of consecutive digits
Matches: TCP 192.168.1.1:80
ICMP 192.168.1.1
Regular Expression Example #3
Regular Expression = [^\,]*
Matches all fields until a , is seen. Any character can be used.
Useful for matching data within a given sub-expression that can vary greatly
Matches: the red text in the below line:250606001E05,25,2,8,HOUAV03
Regular Expression Operators and Their Definitions
Menu Item Character Definition
Any Character . Matches any single character
Character in Range [ ] Matches any single character from within the bracketed list
Character Not in Range
[^] Specifies a set of characters not to be matched
Beginning of Line ^ Matches the beginning of a line
End of Line $ Matches from end of string
Special Characters
Special Characters include \ ^ $ * . [ ] | + ( )
Any time you want to use a special characters as a literal, it must be escaped − Example: The path c:\myfile.txt would need to be entered
as c:\\myfile\.txt− Example: The User-ID $ExchangeService would need to
be entered as \$ExchangeService
Taking Apart the Regular Expression
^\s+ TCP \s+ \d+ . \d+ . \d+ . \d+ :? \d+?
TCP 192 . 168 . 106 . 134 : 80
Syntax Must Be Precise
Regular Expression = ^\d.\d.\d.\d:?\d?
Matches:
1.2.3.4:5
1.2.3.5
Does Not Match:
192.168.1.20:25
192.168.1.20
Examples of Regular Expressions and Matches
Example Matches Does Not Match
st.n Austin and Houston Webster
st[io]n Austin and Houston Stanton
st[^io]n Stanton Houston or Austin
^houston Houston Sam Houston
ston$ Houston and Galveston Stonewall
dall|hart Dallas and Dalhart and Lockhart Dale
dal(l|h) art Dalhart Dallas or Lockhart
il?e$ Etoile Beeville
il*e$ Etoile and Beeville Bellaire
il+e$ Etoile and Beeville Wylie
ad{2} Addison and Caddo Adkins
Regular Expression Tools & Links
Expresso
http://www.ultrapico.com/Expresso.htm− Helps with the actual writing of RegEx expressions
Regular expression syntax help
http://www.boost.org/libs/regex/doc/syntax_perl.html
Timestamp format
http://icu.sourceforge.net/userguide/formatDateTime.html
Date Section
<DateTimeMap>
<TimeStamp>
<TimeStampSample>2005-9-11T14:18:11 GMT</TimeStampSample>
<TimeStampFormat>yyyy-MM-dd'T'HH:mm:ss z</TimeStampFormat>
<TimeStampRE>\d+-\d+-\d+T\d+:\d+:\d+\w+[^|]*</TimeStampRE>
</TimeStamp>
</DateTimeMap>
<DateTimeFormat>yyyy-MM-dd'T'HH:mm:ss z</DateTimeFormat>
When using a DateTimeMap, your regex code should include the following comment tags:<!--TimeStampStartTag--><!--TimeStampEndTag-->
Filter Section
Used to pre-filter high volume Events or unwanted Events
Used to improve Provider performance
Should be as efficient and specific as possible
Sample filter section: <Filters>
<RegEx>.*last message repeated\s+\w+\s+times.*</RegEx>
</Filters>
This particular Filter is used to filter out UNIX Syslog Messages that list the previous message being repeated X times.
Event Section
Contains one or more Event matching nodes
An Event node is used to match a particular message and format it in a specific way
Each Event node contains 3 sections:− Regular Expression section – the RegEx itself− Instruction section – parameter mapping− Message section – SM description definition
Event Node Mapping – RegEx Section
<RegEx>^\s+TCP\s+(\d+.\d+.\d+.\d+):?(\d+)?\s+(\d+.\d+.\d+.\d+):?(\d+)?\s+(\w+)</RegEx>
(F 0) (F 1) (F 2) (F 3) (F 4)
(F 0) (F 1) (F 2) (F 3) (F 4)
Event Node Mapping – Instruction Section
<Instructions>
<Field name="$EventSource" source=“MYEVTSRC" />
<Field name="1" source="%0%" />
<Field name="2" source="%1%" />
<Field name="3" source="%2%" />
<Field name="4" source="%4%" />
<Field name="5" source=“" />
<Field name=“6" source=“%3%" />
</Instructions>
Event Node Mapping – Message Section
<Message><![CDATA[
Protocol: TCP
Local Address: %0%
Local Port: %1%
Foreign Address: %2%
Foreign Port: %3%
Status: %4%
]]></Message>
Note: <![CDATA[ ]]> tags are used to tell the code the interprets the XML code to ignore the contents within from an XML syntax standpoint.
Message Example This is an acceptable way to break down the event into
details, but is not necessary. A better way will be explained shortly.
<Message><![CDATA[
Protocol: TCP
Local Address: %0%
Local Port: %1%
Foreign Address: %2%
Foreign Port: %3%
Status: %4%
]]></Message>
Where are the Parameters?
Parameters are not stored separately unless SM is specifically instructed to do so.
Preventing Data Loss
Adding additional “Catch-all” parsers will allow you to collect anything that slipped through the cracks.
Examples:<Event id="“><RegEx>.*snort.*:.*</RegEx> <Instructions>
<Field name="$EventSource" source="Snort IDS" /> <Field name="$EventSeverity" source="1" />
</Instructions><Message></Message>
</Event><Event id="“><RegEx>.*</RegEx>
<Instructions><Field name="$EventSource" source="Syslog" />
<Field name="$EventSeverity" source="1" /> </Instructions>
<Message></Message></Event>
Putting it All Together
Change the Provider to XML
Click on the Configure XML button
Cut and Paste XML code from Editor
Custom Alert Descriptions – The right way to create alert messages!
The default is to use $Description$ for the Alert Description
This causes the alert to look like this:
Custom Alert Descriptions – The right way to create alert messages!
By creating a descriptive alert description, you can make the alert look like this:
This is accomplished by creating modifying the event processing rule that generates the alert to have a more detailed alert description
Limitations of Regular Expression Parsing
It is a lexical parser and it works only for sequence-based regular expression parsing
Does not support XML format messages, i.e., IDMEF messages
Sub-expressions are limited to 0–24
XML Tools & Links
SCiTe− http://scintilla.sourceforge.net/ScintillaDownload.html− Small fast text editor with color coding for XML
Notepad++− http://notepad-plus.sourceforge.net/uk/site.htm− Slightly larger text editor, but more robust than SCiTe