View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Fast Port Scan Using Fast Port Scan Using Sequential Hypothesis Sequential Hypothesis
TestingTesting
Jaeyeon Jung, Vern Paxson, Arthur W. Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari BalakrishnanBerger, and Hari Balakrishnan
IntroductionIntroduction Port Scanning: ReconnaissancePort Scanning: Reconnaissance
Hackers will scan host/hosts for vulnerable ports as Hackers will scan host/hosts for vulnerable ports as potential avenues of attackpotential avenues of attack
Not clearly definedNot clearly defined Scan sweepsScan sweeps
• Connection to a few addresses, some fail?Connection to a few addresses, some fail? GranularityGranularity
• Separate sources as one scan?Separate sources as one scan? TemporalTemporal
• Over what timeframe should activity be trackedOver what timeframe should activity be tracked IntentIntent
• Hard to differentiate between benign scans and scans with Hard to differentiate between benign scans and scans with malicious intentmalicious intent
Previous Scanning TechniquesPrevious Scanning Techniques
Malformed PacketsMalformed Packets Packets used for “stealth scanning”Packets used for “stealth scanning”
Connections to ports/hosts per unit timeConnections to ports/hosts per unit time Checks whether a source hits more than X Checks whether a source hits more than X
ports on Y hosts in Z timeports on Y hosts in Z time Failed connectionsFailed connections
Malicious connections will have a higher ratio Malicious connections will have a higher ratio of failed connection attemptsof failed connection attempts
Bro NIDSBro NIDS
Current algorithm in use for yearsCurrent algorithm in use for years High efficiencyHigh efficiency Counts local connections from remote hostCounts local connections from remote host Differentiates connections by serviceDifferentiates connections by service Sets thresholdSets threshold Blocks suspected malicious hostsBlocks suspected malicious hosts
Flaws in BroFlaws in Bro
Skewed for little-used serversSkewed for little-used servers Example: a private host that one worker Example: a private host that one worker
remotely logs into from homeremotely logs into from home Difficult to choose probabilitiesDifficult to choose probabilities Difficult to determine never-accessed Difficult to determine never-accessed
hostshosts Needs data to determine appropriate Needs data to determine appropriate
parametersparameters
Threshold Random Walk (TRW)Threshold Random Walk (TRW)
Objectives for the new algorithm:Objectives for the new algorithm: Require performance near BroRequire performance near Bro High speedHigh speed Flag as scanner if no useful connectionFlag as scanner if no useful connection Detect single remote hostsDetect single remote hosts
Data AnalysisData Analysis
Data analyzed from two sites, LBL and ICSIData analyzed from two sites, LBL and ICSI Research laboratories with minimal firewallingResearch laboratories with minimal firewalling LBL: 6000 hosts, sparse host densityLBL: 6000 hosts, sparse host density ICSI: 200 hosts, dense host densityICSI: 200 hosts, dense host density
Separating Possible ScannersSeparating Possible Scanners
Which of remainder are likely, but Which of remainder are likely, but undetected scanners?undetected scanners? Argument nearly circularArgument nearly circular Show that there are properties plausibly used Show that there are properties plausibly used
to distinguish likely scanners in the remainderto distinguish likely scanners in the remainder Use that as a ground truth to develop an Use that as a ground truth to develop an
algorithm againstalgorithm against
Data Analysis (cont.)Data Analysis (cont.)
First modelFirst model Look at remainder hosts making failed Look at remainder hosts making failed
connectionsconnections Compare all of remainder to known badCompare all of remainder to known bad Hope for two modes, where the failed Hope for two modes, where the failed
connection mode resembles the known badconnection mode resembles the known bad No such modality existsNo such modality exists
Data Analysis (cont.)Data Analysis (cont.)
Second modelSecond model Examine ratio of hosts with failed connections Examine ratio of hosts with failed connections
made to successful connections mademade to successful connections made Known bad have a high percentage of failed Known bad have a high percentage of failed
connectionsconnections Conclusion: remainder hosts with <80% Conclusion: remainder hosts with <80%
failure are potentially benignfailure are potentially benign Rest are suspectRest are suspect
TRW – continuedTRW – continued
Detect failed/succeeded connectionsDetect failed/succeeded connections Sequential Hypothesis TestingSequential Hypothesis Testing
Two hypotheses: benign (H_0) and scanner (H_1)Two hypotheses: benign (H_0) and scanner (H_1) Probabilities determined by the equationsProbabilities determined by the equations Theta_0 > theta_1 (benign has higher chance of Theta_0 > theta_1 (benign has higher chance of
succeeding connection)succeeding connection) Four outcomes: detection, false positive, false Four outcomes: detection, false positive, false
negative, nominalnegative, nominal
ThresholdsThresholds
Choose ThresholdsChoose Thresholds Set upper and lower thresholds, n_0 and n_1Set upper and lower thresholds, n_0 and n_1 Calculate likelihood ratioCalculate likelihood ratio Compare to thresholdsCompare to thresholds
Choosing ThresholdsChoosing Thresholds Choose two constants, alpha and betaChoose two constants, alpha and beta
Probability of false positive (P_f) <= alphaProbability of false positive (P_f) <= alpha Detection probability (P_d) >= betaDetection probability (P_d) >= beta Typical values: alpha = 0.01, beta = 0.99Typical values: alpha = 0.01, beta = 0.99
Thresholds can be defined in terms of P_f and Thresholds can be defined in terms of P_f and P_d or alpha and betaP_d or alpha and beta n_1 <= P_d/P_fn_1 <= P_d/P_f n_0 >= (1-P_d)/(1-P_f)n_0 >= (1-P_d)/(1-P_f) Can be approximated using alpha and betaCan be approximated using alpha and beta n_1 = beta/alphan_1 = beta/alpha n_0 = (1-beta)/(1-alpha)n_0 = (1-beta)/(1-alpha)
Evaluation MethodologyEvaluation Methodology
Used the data from the two labsUsed the data from the two labs Knowledge of whether each connection is Knowledge of whether each connection is
established, rejected, or unansweredestablished, rejected, or unanswered Maintains 3 variables for each remote hostMaintains 3 variables for each remote host
D_s, the set of distinct hosts previously D_s, the set of distinct hosts previously connected toconnected to
S_s, the decision state (pending, H_0, or H_1)S_s, the decision state (pending, H_0, or H_1) L_s, the likelihood ratioL_s, the likelihood ratio
Evaluation Methodology (cont.)Evaluation Methodology (cont.)
For each line in datasetFor each line in dataset Skip if not pendingSkip if not pending Determine if connection is successfulDetermine if connection is successful Check whether is already in connection set; if Check whether is already in connection set; if
so, proceed to next lineso, proceed to next line Update D_s and L_sUpdate D_s and L_s If L_s goes beyond either threshold, update If L_s goes beyond either threshold, update
state accordinglystate accordingly
ResultsResults
TRW EvaluationTRW Evaluation Efficiency – true positives to rate of H1Efficiency – true positives to rate of H1 Effectiveness – true positives to all scannersEffectiveness – true positives to all scanners N – Average number of hosts probed before detectionN – Average number of hosts probed before detection
TRW Evaluation (cont.)TRW Evaluation (cont.)
TRW is far more effective than the other TRW is far more effective than the other twotwo
TRW is almost as efficient as BroTRW is almost as efficient as Bro TRW detects scanners in far less timeTRW detects scanners in far less time
Potential ImprovementsPotential Improvements
Leverage Additional InformationLeverage Additional Information Factor for specific services (e.g. HTTP)Factor for specific services (e.g. HTTP) Distinguish between unanswered and rejected Distinguish between unanswered and rejected
connectionsconnections Consider time local host has been inactiveConsider time local host has been inactive Consider rateConsider rate Introduce correlations (e.g. 2 failed in a row Introduce correlations (e.g. 2 failed in a row
worse than 1 fail, 1 success, 1 fail)worse than 1 fail, 1 success, 1 fail) Devise a model on history of the hostsDevise a model on history of the hosts
Improvements (cont.)Improvements (cont.) Managing StateManaging State
Requires large amount of maintained states for trackingRequires large amount of maintained states for tracking However, capping the state is vulnerable to state overflow attacksHowever, capping the state is vulnerable to state overflow attacks
How to RespondHow to Respond What to do when a scanner is detected?What to do when a scanner is detected? Is it worth blocking?Is it worth blocking?
Evasion and GamingEvasion and Gaming Spoofed IPsSpoofed IPs
• Institute “whitelists”Institute “whitelists”• Use a honeypot to try to connectUse a honeypot to try to connect
Evasion (inserting legitimate connections in scan)Evasion (inserting legitimate connections in scan)• Incorporating other information, such as a model of what is normal for Incorporating other information, such as a model of what is normal for
legitimate users and give less weight to connections not fitting the patternlegitimate users and give less weight to connections not fitting the pattern Distributed ScansDistributed Scans
Scans originating from more than one sourceScans originating from more than one source Difficult to fix in this frameworkDifficult to fix in this framework
Conclusion/SummaryConclusion/Summary
TRW- based on ratio of failed/succeeded TRW- based on ratio of failed/succeeded connectionsconnections
Sequential Hypothesis TestingSequential Hypothesis Testing Highly accurateHighly accurate Quick ResponseQuick Response