89
1 Algorithms for Port of Entry Inspection for WMDs ed S. Roberts MACS Center, tgers University

1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

1

Algorithms for Port of Entry Inspection for WMDs

Fred S. RobertsDIMACS Center,Rutgers University

Page 2: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

2

Port of Entry Inspection Algorithms

•Goal: Find ways to intercept illicit nuclear materials and weapons

destined for the U.S. via the maritime transportation system

•Currently inspecting only small % of containers arriving at ports

•Even inspecting 8% of containers in Port of NY/NJ might bring international trade to a halt (Larrabbee 2002)

Page 3: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

3

Port of Entry Inspection Algorithms•Aim: Develop decision support algorithms that will help us to “optimally” intercept illicit materials and weapons subject to limits on delays, manpower, and equipment

•Find inspection schemes that minimize total “cost” including “cost” of false positives and false negatives

Mobile Vacis: truck-mounted gamma ray imaging system

Page 4: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

4

Sequential Decision Making Problem•Stream of containers arrives at a port•The Decision Maker’s Problem:

•Which to inspect?•Which inspections next based on previous results?

•Approach: –“decision logics”–combinatorial optimization methods–Builds on ideas of Stroud and Saeger at Los AlamosNational Laboratory–Need for new models– and methods

Page 5: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

5

Sequential Diagnosis Problem•Such sequential diagnosis problems arise in many areas:

–Communication networks (testing connectivity, paging cellular customers, sequencing tasks, …)–Manufacturing (testing machines, fault diagnosis, routing customer service calls, …)–Artificial intelligence/CS (optimal derivation strategies in knowledge bases, best-value satisficing search, coding decision trees, …)–Medicine (diagnosing patients, sequencing treatments, …)

Page 6: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

6

Sequential Decision Making Problem•Containers arriving to be classified into categories.•Simple case: 0 = “ok”, 1 = “suspicious”

•Inspection scheme: specifies which inspections are to be made based on previous observations

Page 7: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

7

Sequential Decision Making Problem

•Containers have attributes, each in a number of states

•Sample attributes:–Levels of certain kinds of chemicals or biological materials–Whether or not there are items of a certain kind in the cargo list–Whether cargo was picked up in a certain port

Page 8: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

8

Sequential Decision Making Problem

•Currently used attributes:–Does ship’s manifest set off an “alarm”?–What is the neutron or Gamma emission count? Is it above threshold?–Does a radiograph image come up positive?–Does an induced fission test come up positive?

Gamma ray detector

Page 9: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

9

Sequential Decision Making Problem

•We can imagine many other attributes•This project is concerned with general algorithmic approaches.•We seek a methodology not tied to today’s technology.•Detectors are evolving quickly.

Page 10: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

10

Sequential Decision Making Problem

•Simplest Case: Attributes are in state 0 or 1

•Then: Container is a binary string like 011001

•So: Classification is a decision function F that assigns each binary string to a category.

011001 F(011001)

If attributes 2, 3, and 6 are present, assign container to category F(011001).

Page 11: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

11

Sequential Decision Making Problem

•If there are two categories, 0 and 1, decision function F is a boolean function.

Example: F(000) = F(111) = 1, F(abc) = 0 otherwise

This classifies a container as positive iff it has none of the attributes or all of them.

1 =

Page 12: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

12

Sequential Decision Making Problem

•Given a container, test its attributes until know enough to calculate the value of F.

•An inspection scheme tells us in which order to test the attributes to minimize cost.

•Even this simplified problem is hard computationally.

Page 13: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

13

Sequential Decision Making Problem•This assumes F is known.•Simplifying assumption: Attributes are independent.•At any point we stop inspecting and output the value of F based on outcomes of inspections so far.•Complications: May be precedence relations in the components (e.g., can’t test attribute a4 before testing a6. •Or: cost may depend on attributes tested before.•F may depend on variables that cannot be directly tested or for which tests are too costly.

Page 14: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

14

Sequential Decision Making Problem

•Such problems are hard computationally.•There are many possible boolean functions F.•Even if F is fixed, problem of finding a good classification scheme (to be defined precisely below) is NP-complete. •Several classes of functions F allow for efficient inspection schemes:

–k-out-of-n systems–Certain series-parallel systems–Read-once systems–“regular” systems–Horn systems

Page 15: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

15

Sensors and Inspection Lanes•n types of sensors measure presence or absence of the n attributes. •Many copies of each sensor.•Complication: different characteristics of sensors.•Entities come for inspection.•Which sensor of a given type to use?•Think of inspection lanes and queues.•Besides efficient inspection schemes, could decrease costs by:

–Buying more sensors–Change allocation of containers to sensor lanes.

Page 16: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

16

Binary Decision Tree Approach•Sensors measure presence/absence of attributes.

•Binary Decision Tree: –Nodes are sensors or categories (0 or 1)–Two arcs exit from each sensor node, labeled left and right.–Take the right arc when sensor says the attribute is present, left arc otherwise

Page 17: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

17

Binary Decision Tree Approach

•Reach category 1 from the root only through the path a0 to a1 to 1.

•Container is classified in category 1 iff it has both attributes a0 and a1 .

•Corresponding boolean function F(11) = 1, F(10) = F(01) = F(00) = 0.

Figure 1

Page 18: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

18

Binary Decision Tree Approach•Reach category 1 from the root by:a0 L to a1 R a2 R 1 ora0 R a2 R1

•Container classified in category 1 iff it hasa1 and a2 and not a0 or a0 and a2 and possibly a1.

•Corresponding boolean function F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise.

Figure 2

Page 19: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

19

Binary Decision Tree Approach•This binary decision tree corresponds to the same boolean function

F(111) = F(101) = F(011) = 1, F(abc) = 0 otherwise.

However, it has one less observation node ai. So, it is more efficient if all observations are equally costly and equally likely.

Figure 3

Page 20: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

20

Binary Decision Tree Approach•Even if the boolean function F is fixed, the problem of finding the “optimal” binary decision tree for it is very hard (NP-complete).

•For small n = number of attributes, can try to solve it by brute force enumeration.

•Even for n = 5, not practical. (n = 4 at Port of Long Beach-Los Angeles)

Port of Long Beach

Page 21: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

21

Binary Decision Tree ApproachPromising Approaches:

•Heuristic algorithms, approximations to optimal.•Special assumptions about the boolean function F. •Example: For “monotone” boolean functions, integer programming formulations give promising heuristics.•Stroud and Saeger enumerate all “complete,” monotone boolean functions and calculate the least expensive corresponding binary decision trees.

Page 22: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

22

Binary Decision Tree ApproachMonotone Boolean Functions:

•Given two strings x1x2…xn, y1y2…yn

•Suppose that xi yi for all i implies that F(x1x2…xn) F(y1,y2…yn).•Then we say that F is monotone. •Then 11…1 has highest probability of being in category 1.

Page 23: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

23

Binary Decision Tree Approach

Incomplete Boolean Functions:

•Boolean function F is incomplete if F can be calculated by finding at most n-1 attributes and knowing the value of the input string on those attributes•Example: F(111) = F(110) = F(101) = F(100) = 1, F(000) = F(001) = F(010) = F(011) = 0. •F(abc) is determined without knowing b (or c).•F is incomplete.

Page 24: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

24

Binary Decision Tree Approach

Complete, Monotone Boolean Functions:

•Stroud and Saeger: algorithm for enumerating binary decision trees implementing complete, monotone boolean functions. •Feasible to implement up to n = 4.•n = 2:

–There are 6 monotone boolean functions.–Only 2 of them are complete, monotone–There are 4 binary decision trees for calculating these 2 complete, monotone boolean functions.

Page 25: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

25

Binary Decision Tree Approach

Complete, Monotone Boolean Functions:

•n = 3:–9 complete, monotone boolean functions.–60 distinct binary trees for calculating them

•n = 4:–114 complete, monotone boolean functions.–11,808 distinct binary decision trees for calculating them.

Page 26: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

26

Binary Decision Tree Approach

Complete, Monotone Boolean Functions:

•n = 5:–6894 complete, monotone boolean functions–263,515,920 corresponding binary decision trees.

•Combinatorial explosion! •Need alternative approaches; enumeration not feasible!

Page 27: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

27

Cost Functions

•Above analysis: Only uses number of sensors•Using a sensor has a cost:

–Unit cost of inspecting one item with it–Fixed cost of purchasing and deploying it–Delay cost from queuing up at the sensor station

•Preliminary problem: disregard fixed and delay costs. Minimize unit costs.

Page 28: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

28

Cost Functions

•Simplification so far: Disregard characteristics of population of entities being inspected. •Only count number of observation (attribute) nodes in the tree.

•Unit Cost Complication: How many nodes of the decision tree are actually visited during average container’s inspection? Depends on “distribution” of containers. In our early models, will depend on probability of sensor errors and probability of bomb in a container.

Page 29: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

29

Cost Functions: Delay Costs•Tradeoff between fixed costs and delay costs: Add more sensors cuts down on delays.•Stochastic process of containers arriving•Distribution of delay times for inspections•Use queuing theory to find average delay times under different models

Page 30: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

30

Cost Functions

•Cost of false positive: Cost of additional tests.

–If it means opening the container, it’s very expensive.

•Cost of false negative: –Complex issue.–What is cost of a bomb going off in Manhattan?

Page 31: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

31

The Brute Force Approach

•The cost of each binary decision tree corresponding to a complete, monotone boolean function is calculated.•The optimum tree is selected.•Optimum depends on assumptions about sensor errors, costs of false positive and false negative outcomes, and unit, fixed, and delay costs for each sensor.

Page 32: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

32

Cost Functions: Sensor Errors•One Approach to False Positives/Negatives:Assume there can be Sensor Errors•Simplest model: assume that all sensors checking for attribute ai have same fixed probability of saying ai is 0 if in fact it is 1, and similarly saying it is 1 if in fact it is 0.•More sophisticated analysis later describes a model for determining probabilities of sensor errors. •Notation: X = state of nature (bomb or no bomb)

Y = outcome (of sensor or entire inspection process).

Page 33: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

33

Probability of Error for The Entire TreeState of nature is zero (X = 0), absence of a bomb State of nature is one (X = 1), presence

of a bomb

Probability of false positive (P(Y=1|X=0))

for this tree is given by

Probability of false negative(P(Y=0|X=1))

for this tree is given by

A

B

C

0

1

0 1

A

B

C

0

1

0 1

P(Y=1|X=0) = P(YA=1|X=0) * P(YB=1|X=0) + P(YA=1|X=0) *P(YB=0|X=0)* P(YC=1|X=0)

P(Y=0|X=1) = P(YA=0|X=1) + P(YA=1|X=1) *P(YB=0|X=1)*P(YC=0|X=1)

Page 34: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

34

Cost Function used for Evaluating the Decision Trees.

CTot = CFalsePositive *PFalsePositive + CFalseNegative *PFalseNegative + Cutil

The error probability of the entire tree is computed from the error probabilities of the individual sensors.

CFalsePositive is the cost of false positive (Type I error)CFalseNegative is the cost of false negative (Type II error)PFalsePositive is the probability of a false positive occurringPFalseNegative is the probability of a false negative occurringCutil is the cost of utilization of the tree.

Page 35: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

35

Cost Function used for Evaluating the Decision Trees.

Cutil is the cost of utilization of the tree.

Simplest assumption: Cutil is the expected sum of unit costs associated with the tree. Count unit cost of each sensor each time it is used. Use P(X = 1) and probability of errors at each type of sensor to calculate expected value.

Later: models for distribution of attributes of containers and more sophisticated analysis of expected cost of utilizing the tree, bringing in delay costs.

Page 36: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

36

Stroud Saeger Experiments• Stroud-Saeger ranked all trees formedfrom 3 or 4 sensors A, B, C and D according to increasing tree costs. • Used cost function defined above. • Values used in their experiments:

– CA = .25; P(YA=1|X=1) = .90; P(YA=1|X=0) = .10;– CB = 10; P(YC=1|X=1) = .99; P(YB=1|X=0) = .01;– CC = 30; P(YD=1|X=1) = .999; P(YC=1|X=0) = .001;– CD = 1; P(YD=1|X=1) = .95; P(YD=1|X=0) = .05;

– Here, Ci = cost of utilization of sensor i. • Also fixed were: CFalseNegative, CFalsePositive, P(X=1)

Page 37: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

37

Stroud Saeger Experiments: Our Sensitivity Analysis

• We have explored sensitivity of the Stroud-Saeger conclusions to variations in values of these three parameters.

• We estimated high and low values for these parameters.• We chose one of the values from the interval of values

and then explored the highest ranked tree as the other two were chosen at random in the interval of values. 10,000 experiments for each pair of fixed values.

• We looked for the variation in the top-ranked tree and how the top-rank related to choice of parameter values.

• Very surprising results.

Page 38: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

38

Stroud Saeger Experiments: Our Sensitivity Analysis

– CFalseNegative was varied between 25 million and 10 billion dollars• Low and high estimates of direct and indirect costs

incurred due to a false negative.

– CFalsePositive was varied between $180 and $720

• Cost incurred due to false positive

(4 men * (3 -6 hrs) * (15 – 30 $/hr)– P(X=1) was varied between 1/10,000,000 and

1/100,000

Page 39: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

39

Stroud Saeger Experiments: Sensitivity Analysis

• First set of experiments: 3 attributes or types of sensors, A, B, C.

• Extensive computer experimentation.

Page 40: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

40

Frequency of Top-ranked Trees when CFalseNegative and CFalsePositive are Varied

• 10,000 randomized experiments (randomly selected values of CFalseNegative and CFalsePositive from the specified range of values) for the median value of P(X=1).

• The above graph has frequency counts of the number of experiments when a particular tree was ranked first or second, or third and so on.

• Only three trees (7, 55 and 1) ever came first. 6 trees came second, 10 came third, 13 came fourth.

0 10 20 30 40 50 600

1000

2000

3000

4000

5000

6000

7000

Tree no.

Fre

qu

en

cy

1st2nd3rd4th5th

Page 41: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

41

• 10,000 randomized experiments for the median value of CFalsePositive. • Only 2 trees (7 and 55) ever came first. 4 trees came second. 7 trees came third.

10 and 13 trees came 4th and 5th respectively.

Frequency of Top-ranked Trees when CFalseNegative and P(X=1) are Varied

0 10 20 30 40 50 600

1000

2000

3000

4000

5000

6000

7000

8000

Tree no.

Fre

qu

en

cy

1st2nd3rd4th5th

Page 42: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

42

• 10,000 randomized experiments for the median value of CFalseNegative. • Only 3 trees (7, 55 and 1) ever came first. 6 trees came second. 10 trees

came third. 13 and 16 trees came 4th and 5th respectively.

Frequency of Top-ranked Trees when P(X=1) and CFalsePositive are Varied

0 10 20 30 40 50 600

1000

2000

3000

4000

5000

6000

7000

Fre

qu

en

cy

Tree no.

1st2nd3rd4th5th

Page 43: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

43

Most Frequent Tree Groups Attaining the Top Three Ranks.

• Trees 7, 9 and 10

A

B

C

0

1

0 1

B

A

C

A

1

0 1

0 0

B

A

A

C

1

0 1

0 0

All the three decision trees have been generated from the same boolean expression 00000111 representing F(000)F(001)…F(111)Both Tree 9 and Tree 10 are ranked second and third more than 99% of the times when Tree 7 is ranked first.

Page 44: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

44

Most Frequent Tree Groups Attaining the Top Three Ranks

• Trees 55, 57 and 58

A

1

C

B

1

0 1

B

1

C

A

1

0 1

B

1

A

C

1

0 1

The boolean expression for these three decision trees is 01111111Tree ranked 57 is second 96% of the times and tree 58 is third 79 % of the times when tree 55 is ranked first.

Page 45: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

45

Most Frequent Tree Groups Attaining the Top Three Ranks

• Trees 1, 3, and 2

A

B

C

0

0

0 1

A

C

B

0

0

0 1

B

A

C

0

0

0 1

The boolean expression for these three decision trees is 00000001Tree 3 is ranked second 98% of times and tree 2 is ranked third 80 % of the times when tree 1 is ranked first.

Page 46: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

46

Values of CFalseNegative and CFalsePositive when Tree 7 was Ranked First

• This is a graph of CFalsePositive against CFalseNegative values obtained from the randomized experiments. The black dots represent points at which tree 7 scored first rank.

Page 47: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

47

Values of CFalseNegative and CFalsePositive when Tree 55 was Ranked First

• Tree 55 fills up the lower area in the range of CFalseNegative and CFalsePositive values.

Page 48: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

48

Values of CFalseNegative and CFalsePositive when Tree 1 was Ranked First

• Tree 1 fills up the major area in the range of CFalseNegative and CFalsePositive.

0 1 2 3 4 5 6 7 8 9 10

x 109

0

100

200

300

400

500

600

700

800

900

CFalseNegative

CFalsePositive

Page 49: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

49

Values of CFalseNegative and CFalsePositive for the Three First Ranked Trees

• Trees 7, 55 and 1 fill up the entire area in the range of CFalseNegative and CFalsePositive among themselves.

Page 50: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

50

Values of CTot, CFalseNegative and CFalsePositive for First Ranked Trees

• This graph shows total costs for trees 7, 55 and 1 in the respective regions in which they were ranked first.

• Each tree’s total cost is a hyperplane which cuts other hyperplanes as it gains and then loses first rank.

Page 51: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

51

Values of CTot, CFalseNegative and CFalsePositive for Trees 1, 7 and 55 (Even When They Were

not Ranked First).

This graph shows the extended CTot hyperplanes for trees 7, 55 and 1 for all regions.

Page 52: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

52

Values of CFalseNegative and P(X=1) when Tree 7 was Ranked First

• Tree 7 again fills up the major area in the range of CFalseNegative and P(X=1).

Page 53: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

53

Values of CFalseNegative and P(X=1) when Tree 55 was Ranked First

• Tree 55 fills up the rest of the area in the range of CFalseNegative and P(X=1).

Page 54: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

54

Values of CFalseNegative and P(X=1) for First Ranked Trees

• Together trees 7 and 55 fill up the entire region of CFalseNegative and P(X=1).

Page 55: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

55

Variations of CTot, CFalseNegative and P(X=1) for First Ranked Trees

• This graph has CTot on the 3rd axis for trees 7 and 55 in the respective regions in which they were most optimal.

• Each tree’s total cost is a conic surface.

Page 56: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

56

Values of CFalsePositive and P(X=1) When Tree 7 was Ranked First

• Tree 7 fills up the major area in the range of CFalsePositive and P(X=1).

Page 57: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

57

Values of CFalsePositive and P(X=1) when Tree 55 was Ranked First

• Tree 55 fills up the lower area in the range of CFalsePositive and P(X=1).

Page 58: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

58

Values of CFalsePositive and P(X=1) when Tree 1 was Ranked First

• Tree 1 fills up the major area in the range of CFalsePositive and P(X=1).

0 100 200 300 400 500 600 700 800 9000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1x 10

-5

CFalsePositive

P(X=1)

Page 59: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

59

Values of CFalsePositive and P(X=1) for First Ranked Trees

• Trees 7, 55 and 1 fill up the entire area in the range of CFalsePositive and P(X=1) among themselves.

Page 60: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

60

Values of CTot, CFalsePositive and P(X=1) for First Ranked Trees

• This graph shows total costs for trees 7, 55 and 1 in the respective regions in which they were most optimal.

• Each tree’s total cost is a hyperplane which cuts other hyperplanes as it gains and then loses first rank.

Page 61: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

61

Modeling Sensor Errors•One Approach to Sensor Errors: Modeling Sensor Operation

•Threshold Model:–Sensors have different discriminating power–Many use counts (e.g., Gamma radiation counts)–See if count exceeds threshold–If so, say attribute is present.

Page 62: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

62

Modeling Sensor ErrorsThreshold Model:

•Sensor i has discriminating power Ki, threshold Ti

•Attribute present if counts exceed Ti

•Calculate fraction of objects in each category whose readings exceed T•Seek threshold values that minimize all costs: inspection, false positive/negative•Assume readings of category 0 containers follow a Gaussian distribution and similarly category 1 containers•Simulation approach

Page 63: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

63

Probability of Error for Individual Sensors

• For ith sensor, the type 1 (P(Yi=1|X=0)) and type 2 (P(Yi=0|X=1)) errors are modeled using Gaussian distributions. – State of nature X=0 represents absence of a bomb.– State of nature X=1 represents presence of a bomb. i represents the outcome (count) of sensor i. – Σi is variance of the distributions

Ki

P(i|X=1)P(i|X=0)

Ti

Characteristics of a typical sensor i

Page 64: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

64

Modeling Sensor ErrorsThe probability of false positive for the ith sensor is computed as:

P(Yi=1|X=0) = 0.5 erfc[Ti/√2]The probability of detection for the ith sensor is computed as:

P(Yi=1|X=1) = 0.5 erfc[(Ti-Ki)/(Σ√2)]

erfc = complementary error function erfc(x) = (1/2,x2)/sqrt()

The following experiments have been done using sensors A, B, C and using:

KA = 4.37; ΣA = 1KB = 2.9; ΣB = 1KC = 4.6; ΣC = 1

We then varied the individual sensor thresholds TA, TB and TC from -4.0 to +4.0 in steps of 0.4. These values were chosen since they gave us an “ROC curve” (see later for the individual sensors over a complete range P(Yi=1|X=0) and P(Yi=1|X=1)

Page 65: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

65

Frequency of First Ranked Trees for Variations in Sensor Thresholds

• 68,921 experiments were conducted, as each Ti was varied through its entire range. • The above graph has frequency counts of the number of experiments when a particular

tree was ranked first. There are 15 such trees. Tree 37 had the highest frequency of attaining rank one.

0 10 20 30 40 50 600

2000

4000

6000

8000

10000

12000

14000

16000

18000

Tree no.

Fre

qu

en

cy

Page 66: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

66

Stroud Saeger Experiments: Our Sensitivity Analysis: 4 Sensors

• Second set of computer experiments: 4 attributes or types of sensors, A, B, C, D.

• Same values as before.• Experiment 1: Fix values of two of CFalseNegative,

CFalsePositive, P(X=1) and vary the third. • Experiment 2: Fix a value of one of CFalseNegative,

CFalsePositive, P(X=1) and vary the other two through their interval of possible values. Do 10,000 experiments each time.

• Look for the variation in the highest ranked tree.

Page 67: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

67

Stroud Saeger Experiments: Our Sensitivity Analysis: 4 Sensors

• Experiment 1: Fix values of two of CFalseNegative, CFalsePositive, P(X=1) and vary the third.

Page 68: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

68

CTot vs CFalseNegative for Ranked 1 Trees (Trees 11485(9651) and 10129(349))

Only two trees ever were ranked first, and one, tree 11485, was ranked first in 9651 out of 10,000 runs.

Page 69: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

69

CTot vs CFalsePositive for Ranked 1 Trees (Tree no. 11485 (10000))

One tree, number 11485, was ranked first every time.

Page 70: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

70

CTot vs P(X=1) for Ranked 1 Trees (Tree no. 11485(8372), 10129(488), 11521(1056))

Three trees dominated first place. Trees 10201(60), 10225(17) and 10153(7) also achieved first rank but with relatively low frequency.

Page 71: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

71

Tree Structure and corresponding Boolean Expressions

1

a

b b

d c

d 1

c

d 1

100 1

0 1

Tree number 11485Boolean Expr: 0101011101111111

1

a

b b

c

d 1

c

d 1

100 1

c

0 d

10

Tree number 10129Boolean Expr: 0001011101111111

Page 72: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

72

Stroud Saeger Experiments: Our Sensitivity Analysis: 4 Sensors

• Experiment 2: Fix the values of one of CFalseNegative, CFalsePositive, P(X=1) and vary the others.

Page 73: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

73

Frequency of First Ranked Trees when Two Parameters (CFalseNegative and CFalsePositive) were Varied Keeping P(X=1) Constant at Randomly Selected

Values.

0 2000 4000 6000 8000 10000 120000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

5 Trees coming first -9541 10129 10153 10201 11485 11521

Tree number

Fre

quency

10,000 randomized experiments with randomly selected values of P(X=1) The experiments were repeated for 20 different randomly selected values of P(X=1)

Page 74: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

74

Frequency of First Ranked Trees when Two Parameters (CFalseNegative and P(X=1)) were Varied

Keeping CFalsePositive Constant at Randomly Selected Values.

0 2000 4000 6000 8000 10000 120000

2

4

6

8

10

12

14x 10

4

Tree number

Fre

quency

Trees coming first -505 4695 5105 5129 7353 9541 10129 10153 10201 10225 11485 11521

10,000 randomized experiments with randomly selected values of CFalsePositive

The experiments were repeated for 20 different randomly selected values of CFalsePositive

Page 75: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

75

Frequency of First Ranked Trees when Two Parameters (P(X=1) and CFalsePositive) were Varied

Keeping CFalseNegative Constant at Randomly Selected Values.

0.95 1 1.05 1.1 1.15 1.2

x 104

0

5

10

15x 10

4

Tree number

Fre

quency

Trees coming first -9541 10129 10153 10201 10225 11485 11521

10,000 randomized experiments with randomly selected values of CFalseNegative

The experiments were repeated for 20 different randomly selected values of CFalseNegative

Page 76: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

76

Variation of CTot wrt CFalseNegative and CFalsePositive, for Tree Ranked First (Tree nos. 11485 and

10129)

CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cutil

11485

10129

Page 77: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

77

Variation of CTot wrt CFalseNegative and P(X=1), for Tree Ranked First(Tree no.

11485(8121),10129(728) and 11521(984))

CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cutil

Trees 505, 5105, 5129, 9541, 10153, 10201 and 10225 also attained rank 1, but with very low frequency (<100).

10129

11485

Page 78: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

78

Variation of CTot wrt CFalsePositive and P(X=1), for Tree Ranked First(Tree no.

11485(7162),10129(1690) and 11521(851))

CTot = CFalsePositive *P(X=0)*P(Y=1|X=0) + CFalseNegative *P(X=1)*P(Y=0|X=1) + Cutil

Trees 10153, 10201 and 10225 also attained first rank, 80, 195 and 22 times respectively.

1012911485

Page 79: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

79

Receiver Operating Characteristic (ROC) Curve

• The ROC curve is the plot of the probability of correct detection (PD) vs. the probability of false positive (PF).

• The ROC curve is used to select an operating point, which provides the tradeoff between the PD and PF

• Each sensor has a ROC curve and the combination of the sensors into a decision tree has a composite ROC curve.

• The parameter which is varied to get different operating points on the ROC curve is the sensor threshold and a combination of thresholds for the decision tree.

• Equal Error Rate (EER) is the operating point on the ROC curve where PF = 1 – PD

• We can use ROC curves to identify optimal thresholds for sensors.

P(i|X=1)P(i|X=0)

Ti

Ki

PD

PF

Operating Point

1

0 1

EER

ROC Curve

Page 80: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

80

Receiver Operating Characteristic (ROC) Curve

• We seek operating characteristics of sensors that place us in the upper left hand corner of the ROC curve.

• Here, PF is small and PD is large.

P(i|X=1)P(i|X=0)

Ti

Ki

PD

PF

Operating Point

1

0 1

EER

ROC Curve

Page 81: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

81

Performance of Sensors Against that of Tree 37 (Most Frequent Tree Attaining Rank 1)

• The black, blue and red dotted lines represent performance characteristics (ROC curve) of sensors A, B and C.

• The green dots represent the performance characteristics (P(Y=1|X=0), P(Y=1|X=1)) of the tree over all combinations of sensor thresholds (Ti).

Page 82: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

82

Performance of Sensors Against that of Tree 37

• This zoomed-in figure of the ROC curve displays the region of high detection probabilities and low false positive probabilities.

• Points lying on the diagonal line are the Equal Error Rates for this tree and the sensors. The tree achieves equal error rates of 0.0027 while sensors A, B and C have EERs of 0.0145, 0.0738, 0.0107.

Page 83: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

83

Best Possible ROC Curve for Tree 37

• Assuming performance probabilities (P(Y=1|X=1) and P(Y=1|X=0)) to be monotonically related (in the sense that P(Y=1|X=1) can be called a monotonic function of P(Y=1|X=0)), we can find an ROC curve for the tree consisting of the set containing maximum P(Y=1|X=1) value corresponding to given P(Y=1|X=0) value.

• The blue dots represent such an ROC curve, the “best” ROC curve for tree 37.

Page 84: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

84

Conclusions from Sensitivity Analysis

• Considerable lack of sensitivity to modification in parameters for trees using 3 or 4 sensors.

• Very few optimal trees.

• Very few boolean functions arise among optimal trees.

Page 85: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

85

Some Complications•More complicated cost models; bringing in costs of delays•More than two values of an attribute

(present, absent, present with probability > 75%, absent with probability at least 75%) (ok, not ok, ok with probability > 99%, ok with probability between 95% and 99%)

•Inferring the boolean function from observations (partially defined boolean functions)

Page 86: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

86

Some Research Challenges•Explain why conclusions are so insensitive to variation in parameter values.•Explore the structure of the optimal trees and compare the different optimal trees.•Develop less brute force methods for finding optimal trees that might work if there are more than 4 attributes.•Develop methods for approximating the optimal tree.

Pallet vacis

Page 87: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

87

Closing Remark•Recall that the “cost” of inspection includes the cost of failure, including failure to foil a terrorist plot.•There are many ways to lower the total “cost” of inspection:

Use more efficient orders of inspection.Find ways to inspect more containers.Find ways to cut down on delays at inspection lanes.

Page 88: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

88

Research Team• Saket Anand, Rutgers, ECE graduate student• Endre Boros, Rutgers, Operations Research• Elsayed Elsayed, Rutgers, Ind. & Systems Engineering• Liliya Fedzhora, Rutgers, Operations Res. grad. student• Paul Kantor, Rutgers, Schl. of Infor. & Library Studies• Abdullah Karaman, Rutgers Ind. & Syst. Eng. grad. student• Alex Kogan, Rutgers, Business School• Paul Lioy, Rutgers/UMDNJ, Environmental and Occupational Health and

Sciences Institute• David Madigan, Rutgers, Statistics• Richard Mammone, Rutgers, Center for Advanced Information Processing• S. Muthukrishnan, Rutgers, Computer Science• Saumitr Pathek, Rutgers ECE graduate student• Richard Picard, Los Alamos, Statistical Sciences Group• Fred Roberts, Rutgers, DIMACS Center• Kevin Saeger, Los Alamos, Homeland Security• Phillip Stroud, Los Alamos, Systems Engineering and Integration Group• Hao Zhang, Rutgers Ind. & Systems Eng., graduate student

Page 89: 1 Algorithms for Port of Entry Inspection for WMDs Fred S. Roberts DIMACS Center, Rutgers University

89

Collaborators on Sensitivity Analysis:• Saket Anand• David Madigan• Richard Mammone• Saumitr Pathak

Research Support: • Office of Naval Research• National Science Foundation

Los Alamos National Laboratory:• Rick Picard• Kevin Saeger• Phil Stroud