35
Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information Technology, Graduate School of Information Science and Technology, Hokkaido University

Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

Large-Scale Knowledge Processing (1st part, Lecture-1)

Shin-ichi Minato

Division of Computer Science and Information Technology,Graduate School of Information Science and Technology,

Hokkaido University

Page 2: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 2

Schedule of this year

• 1st part: (given by Prof. Minato)– Techniques for large-scale discrete structure

manipulation– BDDs/ZDDs and their applications

• 2nd part: (given by Prof. Takigawa)– Probabilistic distribution and statistical models– Inference algorithms for various statistical models

Page 3: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 3

Contents of the 1st part (tentative)• Lec-1 (2017.12.01) BDDs for Knowledge representation • Lec-2 (2017.12.06) Implementation of BDD manipulator• Lec-3 (2017.12.08) Variable ordering for BDDs• Lec-4 (2017.12.13) BDD applications and ZDDs• Lec-5 (2017.12.15) ZDD app. (1): database analysis• Lec-6 (2017.12.20) ZDD extensions for manipulating

sequences and permutations• Lec-7 (2017.12.22) ZDD app. (2): enumeration & indexing

URL of Lecture notes:http://art.ist.hokudai.ac.jp/~minato/lskp2017.html

(no lecture on 2017.12.27 so far)

Page 4: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 4

Today’s contents

• Boolean functions and knowledge representation– binary/multi-valued, Boolean expressions and functions,

propositional and predicate logic• “Manipulation” of Boolean functions

– Generating Boolean function data from Boolean expressions– Binary logic operations between Boolean functions– Checking of tautology, equivalence and implication– Searching optimal solutions

• Techniques of Representing Boolean functions– Required properties.– Total number of Boolean functions, Information theoretic bit

size, randomized functions– Karnaugh maps, Truth tables, SOP forms and BDDs

• BDD: Basic structures and properties

BDDs for Knowledge representation

Page 5: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 5

Boolean functions• Function(関数):

binary relation from a set A to a set B, such that only one element in B corresponds to a element in A.

• Boolean Function / logic function (論理関数):B={0,1} and f: Bn→B(binary-valued logic function, or switching function)

• We may consider multi-valued Boolean Function

Booleanfunction f(X)

x1x2

xn

X

入力(input) 出力(output)

Page 6: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 6

Boolean functions• Function(関数):

binary relation from a set A to a set B, such that only one element in B corresponds to a element in A.

• Boolean Function / logic function (論理関数):B={0,1} and f: Bn→B(binary-valued logic function, or switching function)

• We may consider multi-valued Boolean Function

Booleanfunction f(X)

x1x2

xn

X

入力(input) 出力(output)

One of the most basic model for knowledge representation on computers

Page 7: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 7

Use of Boolean functions (1): Logic design

• Optimal design of VLSI logic circuits– Small sizing and low power design of cellular phones.– High speed and low power design of microprocessors.

• Formal design verification – Mathematical proof for checking no design errors.

Booleanfunction f(X)

x1x2

xn

X

入力(input) 出力(output)

Page 8: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 8

Use of Boolean functions (2): Machine learning

• Optimal design of machines for classifying data.– For any input of bit patterns, the machine (Boolean function)

answers Yes or No.– For given training data (input patterns and correct answers),

the Boolean function may change adaptively, and then appropriate machine can be obtained automatically.

Booleanfunction f(X)

x1x2

xn

X

入力(input) 出力(output)

Page 9: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 9

Use of Boolean functions (3): Data mining• Discovering useful knowledge from large-scale data

– For example, checking supermarket POS data to know frequently purchased item combinations.

– n items in supermarket. For 0/1 bit pattern which presents items in a customer’s basket, the output is Yes if the item combination is a frequent pattern.

Booleanfunction f(X)

x1x2

xn

X

入力(input) 出力(output)

Page 10: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 10

Use of Boolean functions (4): Constraint satisfaction• Various real-life problems can be interpreted as finding

input patterns which satisfy a given constraint.– For any input pattern (0/1 bit pattern), Boolean function

returns Yes or No.– To solve the problem, sometimes we divide it into some

smaller sub-problems. Boolean function manipulation.

Booleanfunction f(X)

x1x2

xn

X

入力(input) 出力(output)

Page 11: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 11

Boolean function and propositional logic

• Propositional Logic(命題論理):Proposition symbols (truth or false) and their logic operations. Same level of concept as Boolean functions.(ex) Student is a human being.

Student ⇒ Human≡ (~Student + Human)

• Predicate Logic(述語論理):Predicates and their logic operations. A predicate has a symbol of the subject (not limited to Boolean symbol).This is higher level concept than Boolean functions.(ex) if X is student, then X is a human being.

Student(X) ⇒ Human(X)≡ (~Student(X) + Human(X))

Page 12: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 12

Boolean expressions and Boolean functions• Boolean expressions(論理式): Expressions which consist of

Boolean symbols and logic operations. (AND, OR, NOT, etc.)

A Boolean expression represents a Boolean function.

Different Boolean expressions may represent a same function.

ab

d

Fa

bc

~c~a

a b c d F1 1 - - 11 - 0 - 10 1 1 - 1- - - 1 1

(ex.) F = a b + a ~c + ~a b c + d

(ex.) F1 = x ~y + x z + ~x y + ~x ~zF2 = x ~y + ~x y + y z + ~y ~zF3 = x ~y + ~x ~z + y z

Page 13: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 13

Notations of logic operators

• Various notations for logic operators.unfortunately, not well standardized.

AND(論理積): x y x・y x ∧ y x & y

OR(論理和): x + y x | y x ∨ y

NOT(否定): ~x ¬ x ! x x

EXOR(排他的論理和): + yx x↑y x ^ y

Page 14: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 14

“Manipulation” of Boolean functions (1)

• Generating Boolean function data from Boolean expressions or circuits. (logic operations between two Boolean functions.)

Logic operations used in many kind of real-life problems.

X

Y

XY: F00: 010: 101: 011: 1

XY: F00: 010: 001: 111: 1

XY: F00: 110: 101: 011: 0

XY: F00: 110: 001: 111: 0

XY: F00: 110: 001: 111: 1

XY: F00: 110: 101: 011: 1

XY: F00: 010: 101: 111: 0

Page 15: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 15

“Manipulation” of Boolean functions (2)• Tautology / inconsistency checking

( co-NP complete problem)(ex.) x y + ~x + ~y ~z + z = ?

• Equivalence checking (F ≡ G) ⇔ (F G + ~F~G ≡ 1) x~y + x z + ~x y + ~x~zx~y + ~x y + y z + ~y ~zx~y + ~x~z + y z

• Implication checking (F ⇒ G) ⇔ (~F + G ≡ 1)• Searching for constraint satisfaction:

– SAT: Problem to find a pattern of input values to satisfy the Boolean function ( NP complete problem)

– Optimization: Problem to find a solution with the minimal (or maximal) cost ( NP complete problem)

Page 16: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 16

Method for representing small Boolean functions

00 01 11 10

00

01

11

10

x yz w

1 1 0 0

0 1 1 1

1 1 1 1

1 0 0 0

• In many cases, schematic representations has been used, since it is easy to see for human beings.

xy

zF

Karnaugh map Logic circuit

Page 17: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 17

Requirements for Boolean function representations

• Compactness– Total number of n input Boolean functions: 22n

Using a fixed-length data format,at least 2n bit must be needed. (cf. truth table)

– In real-life problems, all functions don’t equally appear. Idea to use variable-length data format to represent frequently used functions more compactly.

• Fast processing– How quick for equivalence checking of two functions?– How efficient for logic operations such as AND, OR, NOT.

• Readability for human beings is not important.

Page 18: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 18

Truth table representation• Essentially same as Karnaugh

maps, but don’t have to layout for human beings.(Simple arrays.)

• Good for vector / parallel computation.

• Even for simple Boolean functions, always exponential size of data size needed. (Hard to deal with 30 or 40 inputs.)

• Any complex functions can be processed in the same computation time.

x1x2x3 … xn F1 F2 F3

1010101

1

0110111

0

1000000

0

000 … 0100 … 0010 … 0110 … 0001 … 0101 … 0011 … 0

111 … 1

Page 19: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 19

Boolean expressions with sum-of-products form• Positive and negative literals and

AND-OR two-level logic circuits.• Linear size for total number of

literals.• Simple Boolean functions can be

represented compactly.• Not unique form for a function, and

equiv checking is time-consuming.• Not good at NOT and EXOR

operations.• Most popularly used until BDD is

invented. (but still often used.)

ab

d

Fa

bc

ca

a b c d F1 1 - - 11 - 0 - 10 1 1 - 1- - - 1 1

ab + ac + abc + d

Page 20: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

BDD (Binary Decision Diagram)

Bryant (CMU)

Techniques for Bool. function representation & manipulation

・ Reduced (compressed) Binary decision tree.・ Canonical and compact representation

for many practical Boolean function data(100~1000x compression in some cases)

Innovative BDD synthesis algorithm, proposed in 1984.(best cited paper for manyyears in all EE&CS areas.)

BDDs are used more widely as PC memory capacity grows larger.(especially after 2000)

2017.12.01 20

簡約化(圧縮)

0 1

a

b

c

10

1

01

0

a

b b

c c c c

0 0 01 1 1 1 1

0 1

(場合分け二分木)(BDD)

簡約化(圧縮)

0 1

a

b

c

10

1

01

0

0 1

a

b

c

10

1

01

0

0 1

a

b

c

10

1

01

0

a

b b

c c c c

0 0 01 1 1 1 1

0 1a

b b

c c c c

0 0 01 1 1 1 1

0 1

(場合分け二分木)(BDD)

Reduction(compress)

(Binary decision tree)

BDDBDD

BDDBDD

AND

BDD同士の論理演算

(圧縮データ量にほぼ比例する計算時間)

BDDBDD

Binary logic operation:(computation time isalmost linear for compressed BDD size)

Large-Scale Knowledge Processing

Page 21: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 21

Examples of BDDs(二分決定グラフ)

a

b b

c c c c

0 0 01 1 1 1 1 0 1

a

b

c

10

1

01

0

0 1

a

bc

10

1

0

10

b1

c

0

10

Binary Decision Tree(真理値表と等価なBDD)

Reduced Ordered BDD(既約な順序付きBDD)

Unordered BDD(既約でも順序付きでもないBDD)

Page 22: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 22

Nodes(節点)and edges(枝)in BDDs

• Decision node(分岐節点)

• Terminal node(終端節点)

• 0-edge(0-枝)

• 1-edge(1-枝)

• Sub-graph(部分グラフ)

0 1

a b c

1

0

F0 F1

v 10

F

Shannon’s expansion(シャノンの展開)(Boole’s expansion) F(v, X) = ~v F(0, X) + v F(1, X)

Page 23: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 23

ROBDD (Reduced Ordered BDD)

• In general, BDD is not unique for a Boolean function.• Ordered BDD:

– A fixed total order relation is defined for the variables.– On each path from the root node to a terminal node,

any pair of variables follows the total order relation.• Reduced BDD

– A BDD form where the two reduction rules are applied as much as possible.

• ROBDD has the most important properties.– In this lecture, we may use “BDD” for discussing ROBDD.

Page 24: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 24

BDD reduction rules

(a) Eliminate all redundant nodes.(b) Share all equivalent nodes.

x

f f

(jump)

x

f0 f1

x x

f0 f1

(share)

Reduced BDD.

(cf.)When using the rule (b) only, we can obtaina “quasi-reduced” BDD.(準規約なBDD)

(a)

(b)

Page 25: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 25

Example of BDD reduction

• Final BDD form is anyway unique.• A unique (canonical) form for a Boolean function.

– However, different variable orders may produce different forms.(occasionally same but in general different.)

a

b b

c c c c

0 0 01 1 1 1 1

a

b b

c c

01 1

a

b

c

01 1

Page 26: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 26

Size of BDDs

• Total number of n input Boolean functions: 22n

To distinguish all the functions, at least 2n bit needed. • Also for BDDs, the worst case of BDD size is:

O(2n/n) nodes.– O(n) bit for each node, so O(2n) bit in total.

• However, for many practical Boolean functions,the number of BDD nodes are within a polynomial of n.

Page 27: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 27

n-input AND, OR, EXOR functions

AND(論理積) OR(論理和)EXOR (parity)(排他的論理和)

• Number of BDD nodes is linear for n: O(n)• Swapping 0- and 1-terminal nodes gives NAND/NOR/XNOR.

0 1

x1

x2

x3

x4

10

x1

x2

x3

x4

10

x1

x2

x3

x4

x2

x3

x4

Page 28: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 28

n-bit binary-coded arithmetic addition

• Increasing n, BDD grows vertically but the width is bounded. BDD size: O(n) nodes.

• Subtraction function is also O(n) nodes.• Equality/inequality function is also O(n) nodes.

S0

S1

S2 CBDD for the 3-bit adder function.(Variable order: a2, b2, a1, b1, a0, b0, c0 )

Page 29: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 29

Effect of variable ordering (2)• 8-input data selector

Control inputs higher than data inputs.

O(n) O(2n)

Data inputs higher than control inputs.

Page 30: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 30

BDD construction algorithm

• Reduction from “binary decision tree” to BDD always requires exponential time and space. we use the algorithm [Bryant1986] for directly constructing BDDs from given Boolean expressions.

a

b b

c c c c

0 0 01 1 1 1 1 0 1

a

b

c

10

1

01

0

reduction(compress)

F = a b + ~cdirect construction

Page 31: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

• Any BDD can be generated by repeating binary logic operations between two BDDs.(computation time is almost linear for compressed BDD sizes)

2017.12.01 Large-Scale Knowledge Processing 31

BDD construction from Boolean expressions

0 1

a

b

c

10

1

01

0

0 1

a

b

10

01

0 1

a10

0 1

b10

0 1

c10

1 0

c10

a

b

c ~c

a b

a b + ~cAND

NOT

OR

Page 32: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 32

Properties of BDDs• Canonical form for a Boolean function.

– Easy to equivalence checking• Compact for many practical Boolean functions.

– Linear size for n-bit parity functions and n-bit full-adder func.– More than 100 inputs can be handled for good cases.

• Binary logic operations between two BDDs can be performed in almost linear time. – NOT operation is also easily done.

• BDDs cannot be compact in some Boolean functions.– Exponential size for n-bit multiplier functions.

• BDDs becomes large if variable ordering is poor.– NP complete problem to get an optimal variable order.

(Some heuristic methods have been developed, however.)

Page 33: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 33

BDD package• BDD manipulation programs have been developed

actively in 1990s.– Some of them are public domain software as “BDD package.”

• In many cases, implemented as a C or C++ library.– We call a library function with pointers to BDDs as

function parameters, then new BDD is constructed in the memory, and a pointer to the new BDD is returned.

– User designs the main program to call BDD operations, and then compile the program with the BDD package.

• Detailed implementation techniques of BDD package will be shown in the next lecture.

Page 34: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 34

Summary

• Boolean functions and knowledge representation– binary/multi-valued, Boolean expressions and functions,

propositional and predicate logic• “Manipulation” of Boolean functions

– Generating Boolean function data from Boolean expressions– Binary logic operations between Boolean functions– Checking of tautology, equivalence and implication– Searching optimal solutions

• Techniques of Representing Boolean functions– Required properties.– Total number of Boolean functions, Information theoretic bit

size, randomized functions– Karnaugh maps, Truth tables, SOP forms and BDDs

• BDD: Basic structures and properties

BDDs for Knowledge representation

Page 35: Large-Scale Knowledge Processing (1st part, …minato/LSKP2017/lskp2017...Large-Scale Knowledge Processing (1st part, Lecture-1) Shin-ichi Minato Division of Computer Science and Information

2017.12.01 Large-Scale Knowledge Processing 35

Exercises

1. Write a Karnaugh map and a sum-of-products formfor 4-input parity function. (Returns 1 when the odd number of 1s are included in input patterns.)

2. Consider how to check the equivalency of the following three Boolean expressions.x~y + x z + ~x y + ~x~zx~y + ~x y + y z + ~y ~zx~y + ~x~z + y z