20
ECE 647 TERM PROJECT MiniSAT parallelization Shahadat Hossain Saud Wasly

ECE 647 TERM PROJECT MiniSAT parallelization Shahadat Hossain Saud Wasly

Embed Size (px)

Citation preview

ECE 647 TERM PROJECTMiniSAT parallelization

Shahadat Hossain

Saud Wasly

AGENDA

Introduction MiniSAT Background MiniSAT Algorithm Objectives Difficulties Implemented changes Results Conclusion Questions

THE SAT PROBLEM + MINISAT

Boolean Satisfiablity is a mathematical problem to find if a given set of variables can be assigned in a such a way that all given constraints are met SAT problems commonly arise in EDA

applications MiniSAT is an open source SAT solver created

in 2003 Designed to be extensible; Original code was

~700 lines Current code ~1k lines

SAT competition Main track: Optimize any SAT solver Parallel track: parallelize any SAT solver MiniSAT Hack track: minor changes to miniSAT

FORMAT AND TERMINOLOGY

SAT problems are presented in Conjunctive Normal Form (CNF)

F=(x1 v ¬x2 v ¬x3) ∧ (¬ x1 x5 ¬x7 x20) ∧ (...) ∧ ...

Variables: x1, x2, x3, ... Literals: A variable and its negation

x1 AND ¬ x1, x2 AND ¬ x2, ... Clause: A collection of variables that have an

OR relationship (x1 v ¬x2 v ¬x3) A clause can be true, false, free, or asserting

INTERNAL STRUCTURES Clause Database: Contains all

clauses (input constraints) and all assigned literals

Learnts: Vector of literals that have been learnt from assertions

Trail: Literal path taken so far Watches: Vector of literals to

watch since they are closely related to those that have been changed

Order Heap: Priority heap Decision Level: The level an

assertion has been propagated to; used for backtracking

c simple_v3_c2.cnfcp cnf 3 21 -3 02 3 -1 0

MINISAT ALGORITHM

MiniSAT uses 2 methods to solve a SAT problem: Conflict-driven backtracking

Program makes assertions on variables When clause becomes asserting, MiniSAT ‘learns’ a

literal When a conflict arises, backtrack and ‘unlearn’ literals

Dynamic Variable Ordering* When literal is asserted, related literal is moved to top

of stack

Exit Condition All literals assigned (SAT), no conflicts Backtracked to root of tree (UNSAT)

PARALLELIZATION

Why Parallelize? Large problems can take hours to solve On modern multi-core computers, MiniSAT isn’t

taking advantage of total processing power MiniSAT is essentially a trial-and-error algorithm,

should benefit from multiple copies of same program

MiniSAT occasionally learns constraints, data that be shared between threads

PROPOSED IMPROVEMENTS Parallelize base program

Task parallel vs. Data parallel Parallelize tasks (with synchronization) or copy entire

program Assign different heuristics to different threads

Current MiniSAT uses activity based (dynamic) prioritization Excellent for single thread, but cannot be extended

Divide Literal branches and assign a sequential heuristic* Assign a random heuristic within a given solution space* Hybrid heuristic combining all 3

Share common database between threads** Difficult to implement because of backtracking Only clauseDB is global, and frequent R/W will become a

bottleneck

IMPLEMENTATION Currently have multiple threads working in parallel

Threads are synchronized for exit Thread creation is dynamic, based on the number of cores in

the system Preventing OS from changing core assignment

Currently have 3 working heuristics Single processor execution results in original miniSAT (activity

based stack heuristic) 2nd thread will be a FIFO search algorithm Following threads will be random search heuristics* All new heuristics are activity based, but have different

prioritization Random decisions are randomized based on thread

When there is no chain activity to follow, miniSAT makes a random decision

Random decision is seeded with thread ID and is thus unrelated from thread to thread

CHALLENGES Testing

CNF files can be extremely large and time consuming to solve

SAT competition benchmark files can have over 200,000 variables and take 30+ mins to solve

Poorly Written Code Algorithms are clearly defined in documentation, but

actual code has very few comments Variable names are not always descriptive

Thread and data synchronization Data structures are large and smaller structures are prone

to ‘backtracking’ Multiple solutions

A SAT problem may have more than 1 possible solution Verifying a solution?

TRADEOFFS - ALGORITHM

Initial activity based heuristic has variable performance Because of the way MiniSAT is programmed,

changing the input variable order has a significant impact on processing time

Ordered Heap is a stack implementation, so removing from the bottom adds considerable processing time Virtually requires a rebuilding of the stack every time

Ideally, a few threads will execute sequentially, while other threads “learn” randomly for the main threads Requires a moderate level of communication

between threads

TRADEOFFS - PARALLELIZATION

Frequent access to shared/protected data reduces processing speed

For long literals and deep propagations, parallelization should be useful

More threads result in more memory usage Leads to frequent garbage collection Reduces work space of every thread

WORK REMAINING

Introduce data sharing/synchronization Small data structures (e.g: learnts list)cannot be

shared because of backtracking Sharing learnt would require tracking of which thread

has learned what, so the appropriate part can be ‘unlearnt’

After unlearning, the entire list would have to be recreated

Requires substantial changes to existing code and algorithm

Large data structures (e.g: clauseDB) are ideal for sharing Would reduce the overall memory usage Would slow down processing time because of frequent

access Actual database structure is not well defined

IDEAL IMPLEMENTATION

Functions currently built to search branches independently With data synchronization, separate branches can

help each other by learning new information and introducing new constraints

PRELIMINARY RESULTS

The sequential heuristic can beat the miniSAT heuristic in some cases

Because of random variables introduced, the execution time of multiple threads is variable

Tested small problem on 2 cores Execution time depends on particular CNF file

used Sequential heuristic is occasionally better (by as

much as 20%) Tested small problem on 6 cores (ecelinux),

but system has memory/quota issues Time tracking is potentially incorrect on

multicore Re-implemented to reflect actual time

PRELIMINARY RESULTS 2

Parallelized miniSAT output

Original miniSAT output

CONCLUSION

Synchronization of parallel miniSAT is an extremely complex problem

Provides excellent exposure to SAT solving algorithm (DPLL algorithm) and practical problems faced while implementing

4 competitors in 2009 SAT competition No solver could solve hard difficulty SAT

problems within time limit

SAT COMPETITION RESULTS

MiniSAT performance is hard to beat!!

Only 1 contestant from the parallel group placed in the top 20 (ManySAT)

Modifications can result in previously solvable problems becoming unsolvable

REFERENCE

2009 Sat Competition Results www.satcompetition.org/2009/sat09comp-slides.pdf

POSIX thread (pthread) librarieshttp://www.yolinux.com/TUTORIALS/LinuxTutorialPosixThreads.html

QUESTIONS