View
203
Download
2
Category
Preview:
Citation preview
GRAPHS - INTRODUCTION
Many real-life problems can be formulated in terms of sets of objects and relationships or connections between objects. Examples include:
Finding routes between cities: the objects could be towns, and the connections could be road/rail links.
Deciding what first year courses to take: the objects are courses, and the relationships are prerequisite and co requisite relations. Similarly, planning a course: the objects are topics, and the relations are prerequisites between topics (you have to understand topic X before topic Y will make sense).
Graphs
DEFINITIONS
A graph is a data structure (ADT) that consists of a set of vertices (or nodes) (which can represent objects), and a set of edges linking vertices (which can represent relationships between the objects).
A tree is a special kind of graph (with certain restrictions).
Graph algorithms operate on a graph data structure, and allow us to, for example, search a graph for a path between two given nodes; find the shortest path between two nodes; or order the vertices in the graph is a particular way.
DEFINITIONS……
A graph is a generalization of the tree structure, where instead of a strict parent/child relationship between tree nodes, any kind of complex relationships between the nodes can be represented.
The graph ADT follows directly from the GRAPH concept from mathematics.
DEFINITIONS.....
Incident edge: (vi,vj) is an edge, then edge(vi,vj) is said to be incident to vertices vi and vj
If vi and vj are connected, they are said to be adjacent vertices/nodes
vi and vj are endpoints of the edge {vi, vj} If an edge e is connected to v, then v is said to
be incident on e. Also, the edge e is said to be incident on v.
DEFINITIONS
Cycle Path that ends back at the starting node Example:
A, B, C, G, AA, B, C, G, A Simple path
No cycles in path Acyclic graph
Graph with no cycles Acyclic undirected graphs are trees
GG
CCBB
AA
HH NN
KK
UnconnecteUnconnected graph d graph with two with two
connected connected componentcomponent
ss
DEFINITIONS Two nodes are reachable if
Path exists between them Connected graph
Every node is reachable from any other node
GG
JJFF
DD
AA
Connected Connected graphgraph
GG
JJFF
DD
AA
EE CC HH
Degree of vertex.....
The number of edges incident onto the vertex For a directed graph:
In degree of a vertex vi is the number of edges incident onto vi, with vi as the head.
Out degree of vertex vi is the number of edges incident onto vi, with vi as the tail.
In a directed graph, the number of edges that point to a given vertex is called its in-degree, and the number that point from it is called its out-degree.
Directed/Digraph Graph Origin and terminating nodes A graph is connected if there is a path between
any two vertices. A directed graph is strongly connected if there
is a directed path between any two vertices (edges have directions).
The degree of a vertex is the number of edges adjacent to it.
Undirected (Undigraph)Graph
A graph is undirected if (x,y) implies (y,x). An edge of the form (x,x) is said to be a loop.
If x is y's friend several times over, that could be modeled using multiedges, multiple edges between the same pair of vertices.
A graph is said to be simple if it contains no loops and multiple edges.
…….
A path is a sequence of edges connecting two vertices.
Since Brooks is my father's-sister's-husband's cousin, there is a path between me and him! Etc.
Graphs Directed graph
Edges have direction
Undirected Undirected graphgraph Undirected Undirected
edgesedges
77
1919
2121
11
1212
44
332222 22
33
GG
JJ
FF
DD
AA
EE CC HH
12
Weighted graph
Weight (cost) is associated with each edge
GG
JJ
FF
DD
AA
EE CC HH
KK
NN
104
14
6 16
9
8
7
5
22
3
13
Edges are of 2 types Directed edge: A directed edge between the
vertices vi and vj is an ordered pair. It is denoted by <vi,vj>.
Undirected edge: An undirected edge between the vertices vi and vj is an unordered pair. It is denoted by (vi,vj).
Maximum number of edges: The maximum number of edges in an undirected graph with n vertices is n(n−1)/2.
In a directed graph, it is n(n−1).
PATHS
Path (in directed graph) Examples:
A, B, C is a pathA, G, K is not a path
GG
CCBB
AA
HH NN
KK
PATHS
Path in undirected graph Examples:
A, B, C is a pathH, K, C is not a path
GG
CCBB
AA
HH NN
KK
16
Representing Graphs
Adjacency list Each node holds a
list of its neighbors
Adjacency matrix Each cell keeps
whether and how two nodes are connected
Set of edges
00 11 00 11
00 00 11 00
11 00 00 00
00 11 00 00
1
2
3
4
1 2 3 4
{1,2} {1,4} {2,3} {3,1} {1,2} {1,4} {2,3} {3,1} {4,2}{4,2}
1 1 {2, {2, 4}4}2 2 {3} {3}3 3 {1} {1}4 4 {2} {2}
22
4411
33
17
Adjacency Matrix
• 2D array, where n is the number of vertices in the graph• Each row and column is indexed by the vertex id.
- e,g a=0, b=1, c=2, d=3, e=4• An array entry A [i] [j] is equal to 1 if there is an edge connecting vertices i and j. Otherwise, A [i] [j] is 0.
Adjacency Matrix
Adjacency Matrix
2
4
3
5
1
76
9
8
0 0 1 2 3 4 5 6 7 8 9
0 0 0 0 0 0 0 0 0 1 0
1 0 0 1 1 0 0 0 1 0 1
2 0 1 0 0 1 0 0 0 1 0
3 0 1 0 0 1 1 0 0 0 0
4 0 0 1 1 0 0 0 0 0 0
5 0 0 0 1 0 0 1 0 0 0
6 0 0 0 0 0 1 0 1 0 0
7 0 1 0 0 0 0 1 0 0 0
8 1 0 1 0 0 0 0 0 0 1
9 0 1 0 0 0 0 0 0 1 0
Adjacency List
• The adjacency list is an array A[0..n-1] of lists, where n is the number of vertices in the graph.•Each array entry is indexed by the vertex id (as with adjacency matrix)• The list A[i] stores the ids of the vertices adjacent to i.
Adjacency Lists
Adjacency Lists
An adjacency list consists of a array of pointers, where the ith element points to a linked list of the edges incident on vertex i.
It is implemented by representing each node as a data structure that contains a list of all adjacent nodes.
Rows and columns of a two-dimensional array represent source and destination vertices and entries in the graph indicate whether an edge exists between the vertices.
Adjacency List
2
4
3
5
1
76
9
8
0 0
1
2
3
4
5
6
7
8
9
2 3 7 9
8
1 4 8
1 4 5
2 3
3 6
5 7
1 6
0 2 9
1 8
Adjacency Multi list
In the adjacency-list representation, each edge (u, v) is represented by two entries, one on the list for u and the other on the list for v
Multi lists: lists in which nodes may be shared among several lists
For each edge there will be exactly one node, but this node will be in two lists (i.e., the adjacency lists for each of the two nodes to which it is incident)
Adjacency Lists vs. Matrix
Adjacency Lists More compact than adjacency matrices if
graph has few edges Requires more time to find if an edge
exists Adjacency Matrix
Always require n2 spaceThis can waste a lot of space if the
number of edges are sparse Can quickly find if an edge exists
Operations
Typical operations associated with graphs are: finding a path between two nodes, e.g. the shortest path from one node to another.
A directed graph can be seen as a flow network, where each edge has a capacity and each edge receives a flow.
Comparison with other data structures
Graph data structures are non-hierarchical and therefore suitable for data sets where the individual elements are interconnected in complex ways.
For example, a computer network can be simulated with a graph.
Hierarchical data sets can be represented by a binary or non binary tree.
It is worth mentioning, however, that trees can be seen as a special form of graph.
Graph traversal
Traversal of graph implies visiting the nodes of the graph.
A graph can be traversed in 2 ways Depth first traversal Breadth first traversal
Depth First traversal When a graph is traversed by
visiting the nodes in the forward (deeper) direction as long as possible, the traversal is called depth-first traversal.
E.g. the depth-first traversal starting at the vertex 0 visits the node in the orders: 0 1 2 6 7 8 5 3 4 0 4 3 5 8 6 7 2 1
Breadth first traversal
When a graph is traversed by visiting all the adjacent nodes/vertices of a node/vertex first, the traversal is called breadth-first traversal.
For a graph in which the breadth-first traversal starts at vertex v1, visits to the nodes take place in the order shown in Figure
Minimum Cost spanning tree When the edges of the graph
have weights representing the cost in some suitable terms, we can obtain that spanning tree of a graph whose cost is minimum in terms of the weights of the edges.
For this, we start with the edge with the minimum-cost/weight, add it to set T, and mark it as visited.
We next consider the edge with minimum-cost that is not yet visited, add it to T, and mark it as visited. While adding an edge to the set T, we first check whether both the vertices of the edge are visited; if they are, we do not add to the set T, because it will form a cycle.
The minimum-cost spanning tree of the graph is as shown
BFS and Shortest Path Problem
Given any source vertex s, BFS visits the other vertices at increasing distances away from s. In doing so, BFS discovers paths from s to other vertices
What do we mean by “distance”? The number of edges on a path from s.
2
4
3
5
1
76
9
8
0
Consider s=vertex 1
Nodes at distance 1? 2, 3, 7, 91
1
1
12
22
2
s
Example
Nodes at distance 2? 8, 6, 5, 4
Nodes at distance 3? 0
Graphs and Their Applications Graphs have many real-world applications
Modeling a computer network like InternetRoutes are simple paths in the network
Modeling a city mapStreets are edges, crossings are vertices
Social networksPeople are nodes and their connections are
edges State machines
States are nodes, transitions are edges
Representing Graphs in C#public class Graphpublic class Graph{{ int[][] childNodes;int[][] childNodes; public Graph(int[][] public Graph(int[][] nodes)nodes) {{ this.childNodes = nodes;this.childNodes = nodes; }}}}
Graph g = new Graph(new int[][] {Graph g = new Graph(new int[][] { new int[] {3, 6}, // successors of vertice 0new int[] {3, 6}, // successors of vertice 0 new int[] {2, 3, 4, 5, 6}, // successors of new int[] {2, 3, 4, 5, 6}, // successors of vertice 1vertice 1 new int[] {1, 4, 5}, // successors of vertice 2new int[] {1, 4, 5}, // successors of vertice 2 new int[] {0, 1, 5}, // successors of vertice 3new int[] {0, 1, 5}, // successors of vertice 3 new int[] {1, 2, 6}, // successors of vertice 4new int[] {1, 2, 6}, // successors of vertice 4 new int[] {1, 2, 3}, // successors of vertice 5new int[] {1, 2, 3}, // successors of vertice 5 new int[] {0, 1, 4} // successors of vertice 6new int[] {0, 1, 4} // successors of vertice 6});});
0066
4411
55
22
33
HASH TABLES - INTRODUCTION
WHY the use of Hash tables Hash tables are good for doing a quick search
on things. For instance if we have an array full of data
(say 100 items). If we knew the position that a specific item is stored in an array, then we could quickly access it.
For instance, we just happen to know that the item we want is at position 3; I can apply: myitem=myarray[3];
HASH TABLES - INTRODUCTION
With this, we don't have to search through each element in the array, we just access position 3.
The question is, how do we know that position 3 stores the data that we are interested in?
This is where hashing comes in handy. Given some key, we can apply a hash function
to it to find an index or position that we want to access.
HASH FUNCTION
Hashed Table
Defines the table as one that is managed with an internal hash procedure.
A hashed table is a set, whose elements you can address using their unique key.
Unlike standard and sorted tables, you cannot access hash tables using an index.
All entries in the table must have a unique key.
A small phone book as a hash table
Choosing a good hash function
A good hash function is essential for good hash table performance.
A poor choice of hash function is likely to lead to clustering, in which probability of keys mapping to the same hash bucket (i.e. a collision) is significantly greater than would be expected from a random function.
Collision resolution
If two keys hash to the same index, the corresponding records cannot be stored in the same location.
So, if it's already occupied, we must find another location to store the new record, and do it so that we can find it when we look it up later on.
There are a number of collision resolution techniques, chaining and open addressing.
…….
Difference has to do with whether collisions are stored outside the table (open hashing) or whether collisions result in storing one of the records at another slot in the table (closed hashing)
Chaining
Hash collision resolved by chaining
In the simplest chained hash table technique, each slot in the array references a linked list of inserted records that collide to the same slot.
Insertion requires finding the correct slot, and appending to either end of the list in that slot; deletion requires searching the list and removal.
Chained hash tables inherit the disadvantages of linked lists.
When storing small records, the overhead of the linked list can be significant. Also, traversing a linked list has poor cache performance.
Open Addressing
Open addressing hash tables can store the records directly within the array.
A hash collision is resolved by probing, or searching through alternate locations in the array (the probe sequence) until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table.
Probe sequences include:
Linear probing the interval between probes is fixed--often at 1,
Quadratic probing the interval between probes increases linearly (hence, the indices are described by a quadratic function), and
Double probing the interval between probes is fixed for each record but is computed by another hash function.
……….
Open Addressing Vs. Chaining They are simple to implement effectively and
only require basic data structures. From the point of view of writing suitable hash
functions, chained hash tables are insensitive to clustering, only requiring minimization of collisions.
OA depends upon better hash functions to avoid clustering. This is particularly important if novice programmers can add their own hash functions.
Open Addressing Vs. Chaining
They degrade in performance more gracefully. Although chains grow longer as the table fills, a chained hash table cannot "fill up" and does not exhibit the sudden increases in lookup times that occur in a near-full table with open addressing.
If the hash table stores large records, about 5 or more words per record, chaining uses less memory than open addressing.
Open Addressing Vs. Chaining
If the hash table is sparse (that is, it has a big array with many free array slots), chaining uses less memory than open addressing even for small records of 2 to 4 words per record due to its external storage.
If the hash table is sparse (that is, it has a big array with many free array slots), chaining uses less memory than open addressing even for small records of 2 to 4 words per record due to its external storage.
Applications of Hash Tables
Hash tables are good in situations where you have enormous amounts of data from which you would like to quickly search and retrieve information.
A few typical hash table implementations would be in the following situations:
Applications of Hash Tables
Driver's license record's. With a hash table, you could quickly get information about the driver (i.e. name, address, age) given the license number.
Compiler symbol tables. The compiler uses a symbol table to keep track of the user-defined symbols in a program. This allows the compiler to quickly look up attributes associated with symbols (for example, variable names)
Applications of Hash Tables…..
For internet search engines. For telephone book databases. You could
make use of a hash table implementation to quickly look up Joan’s telephone number.
For electronic library catalogs. Hash Table implementations allow for a fast find among the millions of materials stored in the library.
Applications of Hash Tables…..
For implementing passwords for systems with multiple users.
Hash Tables allow for a fast retrieval of the password which corresponds to a given username.
QUESTIONS
END
Recommended