39
Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia. edu

Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul [email protected]

Embed Size (px)

Citation preview

Page 1: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Data Abstraction

CS 201j: Engineering Software

University of Virginia

Computer Science

Nathanael Paul

[email protected]

Page 2: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Overview

• Data abstraction

• Specification/Design of Abstract Data Types (ADTs)

• Implementation of ADTs

Page 3: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

The Problem

• Programs are complex.– Windows XP: ~45 million lines of code– Mathematica: over 1.5 million

• Abstraction helps– Many-to-one – “forget the details”– Must separate “what” from “how”

Page 4: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Information Hiding

• Modularity - Procedural abstraction– By specification

• Locality• Modifiability

– By parameterization

• Data Abstraction– What you can do with the data is separated

from how it is represented

Page 5: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Software development cycle

• Specifications – What do you want to do?

• Design – How will you do what you want?

• Implement – Code it.

• Test – Check if it works.

• Maintain – School projects don’t usually make it this far.

Bugs are cheaper earlier in the cycle!

Page 6: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Database Implementation

• Database on library web-server stores information on users: userID, name, email, etc.

• You are responsible for implementing the interface between the web-server and database– What happens when we ask for the email

address for a specific user?

Page 7: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Client asks for email address

What is email address of nate?

Client

Server

Database

Page 8: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Client/Server/Database Interaction

I need Nate’s email.

The interaction between the server and database is your part.Database

Server

Client

Page 9: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Client/Server/Database Interaction

[email protected]

Client

Server

Database

Page 10: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Client/Server/Database Interaction

[email protected]

Client

Server

Database

Page 11: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Example: Database System

• Need a new data type

• Abstract Data Types (ADTs)– Help separate what from how– Client will use the specifications for interaction

with data– Client of the web database should not know

the “guts” of the implementation

Page 12: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Data abstraction in Java• An ADT is defined by a class

– The ADT in the web/database application will be a User

– A private instance variable hides the class internals

– public String getEmail ();• What is private in the implementation?• OVERVIEW, EFFECTS, MODIFIES

– A class does not provide data abstraction by itself

Page 13: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

AccessibilityClass User {

// OVERVIEW:

// mutable object

// where the User

// is a library

// member.

public String email;

}

String nateEmail = myUser.email;

sendEmail(nateEmail);

/* The client’s code can only see what is made public in the User class. The user’s email data is public in the User class. This is BAD. */

/* Client code using a User object, myUser */

Page 14: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Program Maintenance

• Suppose storage space is at a premium– Everyone in the database is

[email protected], so we can drop the virginia.edu

[email protected] nate– What kind of problems will occur with the code

just seen?

Page 15: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Program Maintenance

• Suppose storage space is at a premium– Everyone in the database is

[email protected], so we can drop the virginia.edu

[email protected] nate– What kind of problems could occur had the

client code been able to access the email address directly?

String nateEmail = myUser.email;

sendEmail(nateEmail);

Email was public in User class. ***ERROR!!!***

Page 16: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Accessibility (fixed)Class User { // OVERVIEW: A // mutable object where // User is a library // member.

private String email;

public String

getEmail() {

// EFFECTS: returns user’s

// primary email

return email;

}

}

String nateEmail = myUser.getEmail();

sendEmail(nateEmail);

/* This code properly uses data abstraction when returning the full email address. */

// Client code using a User object, myUser

Page 17: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Accessibility (fixed)Class User { // OVERVIEW: A // mutable object where // User is a library // member.

private String email;

public String

getEmail() {

// EFFECTS: returns user’s

// primary email

return email +“@virginia.edu”;

}

}

String nateEmail = myUser.getEmail();

sendEmail(nateEmail);

/* The database dropped the @virginia.edu, and only one line of code needed changing. */

// Client code using a User object, myUser

Page 18: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Advantages/Disadvantages ofData Abstraction?

- More code to write and maintain initially

- Overhead of calling a method

- Greater initial time investment

+ Client doesn’t need to know about representation

+ Maintenance is easier.

+ Increases locality and modifiability

Page 19: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Specifying ADTs

Page 20: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Bad Users at the Library

• The library now wants to crack down on bad Users with overdue books, so the code will need to work with a group of Users.

• What should be used to represent the group? What data structures do we know about? How should we integrate this code with what we have?

• What operations should be supported?– deleteUser(String userID);– isInGroup(String userID);

Page 21: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Library keeping track of “bad” people

• You need to write some code that will manipulate a group of Users that are on the “bad” list.

• Implementation at right uses an array

Class GroupUsers { // OVERVIEW: // Operations provided // to manage a mutable group // of users private User [] latePeople; … public void toString() { // OVERVIEW: Print user // names to standard output … }}

Page 22: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Array implementation initialization for GroupUsers

Class GroupUsers { // OVERVIEW: Unbounded, mutable // group of Users private User [] latePeople; … public void GroupUsers(String [ ] userIDs) { // OVERVIEW: Initialize group // from userIDs latePeople = new User[userIDs.length + 10]; for(int i = 0; i < userIDs.length; i++) { latePeople[i] = new User(userIDs[i]); } }}

Page 23: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

ADT design

• Mutable/Immutable ADTs– Mutable – object’s fields or values change– Immutable – object’s fields permanently set at

creation– Is this being modified?

• Tradeoffs• Immutability simpler and safer• Immutability is slower (creation/deletion of objects)

Page 24: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Classification of ADT operations

• Creator (constructor) – GroupUsers(String userIDs[ ])

• Producer– addUser(String userID)

• Mutator– setUserEmail(String email)

• Observer– isMember (String userID)

Page 25: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Implementing ADTs

Page 26: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

A bad implementation

• Most common characteristics– Modifying implementation forces other code to

be changed (violdates modifiability)– Must understand more code than necessary

to reason about code (violates locality)– Maintenance is difficult

Page 27: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

A good implementation

• User class needed a way to store state of a user, so operations will build around the stored state.

• Methods should be (procedure abstraction):– Easily coded as possible– Efficient– Exhibit locality– Should enable better testing, maintenance

Page 28: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Changing the group implementation

• The “guts” of the implementation is subject to change.

• What happens on the GroupUser’s deleteUser(String userID)?

Page 29: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

deleteUser(String userID)

• The array must shift down an average of n/2 items when deleting an element

<user> <user> <user> <user><user> <user><user><user>X

Page 30: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Linked ListsA new data structure

User 1 User 2 User 3

Each User has its own representation, but we store the collection in a list. In the following implementation, each user object is contained in a Node object.

Head

X

Page 31: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

class Node { // OVERVIEW: // Mutable nodes that is used for a linked list // of users private User theUser; private Node next; …}

List-node implementation

User 1

next points to the next “bad” user

User 2 …

latePeople

Page 32: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

class GroupUsers { // OVERVIEW: // Mutable, unbounded group of users

private Node latePeople; /* head of list */ private int numUsers; …}

/* Nodes are users with an additional member field called next. The Node class was added, so the User class would not need modification. */

List implementation

Page 33: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Adding a user into GroupUsers

/* in GroupUsers.java */

public void addUser(User newUser) {// MODIFIES: this

// EFFECTS: this_pre = this_pre U { (Node)newUser } latePeople.add(new Node(newUser)); numUsers++;}

Page 34: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Adding a node into a group of nodes (Node.java)

public void add (Node n) {// MODIFIES: this// EFFECTS: n is inserted just after this in the list

// first user in list? if (this.next == null) { this.next = n; } else { n.next = this.next;

this.next = n; } }

Page 35: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

deleteUser(String userID) cont.

User 1 User 2 User 3

Head

X

User 1 User 3

Head

X

X

Page 36: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

deleteUser(String userID)Node.java

public void delete (String userID) {// MODIFIES: this// EFFECTS: this_pre = this_pre – node// where node.userID = userID

Node currNode;Node prevNode;if(this.next == null) return;

prevNode = this;currNode = this.next;

// continued on next slide

Page 37: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

deleteUser(String userID)cont.

while(currNode.next != null) {if(userID.equals(currNode.getUserID())) {

prevNode.next = currNode.next;break;

}currNode = currNode.next;prevNode = prevNode.next;

}

// user at end of list?if (currNode.next == null &&

userID.equals(currNode.getUserID())) {prevNode.next = null;

} }

Page 38: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Linked List vs. Array

• Array is better for:– Accessing a randomly desired element

• Linked list is better at:– Inserting– Deleting– Dynamic resizing

• Users of your implementation may need to use a list or an array for efficiency, so you need an implementation that can be changed easily.

Page 39: Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul nate@virginia.edu

Questions?