Upload
allen-fisher
View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Arrays and Pointers
Prepared by
Manuel E. Bermúdez, Ph.D.Associate ProfessorUniversity of Florida
Programming Language PrinciplesLecture 23
Arrays
• Most common composite data type.• Semantically, viewed as a mapping
from the index type to the element type.
• Some languages permit only integer as the index type; others allow any scalar.
Array Declaration Syntax
C: char upper[26] ; /* array of 26 chars, 0..25 */
Fortran:
character(26) upper
Pascal:
var upper: array[‘a’ .. ‘z’] of char;
Ada:
upper: array (character range ‘a’ .. ‘z’) of character;
Arrays and functions in Ada
In either case, upper(‘a’) returns ‘A’.
Multi-Dimension Arrays
Ada: matrix: array (1..10, 1..10) of real;Modula-3: VAR matrix: ARRAY [1..10],[1..10] OF REAL; (same as) VAR matrix: ARRAY [1..10] OF ARRAY [1..10] OF REAL;and matrix[3,4] is the same as matrix[3][4].
Multi-Dimension Arrays (cont’d)
In Ada, matrix: array(1..10,1..10) of real;is NOT the same as matrix: array(1..10) of array (1..10) of
real;
matrix(3)(4) not legal in first form;
matrix(3,4) not legal in second form.
Multi-Dimension Arrays (cont’d)
An array of arrays is a slice.
In C, double matrix[10][10].
However, C integrates arrays and pointers,so matrix[3] is not an array of 10 doubles.
It is (depending on context) either:A pointer to the third row of matrix, or
the value of matrix[3][0]
Slices in Fortran
Array Dimensions, Bounds and Allocation
Five cases:• Global lifetime, static shape:
static allocation (easy).• Local lifetime, static shape:
space allocated on a stack frame.• Local lifetime, shape bound at elaboration:
stack frame needs fixed-size part and variable-size part.
• Arbitrary lifetime, shape bound at elaboration: Java, programmer allocates space.
• Arbitrary lifetime, dynamic shape: array lives on the heap.
Array allocation in Ada (shape bound at elaboration time)
Conformant Arrays in Pascal
• Array shape determined at time of call.• Pascal doesn’t allow local dynamic-shaped
arrays.
• Ada DOES allow local dynamic-shaped arrays (see textbook)
Other Forms of Dynamic Arrays
• C: Arrays passed by reference, so bounds are irrelevant ! (programmer’s problem)
• Java strings: String s = “short”; s = s + “ and sweet”; //
immutable
Resizing Arrays in Java
Create a new array of proper length and data type:
Integer[] a = new Integer[10]; Object[] newArray = new Object[newLength];
•Copy all elements from old array into new one:
System.arraycopy(a,0,newArray,0,a.length);
Rename array: element = newArray; // old space reclaimed by garbage // collector.
Dynamic Arrays in Fortran 90
Arrays sized at run time, but can’t be changed once set.
Classic Array Memory Layouts
Memory Layout in C
Address calculation (static array bounds)
Virtual Location of Array
With static array bounds, we’ve “moved” the array in 3Ds.
Dope Vectors
• A “run-time” descriptor for the array.• Contains, for each dimension (except last
one, always statically known):• Lower bound• Size• Upper bound (if dynamic checks are
required)• Size of dope vector depends on # of
dimensions (i.e. static).• Typically placed next to the array pointer, in
the fixed-size portion of the stack frame.
Strings
• Usually an array of characters.• Many languages allow more flexibility with
strings than with other types of arrays.• Single-character string vs. single character:
• Pascal: no distinction.• C: *very* different
• String constants: 'abc', ”abc”.• Rules for embedding special characters:
• Pascal: double the character: ' ab''cd'• C: escape sequence: ”ab\”cd”.
Strings
• C, Pascal, Ada: string length bound no later than elaboration time (allocate in stack frame).
• Lisp, Icon, ML, Java: allow dynamically-bound strings, stored in the heap.
• Pascal supports lexicographically-ordered comparison of strings ('abc' < 'abd'). Ada supports it on all 1D discrete-valued arrays.
• C: no string assignment, elements copied individually (library functions).
Strings in C
Sets
Pascal supports sets of any discrete type:
var a,b,c: set of char; d,e: set of weekday;
a := b + c; (* union *)
a := b * c: (* intersection *)
a := b – c: (* difference *)
Set implementations
• Arrays, hash tables, trees.• Bit-vectors: each entry true (element in
the set), or false (element not in the set)• Efficient operations:
• Union is inclusive bit-wise OR.• Intersection is bit-wise AND.• Difference is NOT, followed by AND.
• Won’t work for large base types:• A set of 32-bit integers ~ 500MBs.• A set of 64-bit integers ~ 241 MBs
• Usually limited to 128, or 512.
Pointers and Recursive Types
• Most recursive types are records.• Reference model languages (Lisp, ML,
Clu, Java): every field is a reference.• A record of type f contains a reference
to another record of type f.• Value model languages (C, Pascal, Ada):
need a pointer (a variable whose value is a reference).
• Don’t confuse pointer with address: an address may be segmented.
Storage Reclamation
• Explicit (C,C++, Pascal, Modula-2): programmer must reclaim unused heap space.• Can be done efficiently.• Easy to get wrong; if so, can lead to
memory leaks.• Implicit (Lisp, ML, Modula-3, Ada, Java):
heap space reclaimed automatically.• Not so efficient (but getting better)• Simplifies programmer’s task a LOT.
Reference Model (ML)
node (‘R’,node(‘X’,empty,empty), node(‘Y’,node(‘Z’,empty,empty),
node(‘W’,empty,empty)))
Reference Model (Lisp)
'(#\R(#\X()())(#\Y(#\Z()())(#\W()())))
Value Model
• Pascal: type chr_tree_ptr = ^chr_tree; chr_tree = record left, right:chr_tree_ptr; val: char end;• C: struct chr_tree { struct chr_tree *left, *right; char val; }• In C, struct names are not quite type names. Shorthand:
typedef struct chr_tree chr_tree_type
Memory Allocation
• Pascal: new(my_ptr);
• Ada: my_ptr:=new chr_tree;
• C: my_ptr=(struct chr_tree *) malloc(sizeof (struct chr_tree));
• C++, Java: my_ptr = new chr_tree(args);
Pointer References
Pascal: my_ptr^.val := ‘X’;C: (*my_ptr).val = ‘X’; my_ptr->val = ‘X’;Ada: T: chr_tree; P: char_tree_ptr; T.val := ‘X’; P.val := ‘X’; good for record or pointer to one. T := P.all; if need to reference the record.
Pointers and Arrays in C
int n;int *a;int b[10];
All are valid:a = b;
n = a[3];n = *(a+3);n = b[3];n = *(b+3);
Pointers and Arrays in C (cont’d)
Interoperable, but not the same:
int *a[n] allocates n pointersint[n][m] allocates a full 2D array.
In fact, assuming int a[n];*(a+i)*(i+a)a[i]i[a]
are all equivalent !
Pointers and Arrays in C (cont’d)
• In C, arrays are passed by reference: the array name is a pointer.
• It’s customary to pass the array name, and its dimensions:
double det (double *M, int rows, int cols)
{ int i,j; ...
val = *(M+i*cols+j); /* M[i][j] */
}
Tombstones
Technique for catching dangling references
Tombstones
• Advantages:• Catch dangling references.• Prevent memory leaks.• Helpful in heap compaction.
• Disadvantages:• Cheap on the heap, expensive on
the stack (procedure entry/return).• Tombstones themselves can dangle.
Locks and Keys
Locks and Keys
• Advantage:• No need to keep tombstones around.
• Disadvantages:• Only work for heap objects.• Significant overhead.• Increase the cost of copying a pointer.• Increase the cost of every access.
Reference Counts
• Set count to 1 upon object creation• Upon assignment.,
• Decrement count of object on left.• Increment count of object on right.
• Upon subroutine entry, increment counts for local pointers.
• Upon subroutine return, decrement counts for local pointers.
• Need type descriptors for this: objects can be deeply structured.
• WILL FAIL ON CIRCULAR STRUCTURES !
Reference Counts Fail on Circular Structures
Garbage Collection
• System determines which memory is not in use and return the memory to the pool of free storage.
• Done in two or three steps:• Mark nodes that are in use.• Compact free space (optional).• Move free nodes to storage pool.
Marking
• Unmark all nodes (set all mark bits to false).• Start at each program variable that contains a
reference, follow all pointers, mark nodes that are reached.
c a e d b
firstNode
Compaction
Move all marked nodes (i.e., nodes inuse) to one end of memory, updatingall pointers as necessary.
c b e d b
firstNode
a e d Free Memory
Lists in Lisp and ML
Equality Testing and Assignment
• Equality comparison is easy for scalars• For complex or abstract data types, say,
strings s and t, s = t could mean • s and t are aliases• s and t occupy the same storage• s and t contain the same sequence of
characters• s and t print the same
Deep and Shallow Comparisons
• Shallow Comparison:• Both expressions refer to the same
object.• Deep Comparison:
• Expressions refer to objects that are “equal” in content somehow.
• Most PLs use shallow comparisons, and shallow assignments.
Arrays and Pointers
Prepared by
Manuel E. Bermúdez, Ph.D.Associate ProfessorUniversity of Florida
Programming Language PrinciplesLecture 23