CSE$305$Introduc0on$to$ Programming$Languages$$ …zhiyang/teaching/cse305/lectures/lecture... · CSE$305$Introduc0on$to$ Programming$Languages$$ Lecture$15$–ObjectOriented$ Programming$$

CSE 305 Introduc0on to Programming Languages

Lecture 15 – Object Oriented Programming

CSE @ SUNY-‐Buffalo Zhi Yang

Courtesy of Professor Yacov Hel-‐Or Courtesy of Dr. David Reed

No0ce Board

•  August 1,we will be having the fourth long quiz, which covers lecture 16 to lecture 19.

Our objec0ve •  The first objec0ve of our class, is to comprehend a new programming language within very short 5me period, and because you have this ability to shorten your learning curve, you are going to manipulate the language with an insight learning.

•  The second objec0ve is to even engineer your own language!

Review what we’ve learnt and see future Number System

Basic Calcula0on System

1st Genera0on language:

Machine Code 2nd Genera0on

language: Assembly Code

3rd Genera0on Language:

Macro func0on

What’s next ? Lexer

Parser

eg: Egyp0an Number System; Complement Number

eg: Fortran

eg: Gate system, Including different underline device

Context-‐Free Grammar

Lambda Calculus Theory

Regular Expression

Compiler System

Virtual Machine

Type Checking

eg: Abacus

eg: MIPS

Push Down Automata Macro func5on

A family tree of languages

<Fortran>

BASIC

Cobol <LISP>

<Scheme>

<ML>

<Prolog>

PL/1 Algol 60

Algol 68

Pascal

Modula 3 Ada

C

<C++>

<Simula>

<Smalltalk>

<Java>

Dylan

<Ruby>

<Perl>

<Python> <C#>

<Haskell>

<JavaScript>

Type Inference/ Type Reconstruc0on

Chapter 17

Type Reconstruction

For technical reasons, this chapter uses a slightly different definition of substitution fromwhat we had before. This should be changed to correspond exactly to the earlier notion.Aside from that, it’s essentially finished.

The present chapter does not mention let-polymorphism. I think that’s a shame and I’vetried to figure out how to put in something about it, but it’s apparently quite hard to do itin a rigorous way without going on for page after page of technicalities.

Given an explicitly typed term in the simply typed lambda-calculus, we haveseen an algorithm for determining its type, if it has one. In this chapter, we developa more powerful type reconstruction algorithm, capable of calculating a principaltype for a term in which some of the explicit type annotations are replaced byvariables. Similar algorithms lie at the heart of languages like ML and Haskell.

The term “type inference” is often used instead of “type reconstruction.”

17.1 Substitution

We will work in this chapter with the system , the simply typed lambdacalculus with numbers and an infinite collection of atomic types. When we sawthem in Section 8.1, these atomic types were completely uninterpreted. Now weare going to think of them as type variables—i.e., placeholders standing for othertypes. In order to make this idea precise, we need to define what it means tosubstitute arbitrary types for type variables.

17.1.1 Definition: A type substitution (or, for purposes of this chapter, just sub-stitution) is a finite function from type variables to types. For example, we write

for the substitution that maps to and to and is undefined onother arguments.

118

Principal Typings

January 15, 2000 17. TYPE RECONSTRUCTION 128

The main data structure needed for this exercise is a representation of substitu-tions. There are many alternatives; one simple one is to reuse the datatypefrom Exercise 17.3.10: a substitution is just a constraint set, all of whose left-handsides are unification variables. If is a function that performs substitu-tion of a type for a single type variable

then application of a whole substitution to a type can be defined as follows:

(Solution on page 258.)

17.5 Principal Typings

17.5.1 Definition: A principal typing for is a typing such that,whenever is also a typing for , we have .

17.5.2 Exercise [Quick check]: Find a principal typing for

17.5.3 Theorem [Principal typing]: If has any typing, then it has a prin-cipal one. Moreover, the unification algorithm can be used to determine whether

has a typing and, if so, to return a principal one.

Proof: Immediate from the definition of typing and the properties of unification.

17.5.4 Corollary: It is decidable whether has a typing.

Proof: By Corollary 17.3.8 and Theorem 17.5.3.

17.5.5 Exercise [Recommended]: Use the implementations of constraint genera-tion (Exercise 17.3.10) and unification (Exercise 17.4.7) to construct a running type-checker that calculates principal typings, using the checker provided in thecourse directory as a starting point.

January 15, 2000 17. TYPE RECONSTRUCTION 128

The main data structure needed for this exercise is a representation of substitu-tions. There are many alternatives; one simple one is to reuse the datatypefrom Exercise 17.3.10: a substitution is just a constraint set, all of whose left-handsides are unification variables. If is a function that performs substitu-tion of a type for a single type variable

then application of a whole substitution to a type can be defined as follows:

(Solution on page 258.)

17.5 Principal Typings

17.5.1 Definition: A principal typing for is a typing such that,whenever is also a typing for , we have .

17.5.2 Exercise [Quick check]: Find a principal typing for

17.5.3 Theorem [Principal typing]: If has any typing, then it has a prin-cipal one. Moreover, the unification algorithm can be used to determine whether

has a typing and, if so, to return a principal one.

Proof: Immediate from the definition of typing and the properties of unification.

17.5.4 Corollary: It is decidable whether has a typing.

Proof: By Corollary 17.3.8 and Theorem 17.5.3.

17.5.5 Exercise [Recommended]: Use the implementations of constraint genera-tion (Exercise 17.3.10) and unification (Exercise 17.4.7) to construct a running type-checker that calculates principal typings, using the checker provided in thecourse directory as a starting point.

!"#$%&'()*+($$•  ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$!"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&*,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$)"#+(4$8%"$3#($9%569$0%$:365;"230($0-($2369"39($<50-$36$56/59-0$2(3#6569=$$$$

•  ,-($/()%67$%&'()*+($5/$0%$(+(6$&(-.(&&%*3"0%*",(*/+(-0+-&>$$

!"#$%&'()*+($$•  ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$!"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&*,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$)"#+(4$8%"$3#($9%569$0%$:365;"230($0-($2369"39($<50-$36$56/59-0$2(3#6569=$$$$

•  ,-($/()%67$%&'()*+($5/$0%$(+(6$&(-.(&&%*3"0%*",(*/+(-0+-&>$$

!"#$%&'()*+($$•  ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$!"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&*,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$)"#+(4$8%"$3#($9%569$0%$:365;"230($0-($2369"39($<50-$36$56/59-0$2(3#6569=$$$$

•  ,-($/()%67$%&'()*+($5/$0%$(+(6$&(-.(&&%*3"0%*",(*/+(-0+-&>$$

No5ce, in terms of principal types, following slides will include experiments on 64bit Intel Architecture computer ~~~

Learn O-‐O from C/C++ perspec0ve (Principal types)

144 Thinking in C++ www.BruceEckel.com

// Demonstrates the use of specifiers #include <iostream> using namespace std; int main() { char c; unsigned char cu; int i; unsigned int iu; short int is; short iis; // Same as short int unsigned short int isu; unsigned short iisu; long int il; long iil; // Same as long int unsigned long int ilu; unsigned long iilu; float f; double d; long double ld; cout << "\n char= " << sizeof(c) << "\n unsigned char = " << sizeof(cu) << "\n int = " << sizeof(i) << "\n unsigned int = " << sizeof(iu) << "\n short = " << sizeof(is) << "\n unsigned short = " << sizeof(isu) << "\n long = " << sizeof(il) << "\n unsigned long = " << sizeof(ilu) << "\n float = " << sizeof(f) << "\n double = " << sizeof(d) << "\n long double = " << sizeof(ld) << endl; } ///:~

Be aware that the results you get by running this program will probably be different from one machine/operating system/compiler to the next, since (as mentioned previously) the only thing that must be consistent is that each different type hold the minimum and maximum values specified in the Standard.

When you are modifying an int with short or long, the keyword int is optional, as shown above.

char= 1 unsigned char = 1 int = 4 unsigned int = 4 short = 2 unsigned short = 2 long = 8 unsigned long = 8 float = 4 double = 8 long double = 16

Pointers and References （1）

Whenever you run a program, it is first loaded (typically from disk) into the computer’s memory. Thus, all elements of your program are located somewhere in memory. Memory is typically laid out as a sequen0al series of memory loca0ons; we usually refer to these loca0ons as eight-‐bit bytes but actually the size of each space depends on the architecture of the par0cular machine and is usually called that machine’s word size. Each space can be uniquely dis0nguished from all other spaces by its address. For the purposes of this discussion, we’ll just say that all machines use bytes that have sequen0al addresses star0ng at zero and going up to however much memory you have in your computer.

Pointers and References （2）


called; in main( ) the argument is x, which has a value of 47, so this value is copied into a when f( ) is called.

When you run this program you’ll see:

x = 47 a = 47 a = 5 x = 47

Initially, of course, x is 47. When f( ) is called, temporary space is created to hold the variable a for the duration of the function call, and a is initialized by copying the value of x, which is verified by printing it out. Of course, you can change the value of a and show that it is changed. But when f( ) is completed, the temporary space that was created for a disappears, and we see that the only connection that ever existed between a and x happened when the value of x was copied into a.

When you’re inside f( ), x is the outside object (my terminology), and changing the local variable does not affect the outside object, naturally enough, since they are two separate locations in storage. But what if you do want to modify the outside object? This is where pointers come in handy. In a sense, a pointer is an alias for another variable. So if we pass a pointer into a function instead of an ordinary value, we are actually passing an alias to the outside object, enabling the function to modify that outside object, like this:

//: C03:PassAddress.cpp #include <iostream> using namespace std; void f(int* p) { cout << "p = " << p << endl; cout << "*p = " << *p << endl; *p = 5; cout << "p = " << p << endl; } int main() {

3: The C in C++ 151

int x = 47; cout << "x = " << x << endl; cout << "&x = " << &x << endl; f(&x); cout << "x = " << x << endl; } ///:~

Now f( ) takes a pointer as an argument and dereferences the pointer during assignment, and this causes the outside object x to be modified. The output is:

x = 47 &x = 0065FE00 p = 0065FE00 *p = 47 p = 0065FE00 x = 5

Notice that the value contained in p is the same as the address of x – the pointer p does indeed point to x. If that isn’t convincing enough, when p is dereferenced to assign the value 5, we see that the value of x is now changed to 5 as well.

Thus, passing a pointer into a function will allow that function to modify the outside object. You’ll see plenty of other uses for pointers later, but this is arguably the most basic and possibly the most common use.

Introduction to C++ references Pointers work roughly the same in C and in C++, but C++ adds an additional way to pass an address into a function. This is pass-by-reference and it exists in several other programming languages so it was not a C++ invention.

Your initial perception of references may be that they are unnecessary, that you could write all your programs without references. In general, this is true, with the exception of a few important places that you’ll learn about later in the book. You’ll also learn more about references later, but the basic idea is the same

x = 47 &x = 0x7fff5aff914 p = 0x7fff5aff914 *p = 47 p = 0x7fff5aff914 x = 5

Composite type crea5on The fundamental data types and their varia0ons are essen0al, but rather primi0ve. C and C++ provide tools that allow you to compose more sophis0cated data types from the fundamental data types. As you’ll see, the most important of these is struct, which is the founda0on for class in C++. However, the simplest way to create more sophis0cated types is simply to alias a name to another name via typedef. Aliasing names with typedef This keyword promises more than it delivers: typedefsuggests “type defini0on” when “alias” would probably have been a more accurate descrip0on, since that’s what it really does. The syntax is:

typedef exis5ng-‐type-‐descrip5on alias-‐name People oken use typedefwhen data types get slightly complicated, just to prevent extra keystrokes. Here is a commonly-‐used typedef: typedef unsigned long ulong; Now if you say ulong the compiler knows that you mean unsigned long. You might think that this could as easily be accomplished using preprocessor subs0tu0on, but there are key situa0ons in which the compiler must be aware that you’re trea0ng a name as if it were a type, so typedefis essen0al.

Combining variables with struct


s2.d = 0.00093; } ///:~

The struct declaration must end with a semicolon. In main( ), two instances of Structure1 are created: s1 and s2. Each of these has their own separate versions of c, i, f, and d. So s1 and s2 represent clumps of completely independent variables. To select one of the elements within s1 or s2, you use a ‘.’, syntax you’ve seen in the previous chapter when using C++ class objects – since classes evolved from structs, this is where that syntax arose from.

One thing you’ll notice is the awkwardness of the use of Structure1 (as it turns out, this is only required by C, not C++). In C, you can’t just say Structure1 when you’re defining variables, you must say struct Structure1. This is where typedef becomes especially handy in C:

//: C03:SimpleStruct2.cpp // Using typedef with struct typedef struct { char c; int i; float f; double d; } Structure2; int main() { Structure2 s1, s2; s1.c = 'a'; s1.i = 1; s1.f = 3.14; s1.d = 0.00093; s2.c = 'a'; s2.i = 1; s2.f = 3.14; s2.d = 0.00093; } ///:~

By using typedef in this way, you can pretend (in C; try removing the typedef for C++) that Structure2 is a built-in type, like int or float, when you define s1 and s2 (but notice it only has data –

3: The C in C++ 189

One place where typedef comes in handy is for pointer types. As previously mentioned, if you say:

int* x, y;

This actually produces an int* which is x and an int (not an int*) which is y. That is, the ‘*’ binds to the right, not the left. However, if you use a typedef:

typedef int* IntPtr; IntPtr x, y;

Then both x and y are of type int*.

You can argue that it’s more explicit and therefore more readable to avoid typedefs for primitive types, and indeed programs rapidly become difficult to read when many typedefs are used. However, typedefs become especially important in C when used with struct.

Combining variables with struct A struct is a way to collect a group of variables into a structure. Once you create a struct, then you can make many instances of this “new” type of variable you’ve invented. For example:

//: C03:SimpleStruct.cpp struct Structure1 { char c; int i; float f; double d; }; int main() { struct Structure1 s1, s2; s1.c = 'a'; // Select an element using a '.' s1.i = 1; s1.f = 3.14; s1.d = 0.00093; s2.c = 'a'; s2.i = 1; s2.f = 3.14;

One thing you’ll no5ce is the awkwardness of the use of Structure1 (as it turns out, this is only required by C, not C++). In C, you can’t just say Structure1when you’re defining variables, you must say struct Structure1. This is where typedeaecomes especially handy in C:

By using typedef in this way, you can pretend (in C; try removing the typedeffor C++) that Structure2is a built-‐in type, like int or float, when you define s1 and s2 (but no5ce it only has data –characteris5cs – and does not include behavior, which is what we get with real objects in C++).

Abstract data typing (ADT)

The ability to package data with func0ons allows you to create a new data type. This is oken called encapsula3on1. An exis0ng data type may have several pieces of data packaged together. For example, a float has an exponent, a man0ssa, and a sign bit. You can tell it to do things: add to another float or to an int, and so on. It has characteris0cs and behavior.

Sizeof (ADT) #include <iostream> using namespace std; struct A { int i[100]; }; struct B { void f(); }; void B::f() {} int main() { cout << "sizeof struct A = " << sizeof(A) << " bytes" << endl; cout << "sizeof struct B = " << sizeof(B) << " bytes" << endl; } ///:~

sizeof struct A = 400 bytes sizeof struct B = 1 bytes Each int occupies 4 bytes. struct B is something of an anomaly because it is a struct with no data members. In C, this is illegal, but in C++ we need the op0on of crea0ng a struct whose sole task is to scope func0on names, so it is allowed. S0ll, the result produced by the second print statement is a somewhat surprising nonzero value. In early versions of the language, the size was zero, but an awkward situa0on arises when you create such objects: They have the same address as the object created directly aker them, and so are not dis0nct. One of the fundamental rules of objects is that each object must have a unique address, so structures with no data members will always have some minimum nonzero size. The last two sizeof statements show you that the size of the structure in C++ is the same as the size of the equivalent version in C. C++ tries not to add any unnecessary overhead.

Sizeof measures data field of Composite data structure, Not func5on field.

Scoping

Scoping rules tell you where a variable is valid, where it is created, and where it gets destroyed (i.e., goes out of scope). The scope of a variable extends from the point where it is defined to the first closing brace that matches the closest opening brace before the variable was defined. That is, a scope is defined by its “nearest” set of braces.

access control (1) C++ introduces three new keywords to set the boundaries in a structure: public, private, and protected. Their use and meaning are remarkably straighworward. These access specifiers are used only in a structure declara0on, and they change the boundary for all the declara0ons that follow them. Whenever you use an access specifier, it must be followed by a colon. publicmeans all member declara0ons that follow are available to everyone. publicmembers are like structmembers. For example, the following structdeclara0ons are iden0ca

access control (2) struct B { private: char j; float f; public: int i; void func(); };

void B::func() { i = 0; j = '0'; f = 0.0; }; int main() { B b; b.i = 1; return 0; }

Protected acts just like private, with one excep0on that we can’t really talk about right now: “Inherited” structures (which cannot access privatemembers) are granted access to protectedmembers.

The class

In the original OOP language, Simula-‐67, the keyword class was used to describe a new data type. This apparently inspired Stroustrup to choose the same keyword for C++, to emphasize that this was the focal point of the whole language: the crea0on of new data types that are more than just C structs with func0ons. This certainly seems like adequate jus0fica0on for a new keyword. However, the use of class in C++ comes close to being an unnecessary keyword. It’s iden0cal to the structkeyword in absolutely every way except one: class defaults to private, whereas structdefaults to public. Here are two structures that produce the same result:

The Class (2) struct A { private:

int i, j, k; public:

int f(); void g();

}; int A::f() {

return i + j + k; } void A::g() {

i = j = k = 0; }

class B {

int i, j, k; public:

int f(); void g();

}; int B::f() {

return i + j + k; } void B::g() {

i = j = k = 0; }

A == B? Yes

int main() { A a; B b; a.f(); a.g(); b.f(); b.g();

}

Ini5aliza5on & Cleanup

Two of these safety issues are ini0aliza0on and cleanup. A large segment of C bugs occur when the programmer forgets to ini0alize or clean up a variable. This is especially true with C libraries, when client programmers don’t know how to ini0alize a struct, or even that they must. (Libraries oken do not include an ini0aliza0on func0on, so the client programmer is forced to ini0alize the struct by hand.) Cleanup is a special problem because C programmers are comfortable with forge{ng about variables once they are finished, so any cleaning up that may be necessary for a library’s structis oken missed.

We already have struct ~~~ why should we have class instead ?

Guaranteed ini5aliza5on with the constructor

class X { Public: int i; X(); // Constructor }; //Now, when an object is defined, void main() { X a; }

the same thing happens as if a were an int: storage is allocated for the object. But when the program reaches the sequence point (point of execu0on) where a is defined, the constructor is called automa0cally. That is, the compiler quietly inserts the call to X::X( ) for the object a at the point of defini0on. Like any member func0on, the first (secret) argument to the constructor is the this pointer – the address of the object for which it is being called. In the case of the constructor, however, this is poin0ng to an un-‐ini0alized block of memory, and it’s the job of the constructor to ini0alize this memory properly.

Guaranteed cleanup with the destructor

class Y { public: ~Y(); };

The destructor is called automa0cally by the compiler when the object goes out of scope. You can see where the constructor gets called by the point of defini0on of the object, but the only evidence for a destructor call is the closing brace of the scope that surrounds the object. Yet the destructor is s0ll called, even when you use goto to jump out of a scope. (goto s0ll exists in C++ for backward compa0bility with C and for the 0mes when it comes in handy.) You should note that a nonlocal goto, implemented by the Standard C library func0ons setjmp( )and longjmp( ), doesn’t cause destructors to be called. (This is the specifica0on, even if your compiler doesn’t implement it that way. Relying on a feature that isn’t in the specifica0on means your code is nonportable.)

Default constructors

A default constructor is one that can be called with no arguments. A default constructor is used to create a “vanilla object,” but it’s also important when the compiler is told to create an object but isn’t given any details. For example, if you take the struct Ydefined previously and use it in a defini0on like this, Y y2[2] = { Y(1) };

Func5on Overloading & Default Arguments

void print(char); void print(float);

It doesn’t ma}er whether they are both inside a class or at the global scope. The compiler can’t generate unique internal iden0fiers if it uses only the scope of the func0on names. You’d end up with _printin both cases. The idea of an overloaded func0on is that you use the same func0on name, but different argument lists. Thus, for overloading to work the compiler must decorate the func0on name with the names of the argument types. The func0ons above, defined at global scope, produce internal names that might look something like _print_charand _print_floa.t It’s worth no0ng there is no standard for the way names must be decorated by the compiler, so you will see very different results from one compiler to another. (You can see what it looks like by telling the compiler to generate assembly-‐language output.) This, of course, causes problems if you want to buy compiled libraries for a par0cular compiler and linker – but even if name decora0on were standardized, there would be other roadblocks because of the way different compilers generate code.

Overloading on return values

void f(); int f();

It’s common to wonder, “Why just scopes and argument lists? Why not return values?” It seems at first that it would make sense to also decorate the return value with the internal func0on name. Then you could overload on return values, as well:

This works fine when the compiler can unequivocally determine the meaning from the context, as in int x = f( ).; However, in C you’ve always been able to call a func0on and ignore the return value (that is, you can call the func0on for its side effects). How can the compiler dis0nguish which call is meant in this case? Possibly worse is the difficulty the reader has in knowing which func0on call is meant. Overloading solely on return value is a bit too subtle, and thus isn’t allowed in C++.

sta0c area (1) variable in func0ons #include <iostream> using namespace std; char oneChar(const char* charArray = 0) { sta0c const char* s; if(charArray) { s = charArray; return *s; } if(*s == '\0') return 0; return *s++; } char* a = "abcdefghijklmnopqrstuvwxyz"; int main() { oneChar(a); char c; while ((c = oneChar()) != 0) { cout<< c <<endl; } }

a b c d e f g h i j k l m n o p q r s t u v w x y z

sta5c class objects inside func5ons

#include <iostream> using namespace std; class X {

int i; public:

X(int ii = 0) : i(ii) {} // Default ~X() { cout << "X::~X()" << endl; }

}; void f() {

sta0c X x1(47); sta0c X x2; // Default constructor required

} int main() {

f(); } ///:~

X::~X() X::~X()

Take a look at scope

When the compiler generates code for a func0on call, it first pushes all the arguments on the stack, then makes the call. Inside the func0on, code is generated to move the stack pointer down even farther to provide storage for the func0on’s local variables. (“Down” is rela0ve here; your machine may increment or decrement the stack pointer during a push.) But during the assembly-‐language CALL, the CPU pushes the address in the program code where the func0on call came from, so the assembly-‐ language RETURN can use that address to return to the calling point. This address is of course sacred, because without it your program will get completely lost. Here’s what the stack frame looks like aker the CALL and the alloca0on of local variable storage in the func0on:


all functionality inline. In main( ), the call to bigfun( ) starts as you might guess – the entire contents of B is pushed on the stack. (Here, you might see some compilers load registers with the address of the Big and its size, then call a helper function to push the Big onto the stack.)

In the previous code fragment, pushing the arguments onto the stack was all that was required before making the function call. In PassingBigStructures.cpp, however, you’ll see an additional action: the address of B2 is pushed before making the call, even though it’s obviously not an argument. To comprehend what’s going on here, you need to understand the constraints on the compiler when it’s making a function call.

Function-call stack frame When the compiler generates code for a function call, it first pushes all the arguments on the stack, then makes the call. Inside the function, code is generated to move the stack pointer down even farther to provide storage for the function’s local variables. (“Down” is relative here; your machine may increment or decrement the stack pointer during a push.) But during the assembly-language CALL, the CPU pushes the address in the program code where the function call came from, so the assembly-language RETURN can use that address to return to the calling point. This address is of course sacred, because without it your program will get completely lost. Here’s what the stack frame looks like after the CALL and the allocation of local variable storage in the function:

Function arguments

Return address

Local variables

Copy-‐construc5on compiler makes an assump0on about how to create a new object from an exis3ng object. The compiler’s assump0on is that you want to perform this crea0on using a bitcopy, and in many cases this may work fine common example occurs if the class contains pointers – what do they point to, and should you copy them or should they be connected to some new piece of memory? Fortunately, you can intervene in this process and prevent the compiler from doing a bitcopy. You do this by defining your own func0on to be used whenever the compiler needs to make a new object from an exis0ng object. Logically enough, you’re making a new object, so this func0on is a constructor, and also logically enough, the single argument to this constructor has to do with the object you’re construc0ng from. But that object can’t be passed into the constructor by value because you’re trying to define the func0on that handles passing by value, and syntac0cally it doesn’t make sense to pass a pointer because, aker all, you’re crea0ng the new object from an exis0ng object. Here, references come to the rescue, so you take the reference of the source object. This func0on is called the copy-‐constructor and is oken referred to as X(X&), which is its appearance for a class called X.

Inheritance class A {

int i; public:

A(int ii) : i(ii) {} ~A() {} void f() const {}

}; class B {

int i; public:

B(int ii) : i(ii) {} ~B() {} void f() const {}

};

class C : public B {

A a; public:

C(int ii) : B(ii), a(ii) {} ~C() {} // Calls ~A() and ~B() void f() const { // Redefini0on a.f(); B::f(); }

} int main() {

C c(47); } ///:~

Polymorphism (overriding) #include <iostream> using namespace std; enum note { middleC, Csharp, Eflat }; // Etc. class Instrument { public:

void play(note) const { cout << "Instrument::play" << endl; }

};

// Wind objects are Instruments // because they have the same interface: class Wind : public Instrument { public:

// Redefine interface func0on: void play(note) const { cout << "Wind::play" << endl; }

};

void tune(Instrument& i) {

i.play(middleC); } int main() {

Wind flute; tune(flute); // Upcas0ng

} ///:~

Instrument::play

Polymorphism (virtual func0on)

#include <iostream> using namespace std; enum note { middleC, Csharp, Cflat }; // Etc. class Instrument { public:

virtual void play(note) const { cout << "Instrument::play" << endl; }

}; // Wind objects are Instruments // because they have the same interface: class Wind : public Instrument { public:

// Override interface func0on: void play(note) const { cout << "Wind::play" << endl; }

};

void tune(Instrument& i) {

i.play(middleC); } int main() {

Wind flute; tune(flute); // Upcas0ng

} ///:~

Wind::play

How it is implemented?


“dummy” member. Try commenting out the int a in all the classes in the example above to see this.

Picturing virtual functions To understand exactly what’s going on when you use a virtual function, it’s helpful to visualize the activities going on behind the curtain. Here’s a drawing of the array of pointers A[ ] in Instrument4.cpp:

Wind object

vptr

Percussion object

vptr

Stringed object

vptr

Brass object

vptr

&Wind::play&Wind::what&Wind::adjust

&Percussion::play&Percussion::what&Percussion::adjust

&Stringed::play&Stringed::what&Stringed::adjust

&Brass::play&Brass::what&Wind::adjust

Array ofInstrumentpointers A[ ]

Objects:VTABLEs:

The array of Instrument pointers has no specific type information; they each point to an object of type Instrument. Wind, Percussion, Stringed, and Brass all fit into this category because they are derived from Instrument (and thus have the same interface as Instrument, and can respond to the same messages), so their addresses can also be placed into the array. However, the compiler doesn’t know that they are anything more than Instrument objects, so left to its own devices it would normally call the base-class versions of all the functions. But in this case, all those functions have been declared with the virtual keyword, so something different happens.

Template syntax


uncomfortable with inheritance can still use canned container classes right away (as we’ve been doing with vector throughout the book).

Template syntax The template keyword tells the compiler that the class definition that follows will manipulate one or more unspecified types. At the time the actual class code is generated from the template, those types must be specified so that the compiler can substitute them.

To demonstrate the syntax, here’s a small example that produces a bounds-checked array:

//: C16:Array.cpp #include "../require.h" #include <iostream> using namespace std; template<class T> class Array { enum { size = 100 }; T A[size]; public: T& operator[](int index) { require(index >= 0 && index < size, "Index out of range"); return A[index]; } }; int main() { Array<int> ia; Array<float> fa; for(int i = 0; i < 20; i++) { ia[i] = i * i; fa[i] = float(i) * 1.414; } for(int j = 0; j < 20; j++) cout << j << ": " << ia[j] << ", " << fa[j] << endl;


uncomfortable with inheritance can still use canned container classes right away (as we’ve been doing with vector throughout the book).

Template syntax The template keyword tells the compiler that the class definition that follows will manipulate one or more unspecified types. At the time the actual class code is generated from the template, those types must be specified so that the compiler can substitute them.

To demonstrate the syntax, here’s a small example that produces a bounds-checked array:

//: C16:Array.cpp #include "../require.h" #include <iostream> using namespace std; template<class T> class Array { enum { size = 100 }; T A[size]; public: T& operator[](int index) { require(index >= 0 && index < size, "Index out of range"); return A[index]; } }; int main() { Array<int> ia; Array<float> fa; for(int i = 0; i < 20; i++) { ia[i] = i * i; fa[i] = float(i) * 1.414; } for(int j = 0; j < 20; j++) cout << j << ": " << ia[j] << ", " << fa[j] << endl;

16: Introduction to Templates 733

} ///:~

You can see that it looks like a normal class except for the line

template<class T>

which says that T is the substitution parameter, and that it represents a type name. Also, you see T used everywhere in the class where you would normally see the specific type the container holds.

In Array, elements are inserted and extracted with the same function: the overloaded operator [ ] . It returns a reference, so it can be used on both sides of an equal sign (that is, as both an lvalue and an rvalue). Notice that if the index is out of bounds, the require( ) function is used to print a message. Since operator[] is an inline, you could use this approach to guarantee that no array-bounds violations occur, then remove the require( ) for the shipping code.

In main( ), you can see how easy it is to create Arrays that hold different types of objects. When you say

Array<int> ia; Array<float> fa;

the compiler expands the Array template (this is called instantiation) twice, to create two new generated classes, which you can think of as Array_int and Array_float. (Different compilers may decorate the names in different ways.) These are classes just like the ones you would have produced if you had performed the substitution by hand, except that the compiler creates them for you as you define the objects ia and fa. Also note that duplicate class definitions are either avoided by the compiler or merged by the linker.

Mul0ple Inheritance #include <fstream> using namespace std; ofstream out("mithis.out"); class Base1 { char c[0x10]; public:

void prin}his1() { out << "Base1 this = " << this << endl;}

}; class Base2 { char c[0x10]; public:

void prin}his2() { out << "Base2 this = " << this << endl;}

}; class Member1 { char c[0x10]; public:

void prin}hism1() { out << "Member1 this = " << this << endl;}

}; class Member2 { char c[0x10]; public:

void prin}hism2() { out << "Member2 this = " << this << endl;}

};

class MI : public Base1, public Base2 { Member1 m1; Member2 m2; public:

void prin}his() { out << "MI this = " << this << endl; prin}his1(); prin}his2(); m1.prin}hism1(); m2.prin}hism2();}

}; int main() { MI mi; out << "sizeof(mi) = " << hex << sizeof(mi) << " hex" << endl; mi.prin}his(); Base1* b1 = &mi; // Upcast Base2* b2 = &mi; // Upcast out << "Base 1 pointer = " out << "Base 2 pointer = " } ///:~

sizeof(mi) = 40 hex Mi this = 0x223e Base1 this = 0x223e Base2 this = 0x224e Member1 this = 0x225e Member2 this = 0x226e Base 1 pointer = 0x223e Base 2 pointer = 0x224e

Run-‐0me type iden0fica0on(RTTI)

There are two different ways to use RTTI. The first acts like sizeof( ) because it looks like a func0on, but it’s actually implemented by the compiler. typeid( ) takes an argument that’s an object, a reference, or a pointer and returns a reference to a global const object of type typeinfo. These can be compared to each other with the operator== and operator!=, and you can also ask for the name( ) of the type, which returns a string representa0on of the type name. The second syntax for RTTI is called a “type-‐safe downcast.” The reason for the term “downcast” is (again) the historical arrangement of the class hierarchy diagram.

Documents

CSE$305$Introduc0on$to$ Programming$Languages$$ …zhiyang/teaching/cse305/lectures/lecture... · CSE$305$Introduc0on$to$ Programming$Languages$$ Lecture$15$–ObjectOriented$ Programming$$