36
CSE 305 Introduc0on to Programming Languages Lecture 15 – Object Oriented Programming CSE @ SUNYBuffalo Zhi Yang Courtesy of Professor Yacov HelOr Courtesy of Dr. David Reed

CSE$305$Introduc0on$to$ Programming$Languages$$ …zhiyang/teaching/cse305/lectures/lecture... · CSE$305$Introduc0on$to$ Programming$Languages$$ Lecture$15$–ObjectOriented$ Programming$$

Embed Size (px)

Citation preview

CSE  305  Introduc0on  to  Programming  Languages    

Lecture  15  –  Object  Oriented  Programming    

CSE  @  SUNY-­‐Buffalo    Zhi  Yang    

Courtesy  of    Professor  Yacov  Hel-­‐Or  Courtesy  of  Dr.  David  Reed    

   

No0ce  Board  

•  August  1,we  will  be  having  the  fourth  long  quiz,  which  covers  lecture  16  to  lecture  19.    

   

   

Our  objec0ve    •  The  first  objec0ve  of  our  class,  is  to  comprehend  a  new  programming  language  within  very  short  5me  period,    and  because  you  have  this  ability  to  shorten  your  learning  curve,  you  are  going  to  manipulate  the  language  with  an  insight  learning.        

•  The  second  objec0ve  is  to  even  engineer  your  own  language!    

Review  what  we’ve  learnt  and  see  future    Number  System    

Basic  Calcula0on  System    

1st  Genera0on  language:  

Machine  Code  2nd  Genera0on  

language:    Assembly  Code    

3rd  Genera0on    Language:    

Macro  func0on    

What’s  next  ?         Lexer    

Parser  

eg:  Egyp0an  Number  System;  Complement  Number    

eg:  Fortran    

eg:  Gate  system,    Including  different  underline  device  

Context-­‐Free  Grammar  

Lambda  Calculus  Theory  

Regular  Expression  

Compiler  System  

Virtual  Machine  

Type  Checking    

eg:  Abacus  

eg:  MIPS  

Push  Down  Automata    Macro  func5on    

A family tree of languages

<Fortran>  

BASIC  

Cobol   <LISP>  

<Scheme>  

<ML>  

<Prolog>  

PL/1  Algol  60  

Algol  68  

Pascal  

Modula  3  Ada  

C  

<C++>  

<Simula>  

<Smalltalk>  

<Java>  

Dylan  

<Ruby>  

<Perl>  

<Python>  <C#>  

<Haskell>  

<JavaScript>  

Type  Inference/    Type  Reconstruc0on                    

Chapter 17

Type Reconstruction

For technical reasons, this chapter uses a slightly different definition of substitution fromwhat we had before. This should be changed to correspond exactly to the earlier notion.Aside from that, it’s essentially finished.

The present chapter does not mention let-polymorphism. I think that’s a shame and I’vetried to figure out how to put in something about it, but it’s apparently quite hard to do itin a rigorous way without going on for page after page of technicalities.

Given an explicitly typed term in the simply typed lambda-calculus, we haveseen an algorithm for determining its type, if it has one. In this chapter, we developa more powerful type reconstruction algorithm, capable of calculating a principaltype for a term in which some of the explicit type annotations are replaced byvariables. Similar algorithms lie at the heart of languages like ML and Haskell.

The term “type inference” is often used instead of “type reconstruction.”

17.1 Substitution

We will work in this chapter with the system , the simply typed lambdacalculus with numbers and an infinite collection of atomic types. When we sawthem in Section 8.1, these atomic types were completely uninterpreted. Now weare going to think of them as type variables—i.e., placeholders standing for othertypes. In order to make this idea precise, we need to define what it means tosubstitute arbitrary types for type variables.

17.1.1 Definition: A type substitution (or, for purposes of this chapter, just sub-stitution) is a finite function from type variables to types. For example, we write

for the substitution that maps to and to and is undefined onother arguments.

118

Principal  Typings              

January 15, 2000 17. TYPE RECONSTRUCTION 128

The main data structure needed for this exercise is a representation of substitu-tions. There are many alternatives; one simple one is to reuse the datatypefrom Exercise 17.3.10: a substitution is just a constraint set, all of whose left-handsides are unification variables. If is a function that performs substitu-tion of a type for a single type variable

then application of a whole substitution to a type can be defined as follows:

(Solution on page 258.)

17.5 Principal Typings

17.5.1 Definition: A principal typing for is a typing such that,whenever is also a typing for , we have .

17.5.2 Exercise [Quick check]: Find a principal typing for

17.5.3 Theorem [Principal typing]: If has any typing, then it has a prin-cipal one. Moreover, the unification algorithm can be used to determine whether

has a typing and, if so, to return a principal one.

Proof: Immediate from the definition of typing and the properties of unification.

17.5.4 Corollary: It is decidable whether has a typing.

Proof: By Corollary 17.3.8 and Theorem 17.5.3.

17.5.5 Exercise [Recommended]: Use the implementations of constraint genera-tion (Exercise 17.3.10) and unification (Exercise 17.4.7) to construct a running type-checker that calculates principal typings, using the checker provided in thecourse directory as a starting point.

January 15, 2000 17. TYPE RECONSTRUCTION 128

The main data structure needed for this exercise is a representation of substitu-tions. There are many alternatives; one simple one is to reuse the datatypefrom Exercise 17.3.10: a substitution is just a constraint set, all of whose left-handsides are unification variables. If is a function that performs substitu-tion of a type for a single type variable

then application of a whole substitution to a type can be defined as follows:

(Solution on page 258.)

17.5 Principal Typings

17.5.1 Definition: A principal typing for is a typing such that,whenever is also a typing for , we have .

17.5.2 Exercise [Quick check]: Find a principal typing for

17.5.3 Theorem [Principal typing]: If has any typing, then it has a prin-cipal one. Moreover, the unification algorithm can be used to determine whether

has a typing and, if so, to return a principal one.

Proof: Immediate from the definition of typing and the properties of unification.

17.5.4 Corollary: It is decidable whether has a typing.

Proof: By Corollary 17.3.8 and Theorem 17.5.3.

17.5.5 Exercise [Recommended]: Use the implementations of constraint genera-tion (Exercise 17.3.10) and unification (Exercise 17.4.7) to construct a running type-checker that calculates principal typings, using the checker provided in thecourse directory as a starting point.

!"#$%&'()*+($$•  ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$!"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&*,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$)"#+(4$8%"$3#($9%569$0%$:365;"230($0-($2369"39($<50-$36$56/59-0$2(3#6569=$$$$

•  ,-($/()%67$%&'()*+($5/$0%$(+(6$&(-.(&&%*3"0%*",(*/+(-0+-&>$$

!"#$%&'()*+($$•  ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$!"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&*,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$)"#+(4$8%"$3#($9%569$0%$:365;"230($0-($2369"39($<50-$36$56/59-0$2(3#6569=$$$$

•  ,-($/()%67$%&'()*+($5/$0%$(+(6$&(-.(&&%*3"0%*",(*/+(-0+-&>$$

!"#$%&'()*+($$•  ,-($.#/0$%&'()*+($%1$%"#$)23//4$5/$0%$!"#$%&'&()*+*(&,*$%"-%+##.(-*/+(-0+-&*,.1'.(*2&%3*4'"%1*5#&*$&%.")4$$367$&()3"/($8%"$-3+($0-5/$3&52508$0%$/-%#0(6$8%"#$2(3#6569$)"#+(4$8%"$3#($9%569$0%$:365;"230($0-($2369"39($<50-$36$56/59-0$2(3#6569=$$$$

•  ,-($/()%67$%&'()*+($5/$0%$(+(6$&(-.(&&%*3"0%*",(*/+(-0+-&>$$

No5ce,  in  terms  of  principal  types,  following  slides  will  include  experiments  on  64bit  Intel  Architecture  computer  ~~~    

Learn  O-­‐O  from  C/C++  perspec0ve    (Principal  types)      

144 Thinking in C++ www.BruceEckel.com

// Demonstrates the use of specifiers #include <iostream> using namespace std; int main() { char c; unsigned char cu; int i; unsigned int iu; short int is; short iis; // Same as short int unsigned short int isu; unsigned short iisu; long int il; long iil; // Same as long int unsigned long int ilu; unsigned long iilu; float f; double d; long double ld; cout << "\n char= " << sizeof(c) << "\n unsigned char = " << sizeof(cu) << "\n int = " << sizeof(i) << "\n unsigned int = " << sizeof(iu) << "\n short = " << sizeof(is) << "\n unsigned short = " << sizeof(isu) << "\n long = " << sizeof(il) << "\n unsigned long = " << sizeof(ilu) << "\n float = " << sizeof(f) << "\n double = " << sizeof(d) << "\n long double = " << sizeof(ld) << endl; } ///:~

Be aware that the results you get by running this program will probably be different from one machine/operating system/compiler to the next, since (as mentioned previously) the only thing that must be consistent is that each different type hold the minimum and maximum values specified in the Standard.

When you are modifying an int with short or long, the keyword int is optional, as shown above.

 char=  1    unsigned  char  =  1    int  =  4    unsigned  int  =  4    short  =  2    unsigned  short  =  2    long  =  8    unsigned  long  =  8    float  =  4    double  =  8    long  double  =  16  

Pointers  and  References  (1)  

Whenever  you  run  a  program,  it  is  first  loaded  (typically  from  disk)  into  the  computer’s  memory.  Thus,  all  elements  of  your  program  are  located  somewhere  in  memory.  Memory  is  typically  laid  out  as  a  sequen0al  series  of  memory  loca0ons;  we  usually  refer  to  these  loca0ons  as  eight-­‐bit  bytes  but  actually  the  size  of  each  space  depends  on  the  architecture  of  the  par0cular  machine  and  is  usually  called  that  machine’s  word  size.  Each  space  can  be  uniquely  dis0nguished  from  all  other  spaces  by  its  address.  For  the  purposes  of  this  discussion,  we’ll  just  say  that  all  machines  use  bytes  that  have  sequen0al  addresses  star0ng  at  zero  and  going  up  to  however  much  memory  you  have  in  your  computer.  

Pointers  and  References  (2)  

150 Thinking in C++ www.BruceEckel.com

called; in main( ) the argument is x, which has a value of 47, so this value is copied into a when f( ) is called.

When you run this program you’ll see:

x = 47 a = 47 a = 5 x = 47

Initially, of course, x is 47. When f( ) is called, temporary space is created to hold the variable a for the duration of the function call, and a is initialized by copying the value of x, which is verified by printing it out. Of course, you can change the value of a and show that it is changed. But when f( ) is completed, the temporary space that was created for a disappears, and we see that the only connection that ever existed between a and x happened when the value of x was copied into a.

When you’re inside f( ), x is the outside object (my terminology), and changing the local variable does not affect the outside object, naturally enough, since they are two separate locations in storage. But what if you do want to modify the outside object? This is where pointers come in handy. In a sense, a pointer is an alias for another variable. So if we pass a pointer into a function instead of an ordinary value, we are actually passing an alias to the outside object, enabling the function to modify that outside object, like this:

//: C03:PassAddress.cpp #include <iostream> using namespace std; void f(int* p) { cout << "p = " << p << endl; cout << "*p = " << *p << endl; *p = 5; cout << "p = " << p << endl; } int main() {

3: The C in C++ 151

int x = 47; cout << "x = " << x << endl; cout << "&x = " << &x << endl; f(&x); cout << "x = " << x << endl; } ///:~

Now f( ) takes a pointer as an argument and dereferences the pointer during assignment, and this causes the outside object x to be modified. The output is:

x = 47 &x = 0065FE00 p = 0065FE00 *p = 47 p = 0065FE00 x = 5

Notice that the value contained in p is the same as the address of x – the pointer p does indeed point to x. If that isn’t convincing enough, when p is dereferenced to assign the value 5, we see that the value of x is now changed to 5 as well.

Thus, passing a pointer into a function will allow that function to modify the outside object. You’ll see plenty of other uses for pointers later, but this is arguably the most basic and possibly the most common use.

Introduction to C++ references Pointers work roughly the same in C and in C++, but C++ adds an additional way to pass an address into a function. This is pass-by-reference and it exists in several other programming languages so it was not a C++ invention.

Your initial perception of references may be that they are unnecessary, that you could write all your programs without references. In general, this is true, with the exception of a few important places that you’ll learn about later in the book. You’ll also learn more about references later, but the basic idea is the same

x  =  47  &x  =  0x7fff5aff914  p  =  0x7fff5aff914  *p  =  47  p  =  0x7fff5aff914  x  =  5  

Composite  type  crea5on  The  fundamental  data  types  and  their  varia0ons  are  essen0al,  but  rather  primi0ve.  C  and  C++  provide  tools  that  allow  you  to  compose  more  sophis0cated  data  types  from  the  fundamental  data  types.  As  you’ll  see,  the  most  important  of  these  is  struct,  which  is  the  founda0on  for  class  in  C++.  However,  the  simplest  way  to  create  more  sophis0cated  types  is  simply  to  alias  a  name  to  another  name  via  typedef.    Aliasing  names  with  typedef  This  keyword  promises  more  than  it  delivers:  typedefsuggests  “type  defini0on”  when  “alias”  would  probably  have  been  a  more  accurate  descrip0on,  since  that’s  what  it  really  does.  The  syntax  is:  

 typedef  exis5ng-­‐type-­‐descrip5on  alias-­‐name    People  oken  use  typedefwhen  data  types  get  slightly  complicated,  just  to  prevent  extra  keystrokes.  Here  is  a  commonly-­‐used  typedef:  typedef  unsigned  long  ulong;  Now  if  you  say  ulong  the  compiler  knows  that  you  mean  unsigned  long.  You  might  think  that  this  could  as  easily  be  accomplished  using  preprocessor  subs0tu0on,  but  there  are  key  situa0ons  in  which  the  compiler  must  be  aware  that  you’re  trea0ng  a  name  as  if  it  were  a  type,  so  typedefis  essen0al.  

Combining  variables  with  struct  

190 Thinking in C++ www.BruceEckel.com

s2.d = 0.00093; } ///:~

The struct declaration must end with a semicolon. In main( ), two instances of Structure1 are created: s1 and s2. Each of these has their own separate versions of c, i, f, and d. So s1 and s2 represent clumps of completely independent variables. To select one of the elements within s1 or s2, you use a ‘.’, syntax you’ve seen in the previous chapter when using C++ class objects – since classes evolved from structs, this is where that syntax arose from.

One thing you’ll notice is the awkwardness of the use of Structure1 (as it turns out, this is only required by C, not C++). In C, you can’t just say Structure1 when you’re defining variables, you must say struct Structure1. This is where typedef becomes especially handy in C:

//: C03:SimpleStruct2.cpp // Using typedef with struct typedef struct { char c; int i; float f; double d; } Structure2; int main() { Structure2 s1, s2; s1.c = 'a'; s1.i = 1; s1.f = 3.14; s1.d = 0.00093; s2.c = 'a'; s2.i = 1; s2.f = 3.14; s2.d = 0.00093; } ///:~

By using typedef in this way, you can pretend (in C; try removing the typedef for C++) that Structure2 is a built-in type, like int or float, when you define s1 and s2 (but notice it only has data –

3: The C in C++ 189

One place where typedef comes in handy is for pointer types. As previously mentioned, if you say:

int* x, y;

This actually produces an int* which is x and an int (not an int*) which is y. That is, the ‘*’ binds to the right, not the left. However, if you use a typedef:

typedef int* IntPtr; IntPtr x, y;

Then both x and y are of type int*.

You can argue that it’s more explicit and therefore more readable to avoid typedefs for primitive types, and indeed programs rapidly become difficult to read when many typedefs are used. However, typedefs become especially important in C when used with struct.

Combining variables with struct A struct is a way to collect a group of variables into a structure. Once you create a struct, then you can make many instances of this “new” type of variable you’ve invented. For example:

//: C03:SimpleStruct.cpp struct Structure1 { char c; int i; float f; double d; }; int main() { struct Structure1 s1, s2; s1.c = 'a'; // Select an element using a '.' s1.i = 1; s1.f = 3.14; s1.d = 0.00093; s2.c = 'a'; s2.i = 1; s2.f = 3.14;

One  thing  you’ll  no5ce  is  the  awkwardness  of  the  use  of  Structure1  (as  it  turns  out,  this  is  only  required  by  C,  not  C++).  In  C,  you  can’t  just  say  Structure1when  you’re  defining  variables,  you  must  say  struct  Structure1.  This  is  where  typedeaecomes  especially  handy  in  C:  

By  using  typedef  in  this  way,  you  can  pretend  (in  C;  try  removing  the  typedeffor  C++)  that  Structure2is  a  built-­‐in  type,  like  int  or  float,  when  you  define  s1  and  s2  (but  no5ce  it  only  has  data  –characteris5cs  –  and  does  not  include  behavior,  which  is  what  we  get  with  real  objects  in  C++).  

Abstract  data  typing  (ADT)    

The  ability  to  package  data  with  func0ons  allows  you  to  create  a  new  data  type.  This  is  oken  called  encapsula3on1.  An  exis0ng  data  type  may  have  several  pieces  of  data  packaged  together.  For  example,  a  float  has  an  exponent,  a  man0ssa,  and  a  sign  bit.  You  can  tell  it  to  do  things:  add  to  another  float  or  to  an  int,  and  so  on.  It  has  characteris0cs  and  behavior.  

Sizeof  (ADT)              #include  <iostream>  using  namespace  std;    struct  A  {  int  i[100];  };  struct  B  {  void  f();  };  void  B::f()  {}  int  main()  {                  cout  <<  "sizeof  struct  A  =  "  <<  sizeof(A)  <<  "  bytes"  <<  endl;                  cout  <<  "sizeof  struct  B  =  "  <<  sizeof(B)  <<  "  bytes"  <<  endl;  }  ///:~      

sizeof  struct  A  =  400  bytes  sizeof  struct  B  =  1  bytes    Each  int  occupies  4  bytes.  struct  B  is  something  of  an  anomaly  because  it  is  a  struct  with  no  data  members.  In  C,  this  is  illegal,  but  in  C++  we  need  the  op0on  of  crea0ng  a  struct  whose  sole  task  is  to  scope  func0on  names,  so  it  is  allowed.  S0ll,  the  result  produced  by  the  second  print  statement  is  a  somewhat  surprising  nonzero  value.  In  early  versions  of  the  language,  the  size  was  zero,  but  an  awkward  situa0on  arises  when  you  create  such  objects:  They  have  the  same  address  as  the  object  created  directly  aker  them,  and  so  are  not  dis0nct.  One  of  the  fundamental  rules  of  objects  is  that  each  object  must  have  a  unique  address,  so  structures  with  no  data  members  will  always  have  some  minimum  nonzero  size.  The  last  two  sizeof  statements  show  you  that  the  size  of  the  structure  in  C++  is  the  same  as  the  size  of  the  equivalent  version  in  C.  C++  tries  not  to  add  any  unnecessary  overhead.  

Sizeof  measures  data  field  of    Composite  data  structure,    Not  func5on  field.          

Scoping    

Scoping  rules  tell  you  where  a  variable  is  valid,  where  it  is  created,  and  where  it  gets  destroyed  (i.e.,  goes  out  of  scope).  The  scope  of  a  variable  extends  from  the  point  where  it  is  defined  to  the  first  closing  brace  that  matches  the  closest  opening  brace  before  the  variable  was  defined.  That  is,  a  scope  is  defined  by  its  “nearest”  set  of  braces.    

access  control      (1)        C++  introduces  three  new  keywords  to  set  the  boundaries  in  a  structure:  public,  private,  and  protected.  Their  use  and  meaning  are  remarkably  straighworward.  These  access  specifiers  are  used  only  in  a  structure  declara0on,  and  they  change  the  boundary  for  all  the  declara0ons  that  follow  them.  Whenever  you  use  an  access  specifier,  it  must  be  followed  by  a  colon.  publicmeans  all  member  declara0ons  that  follow  are  available  to  everyone.  publicmembers  are  like  structmembers.  For  example,  the  following  structdeclara0ons  are  iden0ca  

access  control      (2)    struct  B  {  private:                  char  j;                  float  f;  public:                  int  i;                  void  func();  };  

 void  B::func()  {                    i  =  0;                  j  =  '0';                  f  =  0.0;  };    int  main()  {                    B  b;                  b.i  =  1;                  return  0;  }  

Protected  acts  just  like  private,  with  one  excep0on  that  we  can’t  really  talk  about  right  now:  “Inherited”  structures  (which  cannot  access  privatemembers)  are  granted  access  to  protectedmembers.  

The  class          

In  the  original  OOP  language,  Simula-­‐67,  the  keyword  class  was  used  to  describe  a  new  data  type.  This  apparently  inspired  Stroustrup  to  choose  the  same  keyword  for  C++,  to  emphasize  that  this  was  the  focal  point  of  the  whole  language:  the  crea0on  of  new  data  types  that  are  more  than  just  C  structs  with  func0ons.  This  certainly  seems  like  adequate  jus0fica0on  for  a  new  keyword.  However,  the  use  of  class  in  C++  comes  close  to  being  an  unnecessary  keyword.  It’s  iden0cal  to  the  structkeyword  in  absolutely  every  way  except  one:  class  defaults  to  private,  whereas  structdefaults  to  public.  Here  are  two  structures  that  produce  the  same  result:  

The  Class  (2)          struct  A    {    private:  

 int  i,  j,  k;    public:  

 int  f();      void  g();  

};    int  A::f()    {    

 return  i  +  j  +  k;  }    void  A::g()    {  

   i  =  j  =  k  =  0;  }  

class  B    {    

 int  i,  j,  k;  public:    

 int  f();    void  g();  

};    int  B::f()    {    

 return  i  +  j  +  k;  }      void  B::g()    {  

 i  =  j  =  k  =  0;  }  

A  ==  B?      Yes    

int  main()  {      A  a;    B  b;      a.f();      a.g();      b.f();  b.g();  

}    

Ini5aliza5on  &  Cleanup  

Two  of  these  safety  issues  are  ini0aliza0on  and  cleanup.  A  large  segment  of  C  bugs  occur  when  the  programmer  forgets  to  ini0alize  or  clean  up  a  variable.  This  is  especially  true  with  C  libraries,  when  client  programmers  don’t  know  how  to  ini0alize  a  struct,  or  even  that  they  must.  (Libraries  oken  do  not  include  an  ini0aliza0on  func0on,  so  the  client  programmer  is  forced  to  ini0alize  the  struct  by  hand.)  Cleanup  is  a  special  problem  because  C  programmers  are  comfortable  with  forge{ng  about  variables  once  they  are  finished,  so  any  cleaning  up  that  may  be  necessary  for  a  library’s  structis  oken  missed.  

We  already  have  struct  ~~~  why  should  we  have  class  instead  ?        

Guaranteed  ini5aliza5on  with  the  constructor  

 class  X  {    Public:  int  i;    X();  //  Constructor  };  //Now,  when  an  object  is  defined,  void  main()  {    X  a;  }      

the  same  thing  happens  as  if  a  were  an  int:  storage  is  allocated  for  the  object.  But  when  the  program  reaches  the  sequence  point  (point  of  execu0on)  where  a  is  defined,  the  constructor  is  called  automa0cally.  That  is,  the  compiler  quietly  inserts  the  call  to  X::X(  )  for  the  object  a  at  the  point  of  defini0on.  Like  any  member  func0on,  the  first  (secret)  argument  to  the  constructor  is  the  this  pointer  –  the  address  of  the  object  for  which  it  is  being  called.  In  the  case  of  the  constructor,  however,  this  is  poin0ng  to  an  un-­‐ini0alized  block  of  memory,  and  it’s  the  job  of  the  constructor  to  ini0alize  this  memory  properly.  

Guaranteed  cleanup  with  the  destructor        

class  Y  {  public:  ~Y();  };  

The  destructor  is  called  automa0cally  by  the  compiler  when  the  object  goes  out  of  scope.  You  can  see  where  the  constructor  gets  called  by  the  point  of  defini0on  of  the  object,  but  the  only  evidence  for  a  destructor  call  is  the  closing  brace  of  the  scope  that  surrounds  the  object.  Yet  the  destructor  is  s0ll  called,  even  when  you  use  goto  to  jump  out  of  a  scope.  (goto  s0ll  exists  in  C++  for  backward  compa0bility  with  C  and  for  the  0mes  when  it  comes  in  handy.)  You  should  note  that  a  nonlocal  goto,  implemented  by  the  Standard  C  library  func0ons  setjmp(  )and  longjmp(  ),  doesn’t  cause  destructors  to  be  called.  (This  is  the  specifica0on,  even  if  your  compiler  doesn’t  implement  it  that  way.  Relying  on  a  feature  that  isn’t  in  the  specifica0on  means  your  code  is  nonportable.)  

Default  constructors  

A  default  constructor  is  one  that  can  be  called  with  no  arguments.  A  default  constructor  is  used  to  create  a  “vanilla  object,”  but  it’s  also  important  when  the  compiler  is  told  to  create  an  object  but  isn’t  given  any  details.  For  example,  if  you  take  the  struct  Ydefined  previously  and  use  it  in  a  defini0on  like  this,  Y  y2[2]  =  {  Y(1)  };  

Func5on  Overloading  &  Default  Arguments  

void  print(char);  void  print(float);  

It  doesn’t  ma}er  whether  they  are  both  inside  a  class  or  at  the  global  scope.  The  compiler  can’t  generate  unique  internal  iden0fiers  if  it  uses  only  the  scope  of  the  func0on  names.  You’d  end  up  with  _printin  both  cases.  The  idea  of  an  overloaded  func0on  is  that  you  use  the  same  func0on  name,  but  different  argument  lists.  Thus,  for  overloading  to  work  the  compiler  must  decorate  the  func0on  name  with  the  names  of  the  argument  types.  The  func0ons  above,  defined  at  global  scope,  produce  internal  names  that  might  look  something  like  _print_charand  _print_floa.t  It’s  worth  no0ng  there  is  no  standard  for  the  way  names  must  be  decorated  by  the  compiler,  so  you  will  see  very  different  results  from  one  compiler  to  another.  (You  can  see  what  it  looks  like  by  telling  the  compiler  to  generate  assembly-­‐language  output.)  This,  of  course,  causes  problems  if  you  want  to  buy  compiled  libraries  for  a  par0cular  compiler  and  linker  –  but  even  if  name  decora0on  were  standardized,  there  would  be  other  roadblocks  because  of  the  way  different  compilers  generate  code.  

Overloading  on  return  values  

void  f();      int  f();  

It’s  common  to  wonder,  “Why  just  scopes  and  argument  lists?  Why  not  return  values?”  It  seems  at  first  that  it  would  make  sense  to  also  decorate  the  return  value  with  the  internal  func0on  name.  Then  you  could  overload  on  return  values,  as  well:  

This  works  fine  when  the  compiler  can  unequivocally  determine  the  meaning  from  the  context,  as  in  int  x  =  f(  ).;  However,  in  C  you’ve  always  been  able  to  call  a  func0on  and  ignore  the  return  value  (that  is,  you  can  call  the  func0on  for  its  side  effects).  How  can  the  compiler  dis0nguish  which  call  is  meant  in  this  case?  Possibly  worse  is  the  difficulty  the  reader  has  in  knowing  which  func0on  call  is  meant.  Overloading  solely  on  return  value  is  a  bit  too  subtle,  and  thus  isn’t  allowed  in  C++.  

sta0c  area  (1)  variable  in  func0ons  #include  <iostream>  using  namespace  std;  char  oneChar(const  char*  charArray  =  0)  {                  sta0c  const  char*  s;                  if(charArray)                  {                                  s  =  charArray;  return  *s;                  }                  if(*s  ==  '\0')                                  return  0;                  return  *s++;  }    char*  a  =  "abcdefghijklmnopqrstuvwxyz";    int  main()  {                  oneChar(a);                  char  c;                  while  ((c  =  oneChar())  !=  0)                  {                                  cout<<  c  <<endl;                  }    }  

a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t  u  v  w  x  y  z  

sta5c  class  objects  inside  func5ons            

#include  <iostream>    using  namespace  std;  class  X    {    

 int  i;  public:    

 X(int  ii  =  0)  :  i(ii)  {}  //  Default      ~X()  {  cout  <<  "X::~X()"  <<  endl;  }  

};    void  f()  {    

 sta0c  X  x1(47);      sta0c  X  x2;  //  Default  constructor  required  

}  int  main()  {    

 f();  }  ///:~  

X::~X()  X::~X()  

Take  a  look  at  scope    

When  the  compiler  generates  code  for  a  func0on  call,  it  first  pushes  all  the  arguments  on  the  stack,  then  makes  the  call.  Inside  the  func0on,  code  is  generated  to  move  the  stack  pointer  down  even  farther  to  provide  storage  for  the  func0on’s  local  variables.  (“Down”  is  rela0ve  here;  your  machine  may  increment  or  decrement  the  stack  pointer  during  a  push.)  But  during  the  assembly-­‐language  CALL,  the  CPU  pushes  the  address  in  the  program  code  where  the  func0on  call  came  from,  so  the  assembly-­‐  language  RETURN  can  use  that  address  to  return  to  the  calling  point.  This  address  is  of  course  sacred,  because  without  it  your  program  will  get  completely  lost.  Here’s  what  the  stack  frame  looks  like  aker  the  CALL  and  the  alloca0on  of  local  variable  storage  in  the  func0on:  

482 Thinking in C++ www.BruceEckel.com

all functionality inline. In main( ), the call to bigfun( ) starts as you might guess – the entire contents of B is pushed on the stack. (Here, you might see some compilers load registers with the address of the Big and its size, then call a helper function to push the Big onto the stack.)

In the previous code fragment, pushing the arguments onto the stack was all that was required before making the function call. In PassingBigStructures.cpp, however, you’ll see an additional action: the address of B2 is pushed before making the call, even though it’s obviously not an argument. To comprehend what’s going on here, you need to understand the constraints on the compiler when it’s making a function call.

Function-call stack frame When the compiler generates code for a function call, it first pushes all the arguments on the stack, then makes the call. Inside the function, code is generated to move the stack pointer down even farther to provide storage for the function’s local variables. (“Down” is relative here; your machine may increment or decrement the stack pointer during a push.) But during the assembly-language CALL, the CPU pushes the address in the program code where the function call came from, so the assembly-language RETURN can use that address to return to the calling point. This address is of course sacred, because without it your program will get completely lost. Here’s what the stack frame looks like after the CALL and the allocation of local variable storage in the function:

Function arguments

Return address

Local variables

Copy-­‐construc5on  compiler  makes  an  assump0on  about  how  to  create  a  new  object  from  an  exis3ng  object.  The  compiler’s  assump0on  is  that  you  want  to  perform  this  crea0on  using  a  bitcopy,  and  in  many  cases  this  may  work  fine    common  example  occurs  if  the  class  contains  pointers  –  what  do  they  point  to,  and  should  you  copy  them  or  should  they  be  connected  to  some  new  piece  of  memory?    Fortunately,  you  can  intervene  in  this  process  and  prevent  the  compiler  from  doing  a  bitcopy.  You  do  this  by  defining  your  own  func0on  to  be  used  whenever  the  compiler  needs  to  make  a  new  object  from  an  exis0ng  object.  Logically  enough,  you’re  making  a  new  object,  so  this  func0on  is  a  constructor,  and  also  logically  enough,  the  single  argument  to  this  constructor  has  to  do  with  the  object  you’re  construc0ng  from.  But  that  object  can’t  be  passed  into  the  constructor  by  value  because  you’re  trying  to  define  the  func0on  that  handles  passing  by  value,  and  syntac0cally  it  doesn’t  make  sense  to  pass  a  pointer  because,  aker  all,  you’re  crea0ng  the  new  object  from  an  exis0ng  object.  Here,  references  come  to  the  rescue,  so  you  take  the  reference  of  the  source  object.  This  func0on  is  called  the  copy-­‐constructor  and  is  oken  referred  to  as  X(X&),  which  is  its  appearance  for  a  class  called  X.  

Inheritance          class  A  {    

 int  i;  public:    

 A(int  ii)  :  i(ii)  {}      ~A()  {}      void  f()  const  {}  

};  class  B  {    

 int  i;  public:    

 B(int  ii)  :  i(ii)  {}      ~B()  {}      void  f()  const  {}    

};  

class  C  :  public  B    {    

 A  a;  public:    

 C(int  ii)  :  B(ii),  a(ii)  {}      ~C()  {}    //  Calls  ~A()  and  ~B()    void  f()  const      {        //  Redefini0on      a.f();        B::f();      }  

}    int  main()    {    

 C  c(47);  }  ///:~  

Polymorphism  (overriding)        #include  <iostream>    using  namespace  std;    enum  note  {  middleC,  Csharp,  Eflat  };  //  Etc.  class  Instrument    {    public:  

 void  play(note)  const      {        cout  <<  "Instrument::play"  <<  endl;    }  

};      

//  Wind  objects  are  Instruments    //  because  they  have  the  same  interface:    class  Wind  :  public  Instrument    {    public:  

 //  Redefine  interface  func0on:      void  play(note)  const      {      cout  <<  "Wind::play"  <<  endl;    }  

};    

void  tune(Instrument&  i)    {    

 i.play(middleC);  }    int  main()    {    

 Wind  flute;      tune(flute);  //  Upcas0ng  

}  ///:~        

Instrument::play  

Polymorphism  (virtual  func0on)    

#include  <iostream>    using  namespace  std;    enum  note  {  middleC,  Csharp,  Cflat  };  //  Etc.  class  Instrument    {    public:  

 virtual  void  play(note)  const      {        cout  <<  "Instrument::play"  <<  endl;    }  

};      //  Wind  objects  are  Instruments    //  because  they  have  the  same  interface:    class  Wind  :  public  Instrument    {    public:  

 //  Override  interface  func0on:      void  play(note)  const      {      cout  <<  "Wind::play"  <<  endl;    }    

};      

void  tune(Instrument&  i)    {    

 i.play(middleC);  }    int  main()    {    

 Wind  flute;      tune(flute);  //  Upcas0ng  

}  ///:~  

Wind::play  

How  it  is  implemented?      

672 Thinking in C++ www.BruceEckel.com

“dummy” member. Try commenting out the int a in all the classes in the example above to see this.

Picturing virtual functions To understand exactly what’s going on when you use a virtual function, it’s helpful to visualize the activities going on behind the curtain. Here’s a drawing of the array of pointers A[ ] in Instrument4.cpp:

Wind object

vptr

Percussion object

vptr

Stringed object

vptr

Brass object

vptr

&Wind::play&Wind::what&Wind::adjust

&Percussion::play&Percussion::what&Percussion::adjust

&Stringed::play&Stringed::what&Stringed::adjust

&Brass::play&Brass::what&Wind::adjust

Array ofInstrumentpointers A[ ]

Objects:VTABLEs:

The array of Instrument pointers has no specific type information; they each point to an object of type Instrument. Wind, Percussion, Stringed, and Brass all fit into this category because they are derived from Instrument (and thus have the same interface as Instrument, and can respond to the same messages), so their addresses can also be placed into the array. However, the compiler doesn’t know that they are anything more than Instrument objects, so left to its own devices it would normally call the base-class versions of all the functions. But in this case, all those functions have been declared with the virtual keyword, so something different happens.

Template  syntax      

732 Thinking in C++ www.BruceEckel.com

uncomfortable with inheritance can still use canned container classes right away (as we’ve been doing with vector throughout the book).

Template syntax The template keyword tells the compiler that the class definition that follows will manipulate one or more unspecified types. At the time the actual class code is generated from the template, those types must be specified so that the compiler can substitute them.

To demonstrate the syntax, here’s a small example that produces a bounds-checked array:

//: C16:Array.cpp #include "../require.h" #include <iostream> using namespace std; template<class T> class Array { enum { size = 100 }; T A[size]; public: T& operator[](int index) { require(index >= 0 && index < size, "Index out of range"); return A[index]; } }; int main() { Array<int> ia; Array<float> fa; for(int i = 0; i < 20; i++) { ia[i] = i * i; fa[i] = float(i) * 1.414; } for(int j = 0; j < 20; j++) cout << j << ": " << ia[j] << ", " << fa[j] << endl;

732 Thinking in C++ www.BruceEckel.com

uncomfortable with inheritance can still use canned container classes right away (as we’ve been doing with vector throughout the book).

Template syntax The template keyword tells the compiler that the class definition that follows will manipulate one or more unspecified types. At the time the actual class code is generated from the template, those types must be specified so that the compiler can substitute them.

To demonstrate the syntax, here’s a small example that produces a bounds-checked array:

//: C16:Array.cpp #include "../require.h" #include <iostream> using namespace std; template<class T> class Array { enum { size = 100 }; T A[size]; public: T& operator[](int index) { require(index >= 0 && index < size, "Index out of range"); return A[index]; } }; int main() { Array<int> ia; Array<float> fa; for(int i = 0; i < 20; i++) { ia[i] = i * i; fa[i] = float(i) * 1.414; } for(int j = 0; j < 20; j++) cout << j << ": " << ia[j] << ", " << fa[j] << endl;

16: Introduction to Templates 733

} ///:~

You can see that it looks like a normal class except for the line

template<class T>

which says that T is the substitution parameter, and that it represents a type name. Also, you see T used everywhere in the class where you would normally see the specific type the container holds.

In Array, elements are inserted and extracted with the same function: the overloaded operator [ ] . It returns a reference, so it can be used on both sides of an equal sign (that is, as both an lvalue and an rvalue). Notice that if the index is out of bounds, the require( ) function is used to print a message. Since operator[] is an inline, you could use this approach to guarantee that no array-bounds violations occur, then remove the require( ) for the shipping code.

In main( ), you can see how easy it is to create Arrays that hold different types of objects. When you say

Array<int> ia; Array<float> fa;

the compiler expands the Array template (this is called instantiation) twice, to create two new generated classes, which you can think of as Array_int and Array_float. (Different compilers may decorate the names in different ways.) These are classes just like the ones you would have produced if you had performed the substitution by hand, except that the compiler creates them for you as you define the objects ia and fa. Also note that duplicate class definitions are either avoided by the compiler or merged by the linker.

Mul0ple  Inheritance    #include  <fstream>    using  namespace  std;    ofstream  out("mithis.out");  class  Base1  {    char  c[0x10];  public:    

 void  prin}his1()  {    out  <<  "Base1  this  =  "  <<  this  <<  endl;}  

};  class  Base2  {    char  c[0x10];  public:    

 void  prin}his2()  {    out  <<  "Base2  this  =  "  <<  this  <<  endl;}  

};  class  Member1  {    char  c[0x10];  public:    

 void  prin}hism1()  {    out  <<  "Member1  this  =  "  <<  this  <<  endl;}  

};  class  Member2  {    char  c[0x10];  public:    

 void  prin}hism2()  {    out  <<  "Member2  this  =  "  <<  this  <<  endl;}  

};  

class  MI  :  public  Base1,  public  Base2  {    Member1  m1;  Member2  m2;    public:  

 void  prin}his()  {        out  <<  "MI  this  =  "  <<  this    <<  endl;      prin}his1();  prin}his2();      m1.prin}hism1();  m2.prin}hism2();}  

};    int  main()  {    MI  mi;  out  <<  "sizeof(mi)  =  "  <<  hex  <<  sizeof(mi)  <<  "  hex"  <<  endl;  mi.prin}his();    Base1*  b1  =  &mi;  //  Upcast    Base2*  b2  =  &mi;  //  Upcast    out  <<  "Base  1  pointer  =  "  out  <<  "Base  2  pointer  =  "  }  ///:~  

sizeof(mi)  =  40  hex    Mi  this  =  0x223e    Base1  this  =  0x223e    Base2  this  =  0x224e    Member1  this  =  0x225e  Member2  this  =  0x226e    Base  1  pointer  =  0x223e  Base  2  pointer  =  0x224e  

Run-­‐0me  type  iden0fica0on(RTTI)      

There  are  two  different  ways  to  use  RTTI.  The  first  acts  like  sizeof(  )  because  it  looks  like  a  func0on,  but  it’s  actually  implemented  by  the  compiler.  typeid(  )  takes  an  argument  that’s  an  object,  a  reference,  or  a  pointer  and  returns  a  reference  to  a  global  const  object  of  type  typeinfo.  These  can  be  compared  to  each  other  with  the  operator==  and  operator!=,  and  you  can  also  ask  for  the  name(  )  of  the  type,  which  returns  a  string  representa0on  of  the  type  name.    The  second  syntax  for  RTTI  is  called  a  “type-­‐safe  downcast.”  The  reason  for  the  term  “downcast”  is  (again)  the  historical  arrangement  of  the  class  hierarchy  diagram.