Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

Embed Size (px)

Citation preview

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    1/181

    Common Sense C - Advice & Warnings

    for C and C++ Programmers

    (Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    PrefaceAbout the Author

    Chapter 1Introduction

    What's the Problem?

    "Real Programmers" And C

    A Better C

    Conquering C

    Chapter 2Common Mistakes and How to Avoid ThemLazy Logic

    Precedence Without Precedent

    No Such Number, Address Unknown

    It Hurts So Good

    Sidebar 1 C Coding Suggestions

    Chapter 3Foolproof Statement and Comment Syntax

    Brace Yourself

    Follow This Advice, or ElseGive Me a Break

    One Last Comment

    From C to Shining C

    (Sidebar 1) C Coding Suggestions

    Chapter 4Hassle-free Arrays and Strings

    String Symphony

    Sidebar 1 C Coding Suggestions

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    2/181

    Chapter 5Simplified Variable Declarations

    Chapter 6Practical Pointers

    Finger Pointing

    Cs a Real Nowhere, Man

    You Cant Get There from Here

    Amnesia

    One Blankety-Blank Trap After Another

    Letting the Cat Out of the Bag

    Sidebar 1 Pulling a Fast One

    Sidebar 2 C Coding Suggestions

    Chapter 7Macros and Miscellaneous Pitfalls

    Chapter 8Working with C++

    Starting on the Right Foot

    Your Constant Companion

    The Calm Before the Storm

    New and Improved

    Merrily Down the Streams

    Non-Plused

    OOP, Not Oooops!

    Weighing the Pluses and Minuses

    C Coding Suggestions

    Chapter 9Managing C and C++ Development

    Discipline Has Its Rewards

    How Big Is the World?

    Getting Started With Standards

    The Evolution of Standards

    No Train, No Gain

    The Right Tool For the Job

    Debugging Is a Waste of Time

    Order Out of Chaos

    Reuse It Or Lose It

    Principles Of Reuse

    Bibliography

    Appendix

    http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    3/181

    Index

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    4/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Table of Contents

    COMMON-SENSE C -- ADVICE AND WARNINGS FOR C

    AND C++ PROGRAMMERS

    C is a powerful programming language, but not without risks.

    Without help, even experienced C programmers can find

    themselves in trouble, despite "careful" programming, lint filters

    and good debuggers. And managers of programming projects

    can discover too late that using C carelessly can lead to delayedand defect-ridden software. This book helps avoid problems by

    illuminating the dangers of C and describing specific

    programming techniques to make C programming both faster

    and safer.

    Paul Conte draws on more than 15 years of software

    development, including writing commercial products using C, to

    warn you of C and C++ features that trip up even the best Cprogrammers. This book is unique in that it takes a critical look

    at C's deficiencies, but offers tried-and-proven techniques to

    minimize the chances that common C coding mistakes will lead

    to serious or hard-to-find software defects. Managers will find

    Paul's descriptions of C pitfalls and hard-hitting assessment of

    the language invaluable in deciding when -- or whether -- to use

    C for programming projects. No other book on C programming

    combines the depth of specific technical information and the

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    5/181

    strategic assessment of C's capabilities and risks that you'll find

    in Common Sense C.

    Table of Contents

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    6/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Table of Contents

    About the Author

    Paul Conte is a senior technical editor for NEWS 3X/400 and

    pesident of Picante Software of Eugene, Oregon, which develops

    workstation-based applications development tools for S/36 and

    AS/400 programmers. Paul has published numerous articles on

    the AS/400, programming languages, software engineering, and

    database design. His interest in programming languages led tothe development of RPG/free, the widely used free-format

    version of RPG. During his career, Paul has developed

    applications on a variety of platforms, including the S/38,

    AS/400, S/370, DEC, and PCs. His language expertise covers a

    wide range: C/C++, COBOL, RPG, Pascal, FORTRAN, Awk,

    and SNOBOL, to name a few.

    Paul has a B.A. in psychology from Georgia State Universityand an M.S. in computer science from the University of Oregon.

    He served on the University of Oregon faculty for eight years

    and has run his own consulting firm, prior to starting Picante

    Software, Inc. Paul has received several awards for his writing,

    including a Society for Technical Communication's International

    Award of Excellence for an article about C pitfalls.

    Acknowledgments

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    7/181

    Several people played a key role in creating this book. Jennifer

    Hamilton pressed the case for C and C++ and stimulated myanalysis of where C's problems lie. Arguing with her over C

    facilities and programming style helped me refine my own side

    of the debate. Mike Otey provided invaluable technical review.

    Trish Faubion helped turn the original rough style into one that

    retained its bite, but was much more polished. Katie McCormick

    Tipton, Barb Gibbens, and Kathy Blomstrom all helped refine

    my writing. And Dave Bernard and Sharon Hamm wielded just

    the right mix of encouragement and threat to make the bookactually happen. My sincere thanks to all.

    Dedication

    To my parents, Theodore and Sybil Conte, who've always been

    my example of lives well-lived

    Table of Contents

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    8/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    Chapter 1Introduction

    C and C++ are widely promoted as ideal portable, fast, and in

    the case of C++ "object-oriented" languages. This

    characterization is deserved when C is considered for systems-

    level programs such as compilers, or for mass-market products

    such as word processing or spreadsheet programs. C wasdesigned as a reasonably transportable replacement for assembly

    language that would add some high-level language constructs,

    but would retain almost all the low-level procedural capabilities

    found at the machine instruction level. C++ follows in that

    tradition, adding object-oriented capabilities (encapsulation and

    inheritance) to improve productivity while retaining C's original

    features and its philosophy of "bare metal" performance.

    But C is increasingly being considered as the best replacement

    for outdated commercial languages such as COBOL, RPG, and

    Basic. And many proponents also recommend C and C++ as

    superior alternatives to the Pascal family of languages (including

    Modula-2 and other successors to Pascal); to object-oriented

    languages such as Smalltalk, Eiffel, and Actor; and to the

    general-purpose language, Ada. C has its place, but in many

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    9/181

    cases especially business programming C can be a poor

    choice.

    What's the Problem?

    The fundamental problem with C is that it doesn't hide enough

    machine-level details. A good example is the central role that

    pointer variables play in C programs. C pointers were designed

    to provide machine-independent address arithmetic; and, for the

    most part, pointers do make it easier to write system programs

    that transport across machines. (Even this advantage is qualified,however, because pointers don't always transport easily between

    machines with flat addresses e.g., Vax and machines with

    segmented addresses e.g., Intel 808x.)

    But at an application level, C pointers are a burden and a danger.

    They're burdensome because the programmer has to attend to

    details that a compiler can readily handle. For example, in C, to

    use a function (procedure) parameter as an output parameter (i.

    e., one that changes a value in the calling function), you have to

    pass the address of the variable that is to receive the value. This

    mechanism requires special attention when calling a function to

    code an argument as arg when it's passed to a function that

    defines the corresponding parameter as the same type as arg, but

    as &arg when the argument is passed to a function that defines

    the corresponding parameter as a pointer. In the called function,

    normal parameters are referenced as arg, whereas the value of

    parameters declared as pointers must be referenced as *arg. Inall of these cases, a simple miscoding that incorrectly omits or

    adds a * or & can be fatal during program execution. By

    contrast, in languages like Pascal and Ada, you simply specify

    whether a parameter is passed by value (input only) or reference

    (allowing output) and all references are simple variable names,

    such as arg.

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    10/181

    It's true that C++ adds references as a simpler way to implement

    output parameters. But C++ still retains the error-prone use of

    pointer parameters. And, as a good example of the damage thatcan be done by conventional C/C++ advice, Bjorne Stroustrup,

    the author of C++, goes so far as to discourage the use of

    references as parameters and suggests pointer parameters

    instead!

    Pointers are often viewed as essential building blocks for

    dynamic data structures, such as sets and lists, and C proponents

    point to COBOL's (and other older languages') lack of pointers

    as a good reason to switch to C. But there are two ways to

    implement pointers: as addresses (as C does) or as "handles" (as

    Pascal does). The two implementations serve two distinctly

    different purposes. Address pointers let you directly manipulate

    a pointer variable to create a new pointer value (i.e., a new

    address). This ability is essential in many systems-level

    programs where access of specific memory locations (or even

    registers) is required. The downside of address pointers is that

    there's no guarantee that a computed pointer value will be theintended or even a valid address. As a result, a common

    experience in C programming is to have a program write over

    memory that contains the wrong data the program's own

    instructions, or even the operating system's code all due to an

    incorrect pointer value.

    Handle pointers contain system-defined values (which may even

    be addresses) that cannot be directly manipulated by arithmeticoperations, and which the system can check for validity before

    using to reference storage. Thus, handle pointers provide support

    for dynamic data structures, but protect the programmer from the

    dangers of machine-level address manipulations. A similar

    argument applies when comparing C's approach to storage

    allocation (e.g., with the malloc() function) in explicit bytes

    versus other languages' built-in new and delete operations to

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    11/181

    allocate memory based on variable declarations, leaving the

    storage size allocations to the compiler.

    This discussion of pointers introduces a theme that is repeated

    throughout the book C was designed and is well-suited as a

    replacement for assembly language. But most software

    developers today agree that assembly language even a great

    version of assembly language isn't the right tool for most non-

    systems programming. Programmers who don't understand that

    programming with C pointers (and many other C features) is

    very close to assembly language programming are in trouble

    from the beginning. Unfortunately, most C programmers don't

    seem to get it.

    "Real Programmers" And C

    The problems with C itself would be more manageable if the

    culture and practices that have grown up around C weren't also

    rooted in machine-level, systems programming. Consider

    something as simple as adding a new element to an array. Two

    favorite C idioms for this operation are:

    array[++top] = item;array[next++] = item;

    The first example increments top, then adds item to array[top].

    The second example adds item to array[next], then increments

    next. In C, arrays are really just synonyms for pointers; and thiscoding style follows an assembly language practice of

    combining an address increment with a memory reference to the

    address. But in a high-level language, there's no reason to code

    these operations in a single statement. (You might, of course,

    want to create a procedure, such as add_item(array, item) so that

    a single, meaningful statement can be used to add an item. But

    that's not the point here, since both the increment and

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    12/181

    assignment operations are coded explicitly in the example.)

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    13/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    The most important problem with this condensed coding style is

    that, when you're reading volumes of code, it's easy to overlook

    statements where the ++ increment has been placed on the

    wrong end of the index identifier. The following alternatives use

    the code's visual layout to show the critical sequence of

    operations:

    ++top;array[top] = item;array[next] = item;++next;

    These alternatives also eliminate the need to use post increment

    (and post decrement) operations, removing one more piece of

    syntactic clutter and a potential source of coding errors from the

    program. Note also that the two-statement alternatives are just as

    easy to write and, with most optimizing C compilers, willexecute as fast as the one-statement approach.

    To most people, even non-C programmers, the difference in

    clarity in these isolated one- or two-line examples is small.

    However, in large programs or more complex statements, the

    differences mount up. As the examples in the rest of this book

    point out, conventional C style much of it based on assembly

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    14/181

    language programming techniques can also lead to subtle, but

    fatal, program errors.

    A Better C

    Many claims have been made for C++, but one thing seems

    certain: C++ is a "better" if more complex version of C. C+

    + adds some important language features missing in C; for

    example, reference parameters, inline functions, and templates to

    define generic functions and classes. These features aid clearer

    programming and can reduce but not eliminate the needfor macros in C++ programs.

    What C++ doesn't do is eliminate any of C's traps. C++ was

    intentionally designed to be an almost complete superset of C;

    that is, almost any ANSI C program even one using

    dangerous C techniques that have better alternatives in C++

    will compile as a C++ program. Thus, you can still be burned by

    typing = instead of == in a C++ program (I discuss this in

    Chapter 2). C++ also continues the heavy use of special

    characters, rather than keywords, in its syntax. The problems

    that arise from C's use of * for "pointer" or "contents of" and &

    for "address of" are compounded by new C++ notations, such as

    a trailing & for "reference."

    C++ also introduces facilities for object-oriented programming

    (OOP). The primary new C++ concept is the "class," which is a

    facility to package functions and variable declarations togetherso that new data types can be defined and used in C++ programs.

    C++ also provides for "inheritance," a facility for deriving a new

    class definition from an existing class. The OOP capabilities of C

    ++ are quite powerful, and when you work with a well-designed

    C++ class library, many implementation details can safely be

    ignored. But creating new classes is a different matter; and, if

    you write many non-trivial programs in C++, eventually you'll

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    15/181

    have to construct some of your own classes. As Chapter 8 points

    out, there are some very slippery slopes to climb as you write C+

    + classes.

    The question frequently arises of whether a programmer who

    doesn't know either C or C++ should learn C or C++ first. That's

    a hard call, and the best answer may be to learn one of the object-

    oriented extensions of Pascal or Smalltalk first, the idea being to

    learn the OOP concepts with a language not so laden with

    assembly language baggage, then learn how to do it in C++. In

    any case, you can't completely skip over learning about problem

    areas in C because most of these still exist in C++. As a result,

    much of this book is directed at problem areas common to both

    languages.

    Conquering C

    To pick the right projects for C or C++, and then use the

    language effectively, you have to ignore a lot of conventional

    attitudes towards C and C programming practices. Many of these

    attitudes and practices are rooted in a time and place 15 years

    ago when C was a major step forward for systems programmers.

    Today there are good alternatives to C for many applications,

    and programming practices have changed considerably. One of

    the most important differences between 15 years ago and today

    is that businesses are placing much more emphasis on

    controlling software development costs than on modest

    improvements in performance. Thus, developers trying tocontrol costs want to avoid language features such as address

    pointers and coding practices such as folding a sequence of

    distinct operations into a single statement.

    If you do find yourself (or your staff) programming in C, the

    attitude with which you approach the task has a lot to do with

    whether you conquer C or it conquers you. To successfully

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    16/181

    program in C, you can't just memorize more C rules, code more

    carefully, and keep the debugger close at hand. You have to start

    with an awareness of what types of languages C and C++ are,and plan your strategy for preventing accidents. With some

    forewarning, and the right attitude, it's not terribly difficult to do,

    although compared to other languages, C can remain a

    frustratingly primitive and C++ an agitatingly complex

    way to write software.

    There are some bright spots in the world of C programming,

    however. If you don't succumb to the "this is the way all C

    programmers do it" method of programming, you can enjoy the

    benefits of an enormous collection of C and C++ source and

    executable libraries, and a large set of C-related tools, such as

    "C-aware" editors and programmer workbenches. And there's no

    question that the fierce competition among C compiler vendors,

    especially on the PC, has produced excellent and affordable C

    compilers. The performance of well-designed C programs

    compiled with one of the good optimizing compilers is usually

    excellent, too.

    So don't fear that the only result of programming in C is

    spending large amounts of time chasing wild pointers. With the

    right amount of respect for the language and not too much

    respect for C "traditions," you can enjoy the advantages of the

    broad C compiler and tools market. All it takes is going into it

    with your eyes wide open and programming with a little

    "common sense."

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    17/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    Chapter 2Common Mistakes and How to Avoid Them

    Deck: Get a clear look at some classic surprises you'll want to

    avoid in C programs

    by Paul Conte

    Why do some programmers think C is such a hot language? Itmust be because it has burned them so many times. Unless

    you're from the "no flame, no gain" school of programming, you

    need to watch out when you start using C. In this book, I point

    out some of the "hot" spots you really want to avoid.

    Let's start by firing up an example.

    if (x = y)printf("Equal values");

    Simple enough. If y is not zero, print "Equal values". And, by

    the way, replace the value of x with the value of y. Isn't it nice

    that C lets you do assignment within an if statement expression?

    But maybe you thought this code really meant: If x is equal to y,

    print "Equal values"? No, the code for that is

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    18/181

    if (x == y)printf("Equal values");

    If this example tripped you up, don't worry. Typing =

    (assignment) instead of == (equality) occasionally gets the best

    C programmers, too. The problem isn't in comprehending the

    different meanings of = and ==. The problem is that it's easy to

    mistype = when you mean ==, especially because = is the

    standard mathematical symbol for equality, and = represents

    equality in many other widely used programming languages (e.

    g., PL/I, COBOL, and Pascal). Unfortunately, C treats this easy-

    to-make typo as an intentional assignment operation. The

    resulting code will execute, and the error may be hard to

    diagnose.

    Hard-core C programmers may try to convince you it's your

    inexperience, not C's syntax, that causes this type of coding

    error. But there's a booming market in C source-code checkers

    (known as "lint" filters) to help experienced C programmersprotect themselves from just these kinds of sneaky problems. If

    C's pitfalls weren't so pervasive, lint utility vendors would be out

    of business.

    All programmers are not created "equal equal," so if you want to

    be an A++ C programmer (why be just a C++ programmer?), the

    first rule is don't use assignment in an ifstatement expression,

    unless it is absolutely necessary. In addition, use a compilerwarning level or a lint utility that will catch = in if statement

    expressions. Be forewarned, however, that you may never be

    acknowledged as a "real" C programmer unless you're willing to

    take some risks to speed up your code by a few nanoseconds.

    Another good technique -- if you can handle accusations of

    "wimp" programmer -- is to define the macro

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    19/181

    #define EQ ==

    and never use == at all. Instead, you can write logicalexpressions, such as

    if (x EQ y)printf("Equal values");

    In addition to = and ==, C also has & (bitwise AND), &&

    (logical AND), | (bitwise OR), and || (logical OR). The bitwise

    and logical operators work the same, when their operands are 0

    or 1. In other cases, however, the results are different. Forexample,

    2 && 4

    is 1, which is considered "true" in an if statement, whereas

    2 & 4

    is 0, which is "false." Because, in many cases, & and | produce

    the same effect as && and || in if statement expressions (i.e.,

    zero or non-zero), incorrect use of the bitwise operators can

    cause infrequent and hard-to-diagnose errors. If you'd rather rely

    on something more than luck for correct programs, you may

    want to define the following four macros and use them instead of

    &, |, &&, and ||.

    #define and( a, b ) ( ( a ) & ( b ) )#define or( a, b ) ( ( a ) | ( b ) )

    #define AND &define OR ||

    Lazy Logic

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    20/181

    Yes, C is a devilishly clever little language. It's quick to write,

    too. Suppose you've written a function, get_customer, to return

    either an integer customer ID or zero if no customer is input.Why ywaste time with "verbose" code like

    custid = get_customer();if (custid > 0) {

    /* Process the customer */}

    when you can simply write:

    if (custid = get_customer()) {/* Process the customer */

    }

    With the original definition of get_customer, this code works. In

    C, an if statement evaluates the expression within parentheses,

    and, if the expression's value is non-zero, the subordinate code is

    executed. In this example, the variable custid is set to the return

    value of get_customer. Because the value of a C assignmentoperation is the same as the value assigned to the target variable,

    when custid is assigned a non-zero value, the subordinate code

    to process the customer is executed.

    You'll see "simplified" if statement expressions like this all over

    C programs. But suppose you and your fellow programmers

    have been using get_customer for a while; say you have a dozen

    or so programs that call it. Then one day you get an I/O errorthat zaps one of your programs, and you decide you had better

    add to get_customer a return value of -1 for an I/O error.

    Problem solved? No, problems are created. Every

    if (custid = get_customer())

    statement will still execute the subordinate code when there's an

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    21/181

    error because the value of the if statement expression is non-

    zero. On the other hand, if you follow the first rule and keep the

    assignment operation separate, your code will work properlywith the new error return value.

    C is a "truth-or-consequences" language. You'll experience less

    of the latter if you use only logical expressions (ones that

    evaluate to 0 or 1) in if statements. You can define the following

    simple macros to implement Boolean variables and functions

    that return a Boolean value.

    #define BOOL int#define TRUE 1#define FALSE 0

    You should also use only Boolean variables and functions with

    the logical operators && and ||. Following this practice

    eliminates problems caused by accidentally using the bitwise

    operators & and | in logical expressions.

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    22/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    Precedence Without Precedent

    In our get_customer example, you might think the following

    alternative would be safe and still have a nice "C-food" flavor.

    if (custid = get_customer() > 0) {/* Process the customer */

    }

    Now the code guards against negative, as well as zero, return

    values. Or does it? Something's fishy here. This code simply

    assigns 0 or 1 to custid because the > comparison operator has

    higher precedence (i.e., binds more tightly) than the assignment

    operator. This code is equivalent to

    if (custid = (get_customer() > 0)) {

    /* Process the customer */}

    What you really need is:

    if ((custid = get_customer()) > 0) {/* Process the customer */

    }

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    23/181

    C has 15 levels of operator precedence (so much for C being a

    "simple" language). Two other easy rules will keep you from

    floundering at C.Do all assignments as separate statements, notas a part of a more complex expression. And use parentheses

    liberally to explicitly define the order of evaluation.

    No Such Number, Address Unknown

    Understand one thing about C, and all its mysteries are revealed.

    C was -- and is -- a language meant as a portable replacement for

    machine-dependent assembly languages. Keep this in mindwhen you consider the following example.

    Suppose you code an array of part numbers and their names and

    a few lines to display a list of parts, as shown in Figure 2.1. If

    you remember C is for machine-level programming, you won't

    be suprised to find there's no part number 11. In C, 011 is not 11;

    it's 9! Integer constants that begin with 0 are octal (Get it? The 0

    looks like O for Octal.)

    Figure 2.1 Sample C Code

    struct part {int part_number;char description[30];

    }

    main() {

    int i;

    /* Array of part numbers and descriptions*/struct part part_table[100] ={{011, "Wrench" },{067, "Screwdriver" },{137, "Hammer" },

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    24/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    25/181

    output. Unless your magic number is 8, you should use a lint

    utility or your editor to ferret out all %i format specifications and

    numbers that begin with 0.

    The previous two examples actually have a much bigger

    problem than octal numbers. I miscoded the scanf function

    argument as part_number instead of &part_number. So instead

    of supplying scanf with the address where I want the input

    stored (i.e., the address of part_number), I supplied the

    uninitialized value of part_number. C is powerful, so powerful in

    fact, that scanf will trash some location in memory pointed to by

    whatever garbage is in part_number. If you're lucky, the trashed

    memory will be part of the debugger or operating system code,

    and you'll earn a C programming purple heart. To avoid winning

    too many battle ribbons, however, always double-check that

    you've supplied valid addresses for arguments to scanf and

    similar functions.

    It Hurts So Good

    If you're new to C, you may think I'm blowing its problems out

    of proportion. You may wonder whether C's flaws significantly

    hamper the work-a-day C programmer. The answer is yes, most

    C programmers do suffer from C's flaws; but like some

    mainframe COBOL programmers and some midrange RPG

    programmers, C programmers sometimes take pride in their

    ability to overcome the language's deficiencies. And after

    enough years chasing errant pointers, many C programmersbecome numb to the pain of using a language that can crash the

    debugger and freeze their PC.

    Conversations I have had with technical staff of several large

    microcomputer software companies illustrate what, I think, is the

    prevalent viewpoint in the C programming culture. I asked

    experienced C programmers whether they regularly encountered

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    26/181

    the kinds of problems I've described so far, and they all said, in

    effect, "Of course, it's just part of programming in C." Then I

    asked them how they handled a couple of the most commonproblems, and with one exception, they said they relied on

    "careful programming," lint filters, and good debuggers. The one

    exception said he built his own layer of abstract data types to

    insulate himself from C. (In following chapters, I offer

    techniques along these lines.)

    The most insightful reflection about C I've heard is from the

    computer scientist Bertrand Meyer, who designed the Eiffel

    programming language. He said, "How could I even try to teach

    systematic algorithm construction when I knew the bulk of the C

    students' time was spent fighting tricky pointer arithmetic,

    chasing memory allocation bugs, trying to figure out whether an

    argument was a value or a pointer, making sure the number of

    asterisks was right, and so on. I'm afraid it will be hard to

    recover from the damage caused by C to an entire generation of

    programmers."

    The same assessment was put more briefly by the programmer

    who said, "C's a double-edged swordwithout any handle."

    C can do you harm, and not just if you're inexperienced. In this

    book, I will try to give you a handle on C so you can wield it as

    safely as possible. In future chapters, I will describe other ways

    that C can trip you up and suggest programming practices to

    avoid common problems. If you're considering C, either forworkstation or AS/400 development, you'll gain a better

    understanding of some of the risks you face. If you're already

    using C, these tips will help you minimize your risks.

    * * * * *

    To experienced C programmers only: Did you catch the other

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    27/181

    errors I made in coding Figure 2.1? I'll point them out in Chapter

    3.

    Sidebar 1 C Coding Suggestions

    * Don't use = in an if statement expression, unless it is absolutely

    necessary.

    * Define a macro EQ for ==, and never use ==.

    * Define macros for &, |, &&, and ||.

    * Define macros for BOOL, TRUE, and FALSE.

    * Use only Boolean-valued expressions in if statements.

    * Use only Boolean variables with the logical operators && and ||.

    * Do all assignments as separate statements, not as part of a more

    complex expression.

    * Use parentheses in expressions to explicitly define order of

    evaluation.

    * Don't use %i format specifications or numbers that begin with 0.

    * Be sure to code addresses for arguments to scanf and similar

    functions.

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    28/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    Chapter 3Foolproof Statement and Comment Syntax

    Deck: As you learn the language, learn its pitfalls as well

    by Paul Conte

    C is not really a bad language; it's just too often misused. As a

    language for writing low-level device drivers or operatingsystem kernels, C is superb. It's also a great language for

    torturing student programmers. But for business applications or

    other software above the operating system level, C is a

    minefield: "Explosive" results await the unwary C programmer's

    misstep. Here's an example that requires you to pick your way

    carefully:

    if (xcnt < 2)returndate = x[0];time = x[1];

    This code appears to guard references to array x by checking the

    count of its elements first. But a semicolon is missing after the

    return, so the code really means:

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    29/181

    if (xcnt < 2) {return date = x[0];

    }

    time = x[1];

    C's "flexibility" lets you freely combine most expressions and

    statements, such as this assignment expression within a return

    statement. Unfortunately, this flexibility also means C compilers

    can't detect many errors caused by simple typos.

    Brace Yourself

    The problem occurs because of a missing semicolon. Many "old

    hand" C programmers would say the solution is simply to add

    the semicolon (after a few hours of debugging!). But does the

    following correction give us a safe program?

    If (xcnt < 2)return;

    date = x[0];

    time = x[1];

    What if we decide to add an error message?

    if (xcnt < 2)printf("Timestamp array is too small\n");return;

    date = x[0];time = x[1];

    Indeed, this is the ultimate "safe" program for valid arrays, it

    never does anything but return, terminating program execution!

    The code's execution is identical to

    if (xcnt < 2) {printf("Timestamp array is too small\n");

    }

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    30/181

    return;date = x[0];time = x[1];

    For quick coding, C lets you omit the { } around a conditional

    statement, a shortcut most published C programs take advantage

    of. You will be tempted to take this shortcut, too. Don't! Ever!

    Always enclose conditional code in braces. The errors

    introduced by incorrectly matched conditions and subordinate

    statements are very hard to ferret out. Our original example is

    better coded like this:

    if (xcnt < 2) {return;

    }date = x[0];time = x[1];

    Note that using braces also lets the compiler catch a missing

    semicolon, so you get lots of protection by following this simple

    rule.

    Unwinding this example also suggests a rule I mentioned in

    Chapter 2: Do all your assignments as separate statements, not as

    part of a more complex expression. Another helpful rule is: Use

    parentheses around expressions in return statements. For

    example, if you really did want to return the date after assigning

    it to a global variable, you might code

    if (xcnt < 2) {date = x[0];return (date);

    }

    This doesn't solve the original problem we looked at, but it does

    show how to code return statements so you're less likely to be

    tripped up by other problems with complex expressions.

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    31/181

    Follow This Advice, or Else

    Another problem related to if statements is the improper

    matching of else clauses. (This problem is not unique to C;

    COBOL programmers have been bit by the same type of

    "bugs.") Suppose we change our previous example so that array

    x must either have at least two elements to be assigned to date

    and time or be empty, in which case the program should do

    nothing. Other conditions should cause a return. The following

    fragment seems to do what we require:

    if (xcnt < 2)if (xcnt != 0) return;

    else {date = x[0];time = x[1];

    }

    But C associates an else with the closest unmatched if inside the

    same pair of braces. The compiler executes the above code thesame as the following:

    if (xcnt < 2) {if (xcnt != 0) {return;

    }else {date = x[0];

    time = x[1];}}

    In other words, nothing at all happens when xcnt is 2 or greater.

    Again, using braces for all conditional statements comes to the

    rescue:

    if (xcnt < 2) {

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    32/181

    if (xcnt != 0) {return;

    }

    }else {date = x[0];time = x[1];

    }

    Although full use of braces increases the number of lines of

    source code, braces make future program modifications much

    easier and less error-prone. With braces delimiting segments of

    conditional code, adding and deleting subordinate statementsrequires less careful checking of how else clauses and

    subordinate statements match up with the if statements.

    Another, elegant, solution is to define the following macros:

    #define IF { if (#define THEN ) {#define ELSE } else {#define ELSEIF } else if (#define ENDIF } }

    We could then code our previous example as:

    IF xcnt < 2THEN IF xcnt != 0

    THEN return;ENDIF

    ELSE date = x[0];time = x[1];

    ENDIF

    Once the macros are replaced with their corresponding

    definitions, the code is executed the same as the previous

    example. This style is guaranteed to cause traditional C

    programmers apoplexy, but they'll forget about it when they

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    33/181

    chase down their next bug. Meanwhile, your code can be

    readable and reliable.

    Give Me a Break

    As W. A. Wulf put it, "More computing sins are committed in

    the name of efficiency (without necessarily achieving it) than for

    any other single reason including blind stupidity." C's switch

    statement could be the all-time award winner in the "stupid

    efficiency" category. The error in the following code fragment

    may be obvious outside the context of a larger program; but inreal programs, such errors are easy to make and hard to find.

    switch (color) {case 1: printf("red\n");case 2: printf("blue\n");

    }

    Given this code, when color is 1, both "red" and "blue" are

    printed. The proper code is

    switch (color) {case 1: printf("red\n");

    break;case 2: printf("blue\n");

    }

    Of course, when you add another color, you'd better add another

    break after the second case. C's switch is not what is generallyrecognized as a "case" multiway conditional control structure;

    it's nothing more than a jump table. The compiler evaluates the

    switch expression simply to determine the target of a jump (i.e.,

    go to) operation into the code that follows. Unless you code a

    break, execution will continue sequentially through the code for

    cases that follow the case that matches the switch expression

    value.

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    34/181

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    35/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    36/181

    installment and the macros presented above (IF, THEN,

    ELSEIF, ELSE, and ENDIF) to code the tests as in Figure 3.1.

    This solution has a compact, table-oriented layout and avoids thehazards of raw C.

    Figure 3.1 - Coding an ELSEIF in C with Macros

    IF color EQ 1 THEN printf("red\n");ELSEIF color EQ 2 THEN printf("blue\n");ELSE printf("Invalid color\n");ENDIF

    One Last Comment

    Most programmers know that "comments lie," which is why

    high-level languages should let you directly express what your

    program does rather than force you to comment unclear code. C

    programmers will find that comments can also make their code

    lie! Read the fragment in Figure 3.2 carefully. A comment warns

    anybody reading the code about an important condition thatchanges at this point in the program flow. The comment tells us

    that here is where the variable prv_opcode changes from the

    previous opcode to the current opcode. A look at the C code

    seems to verify that the comment doesn't lie. But the C code (or

    what looks like C code) itself lies. The statement

    strcpy(prv_opcode, op_code);

    doesn't copy op_code to prv_opcode. It doesn't do anything

    it's part of a multiline comment, not executable code. The

    comment ends with the */ on the last line in the figure, making

    all of Figure 3.2 one long comment.

    Figure 3.2 - Sample C Code

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    37/181

    /* IMPORTANT NOTE: prv_opcode is set here,

    after handling vendor-specific translations and

    blank opcodes. After this section, opcode may bemodified. You should _not_ test prv_opcode after

    this point because it now holds the current opcode.

    strcpy(prv_opcode, opcode);

    setnull(stk_opcode);setnull(op_symbol);setnull(op_suffix);

    op_is_ctlop = FALSE;

    /* Control opcodes are ones that causeindentation:BEGSR, IFxx, DO, DOUxx, and DOWxx.*/

    [Note call-outs see magazine version, p. 114, Sept. 91]

    C uses /* and */ to delimit comments. C also implicitly

    continues open comments across multiple lines until the ending

    */ is encountered. This makes it easy to have "runaway"

    comments that encompass what's intended as executable code.

    Unintentionally commented-out code, especially if it's

    initialization code, can cause mysterious program behavior. You

    see the program fail, you look at the code, and it "can't do that!"

    Only when, on your tenth look, you finally catch that the

    comment a page up has no closing */ do you unfold the mystery.

    No foolproof way exists to avoid runaway C comments. (Newer

    languages such as Ada let you prevent this problem by using

    to start comments that end at the end of the line.) Two rules can

    help: Place the opening /* and closing */ for comments on lines

    by themselves, and use a vertical bar to begin each line of

    comment text. For example,

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    38/181

    /*|Comment lines

    |are here*/

    This practice avoids the most common cause of a missing */

    editing the last line of a comment and accidentally deleting the

    */ at the end of the line. It's also easier to check visually for

    matching comment delimiters when they appear at the same

    indentation level in the source. In addition, some C "lint"

    utilities can catch occurrences of /* inside a comment, which

    usually indicates a missing */.

    From C to Shining C

    If C's pitfalls somewhat tarnish its image, remember that your C

    programs can still shine if you polish your programming

    techniques. The most important thing you can do to improve C

    programs is take "C-riously" the dangers of writing C code in the

    traditional (some would say, "C-eat of the pants") manner. Don't

    try to make your C code "do a lot with just a few statements,"

    and don't hesitate to use source macros to lift yourself to a

    higher, safer language level than C primitives. Sure, your code

    will look foreign to old-style C programmers, but your running

    programs will look a lot better to your end users.

    * * * * *

    Answer to Chapter 2's puzzle: Figure 3.3 shows corrections to

    the three errors in the original code. The structure definition was

    missing the final semicolon (A), causing the compiler to treat the

    definition as the return type of the main function. The for loop

    had an extra semicolon immediately after the parentheses (B), so

    the body of the for loop was the null statement instead of the

    code within the braces. And the printf format string was missing

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    39/181

    (C) a newline character (\n), causing the results to be printed in a

    continuous stream, rather than one item per line.

    Figure 3.3 - Corrected C from Chapter 2

    struct part {int part_number;char description[30];

    };main() {int i;/* Array of part numbers and descriptions

    */struct part part_table[100] ={{011, "Wrench" },{067, "Screwdriver" },{137, "Hammer" },{260, "Pliers" },/*etc. */{0, "sentinel" }

    };for (i=0; i

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    40/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    41/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    42/181

    int orders[12];...++orders[month - 1];

    This works fine, so long as you never forget to subtract 1 from

    month when you use month as a subscript. You also need to be

    careful when you code for loops:

    for (month = 0; month < 12; month++) {printf("Total for month %d is %d\n",

    (month + 1), orders[month]);}

    This example suggests I should clarify my previous caution to

    remember to subtract 1. When you use month as a loop variable

    that runs across the array's range (0 to 11), you shouldn't make

    the adjustment in subscripted array references, but rather in

    printing the loop variable. Also remember that, to cover the

    array's range, the for loop must start at 0 and run to 11, not 12,

    so use < instead of

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    43/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    44/181

    Now we can write our first example as in Figure 4.2. And

    because we often want to do for loops across the entire range of

    a table, the macros in Figure 4.3 are handy. Using these macros,we can simplify printing the monthly counts to the code in

    Figure 4.4.

    Figure 4.2 Using Table Macros

    int month;TABLE( orders, int, 12 );.

    .

    .++ orders[ month ];

    for ( month = 1; month

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    45/181

    By now, you might reasonably ask, "Why bother creating all

    these macros to make C look like some other language; why not

    just use another language?" Good question, and if you have agood alternative, such as Pascal or Modula-2, you should use it

    instead of C. But if you're stuck with C, well-designed macros

    can add substantial safety and clarity to your programs. And, I

    should add, well-written macros don't hurt runtime performance,

    because they are translated into ordinary C code before

    compilation.

    String Symphony

    C doesn't have built-in support for variable-length strings;

    instead, C "fakes" strings by using static character arrays,

    character pointers, and a library of string functions that take

    pointers as arguments. Because strings inherently can be any

    length, jamming strings into C's fixed-length arrays leads to

    especially perverse pitfalls. Take a simple assignment of one

    string variable to another:

    strcpy(b, a);

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    46/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    This innocent statement has probably reduced the average life

    expectancy of C programmers by five years stress is not good

    for a programmer's health. What's wrong with this statement? I

    don't know ... maybe nothing. Maybe when this statement

    executes, the string in a will be no longer than b can hold. If so,

    everything is all right. If not, everything is all wrong. The strcpy

    function is a primitive memory-to-memory copy that is not

    limited by the target's declared size. In this example, if the stringin a is longer than the size of b, whatever is next to b in memory

    will be trashed. On PCs, this might even be operating system

    code, leaving your system frozen solid.

    The typical way in C to avoid string operations that overwrite

    memory is to "be careful." That works for experienced

    programmers most of the time. It's not a pretty sight,

    however, when this strategy doesn't work. The only effectivestrategy is: Always guard a string assignment against

    overwriting the target variable.

    Figure 4.5 shows one way to guard a string copy, using the

    sizeof operator. If the source string (including the '\0' terminator)

    fits in the target, the whole string is copied; otherwise, only as

    much as will fit is copied, and a null terminator is added. Even

    though this technique may truncate some strings, your program

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    47/181

    will continue its proper execution flow, rather than take some

    wild path caused by overwriting part of the program's

    instructions.

    Figure 4.5 Guarding a String Copy

    if ( strlen( source ) < sizeof( target ) ){

    strcpy( target, source );}else {

    strncpy( target, source, ( sizeof( target ) - 1 ) );target[ ( sizeof( target ) - 1 ) ] =

    '\0';}

    Of course, this technique is a prime candidate for a macro, using

    source and target as parameters. You can use similar macros for

    the other C library string assignment functions, such as strcat.

    You can also add warning messages to your macros to makeerror diagnosis even easier.

    Unfortunately, macros using the sizeof operator won't work for

    target strings that are function parameters, because the size of a

    string parameter is not automatically passed to a function (C

    passes just a pointer to the first character in the string). To

    handle string parameters whose value you want to change (i.e.,

    output or update parameters), you must explicitly pass thestring's declared size (or its maximum length, which is one less

    than the string's declared size) to the function. Figure 4.6a shows

    simple macros to implement "safe" strings. The STRING macro

    declares a string and an associated variable to hold the string's

    maximum length. STRING_TABLE declares a table (base-1

    array) of strings. The cpystr macro simplifies a call to the

    strcpymax function shown in Figure 4.6b.

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    48/181

    Figure 4.6a Macros for Safe Strings

    #define STRING( sname, smaxlen ) \size_t sname##_maxlen = ( smaxlen ); \char sname[ ( smaxlen ) + 1 ]

    #define STRING_TABLE( tname, ttop,smaxlen ) \

    int tname##_upper_bound =( ttop ); \

    size_t tname##_maxlen =( smaxlen ); \

    char tname [ ( ttop ) + 1 ][ ( smaxlen ) + 1 ]

    #define strmaxlen( sname ) sname##_maxlen

    #define cpystr( target, source ) \strcpymax( target, source,

    target##_maxlen )

    Figure 4.6b Safe String Copy Function

    char * strcpymax( char target[],const char source[],const size_t target_maxlen ) {

    if ( strlen( source )

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    49/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    50/181

    A full description of implementing safe, variable-length strings

    in C is beyond the scope of this book, but you can do it using

    structures that contain string lengths and pointers to one or morememory blocks for the string contents. I've seen numerous

    programs where the programmer has built such structures "on

    the fly." Such programs are often fragile and flaky because they

    combine some of C's most treacherous features: pointers and

    dynamic memory management. A more structured approach can

    reduce your risks.

    If you embark on an advanced string implementation, be sure to

    build a library of macros and functions that provide a safe, high-

    level set of string operations, and use these instead of C's

    primitive string functions. Your best bet is probably to use C++

    and an existing C++ string class (e.g., ones available from The

    Free Software Institute) or switch to Awk, a language that has a

    C-like syntax and includes full support for variable-length

    strings.

    Rough C Coming

    Most of the C pitfalls I've covered in the first three chapters are

    fairly easily circumvented by avoiding certain language

    constructs and using macros. Strings presented the first example

    of inherent C constructs for which there is no simple, universal

    solution (other than perhaps moving to C++). In Chapter 5, I'll

    take up pointers, a feature of C even more difficult than strings

    to handle safely.

    *****

    Sidebar 1 C Coding Suggestions

    * Declare C arrays with one extra element, and don't use the element

    with subscript 0.

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    51/181

    * Use macros to define tables and loops over them.

    * Always guard a string assignment against overwriting the target

    variable.

    * Create macros and functions to define strings and provide "safe"string operations.

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    52/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    Chapter 5Simplified Variable Declarations

    Deck: Follow these tips for unlimited visibility in your C

    declarations

    by Paul Conte

    An object's scope is something you can't C very clearly in source

    code. Or should I say you can't code scope clearly in C source?

    If you can't quite C where I'm headed, it's because we're just

    beginning a winding tour through the maze of C object visibility.

    Follow closely, and by the time we exit the maze, you'll have a

    simple map for the shortest route out.

    C, like many languages, lets an identifier x refer to differentobjects (e.g., storage locations for variables) at different places

    in the source code. As a simple case, the two declarations of x

    below refer to different variables.

    int x;

    main( void ) {

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    53/181

    int x;...}

    void func1( void ) {...}

    Let's say the first x refers to a storage location we'll call S1, and

    the second x refers to a storage location we'll call S2. The scope

    of the S1 object (storage location) is simply the region (i.e.,

    lines) of source code where references to x are references to S1;likewise, the scope of the S2 object is the region of source code

    where references to x are references to S2. In this example, the

    scope of S1 (the first x) is everywhere outside the main function,

    and the scope of S2 (the second x) is only within the main

    function. Obviously, for the program to be clear, these two

    regions of the program can't overlap; each reference to x must

    refer to just one of the storage locations. Scope is also called

    "visibility" because you can "see" an object (e.g., read or changea storage location) only within its scope. In this chapter, I use

    "visibility" for the general concept and "scope" to refer to C's

    specific lexical scope attribute.

    The C-nic Route

    The concept of visibility is simple and useful. Among other uses,

    distinct regions of object visibility let you use identifiers indifferent parts of your code without worrying about whether the

    same identifier, used for different purposes, unintentionally

    refers to the same object. But why be merely simple and useful

    when you can be clever and brave, too? Take your first left turn

    into the C labyrinth.

    C splits the single concept of visibility into scope and linkage,

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    54/181

    with scope referring to the region of program text within which

    an identifier's characteristics are understood. C's linkage term

    refers to the connection between identifiers in independentlycompiled translation units. Expressing visibility with two

    attributes instead of one may help C compiler writers, but it

    makes it more difficult for programmers to determine which

    object an identifier references. I'll try to map out C's rules while

    noting some language flaws that lead to C's problems. Then I'll

    mark an easy path to declaring variables and functions with the

    desired visibility.

    C's rules for function visibility are simple: If you specify static

    storage class, a function is visible throughout the source file in

    which it's defined, but not in other source files. With extern or

    no storage class specifier, a function is visible throughout the

    program (i.e., across all files), and you can call it from

    anywhere. These rules lead to my first suggestion: Declare

    functions static if you intend to call them only from within the

    same source file.

    C's rules for variable visibility are far more complex than those

    for function visibility. I've listed these rules in a table (Figure

    5.1) that shows, for any variable declaration, where that variable

    is visible. I've also listed C's scope and linkage attributes and

    whether the declaration causes storage to be allocated (in C

    terminology, whether it is a definition as well as a declaration).

    You can use this guide to help you understand some of the

    mysterious changes that can occur when a C variable hasunexpected visibility. You also may need it to follow my next

    few examples, but later I'll show you a far more useful table for

    C programming.

    Figure 5.1

    Visibility of C Variables

    DECLARED OR REFERENCED INSIDE A BLOCK (function

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    55/181

    or nested block):

    Where Storage Initial StorageScope/

    Linkage Visibility

    specified class value allocation

    1.

    Declared(none) Yes Yes

    Block

    scope,

    Within

    same

    block,

    including

    all

    in block auto No Yes nolinkage

    nestedblocks,

    except

    register any nested

    block (and

    its

    static nested

    blocks)

    with anidentical

    identifier

    without

    extern

    2.

    Declared

    extern No CASE A: Enclosing scope has

    identical, visible

    in block identifier same scope, linkage,

    allocation, visibility as thematching identifier

    CASE B: Otherwise, same as if

    declared extern outside function

    (see 7, below)

    3.

    Declaredextern Yes ILLEGAL

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    56/181

    in block declaration

    4. Not

    declaredbut

    referenced

    in block

    CASE A: Enclosing scope has

    identical, visible identifierdeclared in the same source file

    prior to the reference same as if

    declared extern in block (see 2,

    above)

    CASE B: Otherwise, ILLEGAL

    declaration

    EXTERNAL DECLARATIONS (declared outside any

    function):

    Which Storage Initial StorageScope/

    LinkageVisibility

    declaration class value allocation

    5. First (none) Yes Yes File

    scope

    Rest of file

    except any

    declaration No external block (and

    its nestedblocks)

    with an

    identical

    identifier

    declared

    without

    extern, and

    blocks itcontains

    and Other

    files with

    an identical

    external

    linkage

    identifier

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    57/181

    declared in

    them (No

    other filemay

    allocate

    (define) an

    identical

    external

    linkage

    identifier.)

    6. First static Yes Yes Filescope, Rest of fileexcept any

    declaration No internal

    linkage

    block (and

    its nested

    blocks)

    with an

    identical

    identifier

    declaredwithout

    extern, and

    blocks it

    contains

    7. First extern Yes Yes File

    scope,

    Rest of life

    except any

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    58/181

    declaration external

    linkage

    block (and

    its nested

    blocks)with an

    identical

    identifier

    declared

    without

    extern, and

    blocks it

    containsand Other

    files with

    an identical

    external

    linkage

    identifier

    declared in

    them (One

    other file

    must have

    an identical

    external

    linkage

    identifier

    allocated

    (defined) in

    it.)

    8. First extern Yes Yes Same as if declared

    outside function

    declarationwithout extern or static

    (see 5, above)

    9. Second

    orMust have same

    Same scope, linkage,

    allocation, and

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    59/181

    later type and linkagevisibility as first

    declaration

    declaration as first declaration

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    60/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    61/181

    both the extern and non-extern declarations mean the variable is

    visible outside the file in which it's declared. (If the extern

    declaration isn't the first declaration of x, however, and the firstdeclaration of x isn't visible outside the file, the extern

    declaration also specifies that x isn't visible outside the file a

    further complexity in C's approach to visibility.) Some

    programmers wise in the ways of C might argue, "But both

    declarations are external declarations, so it is consistent that they

    both are visible `externally' to the file." Nice try. However, when

    the declaration

    static int x;

    is outside any function, it also is an external declaration, yet it

    defines a variable that is not visible outside the file.

    Not only does C's syntax lack consistency, but it also confuses

    things by using the static storage class keyword to specify

    visibility. My theory is that Humpty Dumpty was on the original

    C design team. As he told Alice, "When I use a word, it meansjust what I choose it to mean neither more nor less."

    There are other confusing cases, as in the second declaration of x

    below,

    static int x;main( void ) {extern int x;...}

    where the use of extern results in internal linkage. I'll spare you

    a diversion down the cul-de-sac you enter when you have more

    than one external declaration for the same variable (which C

    allows). As we pass through the maze, however, notice that it

    matters whether you initialize a variable in an external

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    62/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    63/181

    variable in a file other than the one containing its definition.

    Naturally, a file can't import a variable that isn't exported by

    some other file, and a file doesn't need to import variables that itexports such variables already are available for shared use

    within the file. Note that in C programs, only one file can export

    a particular variable (this isn't true for all languages). Another

    use of the import concept is to specify you want to use a

    nonlocal variable in a function or block. After covering a couple

    of general rules relating to visibility, I'll explain a simple way to

    implement local, share, and export visibility, as well as import

    declarations.

    The most important rule for declarations in C, or any language,

    is: Use the most restricted visibility possible for variables; avoid

    shared variables. In C, this most often means defining variables

    used in a function as local to that function by putting their

    declarations at the beginning of the function and using auto or no

    storage class specifier. The same rule and technique apply to

    nested blocks within a function.

    In the special case of references to variables within a nested

    block that aren't local to the nested block, I usually do not

    declare the variables in the nested block. This results in implicit

    extern declarations for any variable you reference but don't

    declare in the nested block. When you follow this and other

    guidelines I present below, all references to variables not

    declared in a nested block resolve to a local or import

    declaration in an outer block. I rarely use nested blocks, exceptwith "loop" macros, such as those I presented in Chapter 3. And

    in those cases, import declarations don't add any real protection

    or improve clarity. However, if you prefer to explicitly declare

    every variable in a nested block, you can either use the IMPORT

    macro (which I'll introduce in a moment) or define a similar

    macro to more specifically express your intent.

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    64/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    65/181

    Common Sense C - Advice & Warnings for C and C++

    Programmers(Publisher: 29th Street Press)

    Author(s): Paul Conte

    ISBN: 1882419006

    Publication Date: 10/01/92

    Previous Table of Contents Next

    C-through Macros

    The best way to declare share, export, and import variables is to

    define macros for the required C syntax. Figure 5.2 shows three

    suggested macro definitions and how to use them (along with

    rules for declaring local variables.) For variables you want

    visible throughout the file, use the SHARE macro before the

    file's first function definition to declare (and define) eachvariable. For example,

    SHARE int x;

    This statement expands to

    static int x;

    which gives x the desired visibility. Note that you never use

    SHARE inside a function because it would just define a local

    variable.

    Figure 5.2

    How to Declare C Variables and Functions

    "Visibility" Macros

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    66/181

    #define SHARE static

    #define EXPORT

    #define IMPORT extern

    Declaring Variables

    Visibility Storage

    Where

    referenced

    Storage/visibility

    specifier(s)

    Local Automatic In block Define at beginning of

    block with no specifierLocal Static In block Define as static at

    beginning of block

    Nonlocal In nested

    block

    (No declaration, use

    implicit extern)

    File Static In function Define as SHARE

    before first function in

    file

    Declare as IMPORT at

    beginning of each

    function where

    referenced

    Program Static In function Define as EXPORT

    before first function in

    file where variable is to

    be allocated

    Declare as IMPORT

    before first function in

    file(s) where variable is

    used

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    67/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    68/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    69/181

    the variable with IMPORT at the beginning of the function:

    void func1( void ) {IMPORT x;...}

    The IMPORT declaration adds the necessary extern specifier

    and clearly shows that the function uses a nonlocal variable. A

    variable declared as IMPORT in a function will resolve to one of

    the file's SHARE or IMPORT external declarations.

    In files where you want to use a variable exported from another

    file, declare the variable with IMPORT once at the beginning of

    the file and once in each function or block where you want to

    use the variable. For example,

    | Before first function*/

    IMPORT x;...void func1( void ) {IMPORT x;...}

    The first IMPORT declaration adds the necessary extern

    specifier and clearly indicates that x is defined in another fileand that x may be used in other files (as well as in the current

    file and in the one that defines it). Like the previous example,

    the second IMPORT identifies a nonlocal variable used in a

    function. In this case, the two IMPORT declarations are clearly

    warning that func1 uses a variable that functions in other files

    also might use.

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    70/181

    Finding Your C Legs

    SHARE, EXPORT, and IMPORT let your declarations "saywhat they mean and mean what they say." You may prefer

    different macro names; almost any descriptive names produce

    clearer code than C's standard syntax. I picked up the idea for

    the macros from Steve Schustack's suggestion in Variations in C

    for similar macros he calls GLOBAL, SEMIGLOBAL, and

    IMPORT. I think EXPORT indicates better than GLOBAL the

    role of a variable allocated in one file and made available to

    other files, and SHARE seems more informative than

    SEMIGLOBAL. But whatever names you choose, these simple

    "visibility" macros and guidelines for declarations will let you,

    and anyone who reads your programs, C clearly now.

    C Coding Suggestions

    * Declare functions static if you intend to call them only from within

    the same source file.

    * Use the most restricted visibility possible for variables; avoidshared variables.

    * Put all external declarations before the first function definition in a

    file.

    * Put functions that must share data and external declarations for

    their shared variables in a file by themselves.

    * Use EXPORT, SHARE, and IMPORT macros to clarify the

    intended visibility of a variable.

    Previous Table of Contents Next

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    71/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    72/181

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    73/181

    The fees function will be compiled and executed without raising

    exception. But every call to fees will produce the same $100

    registration fee and $50 activity fee, regardless of age or income.In this example, the third and fourth assignment statements

    increment the values of rfee and afee, which are addresses

    (pointers), not the integer values stored at these two addresses.

    The assignment statements targets should be *rfee and *afee.

    The compiler, however, cant tell the original version is wrong

    because addition operations are legal on both pointer and integer

    variables.

    Cs lack of output parameters forces C programmers to

    explicitly handle addresses and dereferencing (i.e., referencing

    the storage pointed to by a pointer) to return more than one value

    from a function. Combined with Cs overloading of arithmetic

    operators for both integer and pointer arithmetic, dereferencing

    can easily trip you up. A good high-level language (HLL) should

    support output parameters so you dont need pointers and

    dereferencing to return multiple procedure values. (The C

    development community recognizes this C deficiency and hasadded references, which can be used for return parameters, to C+

    +. But no such facility is planned for C itself.)

    HLLs suitable for business programming also should either

    prohibit direct address modification (i.e., pointer arithmetic) or

    provide distinct functions for modifying addresses so such

    operations stand out in the code rather than appear as ordinary

    arithmetic operations. As Ive emphasized in previous chapters,C was designed as a portable assembly language, and when

    youre programming at the machine level, its logical to treat

    addresses as integers. At the business application level, however,

    machine addresses shouldnt be visible, much less easily

    confused with ordinary numbers.

    You wont find a foolproof way to use dereferenced pointer

  • 8/8/2019 Common Sense C - Advice and Warnings for C and C++ Programmers - [1882419006]

    74/181

    parameters. If you try to code operands such as *rfee and *afee

    throughout a function, youll eventually slip up and omit the *.

    Finding the mistake may not be easy. But a simple codingpractice will lead you around the pitfall: For non-array output

    or input/output parameters, use local variables instead of

    dereferenced parameters in function calculations.

    Figure 6.2 shows the fees function rewritten to use two local

    variables in the calculations. The functions last two statements

    assign the calculated values to the locations pointed to by the

    pointer parameters. This technique isolates and simplifies

    dereferencing and can significantly reduce errors. Figure 6.3

    shows how to handle in/out parameters by initializing the local

    variables to the dereferenced parameters.

    Figure 6.2 Using Local Variables Instead of Dereferenced

    Parameters

    void function fees( int * rfee,

    int * afee,const int age,const int income ) {

    /*| Calculate fees as base plus adjustment

    based on age| and income*/

    int reg_fee = 100;int act_fee = 50;

    reg_fee += ( age >= 60 ) ? ( income = 60 ) ? ( income = 60 ) ? ( income = 60 ) ? ( income