Reviewing 'C' - Part 1

| Intro. | Part 0 | Part-1 | Part-2 | Part-3 | Part-4 | Part-5 |

'C' is a function based language, in fact any 'C' program starts with a function.  The mandatory function name in a 'C' program is "main".  Every 'C' program must have a function named "main," this is the function that is executed first when the program is run.  The use of functions is a topic that winds through these notes. 

First Observations

Consider these observations regarding the 'C' language:

While 'C' is free-form, you are encouraged to use white space characters in a way to make your code more readable.  I have found that formatting a program and providing good comments not only makes your code easier to read and easier to debug, but also much easier to maintain.  So 'do-it!' Be sure to study the example programs as well as other example programs. 

Be aware, in submitting an assignment for credit each program must have an opening comment that contains the file name, your name, the date (month,day,year), and a brief description of what the program does.  In the least bit this information is helpful in case your program is separated from the rest of your assignment.  If program readability becomes a problem, an official policy will be annouced.

A first program

This short program helps to illustrate several topics.  The double slashes are used to start a comment, that ends with the line.
// hello.c - Krista Hill - Dec. 26, 2003
// The traditional first 'C' program
#include <stdio.h>
int main(void)
{
  printf("Hello World!\n");
  return 0;
}

The 'C' compiler includes a so-called preprocessor that performs basic secretarial work, before the file is actually compiled.  Commands directed at the preprocessor, called preprocessor commands are identified by a leading # symbol.  The #include command inserts the named file into compiler input stream.  A file name surrounded by angle brackets < and > is searched for in a predefined region of the file system.  The file stdio.h contains input/output definitions.  Conversely, a file name surrounded by double quotes " " is searched for in the project or working directory.

Unlike early 'C', the ANSI 'C' standard requires each and every function declaration to inform the compiler of its return type.  The keyword int means integer.  The C'89 standard further requires main() to return a value of type int but many programmers refuse to adhere to such standards, forcing developers to write flexible compilers.  In any regard, this example follows the general Unix notion that a program returns the integer value zero to indicate successful completion. 

A block of code called a code block can be one statement but is more often several statements contained between curly braces { }. The program is formed by one such code block.  The printf keyword refers to a function that prints out the character string "Hello World!\n".  The '\n' symbol at the end of the character string causes a newline character to be printed.

In the context of embedded systems this example is particularly silly.  In regards to what value to return, in many cases there is no operating system to return anything to.  The compiler for such a processor may have no predefined standard output it can send the greeting through.  Part 3 of the Metrowerks Code Warrior tutorial for the 68HC12 processor walks you through the steps of implementing an embedded version of the program. 

Built-In Types, Variables, and Constants

Each variable in a program must be defined prior to being used.  When a variable is defined, space is allocated in memory for its storage.  'C' has only a few built-in types.

char - typically 8 bits, used to store an ASCII character or a small integer value
int - typically the size of a basic unit of storage in the target machine - an integer is at least 16 bits
float - a so-called single precision floating point number
double- a so-called double precision floating point number

Additional qualifiers modify the basic types, consider the following:

short - a short int has no more bits than an int.
long - a long int has no fewer bits than an int.
signed - integers are signed by default
unsigned- modifies the integer type to be non-negative

The conventions for writing constants are straightforward.  A real number with no decimal point is an integer.  To make a constant long use the l or L suffix.  For example 6047 is an integer and 6047L is a long integer.  The u and U suffixes cause a constant to be unsigned.  Octal notation (base 8) is implied by prefixing an integer with a leading zero.  Hexadecimal numbers (base 16) are prefixed with 0x or 0X.

All floating point number constants are of type double by default.  The f and F are used to produce a single precision floating point constant.  When used with floating point numbers, the l and L suffixes generate a so called long double constant.

Operations can only be performed between similar types.  The rules of automatic promotion handle many of the cases between differing types.  In mixing integer and floating point types, integers are converted to a floating point type. A type-cast is a deliberate type conversion, only the value immediately to the right of the type-cast keyword is converted to the new type. 

Ival = (int)Fval;
In this example Fval is floating point type and Ival is of type int.

A Second Example

A variable of type int is used to represent signed numbers, using two's complement notation and ni is the number of bits.  The sizeof() macro identifies the number of bytes used to represent a thing.  Suppose that an int is two bytes so that ni is 16 bits:

Signed Integer with ni = 16
Largest value possible: 2ni-1-1 = +32767
Smallest value possible:-2ni-1 = -32768

The program intest.c makes use of the sizeof() macro as well as a simple loop structure that we discuss in a moment.  Since a is of type int, the expressions sizeof(int) and sizeof(a) will return the same value.

/**********************************************************
 * intest.c - Krista Hill - Dec. 26, 2003
 * Test the size of an int
 *********************************************************/
#include <stdio.h>
int main()
{
  int a, n = 1;
  
  printf("An int is represented by %d bytes\n",sizeof(a));

  a = 1;
  while (a > 0) {
    a = a << 1;
    n = n + 1;
  }

  printf("ni = %d min = %d\n", n, a);
  return 0;
}
To understand how the rest of the program works, consider that the two lines following the while keyword are repeatedly executed, for as long as a is greater than zero.  Each time, the value in a is shifted one place left.  Ordinarily such a shift causes the same effect as multiplying the value by two.  But the loop exits when a shift moves the bit to the sign-bit position, causing the result to become negative.  In this case the final result is the most negative possible value. 
C:\CCODE>intest
An int is represented by 2 bytes
ni = 16 min = -32768
The keywords long and short also refer to integer types, a short is typically represented with fewer bits than a long.  The keyword unsigned is a modifier indicating that an integer type can only represent non-negative values. 

Signed Integer with ni = 16
Largest value possible: 2ni-1-1 = 32767
Smallest value possible:-2ni-1 = -32768

Unsigned Integer with ni = 16
Largest value possible: 2ni-1 = 65535
Smallest value possible:0

Logical and Bitwise Operations

Logical and bitwise operators behave differently in that a logical operator treats an entire integer variable as true or false, While a bitwise operator treats each bit in an integer separately.  A logical operator treats the integer value zero as false and all non-zero values as true.  The following are the logical operators. 

||- Logical OR
&&- Logical AND
! - Logical NOT

The following are examples:

2 && 1- Returns true
0 && 0- Returns false
0 || 5- Returns true
!1 - Returns false

The following operators are defined to perform comparison and are defined to return a non-zero value to represent a true result. 

Test     Description
A < B A less than B?
A <= B A less than or equal to B?
A > B A greater than B?
A >= B A greater than or equal to B?
A == B A equal to B?
A != B A not equal to B?

Logical operators provide a useful means to combine tests, as in the following examples:

A > B && C > 5
A <= B || P != 0

The following lists the bitwise operators.  As previously indicated, these operators related to the individual bits of an expression. 

|- Bitwise inclusive OR
&- Bitwise inclusive AND
~ - Bitwise NOT
^ - Bitwise exclusive OR

The following program serves to illustrate the difference between bitwise and logic operators.  In each printf() function call, the characters 0x.2x causes the output to print two hexadecimal digits, with leading 0x characters.

/************************************************
 * bitx.c  - Krista Hill
 * Illustrate difference between logical and
 * bitwise operators.
 ***********************************************/
#include <stdio.h>
int main()
{
  int a, b, c, d;

  a = 0x00 | 0x00;  b = 0x00 || 0x00;
  printf("0x00 | 0x00 = 0x%.2x  0x00 || 0x00 = 0x%.2x\n", a, b);

  c = 0x0C | 0x0A;  d = 0x0C || 0x0A;
  printf("0x0C | 0x0A = 0x%.2x  0x0C || 0x0A = 0x%.2x\n", c, d);

  a = 0x00 & 0x00;  b = 0x00 && 0x00;
  printf("0x00 & 0x00 = 0x%.2x  0x00 && 0x00 = 0x%.2x\n", a, b);

  c = 0x0C & 0x0A;  d = 0x0C && 0x0A;
  printf("0x0C & 0x0A = 0x%.2x  0x0C && 0x0A = 0x%.2x\n", c, d);

  return 0;
}

The followingis the program result.  Make sure that you know the difference between logical and bitwise operators.  Such similar behavior can hide certain bugs in your code. 

$ bitx
0x00 | 0x00 = 0x00  0x00 || 0x00 = 0x00
0x0C | 0x0A = 0x0e  0x0C || 0x0A = 0x01
0x00 & 0x00 = 0x00  0x00 && 0x00 = 0x00
0x0C & 0x0A = 0x08  0x0C && 0x0A = 0x01

Combined Operators

Combined operators represent commonly used combinations of operators.  Such combined operators are more concise and often map more easily to assembly language than the operators they represent.  We start with the increment and decrement operators.

++i   means:   i = i + 1
--i   means:   i = i - 1

To use such operators in a larger statement it is necessary to understand the difference between the pre and post position in a combined operator.  The operator ++i is said to pre-increment, returning the incremented value.  The operator i++ said to post-increment, as it returns the original value of i.  To make this more clear, suppose that the variable i initially equals 3. 

PRE-INCREMENT POST-INCREMENT
j = 2 * ++i; j = 2 * i--;
i becomes 4 i becomes 2
j becomes 8 j becomes 6

In general, play it safe with combined operators.  Do not write statements that modify an object more than once.  A prior value should be accessed only to determine its value.  To avoid confusion, the following should be disallowed by your compiler.  Such expressions are confusing to compilers and humans alike.

Disallowed Examples
j = j++;
a[j] = j++;
m = j++ + ++j;

The assignment operator is also combined with others to form additional combined operators.  The following is a summary:

General Combined Value   Value 
DescriptionStatement Operator Before  After 
Left Shift x = x << 1;    x <<= 1; 2 4
Right Shift x = x >> 1; x >>= 1; 2 1
Addition x = x + 2; x += 2; 2 4
Subtract x = x - 3; x -= 3; 2 -1
Multiplication    x = x * 6; x *= 6; 2 12
Division x = x / 2; x /= 2; 6 3
Modulus x = x % 4; x %= 4; 7 3
Bitwise OR x = x | 3; x |= 3; 6 7
Bitwise AND x = x & 3; x &= 3; 6 2
Bitwise XOR x = x ^ 3; x ^= 3; 6 5

Bit Twiddling

With respect to device registers, it is sometimes necessary to clear or set bits, that is make them zero or one, respectively, while leaving other bits unmodified.  A programmer can use at least two methods to express such operations.  Bitwise operations provide a simple explicit method, which we discuss here.  Data structures provide a second, slightly more convenient and abstract method, which we discuss later.

Bitwise instructions are useful for manipulating individual bits in device registers.  If individual bits need to be cleared or need to be set, it is simple enough to use a single combined operator expression.  The bitwise OR sets bits and bitwise AND is convenient for clearing bits.  In the following example, assume that REGX and REGY are the names of registers defined in a device specific header file.  The named register appears to be a variable, but is usually declared as being volatile and is mapped to the actual device address in the memory space.  Use of the bitwise NOT '~' helps to make the code more general as well as easier to understand and maintain.

REGX |= 0x80;     // make bit 7 in named register high
REGY &= ~0x18     // make bits 3, 4 in named register low

Note in the second example above, the bitwise-not opererator '~' is preferred to simply stating the mask value, as the former makes fewer assumptions about the hardware, such as the actual size of the register.  While stating the actual mask value may work now, with fewer assumption the code is easier to maintain.  The idea is to shift such burdens from ourselves, to the compiler.  In the long run, having the compiler determine the number of bits in the register and determine the required mask, is less likely to introduce errors. 

Some registers interpret a write as a command, so special care is required if some bits are cleared and others set.  Consider an 8 channel analog to digital converter, where the three least signifiant bits express which channel to acquire.  While the inent of the following is to perform one conversion on channel 2, two conversions may be performed.

ADC &= ~0x07;
ADC |= 0x02;

In such cases, a programmer can use a temporary variable to clearly state intent.  Consider the following example:

tmp = ADC & ~0x07;
ADC = tmp |  0x02;

Conditional Assignments

This is one of the oddities that makes 'C' the endearing language that it is.  We start with an example.  Suppose a variable named val1 is known to take on positive values, and its value is to be copied to val2, but only if val1 is less than ten.  Otherwise, the value ten is assigned. 

val2 = (val1 < 10)? val1 : 10;

The expression before the question mark '?' is a logical expression that produces a non-zero/zero true-false value.  If the expression is true, the value to left of the colon ':' is assigned.  Otherwise the value to the right is assigned.  If you find conditional assignments to be useful, make a point to keep them relatively simple.  Dense complicated code is confusing to the eyes.  In particular, don't test and assign auto-incremented or auto-decremented variables.

The following program makes use of the above conditional assignment.

/*********************************************
 * TestAssn.c - Krista Hill - Dec. 26, 2003
 * Example with conditional assignment
 ********************************************/
#include <stdio.h>

int main()
{
  unsigned val1, val2;
  printf("TestAssn program output\n");

  val1 = 4;
  val2 = (val1 < 10)? val1 : 10;
  printf("val1 = %d, val2 = %d\n",val1,val2); 

  val1 = 12;
  val2 = (val1 < 10)? val1 : 10;
  printf("val1 = %d, val2 = %d\n",val1,val2);

  return 0;
}

The following is the corresponding program output.
$ TestAssn
TestAssn program output
val1 = 4, val2 = 4
val1 = 12, val2 = 10

Operator Precedence

Consider how an expression is evaluated.  The concept of precedence is a set of rules that determines the order that operations are performed.  Following mathematical tradition, multiplication and division are performed before addition and subtraction.  Multiplication and division are said to have higher precedence than addition and subtraction, the assignment operator '=' has even lower precedence.
F = 2 + 4 * A   means:   F = (2 + (4 * A))

In any expression grammar, operators are grouped into levels from lowest that execute first, to highest that execute last.  The number of levels depends on the language.  The 'C' language has 15 precedence levels.  For operators at the same levels, an associativity rule controls the grouping in which pairs of operators are evaluated. 

The following table summarizes all the 'C' operators.  As implied above, we use parenthesis to cause operators to evaluate in a different order.  Besides the familiar topics, the table also refers to things we have not discussed yet.  In reading along, be sure to return to this table from time to time.  Notice that some operators such as parenthesis, the minus sign and * have several meanings, based on the context where they are used.  Parenthesis are also associated with function calls.  When used with one number, the minus sign means form the opposite value.  Likewise * can mean multiplication or pointer reference.  Finally, while most operators of the same order execute left to right, there are notable exceptions.  As stated earlier, the assignment operators execute right to left.

Summary of 'C' Operators
 Order   Associativity   Operator   Description 
1Left-to-right( )Function call
[ ]Array element reference
-> Pointer to structure member reference
. Structure member reference
2Right-to-left- Unary minus
++ Increment
-- Decrement
! Logical negation
~ Bitwise invert
* Pointer reference
& Address
sizeofSize of an object
(type)Type case (conversion)
3Left-to-right* Multiplication
/ Division
% Modulus
4Left-to-right+ Addition
- Subtraction
5Left-to-right<<Left shift
>>Right shift
6Left-to-right< Less-than
<= Less-than or equal
> Greater than
>= Greater than or equal
7Left-to-right== Equality
!= Inequality
8Left-to-right& Bitwise AND
9Left-to-right^Bitwise XOR
10Left-to-right|Bitwise OR
11Left-to-right&&Logical AND
12Left-to-right||Logical OR
13Right-to-left?:Conditional
14Right-to-left= *= /=Assignment operators
%= += -=
&= ^= |=
<<= >>=
15Right-to-left, Comma operator

Just about every good 'C' programming book contains this table summary of 'C' operators.  The table here is from Stephen G. Kochan's revised Programming in C, copyright 1988.  That particular text includes an ample supply of examples.

The printf Function

The first argument to printf() is the format string, it specifies how the remaining arguments are used in forming the output.  The % symbol, with a specific conversion character and modifier characters are used to precisely format output.  We saw how %d causes an integer to print in decimal format.  We also saw how to print an integer in hexadecimal format.  The general format of a print conversion specification is as follows:
%[flags][width][.prec][l]type
Print Conversion Specification Table
flags- left justify in field
+ precede with sign (+ or -)
(space)place space before positive value
 
width Minimum size of the field, * means take next argument as the field width value
 
prec For integers, the minimum number of digits to display. For floating values, the number of digits or significant digits.  For character strings, the maximum number of characters.  A * means take next argument as the size.
 
l Display long version - usually an integer
 
type Type conversion character
The next table lists the conversion type characters.
Conversion Characters
Char. Use for printing
d  Integers
u Unsigned integers
o Octal numbers
x Hexadecimal using a-f
X Hexadecimal using A-F
f Floating point numbers
e Floating point with e as exponent
E Floating point with E as exponent
g Floating point in f or e format
G Floating point in f or E format
c A single ASCII character
s Null terminated character string
% Print a percent symbol

Since the pecent symbol % has special meaning in 'C', a special specification is needed to print one out.  Use %% to print out a single percent symbol.

Homework Problems:

  1. Enter, compile, and execute intest.c, but modify the code so that n and a are of type char.  Explain for ni = 8 what the largest and smallest values are that may be represented.

  2. Modify the program intest.c in the previous problem so that it produces ni, the most negative number and the most positive number that may be represented.  Hint: The most negative number overflows to the most positive. Explain in a few sentences along with any suitable figures how your program works.

  3. To multiply an integer value by a power of two, it is convenient to simply perform a left shift operation.  Provide the results for the following expression, first assuming left grouping, and then right grouping.  Examine the list of operator precedences given in this document and report the actual grouping direction of the shift operator used in 'C'.  Assume the use of an integer large enough to store the result.

    F = 2 << 3 << 2

  4. Explain why the parenthesis in the following are an essential part of the expression.  Pick values for A, B, C and produce results, with and without parenthesis, that support your statement.

    F = (A || B) && (A || C)

  5. Show that the results in the example program bitx.c are correct.  Covert the values 0x0C and 0x05 to binary, perform the bitwise and logical AND and OR operations, then convert the results back to hexadecimal.

  6. Use the example program bitx.c to inspire the program inctest.c that verifies the pre-increment and post-decrement examples given earlier.  For each case, i initially equals 3.  For each expression print the new values for i and j.

  7. Copy the file TestAssn.c given above to TestEx2.c and make the following changes:

    For homework produce a printout of the new program and the program output.

Please Let me know that you read my web pages.

This supplemental set of notes is written for the computer engineering students at the University of Hartford.  Copyright is reserved by the author, but copies of this document may be made for educational use as-is, provided that this statement remains attached.  The author welcomes corrections, comments, and constructive criticism. 
Original Author: Krista Hill kmhill@hartford.edu Date: Wed Dec 10 22:32:32 EST 2003
Revised: Wed Jan 30 18:59:51 EST 2008