Sequence
points in C.
Ever wondered
what will be the value of the expression i++ * ++i in C/C++ when i = 1? If you
have a compiler, try it once and do a mental exercise to find out the reason!
But, before going to crack your head, check the same with more compilers and
you may wonder why these compilers are giving different answers for the same
expression! Now what is the reason behind this? Fortunately we have a written
specification for the language from the standard committee of ISO. And they
have clearly defined what should be the result of an expression in the language
and all the modern compilers are following that standard. But still, somewhere
something is missing with these tools or the expression evaluation? Before
going to the details of this expression, we can look into some terms defined in
the C standard.
Undefined Behavior:
behavior, upon use of a non-portable or erroneous program construct or of
erroneous data, for which the International Standard imposes no requirements.
An example of
undefined behavior is the behavior on integer overflow.
Unspecified Behavior: use of an unspecified value, or other behavior where the
International Standard provides two or more possibilities and imposes no
further requirements on which is chosen in any instance.
An example of
unspecified behavior is the order in which the arguments to a function are
evaluated.
And thus, the
language can have some behavior which is undefined or unspecified. In such
situations, the output of the program will depend on the implementation; in
other words the compiler. The compiler developer has the liberty to choose how
the program should behave in such situations. But as a good programming
practice, a programmer should never code in such a way that your code will give
different output on different implementation. Now the answer of the first
question is somewhat clear. It might be an undefined or unspecified behavior
and hence the result. And under which rule it produce this behavior? That is
sequence points.
Sequence pointsAccessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.
In simple words,
the side effect (change in execution environment) means, change of an object or
file. The changes are cumulative when
the execution of instructions progress. Then there should be a mechanism which
defines when and where all the changes in these objects will be stored back.
This mechanism is known as sequence points. The change of a variable will not be
written back to the memory whenever it happens, but only at the sequence
points. That is all the side effects from the previous sequence point will be
completed at the next sequence point. So, if the variable is changed multiple
times between 2 sequence points, which value will it take for the next
operation? The one, which got modified just now in the expression or the other,
which has already stored back at the previous sequence point? This behavior is
not defined in the standard. That means it’s an ‘Undefined Behavior’.
According to the
C standard there are 7 sequence points and they are
Ø
The call to a function, after
the arguments have been evaluated.
Ø
The end of the first operand of
the following operators: logical AND '&&', logical OR '||', conditional
'?', comma ','.
Ø
The end of a full declarator.
Ø
The end of a full expression:
an initializer, the expression in an expression statement, the controlling
expression of a selection statement (if or switch), the controlling expression
of a ‘while’ or ‘do’ statement, each of the expressions of a ‘for’ statement,
the expression in a return statement.
Ø
Immediately before a library
function returns.
Ø
After the actions associated
with each formatted input/output function conversion specifier.
Ø
Immediately before and
immediately after each call to a comparison function, and also between any call
to a comparison function and any movement of the objects passed as arguments to
that call.
Each declarator
declares one identifier and a full declarator is a declarator that is not part
of another declarator. The last one deals with searching and sorting functions
defined in C library. These comparison functions are passed as a function
pointer and will be called from the search/sort function (call back functions).
In general, this can be clubbed to the first one and make a more general point:
“The call to a function or a call back function”.
A good coding
practice: avoid more than one change in a single object, between 2 consecutive
sequence points. The output will be compiler dependent and if you have any such
undefined or unspecified behavior in your code, pray the compiler will never
get changed or the new version of the compiler will not have a change in the
behavior of the evaluation of the expression. Otherwise, you will end up in
unexpected bugs from unexpected area!