Unsequenced functions v3

Étienne Alepins (Thales Canada) and Jens Gustedt (INRIA France)

org: ISO/IEC JCT1/SC22/WG14 document: N2825
target: IS 9899:2023 version: 3
date: 2021-10-12 license: CC BY

Revision history

Paper number Title Changes
N2477 Const functions Initial version
N2539 Unsequenced functions Supersedes N2477 WG14 poll: 15-0-1
new wording (extracted from N2522)
no application to the C library
N2825 Unsequenced functions v3 Supersedes N2539
no attribute verification imposed
support for function pointers
optional text for inclusion of lambdas

Purpose

Multiple compilers on the market support the concept of const and pure functions (so coined by GCC), also sometimes called no-side-effects functions. In that terminology, a const function is a function that does not depend on or modify the global context of the program: it only uses its input parameters to modify its return value (GCC doesn’t allow a const function to have output parameters). To avoid ambiguities with the properties of the const qualifier, the new attribute unsequenced is proposed which defines that a call to such a function can be executed unsequenced with respect to unrelated evaluations, namely as soon as its input is available and as late as its result is used by another statement. Although widely supported, this concept is currently absent from the C standard. Unsequenced property can be gradually built from sub-properties:

It is proposed to add support to all those functions attributes in C23 such that functions that aren’t unsequenced can still benefit some optimisations.

Use cases

Unsequenced functions can be leveraged in multiple ways to enhance performance and reduce code size. Optimization is key in a compiler: it allows adding more features to a given system or conversely selecting lower-power CPUs to perform the same tasks.

Global variables reload

Model-based systems define software using a series of graphical blocks such as confirmators, delays, digital filters, etc. Code generators are often used to produce source code. Some of these generators use global variables to implement the data flow between models. Each block is implemented by a call to a library function. Let’s take a model with two confirmator blocks:

extern bool bInput1, bInput2, bInit, bInitState1, bInitState2;

extern int uConfTime1, uConfTime2;

void foo() {

    static int uCounter1, uCounter2;

    ...

    bOutput1 = CONF(bInput1, uConfTime1, bInit, bInitState1, &uCounter1);

    bOutput2 = CONF(bInput2, uConfTime2, bInit, bInitState2, &uCounter2);

    ...

}

where all these are global variables. bInit represents the initialization state of the whole system (for CONF, it resets the counter output parameter to zero). Since the compiler must assume CONF may be modifying bInit (through a direct access to this global variable), it is forced to load it’s value once before the first CONF call and a second time before the second CONF call. The second load can be avoided if the compiler is told CONF is an unsequenced function. It will then re-use bInit value from a register or stack, thus reducing the number of assembly instructions, i.e. potentially saving code size and performance. Code generators are not always flexible enough to put bInit in a local variable before calling blocks in order to work around compilers not supporting unsequenced functions.

In large model-based systems, there can be hundreds of such optimization opportunities, thus having a major overall positive impact.

Function call merge

Multiple use cases of potentially unsequenced function exist in the standard library, taking for example the cos <math.h> function:

if ((cos(angle1) > cos(angle2)) || (cos(angle1) > cos(angle3))) {
    ...
}

In this case, the compiler will invoke four times the cos function, with three different parameters. If cos would be declared unsequenced, the compiler could have merged the first and the third calls, reusing the result of the first call for the third, thus reducing CPU throughput and code size.

A large portion of the <math.h> functions fit the unsequenced objective for most of their input domain. However, in case of errors, these functions may also modify errno or the floating-point environment, which contradicts the strict unsequenced property as formulated above. Special care has to be taken to allow the functions to be declared unsequenced and in a manner where additional constraints can be spelled out. More work needs to be done on that topic and this paper does not propose any attribute addition to standard library functions, yet.

Prior art

GCC

int square(int) __attribute__ ((const));
int hash(char*) __attribute__ ((pure));

The const attribute is implemented since GCC 2.5 and pure since GCC 2.96.

GCC distinguishes const from pure:

We think that these definitions go a bit too short, because they are centered around interoperability restrictions and not around the effective properties that we want for the corresponding functions.

Below, we will propose features that extend these specifications, namely unsequenced and idempotent. These extensions are done in a way that the implementations of the GCC attributes are valid implementations, in the sense that they don’t check for validation of the requested properties, anyhow, and that the wider permissions that we give still make the targeted optimizations valid.

Surprisingly, then one of the attributes that we propose, independent, also comes close to another GCC attribute, malloc, namely for functions that have no pointer parameters but a pointer return value. For such an independent function it can be concluded that, if repeatedly called with the same arguments, it will either return a null pointer, a pointer to always the same static allocation, or a pointer to a freshly allocated object at each call.

LLVM Clang

Clang supports all GCC attributes.

WindRiver Diab 5.x

#pragma pure_function function({global | n}, ...),  ...
#pragma no_side_effects function({global | n}, ...), ...

These pragma exist in Diab compiler since at least version 4.4. Diab uses pure_function for unsequenced functions and no_side_effects for idempotent functions.

In addition to the basic idempotent/unsequenced, Diab allows functions to still access some specially marked global variable. This is achieved by passing parameters to the pragma. It is the only compiler to our knowledge that goes beyond the GCC const/pure concept. As proposed in N2644, this could be used to solve the errno and floating-point environment challenge of standard library functions. Nevertheless, we think that the general idempotent and unsequenced properties are currently more widely supported and so we concentrate this paper on these.

Diab 5.x compiler is a WindRiver proprietary compiler, not based on GCC or LLVM. It is in particular used in the highly popular WindRiver VxWorks RTOS for aerospace, automotive and industrial. It supports C99 (C11 experimental) and C++14 (except thread_local, some Unicode types and some library functions related to atomic, chron/signal, thread, filesystem and localization).

Not to be confused with Diab 7.x which is a branch based on LLVM.

GreenHills MULTI

__attribute__((const))
__attribute__((pure))

MULTI supports GCC const and pure attributes since at least version 4.0.

Ada language – GNAT

pragma Pure_Function ([Entity =>] function_LOCAL_NAME);

AdaCore GNAT Ada uses Pure_Function for const functions, i.e. functions where “the compiler can assume that there are no side effects, and in particular that two calls with identical arguments produce the same result.”

SPARK language also has the “Global” aspects which is similar. Efforts are in progress to incorporate const/pure support in the next revision of Ada language.

Fortran

Fortran allows a function to be declared PURE, meaning that the function has no side effect. That is the equivalent of GCC const.

Standardization

Const/pure compiler-specific features should be promoted to language features in order to:

Implementation in C23

Unsequenced and its sub-properties fit well into the realm of the new attribute feature of C23, because conforming code that has these attributes would remain conforming if the attribute is removed or ignored. The use of the attribute feature is clearly advantageous over pragmas (attributes have well-defined applicability) and the introduction of new keywords which suggest a semantic change.

Therefore this paper proposes to use the new attribute system to implement the properties. The names “const” and “pure” are not used because their definition is split into 5 sub-properties, because “unsequenced” has not the exact same meaning as GCC “const” and because “pure” has been used in the industry for different semantics (Fortran, Ada, Diab compiler).

Attribute enforcement

GCC 8.3.0 does not seem to raise warnings when const functions modify global variables.

AdaCore GNAT Ada allows pure/const functions to modify global variables. This is to support cases like instrumentation (e.g. call counter), table-caching implementations, debug traces, etc.

WindRiver Diab const/pure attribute allows to specify exceptions, i.e. global variables that a const/pure function might still read/write. That may be used to work around above mentioned GNAT use case. However, it is proposed not to introduce this for now.

Modifying global context includes calling non-idempotent functions. Reading global context includes calling non-unsequenced functions.

Having compiler errors or warnings when code violate its properties would avoid source code bugs. Indeed, wrongly declaring as unsequenced a function might produce functionally incorrect executable code. Initial version of that paper required compiler enforcement of these properties. However, as pointed out by WG14 on August 2020 meeting, making that propagated type check might increase drastically compilation time and memory usage. Also, it would require attribute addition to existing code, some of which might be hard to change (e.g. OS headers). For these reasons, this paper proposes an “assume” flavor of the attributes meaning that:

Hence, this puts the verification burden on the developer. However, depending on the quality of implementation, compiler diagnostics could reduce that burden and increase safety, at least when function definitions are visible to the translation unit. Static analysis tools might also be used to cover that outside of the compilation process.

For the same reasons, this paper does not ask for attribute inference, i.e. that the compiler automatically tags a function with these properties by looking at its code.

Relationship with other language features

Function pointers

ISO/IEC 9899:202x “6.7.3 Type qualifiers” mentions that “If the specification of a function type includes any type qualifiers, the behavior is Undefined.” Hence, these properties do not conflict with const, restrict, volatile or _Atomic keywords.

This paper proposes these properties for function pointers as well. Indeed, function pointers would benefit from those attribute as much as functions.

inline

Although there is no conflict here, it may seem superficial to add any of these properties to an inline function since the translator sees the function’s body and can assess whether it has side effects or depends on the global context. However, it is proposed to allow that combination to allow the programmer to profit from the corresponding diagnosis and to explicitly enforce constraints where these properties can’t be automatically deduced.

_Noreturn

Although none of the two above mentioned use cases apply for a _Noreturn function, it might still be useful to declare a _Noreturn function with one of the properties: it allows the _Noreturn function programmer to add static check that it’s function does not rely on or modifies global variables. This combination is more likely to be useful for idempotent than for unsequenced, though.

nodiscard

nodiscard attribute requires function calls to use a function’s return value. This is compatible with the properties.

Rationale for some of the choices

The current implementations of these features leave a lot of the properties to be defined by the user, in particular which accesses would constitute an access to global state, be it for reading or writing. Here, this paper proposes to make three dedicated choices:

  1. An access to a const-qualified object is only considered invariant if the object is not additionally volatile-qualified. This is because any access to a volatile-qualified object may have a different result according to the context where it is placed, and in particular repeated such accesses can lead to different results.
  2. A call to an allocation function changes the global state of the current thread and thus at a first glance any such calls should be forbidden for idempotent or independent attributes. Nevertheless, we consider temporary allocations that are freed before the end of the function as not modifying the state. Because such temporary allocations may constitute an important use case, we think that these should be allowed and asserted by the noleak attribute.
  3. Calls to the special library function call_once have a special status, because they basically enforce that a certain state is reached before any execution may progress beyond such a call. The callbacks that are used for such calls usually change global state (otherwise there is not much point to them) and the change in the underlying once_flag by itself constitutes a state change. But since the guarantee is that the call is only effected exactly once, it can be thought to take place before any execution of the function that we want to annotate.

WG21 C++ liaison

These properties seem compatible with C++. It would be beneficial for the whole C and C++ communities that they be implemented the same way in both languages. Discussions for this harmonization have started in the joint WG14/WG21 SG.

A first observation has been that the emphasis of this proposal on call_once is very C specific, this C library function is not even integrated in C++. The reason is that C++ has other mechanisms to initialize static objects dynamically at startup. Integration of the proposed attributes into C++ should discuss these mechanisms instead.

The same observation holds for memory allocation; for C++ the required pairings for allocation and deallocation for the noleak attribute have to be extended by analogous properties for operators new and delete. Also the fact of returning a new object from a function should not be considered a leak, either.

Link-time optimization can achieve some of the optimization benefits these properties bring. However, in addition to lacking manually-enforced constraints addition capabilities, it is not always possible to enable LTO. For example, multiple products in safety-critical markets such as avionics, nuclear, automotive and railway explicitly disable LTO in order to keep testing credit obtained by testing individual object files. Moreover, not all linkers offer LTO.

Some differences with GCC const and pure

GCC forbids usage of pointer arguments to const functions. There are two reasons why this paper wants to support that:

The requirements for programs are less restricted for [[unsequenced]] than for gcc ((__attribute__(const))). Thus, besides the syntax, gcc’s implementation is a valid implementation of [[unsequenced]], only that it may perhaps not be able to do all optimizations that would be possible.

Programs that are specified with gcc const, can easily be switched to use [[unsequenced]], instead, and remain valid. Additional standard attributes could then be added for functions that would not fit under the gcc model.

Similarily, gcc’s pure is less restricted than the new standard attribute [[indempotent]], and an implementation of the gcc feature is, syntax aside, an implementation of the new standard attribute.

Proposed Wording

Because the main change is the insertion of a whole clause, the proposed changes can be based on any intermediate version of ISO/IEC 9899:2023. Only reference to other clauses would eventually have to be modified if indpendent changes for ISO/IEC 9899:2023 insert other clauses. The only change that effectively would modify existing text is the following:

Add the new attributes idempotent, noleak, unsequenced, independent and stateless to the list of standard attributes in 6.7.11.1 p2.

Otherwise, the text to be inserted is as given in the following sections. If lambdas would be incorporated into C23, the proposed attributes should also apply to them, so some adjustments would then be in order. Also N2840 proposes some changes to the availibility of the call_once function.

General description

This introduces the general framework of these new function attributes. Changes are merely straightforward for the syntax feature itself and its impact on the type system.

We also make the necessary connections to two parts of the C library, namely dynamic startup-initialization via call_once and memory management.

6.7.11.6 Standard attributes for function and lambda types

Constraints

1 The identifier in a standard function type attribute shall be one of:

idempotent independent noleak stateless unsequenced

2 The attribute tokens of function attributes shall appear only in the attribute specifier sequence of a function declarator or a lambda expression.FNTα) The corresponding attribute is a property of the function or lambda type.FNTβ) The attribute tokens shall appear at most once in each attribute list. The attribute argument clause shall be omitted.

FNTα) That is, they appear in the attributes right after the closing parenthesis of the parameter list or right after the capture clause, if a lambda has no parameter list. If they appear in a function declarator, the function declarator may declare a function or a pointer to function.

FNTβ) If several declarations of the same function or function pointer are visible, regardless whether an attribute is present at several or just one of the declarators, it is attached to the type of the object.

Description

3 Each attribute defined in this clause asserts a specific property of a function or lambda. The main purpose is to provide the translator with information about the access of objects by such a function or lambda such that certain assumptions about function calls can be deduced.

4 Although syntactically attached to a function type, the attributes described are not part of the prototype of a such annotated function or lambda, and redeclarations and conversions that drop such an attribute are valid and constitute compatible types. Conversely, if a definition or lambda expression that does not have the asserted property is accessed by a function declaration with external linkage or a function pointer (originating from a function definition or from a function literal expression) that have the attribute, the behavior is undefined.

5 The library function call_once and the associated type once_flag (see 7.22 and 7.26.2.1) play a special role in this clause. They provide initialization facilities that are designed to be executed exactly once before entering into the specific functionality of the function in question. The data dependencies that are described in this clause are generally only considered after any initialization of objects with static storage duration and calls to call_once callbacks have taken place. We say that a static or thread-local object Y of type once_flag is a once-guard for a static or thread-local non-volatile object X (X and Y possibly being the same) if X is never modified other then by a call to call_once(&Y) and if Y is unique with that property. If X has a once-guard Y, then, after any call to call_once(&Y) terminates, X will not be modified until the end of the execution.

6 A second set of C library functions that interact with the properties described in this clause are the memory management functions (7.22.3). These are only considered for their effects, namely of allocating or deallocating storage. The possible effect on a hidden thread-specific state of the memory management system is ignored for the purpose of the properties, here.

Recommended Practice

7 If possible, it is recommended that implementations diagnose if an attribute of this clause is applied to a function definition or lambda expression that does not have the corresponding property.

noleak

Having no memory leaks (not in the sense of just accessibility) is an property for which we think that users should be able express their expectation (for function declarations) or their intent (for function definitions or lambda expressions). It will be an important feature for other attributes that we describe, but may also ease the work of static or dynamic program analyzers.

6.7.11.6.1 The noleak attribute

Description

1 A storage leak in a function definition or lambda expression is an allocation (7.22.3) and a possible control flow such that the pointer to the allocation is not returned by the function and such that no corresponding deallocation is performed before the end of a function call with that control flow.FNT0) This not withstanding, an allocation without deallocation is not a leak if it is used to initialize a pointer object X of static storage duration that is once-guarded by Y, also with static storage duration.

FNT0) Thus allocations of objects where the address is not returned or deallocated are considered to be leaks, even if the allocated object may still be accessible by other means, for example because the address has been stored in a variable.

2 The noleak attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function does not leak any storage.

Recommended Practice

3 It is recommended that for functions with a noleak attribute implementations diagnose if a pointer to newly allocated storage is not used as an argument to free or realloc, unless it is the return value of the function or lambda.

4 NOTE For this definitions, allocations that are effected by an initializer function or lambda that is guarded by a once_flag with static storage duration are not considered leaks. In contrast to that, because the number of threads that an application starts is in general not bounded, such allocations that are only guarded by a once_flag of thread storage duration are leaks. In both cases, applications are encouraged to deallocate storage that is such allocated by the appropriate means, that is by atexit handlers, at_quick_exit handlers or tss_t destructors.

stateless

The stateless attribute ensures compile-time independence from local state. As such it is not sufficient to enable all the optimizations that are state of the art, but it is a first step that can be taken at compile-time without inspecting any function arguments.

6.7.11.6.2 The stateless attribute

Description

1 A function or lambda is stateless if in the function body any defined object X of static or thread storage duration is not volatile-qualified (including directly and indirectly in function calls) and has one of the following properties.

2 The stateless attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is stateless.

Example

In the following, get_buffer only modifies the static variables through a call to call_once. Thus the annotation of the function with the stateless attribute is valid if the program does not use the file scope identifiers buffer and allocate_buffer other than indicated.

idempotent

6.7.11.6.3 The idempotent attribute

Description

1 An evaluation E is idempotent if it has the following properties.

FNT2) Thus if it has access by other means to a file scope object X other than using the identifier, E may still modify X provided it always stores a consistent value. Also, if X is guarded by Y, E is never evaluated during a call to call_once(&Y).

2 A function designator or lambda value f is idempotent if the function or lambda is stateless, has no storage leak and if any evaluation r = f(a1, ..., an) is idempotent, where a1, ..., an are valid function call arguments, and where r is a variable with the non-void return type of the function or lambda. Analogously, f with a return type of void is idempotent if f(a1, ..., an) is idempotent with a1, ..., an as above, and f with a prototype with an empty parameter list is idempotent if the expressions r = f() or f() are idempotent, respectively.FNT3)

FNT3) If the function call f() is idempotent it has no observable side effects.

3 The idempotent attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is idempotent.

Example

4 The attribute in the following function declaration asserts that two consecutive calls to the function will result in the same pointer return value. So if no change to the abstract state occurs between two calls the second is redundant and has no additional side effects.

typedef struct toto toto;
toto const* toto_same(void) [[idempotent]];

Also, because the function has no storage leak the returned pointer value is either a null pointer, a pointer to a static object or a pointer to an object that has been allocated during a call to call_once. Thus application code and implementations may suppress repeated calls to the function as long as they retain the return value.

independent

6.7.11.6.4 The independent attribute

Description

1 A function call is independent if all lvalue conversions on an object X that happen during the call, inclusive calls into other functions or lambdas, refers to an object that is reachable through a pointer parameter of the function call or that has one of the following properties.

  1. X has automatic storage duration:
    • X is the instance of a function parameter associated with the function call, or
    • its definition is reached during the call.
  2. X has allocated storage duration:
    • X is allocated during the call.FNT4)
  3. X has static or thread-local storage duration and is not volatile qualified:
    • X is const qualified, or
    • X has a once-guard Y, and the called function calls call_once(&Y) before any access to X.

FNT4) An independent function call still change thread specific execution state when using memory management functions. As a consequence addresses of locally allocated storage depend on the exact moment in which an implementation effects the call.

2 A function definition or lambda expression is independent, if it has no storage leak, is stateless and if all function calls with the function designator or lambda value are independent.

3 The independent attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is independent.

Example

3 The attribute in the following function declaration asserts that it doesn’t depend on any particular state of the abstract machine. Calls to the function can be effected out of sequence before the return value is needed, but two calls to the function with the same argument may in general result in different pointer return values.

typedef struct toto toto;
toto* toto_new(size_t size) [[independent]];

Because the function has no storage leak the returned pointer is either

If repeating the call several times with the same argument returns distinct values, such return values refer to newly allocated storage that is not aliased by any other object.

unsequenced

6.7.11.6.5 The unsequenced attribute

Description

1 The unsequenced attribute implies the independent and idempotent attributes. For all function declarations and lambda expressions that have both the independent and idempotent attributes the unsequenced attribute is implied.FNT5)

FNT5) That is, the unsequenced attribute is a short form for simultaneously applying these two attributes.

2 A function or lambda value is unsequenced if a function call can be executed as soon as the values of the arguments and all objects that are accessible through them have been determined, and if it can be executed as late as any of its return value or modified pointed-to arguments, even indirectly, are accessed.

3 The unsequenced attribute asserts that the declared function, the corresponding lambda expression or the pointed-to function is unsequenced.

4 NOTE 1 The unsequenced attribute asserts strong properties for the annotated function or lambda, in particular that certain sequencing requirements for function calls can be relaxed without affecting the state of the abstract machine. Thereby, calls to such functions are natural candidates for optimization techniques such as common subexpression elimination, local memoization or lazy evaluation.

5 NOTE 2 Whether a particular C library function is unsequenced depends on properties of the implementation. Many functions in the <math.h> header are unsequenced for an implementation that has math_errhandling evaluate to zero.

Example 1

6 The attribute in the following function declaration asserts that it doesn’t depend on any modifiable state of the abstract machine. Calls to the function can be executed out of sequence before the return value is needed and two calls to the function will result in the same return value.

bool tendency(signed char) [[unsequenced]];

Therefore such a call for a given argument value needs only to be executed once and the returned value can be reused when appropriate. For example, calls for all possible argument values can be executed during translation or program startup and tabulated.

Example 2

7 The attribute in the following function declaration asserts that it doesn’t depend on any modifiable state of the abstract machine. Calls to the function can be executed out of sequence before the return value is needed and two calls to the function will result in the same pointer return value. Therefore such a call needs only to be executed once and the returned pointer value can be reused when appropriate. For example, a single call can be executed during program startup and the result can be held in some hidden state.

typedef struct toto toto;
toto const* toto_zero(void) [[unsequenced]];

Also, because the function has no storage leak the returned pointer value is either a null pointer or a pointer to a static object, but it does not refer to a new allocation because that would be different for each call. Since the function is also stateless, any local static object of the function is const qualified. As a consequence, if the return value is not a null pointer and does not refer to an object (or subobject thereof) with internal or external linkage, the translator can deduce that it refers to a local static object with a const-qualified but non-volatile definition, and, that the pointed-to object does not alias any other object and will not change during the whole execution.