Remove ATOMIC_VAR_INIT

Jens Gustedt, INRIA, France

org: ISO/IEC JCT1/SC22/WG14 document: N2886
target: IS 9899:2023 version: 2
date: 2021-11-15 license: CC BY

Revision history

Paper number Changes
N2390 Initial version discussed in Ithaca 2019, request for precissions
N2886 replaces N2390 splits proposal into several variants
take care of allocated storage and byte tempering
integrates changes into 6.7.9 and 7.31.10
adds optional change for effective types
add optional recommended practice

Problem description

The macro itself

The macro ATOMIC_VAR_INIT is basically useless for the purpose for which it was designed, namely to initialize any atomic type with a constant expression of the appropriate base type. There are problems because it is subject to macro parameter expansion (causing difficulties with compound literals) and with the fact that compile time constants of a particular base type (e.g for structure types) might not even exist.

On the other hand, all implementations seems to cope well with the normal initialization syntax for variables when extending it to atomics. Therefore, the use of ATOMIC_VAR_INIT has been made optional in C17 and the macro itself has been declared obsolete.

Because it is basically useless and problematic to use, we propose to remove it from C23. Since the macro name uses a reserved prefix, implementations may continue to provide the macro if they want to. They do not need to do anything to stay conforming.

Other problems with initialization

During the discussion for this proposal several defects of the current text have been observed.

The description of atomics that are in an indeterminate state is too narrow

In the current text, the problematic automatic atomic objects are identified as “not explicitly initialized”. This goes to short because automatic object can have an explicit initializer but still not be initialized if the execution jumps over the definition of the variable.

The current text, because it is about the initialization macro, also omits to list allocated atomics. Also, because they have their own initialization macro, atomic_flag objects must be handled specially in some places.

Signal handlers should not touch unitialized atomics

In the current text, provisions are made for atomic_init such that it cannot be called unsequenced from a signal handler. But the same problems as for atomic_init can in fact also occur for uninitialized atomics if a thread is interrupted in the middle of an initialization by assignment.

Representation bytes of atomics should not be tempered with

Changing any byte of an atomic object is not an atomic operation and can as such lead to a race condition. The current standard already takes care of this by making such unsynchronized changes undefined.

A more subtle problem occurs for unsequenced accesses within the same thread. If a lock-free atomic is changed by a non-atomic operation and a signal handler that uses it kicks in while in the middle of the non-atomic change, the signal handler might see an inconsistent value. For a change with atomic_init this is taken care of explicitly, that function is simply forbidden for asynchronous signal handlers. Specifications for other non-atomic changes are currently missing, so we add them by making changes on a byte level equivalent to a non-initialization.

Another question that the current standard leaves open is not tackled by this proposal, namely to know whether atomic operations on lock-free atomics add sequencing properties in general. For example a user could have reasonable expectation that an evaluation ++i - ++i could work if i has a lock-free integer type. Currently this has undefined behavior, but for no good technical reason.

Requirements to use atomic_init before any atomic operation

Hans Boehm questioned C’s permission to use simple assignment on uninitialized atomics as long as access is not conflicting. Implementing simple assignment for locked atomics where it is unknown if the object has been initialized may be challenging.

The discussion around that problem showed that using assignments in such contexts seems to be common practice in C and that changing it in general for all atomics could invalidate a lot of user code.

Since the problem seems only to be present for locked atomics we include an option below that only implies changes for the status of these. Lock-free atomics (generally those data-types for which the platform has a CAS instruction or similar) don’t seem to need such precautions. Simple assignment for them may be much more expensive than initialization, but already as of today users that encounter performance difficulties because of that have the choice to either initialize their variables properly (for automatic objects) or to use atomic_init (for allocated objects).

Suggested changes

The text concerning it (7.17.2.1) can not be completely removed from C23 because it contains normative text that is important for initialization of atomics. Therefore we propose to keep the second part of 7.17.2.1 p2 as the a new introduction to “Initialization”, 7.17.2 p1.

With all the variants proposed in the following, the numbering of the clause for atomic_init would change from 7.17.2.2 to 7.17.2.1.

Change to the normative text

Variant 1

This minimal variant just removes the macros and adds some precision about the different cases that are covered. Our understanding is that it does not make normative changes.

7.17.2 Initialization

7.17.2.1 The ATOMIC_VAR_INIT macro

Synopsis

Description

21 The ATOMIC_VAR_INIT macro expands to a token sequence suitable for initializing an atomic object of a type that is initialization-compatible with value. An atomic object with automatic storage duration that is not explicitly initialized or such an object with allocated storage duration is initially in an indeterminate state; equally a non-atomic store to any byte of the representation (either directly or, for example, by calls to memcpy or memset) puts any atomic object into an indeterminate state. however, the default (zero)Explicit or default initialization for atomic objects with static or thread-local storage duration that do not have type atomic_flag is guaranteed to produce a valid state.277)

32 Concurrent access to the variable being initialized an atomic object before it is set to a valid state, even via an atomic operation, constitutes a data race.

Variant 2

This variant adds text to Variant 1 concerning signal handlers that is similar to the corresponding text for atomic_init. In the current text this is a grey zone. This additional change might constitute a normative change.

7.17.2 Initialization

7.17.2.1 The ATOMIC_VAR_INIT macro

Synopsis

Description

21 The ATOMIC_VAR_INIT macro expands to a token sequence suitable for initializing an atomic object of a type that is initialization-compatible with value. An atomic object with automatic storage duration that is not explicitly initialized or such an object with allocated storage duration is initially in an indeterminate state; equally a non-atomic store to any byte of the representation (either directly or, for example, by calls to memcpy or memset) puts any atomic object into an indeterminate state. however, the default (zero)Explicit or default initialization for atomic objects with static or thread-local storage duration that do not have type atomic_flag is guaranteed to produce a valid state.277)

32 Concurrent access to the variable being initialized an atomic object before it is set to a valid state, even via an atomic operation, constitutes a data race. If a signal occurs other than as the result of calling the abort or raise functions, the behavior is undefined if the signal handler reads or modifies an atomic object that is in an indeterminate state.

Variant 3

This variant adds text to Variant 2 making atomic_init for locked uninitialized atomic objects mandatory before any atomic operation can be performed. This is a normative change. Applications that use atomic operations, in particular simple assignment, in such a context would have undefined behavior with C23, where they had defined behavior before. Nevertheless, it is not clear yet as of today that all implementations were up to the task and if such code was portable.

If this variant is chosen, implementations that know how to do this may still offer the use of atomic operations in such contexts as an extension.

7.17.2 Initialization

7.17.2.1 The ATOMIC_VAR_INIT macro

Synopsis

Description

21 The ATOMIC_VAR_INIT macro expands to a token sequence suitable for initializing an atomic object of a type that is initialization-compatible with value. An atomic object with automatic storage duration that is not explicitly initialized or such an object with allocated storage duration is initially in an indeterminate state; equally a non-atomic store to any byte of the representation (either directly or, for example, by calls to memcpy or memset) puts any atomic object into an indeterminate state. however, the default (zero)Explicit or default initialization for atomic objects with static or thread-local storage duration that do not have type atomic_flag is guaranteed to produce a valid state.277)

32 For an unitialized atomic object that is not lock-free the generic function atomic_init shall be used to set the object to a valid state before any atomic operation. COtherwise, concurrent access to the variable being initialized an atomic object before it is set to a valid state, even via an atomic operation, constitutes a data race.; if a signal occurs other than as the result of calling the abort or raise functions, the behavior is undefined if the signal handler reads or modifies an atomic object that is in an indeterminate state.

Variant 4

This variant adds text to Variant 1 making atomic_init for any kind of uninitialized atomic objects mandatory before any atomic operation can be performed. This is a normative change. Applications that use atomic operations, in particular simple assignment, in such a context would have undefined behavior with C23, where they had defined behavior before.

If this variant is chosen, implementations that know how to do this may still offer the use of atomic operations in such contexts as an extension.

For this variant the current paragraph 3 becomes obsolete. Also, because that part is taken care of by atomic_init there is no need to make explicit provisions for signal handlers.

7.17.2 Initialization

7.17.2.1 The ATOMIC_VAR_INIT macro

Synopsis

Description

21 The ATOMIC_VAR_INIT macro expands to a token sequence suitable for initializing an atomic object of a type that is initialization-compatible with value. An atomic object with automatic storage duration that is not explicitly initialized or such an object with allocated storage duration is initially in an indeterminate state; equally a non-atomic store to any byte of the representation (either directly or, for example, by calls to memcpy or memset) puts any atomic object into an indeterminate state. Before any atomic operation can be performed on an atomic object that does not have type atomic_flag and that is in an indeterminate state, the generic function atomic_init shall be used to set the object to a valid state. however, the default (zero)Explicit or default initialization for atomic objects with static or thread-local storage duration that do not have type atomic_flag is guaranteed to produce a valid state.277)

3 Concurrent access to the variable being initialized, even via an atomic operation, constitutes a data race.

Change to the example

We also propose to amend the example that is following, there, such that it does not use the macro, and such that it clarifies under which circumstances no additional initialization is necessary for race-free access.

4 EXAMPLE The following definitions ensure valid states for guide and head regardless if these are found in file scope or block scope. Thus any atomic operation that is performed on them after their initialization has been met is well defined.

Change 6.7.9, Initialization

Semantics

8 An initializer specifies the initial value stored in an object. For objects with atomic type additional restrictions apply, see 7.17.2 and 7.17.8.

Change to future library directions

We propose to remove the mention of the macro from “Future library directions” (7.31.10).

2 The macro ATOMIC_VAR_INIT is an obsolescent feature. The possibility that an atomic type name of an atomic integer type defines a different type than the corresponding direct type is an obsolescent feature.

Impose an effective type in 6.5 and 7.17.2.21 (optional)

The current text leaves it ambiguous if atomic_init would effectively provide an atomic type to *obj if A is an atomic character type.

In 6.5:

6 The effective type of an object for an access to its stored value is the declared type of the object, if any.96) If a value is stored into an object having no declared type through an lvalue having a type that is not a non-atomic character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

In 7.17.2.21:

Description

2 The atomic_init generic function initializes the atomic object pointed to by obj to the value value, while also initializing any additional state that the implementation might need to carry for the atomic object. If the object has no declared type, after the call the effective type is the atomic type A.

Navigating the possible implications of lacking initializations is complicated and might need some guidance.

Recommended Practice

3 Atomic objects that are in an indeterminate state are potentially subject to race conditions and are best be avoided. Therefore it is recommended that any operation that shares the address of an atomic object with other threads or with signal handlers is sequenced after the initialization of the object. Additionally, it is recommended that atomic objects of automatic storage duration are initialized explicitly and that implementations diagnose possible control flow that circumvents the initialization. Furthermore, it is recommended that immediately after their allocation atomic objects of allocated storage duration receive their effective type and a valid initial state by means of a calls to atomic_init (if the type is not atomic_flag) or one of the functions in 7.17.8 (if it is atomic_flag).

Questions

WG14 does not really have a good procedure to vote on variants. For the purpose of this paper here we propose to first vote the basic version, Variant 1, with all non-optional changes into C23 and then decide if we want to replace Variant 1 with any of the others.

Normative vote

Does WG14 want to integrate Variant 1 and Changes 3.2, 3.3, and 3.4 into C23?

Variants

We propose to handle the variants in decreasing order of the strength one after the other until any (or none) of them is voted into C23.

Normative vote for Variant 4 (impose atomic_init for all)

Does WG14 want to exchange Variant 1 with Variant 4 in C23?

Normative vote for Variant 3 (impose atomic_init for locked atomics)

Does WG14 want to exchange Variant 1 with Variant 3 in C23?

Normative vote for Variant 2 (add provisions for signal handlers)

Does WG14 want to exchange Variant 1 with Variant 2 in C23?

Optional change for effective types

Does WG14 want to add Change 3.5 to C23?

Does WG14 want to add the recommended practice 3.6 to C23?