I’ve blogged about surprising bits of the C++ object model before, and I’m back with more.
Executive summary: Don’t use mozilla::PodZero
or mozilla::PodArrayZero
. Modern C++ provides better alternatives that don’t presume that writing all zeroes will always correctly initialize the given type. Use constructors, in-class member initializers, and functions like std::fill
to zero member fields.
The briefest recap of salient parts of the C++ object model
C++ as a language really wants to know when objects are created so that compilers can know that this memory contains an object of this type. Compilers then can assume that writing an object of one type, won’t conflict with reads/writes of incompatible types.
double foo(double* d, int* i, int z) { *d = 3.14; // int/double are incompatible, so this write may be // assumed not to change the value of *d. *i = z; // Therefore *d may be assumed to still be 3.14, so this // may be compiled as 3.14 * z without rereading *d. return *d * z; }
You can’t use arbitrary memory as your desired type after a cast. An object of that type must have been explicitly created there: e.g. a local variable of that type must be declared there, a field of that type must be defined and the containing object created, the object must be created via new
, &c.
Misinterpreting an object using an incompatible type violates the strict aliasing rules in [basic.lval]p11.
memset
ting an object
memset
lets you write characters over memory. C code routinely used this to fill an array or struct
with zeroes or null pointers or similar, assuming all-zeroes writes the right value.
C++ code also sometimes uses memset
to zero out an object, either after allocating its memory or in the constructor. This doesn’t create a T
(you’d need to placement-new
), but it often still “works”. But what if T
changes to require initialization? Maybe a field in T
gains a constructor (T
might never be touched!) or a nonzero initializer, making T
a non-trivial type. memset
could hide that fresh initialization requirement or (depending when the memset
happens) overwrite a necessary initialization.
Problem intensifies
Unfortunately, Mozilla code has provided and promoted a PodZero
function that misuses memset
this way. So when I built with gcc 8.0 recently (I usually use a home-built clang
), I discovered a torrent of build warnings about memset
misuse on non-trivial types. A taste:
In file included from /home/jwalden/moz/after/js/src/jit/BitSet.h:12, from /home/jwalden/moz/after/js/src/jit/Safepoints.h:10, from /home/jwalden/moz/after/js/src/jit/JitFrames.h:13, from /home/jwalden/moz/after/js/src/jit/BaselineFrame.h:10, from /home/jwalden/moz/after/js/src/vm/Stack-inl.h:15, from /home/jwalden/moz/after/js/src/vm/Debugger-inl.h:12, from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.cpp:29, from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src32.cpp:2: /home/jwalden/moz/after/js/src/jit/JitAllocPolicy.h: In instantiation of ‘T* js::jit::JitAllocPolicy::maybe_pod_calloc(size_t) [with T = js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >; size_t = long unsigned int]’: /home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:1293:63: required from ‘static js::detail::HashTable<T, HashPolicy, AllocPolicy>::Entry* js::detail::HashTable<T, HashPolicy, AllocPolicy>::createTable(AllocPolicy&, uint32_t, js::detail::HashTable<T, HashPolicy, AllocPolicy>::FailureBehavior) [with T = js::HashMapEntry<JS::Value, unsigned int>; HashPolicy = js::HashMap<JS::Value, unsigned int, js::jit::LIRGraph::ValueHasher, js::jit::JitAllocPolicy>::MapHashPolicy; AllocPolicy = js::jit::JitAllocPolicy; js::detail::HashTable<T, HashPolicy, AllocPolicy>::Entry = js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >; uint32_t = unsigned int]’ /home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:1361:28: required from ‘bool js::detail::HashTable<T, HashPolicy, AllocPolicy>::init(uint32_t) [with T = js::HashMapEntry<JS::Value, unsigned int>; HashPolicy = js::HashMap<JS::Value, unsigned int, js::jit::LIRGraph::ValueHasher, js::jit::JitAllocPolicy>::MapHashPolicy; AllocPolicy = js::jit::JitAllocPolicy; uint32_t = unsigned int]’ /home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:92:69: required from ‘bool js::HashMap<Key, Value, HashPolicy, AllocPolicy>::init(uint32_t) [with Key = JS::Value; Value = unsigned int; HashPolicy = js::jit::LIRGraph::ValueHasher; AllocPolicy = js::jit::JitAllocPolicy; uint32_t = unsigned int]’ /home/jwalden/moz/after/js/src/jit/LIR.h:1901:38: required from here /home/jwalden/moz/after/js/src/jit/JitAllocPolicy.h:101:19: warning: ‘void* memset(void*, int, size_t)’ clearing an object of type ‘class js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >’ with no trivial copy-assignment [-Wclass-memaccess] memset(p, 0, numElems * sizeof(T)); ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from /home/jwalden/moz/after/js/src/dbg/dist/include/js/TracingAPI.h:11, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/GCPolicyAPI.h:47, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/RootingAPI.h:22, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallArgs.h:73, from /home/jwalden/moz/after/js/src/jsapi.h:29, from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.h:10, from /home/jwalden/moz/after/js/src/vm/DebuggerMemory.cpp:7, from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src32.cpp:2: /home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:794:7: note: ‘class js::detail::HashTableEntry<js::HashMapEntry<JS::Value, unsigned int> >’ declared here class HashTableEntry ^~~~~~~~~~~~~~ Unified_cpp_js_src36.o In file included from /home/jwalden/moz/after/js/src/dbg/dist/include/js/HashTable.h:19, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/TracingAPI.h:11, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/GCPolicyAPI.h:47, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/RootingAPI.h:22, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallArgs.h:73, from /home/jwalden/moz/after/js/src/dbg/dist/include/js/CallNonGenericMethod.h:12, from /home/jwalden/moz/after/js/src/NamespaceImports.h:15, from /home/jwalden/moz/after/js/src/gc/Barrier.h:10, from /home/jwalden/moz/after/js/src/vm/ArgumentsObject.h:12, from /home/jwalden/moz/after/js/src/vm/GeneratorObject.h:10, from /home/jwalden/moz/after/js/src/vm/GeneratorObject.cpp:7, from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src33.cpp:2: /home/jwalden/moz/after/js/src/dbg/dist/include/mozilla/PodOperations.h: In instantiation of ‘void mozilla::PodZero(T*) [with T = js::NativeIterator]’: /home/jwalden/moz/after/js/src/vm/Iteration.cpp:578:15: required from here /home/jwalden/moz/after/js/src/dbg/dist/include/mozilla/PodOperations.h:32:9: warning: ‘void* memset(void*, int, size_t)’ clearing an object of type ‘struct js::NativeIterator’ with no trivial copy-assignment; use assignment or value-initialization instead [-Wclass-memaccess] memset(aT, 0, sizeof(T)); ~~~~~~^~~~~~~~~~~~~~~~~~ In file included from /home/jwalden/moz/after/js/src/vm/JSCompartment-inl.h:14, from /home/jwalden/moz/after/js/src/vm/JSObject-inl.h:32, from /home/jwalden/moz/after/js/src/vm/ArrayObject-inl.h:15, from /home/jwalden/moz/after/js/src/vm/GeneratorObject.cpp:11, from /home/jwalden/moz/after/js/src/dbg/js/src/Unified_cpp_js_src33.cpp:2: /home/jwalden/moz/after/js/src/vm/Iteration.h:32:8: note: ‘struct js::NativeIterator’ declared here struct NativeIterator ^~~~~~~~~~~~~~
Fixing the problem by not using mozilla::PodZero
Historically you’d have to add every single member-initialization to your constructor, duplicating names and risking missing one, but C+11’s in-class initializers allow an elegant fix:
// Add " = nullptr" to initialize these function pointers. struct AsmJSCacheOps { OpenAsmJSCacheEntryForReadOp openEntryForRead = nullptr; CloseAsmJSCacheEntryForReadOp closeEntryForRead = nullptr; OpenAsmJSCacheEntryForWriteOp openEntryForWrite = nullptr; CloseAsmJSCacheEntryForWriteOp closeEntryForWrite = nullptr; };
As long as you invoke a constructor, the members will be initialized. (Constructors can initialize a member to override in-class initializers.)
List-initialization using {}
is also frequently helpful: you can use it to zero trailing (or all) members of an array or struct
without naming/providing them:
class PreliminaryObjectArray { public: static const uint32_t COUNT = 20; private: // All objects with the type which have been allocated. The pointers in // this array are weak. JSObject* objects[COUNT] = {}; // zeroes public: PreliminaryObjectArray() = default; // ... };
Finally, C++ offers iterative-mutation functions to fill a container:
#include <algorithm> // mozilla::Array's default constructor doesn't initialize array // contents unless the element type is a class with a default // constructor, and no Array overload exists to zero every // element. (You could pass 1024 zeroes, but....) mozilla::Array<uint32_t, 1024> page; // array contents undefined std::fill(page.begin(), page.end(), 0); // now contains zeroes std::fill_n(page.begin(), page.end() - page.begin(), 0); // alternatively
After a long run of fixes to sundry bits of SpiderMonkey code to fix every last one of these issues last week, I’ve returned SpiderMonkey to warning-free with gcc (excluding imported ICU code). The only serious trickiness I ran into was a function of very unusual SpiderMonkey needs that shouldn’t affect code generally.
Fixing these issues is generally very doable. As people update to newer and newer gcc to build, the new -Wclass-memaccess
warning that told me about these issues will bug more and more people, and I’m confident all these problems triggered by PodZero
can be fixed.
mozilla::PodZero
and mozilla::PodArrayZero
are deprecated
PodZero
and its array-zeroing variant PodArrayZero
are ill-fitted to modern C++ and modern compilers. C++ now offers clean, type-safe ways to initialize memory to zeroes. You should avoid using PodZero
and PodArrayZero
in new code, replacing it with the initializer syntaxes mentioned above or with standard C++ algorithms to fill in zeroes.
As PodZero
is used in a ton of places right now, it’ll likely stick around for some time. But there’s a good chance I’ll rename it to DeprecatedPodZero
to highlight its badness and the desire to remove it. You should replace existing uses of it wherever and whenever you can.