blog
recent
archive
twitter

projects
Mac OS X
Keyboard
  backlight
CSC Menu
Valgrind
Fringe Player
pssh
Peal
Frankenmouse

   

Hamster Emporium

by Greg Parker, runtime wrangler and general specialist

(link) [objc explain]: Non-pointer isa   (2013-09-24 1:27 AM)
 

On iOS for arm64, the isa field of Objective-C objects is no longer a pointer.

Say what?

On iOS for arm64, the isa field of Objective-C objects is no longer a pointer.

If it's not a pointer anymore, what is it?

Some of the bits still encode the pointer to the object's class. But neither OS X nor iOS actually uses all 64 bits of virtual address space. The Objective-C runtime may use these extra bits to store per-object data like its retain count or whether it has been weakly referenced.

Why change it?

Performance. Re-purposing these otherwise unused bits increases speed and decreases memory size. On iOS 7 the focus is on optimizing retain/release and alloc/dealloc.

What does this mean for my code?

Don't read obj->isa directly. The compiler will complain if you do. Trust the Compiler. The Compiler is your friend. Use [obj class] or object_getClass(obj) instead.

Don't write obj->isa directly. Use object_setClass() instead.

If you override +allocWithZone:, you may initialize your object's isa field to a "raw" isa pointer. If you do, no extra data will be stored in that isa field and you may suffer the slow path through code like retain/release. To enable these optimizations, instead set the isa field to zero (if it is not already) and then call object_setClass().

If you override retain/release to implement a custom inline retain count, consider removing that code in favor of the runtime's implementation.

The 64-bit iOS simulator currently does not use non-pointer isa. Test your code on a real arm64 device.

What does this mean for debugging?

The debugger knows how to decode the class from the isa field. You should not need to examine it directly in most cases.

You can run your code with environment variable OBJC_DISABLE_NONPOINTER_ISA=YES to disable non-pointer isa for all classes. If your code works with this set and fails without it, you may be incorrectly accessing an isa field directly somewhere.

If you are writing a debugger-like tool, the Objective-C runtime exports some variables to help decode isa fields. objc_debug_isa_class_mask describes which bits are the class pointer: (isa & class_mask) == class pointer. objc_debug_isa_magic_mask and objc_debug_isa_magic_value describe some bits that help distinguish valid isa fields from other invalid values: (isa & magic_mask) == magic_value for isa fields that are not raw class pointers. These variables may change in the future so do not use them in application code.

No seriously, what do each of the bits mean?

For entertainment purposes only. These values will change in future OS versions. I think they already have changed, actually.

(LSB)  
1bitindexed0 is raw isa, 1 is non-pointer isa.
1bithas_assocObject has or once had an associated reference. Object with no associated references can deallocate faster.
1bithas_cxx_dtorObject has a C++ or ARC destructor. Objects with no destructor can deallocate faster.
30bitsshiftclsClass pointer's non-zero bits.
9bitsmagicEquals 0xd2. Used by the debugger to distinguish real objects from uninitialized junk.
1bitweakly_referencedObject is or once was pointed to by an ARC weak variable. Objects not weakly referenced can deallocate faster.
1bitdeallocatingObject is currently deallocating.
1bithas_sidetable_rcObject's retain count is too large to store inline.
19bitsextra_rcObject's retain count above 1. (For example, if extra_rc is 5 then the object's real retain count is 6.)
(MSB)  

(link) [objc explain]: So you crashed in objc_msgSend(): iPhone 5s Edition   (2013-09-12 12:45 PM)
 

So you crashed in objc_msgSend() has been updated with register usage for iPhone 5s's ARM64 processor. The table now looks like this:

objc_msgSend
objc_msgSend_fpret
objc_msgSend_stret
 receiverSELreceiverSEL
i386eax*ecx eax*ecx
x86_64rdirsi rsirdx
ppcr3r4 r4r5
ppc64r3r4 r4r5
armr0r1 r1r2
arm64x0x1

(link) [objc explain]: return value of message to nil   (2012-2-29 2:40 PM)
 

LLVM Compiler 3.0 (Xcode 4.2) or later

Integers up to 64 bits: 0
Floating-point up to long double: 0.0
Pointers: nil
Structs: {0}
Any _Complex type: {0, 0}

Notes

C++ objects returned by value are initialized to {0}, even if the type has a default constructor that does something else. This may be fixed in the future.
Struct return is undefined if you call objc_msgSend_stret() directly.
Struct return is undefined if you use an older compiler.
Floating-point return is undefined on Mac OS X 10.4 and earlier on Power PC.
_Complex long double return is undefined if you use an older compiler.

(link) [objc explain]: objc_msgSend_vtable   (2011-06-17 4:42 PM)
 

objc_msgSend_vtable is a version of objc_msgSend used to optimize a few of the most commonly called methods.

Most Objective-C methods are dispatched using a hash table lookup inside objc_msgSend. On x86_64, a few selectors can be dispatched using a C++-style virtual table: an array lookup, not a hash table.

The compiler knows which selectors are optimized by the runtime. It compiles the call site differently, calling objc_msgSend_fixup via a function pointer. At runtime, objc_msgSend_fixup replaces the function pointer with one of the objc_msgSend_vtable functions, if the called selector is one of the optimized selectors.

C++ vtables are notoriously fragile: the array offsets for each virtual method are hardcoded into the generated code. Objective-C's vtables are not fragile. Each vtable is built at runtime and updated when method lists change. In theory even the set of optimized methods could be changed. The non-fragile flexibility costs an extra memory load during dispatch.

Dispatch via vtable is faster than a hash table, but would consume tremendous amounts of memory if used everywhere. Objective-C's vtable implementation limits its use to a few selectors that are (1) implemented everywhere, but (2) rarely overridden. That means most classes share their superclass's vtable, which keeps memory costs low.

A crash in any objc_msgSend_vtable function should be debugged exactly like a crash in objc_msgSend itself. They both crash for all of the same reasons, like incorrect memory management or memory smashers.

Currently, the runtime uses sixteen different objc_msgSend_vtable functions, one for each slot in the sixteen-entry vtable.

objc_msgSend_vtable0allocWithZone:
objc_msgSend_vtable1alloc
objc_msgSend_vtable2class
objc_msgSend_vtable3self
objc_msgSend_vtable4isKindOfClass:
objc_msgSend_vtable5respondsToSelector:
objc_msgSend_vtable6isFlipped
objc_msgSend_vtable7length
objc_msgSend_vtable8objectForKey:
objc_msgSend_vtable9count
objc_msgSend_vtable10objectAtIndex:
objc_msgSend_vtable11isEqualToString:
objc_msgSend_vtable12isEqual:
objc_msgSend_vtable13retain (non-GC)
hash (GC)
objc_msgSend_vtable14release (non-GC)
addObject: (GC)
objc_msgSend_vtable15autorelease (non-GC)
countByEnumeratingWithState:objects:count: (GC)

The vtable's contents differ for GC and non-GC, for obvious reasons. -isFlipped is part of NSView. -countByEnumeratingWithState:objects:count: is the fast enumeration implementation, including for (x in y). Together these methods make up roughly 30-50% of calls in typical Objective-C applications.

(link) Dr. Gregory Parker, Department of Diagnostic Engineering   (2010-09-01 3:15 AM)
 

Last week, Rick Ballard came by my office for a consult. He had caught Xcode at a crash in objc_msgSend(). The crash looked like an intermittent problem that had been plaguing Xcode for months. So he called the local expert on debugging objc_msgSend(). Dr. Gregory Parker, Department of Diagnostic Engineering.

The good news was that Rick's crash was reliably reproducible. Running tests on a live patient is better than performing an autopsy on a dead one. The bad news was that the obvious debugging tools had not helped. NSZombieEnabled and guardmalloc had turned up nothing, and AUTO_USE_GUARDS=YES (the GC equivalent of guardmalloc) just thrashed the machine for two hours before running out of address space.

So you crashed in objc_msgSend(). The selector was -isAbsolutePath, which was reasonable but meant the debugger's backtrace was missing a frame. objc_msgSend() had read the class from the object, read the method cache from the class, read a method from the method cache, and crashed while trying to read the IMP from the method. Theory: either one of those data structures had been hit by a memory smasher, or the original object was bogus but happened to have dereferenceable pointers in the right places to survive that long. The method cache's mask was invalid - it should have been of the form 2n-1 - so the failure must have been at or before that point in the chain.

The object pointer itself looked plausible. Theory: the object was valid, but a previous object at the same location had been used after being freed. We had the great luxury of a reproducible crash, so we turned on MallocStackLoggingNoCompact and ran it again. That memory had only been used for one object, and it had not been deallocated. So the evidence did not support the use-after-free theory. But the history showed that the object had been allocated as an NSPathStore2 - an internal subclass of NSString for file pathnames - which matched the selector -isAbsolutePath and matched the call site's expectations. The theory that the object pointer was valid looked good.

The object pointer was good, and the method cache was not: the failure was on the chain between them. The contents of the object looked good. The bytes looked like alternating zero and ASCII, which is a dead giveaway for the UTF-16 used inside NSString. The string value decoded as @"/Xcode4/usr/bin/llvm-gcc", which made sense in the call site's context.

The object's isa pointer was not so good. Its value was 0xa0050000. This was not class NSPathStore2 or any other class. vmmap showed it to be in Foundation's data segment, and otool showed it was specifically in Foundation's constant CF strings. But instead of pointing to the start of some string, it pointed to the middle of a string object. That string object was @"tzm-Latn": some localization thingy, perhaps? Theory: some bug had replaced this object's isa pointer with a pointer to the middle of an unrelated localization string object. This did not sound like a good theory.

Go back to the board. Symptom: the object was allocated as an NSPathStore2. Symptom: the object's isa pointer is now 0xa0050000, which is not NSPathStore2. What should the isa pointer's value have been? otool and objc_getClass() agreed: the correct isa pointer should have been 0xa005f198. 0xa0050000 is suspiciously similar. Theory: something had cleared two bytes of this object, leaving a nonsense isa pointer. @"tzm-Latn" was a red herring.

Aha! This is 32-bit i386. Little endian. The pointer 0xa005f198 is stored backwards in memory: 0x98 0xf1 0x05 0xa0. Clearing the least-significant bytes of the isa pointer meant clearing bytes 0 and 1 of the object, not bytes 2 and 3. Damage to bytes 0 and 1 is exactly what you'd expect from a two-byte overrun of the object preceding this one in memory. Theory: the bug was in that preceding object, and this NSPathStore2 object was an innocent victim.

malloc_history works with a pointer to the middle of an allocation, too. We plugged in object-1 and got back an instance of DVTSourceModelItem, not deallocated. Rick recognized this as part of Xcode's indexer, which was always running in another thread at the time of the crash. A buffer overrun from the DVTSourceModelItem object fit the symptoms.

But where was the buffer? I had expected an overrun in some heap-allocated C array, not an ordinary object. Nor did DVTSourceModelItem have any C arrays in its instance variables.

Theory: the compiler or runtime had allocated too little memory for the instance of class DVTSourceModelItem, and ordinary ivar access had overrun that allocation. It was a long shot, but easy to test. malloc_size() and class_getInstanceSize() and an eyeball count of ivars all agreed that the object was 32 bytes. Theory disproved.

We tested the overrun theory again. Add an unused ivar to the end of DVTSourceModelItem, recompile, and run it. No crash. Remove the ivar. Crash. The extra ivar "fixed" the bug. The buffer overrun theory still fit the evidence, but we couldn't find it.

No more ideas. We needed data. Debugger watchpoints were out: there were thousands of instances of DVTSourceModelItem, and we couldn't watch two bytes after each of them. We were not yet desperate enough to try brute force code inspection. AUTO_USE_GUARDS=YES could catch it, if it didn't fall over first. Since we had a suspect in mind, we could play the guardmalloc trick ourselves with a narrower target. Override +[DVTSourceModelItem alloc], mprotect() the page after the allocation, and cross our fingers really hard hoping that it still reproduced after changing the timing so much.

Bang! It crashed (good) somewhere new (also good). DVTSourceModelItem -init was writing to one of its own instance variables. The ivar was a bit in a bitfield, and that bitfield was at the end of the ivar list.

Disassemble. The generated code read 4 bytes around the bit into a register, change the bit in that register, and wrote the 4 bytes back to memory. That's typical for a bitfield. The unexpected part was that the 4 bytes spanned the last two bytes of the object and the first two bytes after the object. That's a bug. Most of the time the out of bounds access is invalid - it reads two bytes it shouldn't, and writes back the same value. But if there's another thread it can crash:

Thread 1Thread 2
reads four bytes, including two
bytes outside the object
 
 allocates a new object
 writes an isa pointer
writes four bytes, clobbering the
new value written by Thread 2
 
 crashes

Theory: a compiler bug generated bad code for DVTSourceModelItem's bitfield ivar, causing a read-modify-write out of bounds by two bytes, which corrupted memory in other threads. Test: try a different compiler. DVTSourceModelItem.m was built with clang, so we recompiled with llvm-gcc. No crash, and the disassembly looked correct. Compile with clang again, crash again.

Diagnosis: clang compiler bug in bitfield ivars. The patient's symptoms were treated with an extra ivar in DVTSourceModelItem until a compiler transplant could be performed.

Elapsed time: about three hours. Too long for an episode of a TV procedural drama, unfortunately.

(link) TargetConditionals.h   (2010-8-16 2:30 PM)
 
Mac OS XiOS deviceiOS simulator
TARGET_OS_MAC111
TARGET_OS_IPHONE011
TARGET_OS_EMBEDDED010
TARGET_IPHONE_SIMULATOR001

(link) Do-it-yourself Objective-C weak import   (2010-4-8 10:23 PM)
 

WARNING DANGER HAZARD BEWARE EEK

The scheme described herein is UNTESTED and probably BUGGY. Use at your own risk.

Executive summary

The Objective-C runtime supports weak-imported classes back to iPhone OS 3.1. An app could use a class added in iPhone OS 3.2 or 4.0 and still run on 3.1. The app would check if [SomeClass class] is nil and act accordingly.

Unfortunately, the compilers and class declarations in framework headers do not support weak import yet. But you may be able to use weak linking anyway, by adding the right incantations yourself.

To use a class SomeClass that is unavailable on some of your app's deployment targets, write this in every file that uses the class:

    asm(".weak_reference _OBJC_CLASS_$_SomeClass");
To subclass a class SomeClass that is unavailable on some of your app's deployment targets, write this in the file containing your subclass's @implementation:
    asm(".weak_reference _OBJC_CLASS_$_SomeClass");
    asm(".weak_reference _OBJC_METACLASS_$_SomeClass");
This will not work for apps running on iPhone OS 3.0 or older. Only iPhone OS 3.1 and newer has any hope of success. Of course, since this is UNTESTED it may not work there either.

How it works

Say you're writing a game, and want to use the hypothetical UIDancePad class added to iPhone OS 3.2. (Do not dance on iPad.) When you use class UIDancePad in your code, the compiler emits a C symbol pointing to the class:

    .long _OBJC_CLASS_$_UIDancePad

Since UIDancePad is in a framework instead of your code, the symbol remains undefined in your executable, as shown by `nm -m`:

    (undefined) external _OBJC_CLASS_$_UIDancePad (from DanceKit)

When you run on iPhone OS 3.2, everything works great: the dynamic loader opens your executable and DanceKit, and binds your undefined symbol to their class definition.

Things don't go so well on iPhone OS 3.1. DanceKit exists but does not define UIDancePad. The dynamic loader is unable to resolve your undefined symbol, and the process halts:

    dyld: Symbol not found: _OBJC_CLASS_$_UIDancePad
        Referenced from: /path/to/YourApp
        Expected in: /path/to/DanceKit

Weak import solves this. The compiled symbol reference is now a weak one:

    .weak_reference _OBJC_CLASS_$_UIDancePad
    .long _OBJC_CLASS_$_UIDancePad

    (undefined) weak external _OBJC_CLASS_$_UIDancePad (from DanceKit)

The dynamic loader shrugs its shoulders if a weak reference cannot be resolved, and sets the pointer to NULL. The Objective-C runtime sees the NULL pointer and fixes up the rest of the metadata as if UIDancePad never existed.

As mentioned above, the compiler and framework header support is not yet in place. The incantations simply add the assembler directives that the compiler does not yet know how to emit:

    asm(".weak_reference _OBJC_CLASS_$_UIDancePad");

Et voilà: weak import of an Objective-C class. Well, maybe. I have only tested this on toy examples, none of which got anywhere close to any version of iPhone OS. Coder beware!

(What about the _OBJC_METACLASS symbol, you ask? When you subclass a class, your subclass's metaclass's superclass pointer points to the subclass's superclass's metaclass. In other words, your subclass's @implementation points to both its superclass and its superclass's metaclass. That requires two symbols: one for the class and one for the metaclass. When you simply use a class without subclassing it, you don't need the metaclass pointer.)

(link) [objc explain]: Weak-import classes   (2009-09-09 1:30 PM)
 

Weak-import classes are a useful new Objective-C feature that you can't use yet.

Weak import is a solution when you want to use something from a framework, but still need to be compatible with older versions of the framework that didn't support it yet. Using weak import you can test if the feature exists at runtime before you try to use it.

Objective-C has not previously supported weak import for classes. Instead you had to use clumsy runtime introspection to check whether a class was available, store a pointer to that class in a variable, and use that variable when you wanted to send a message to the class. Even worse, there was no reasonable way to create your own subclass of a superclass that might be unavailable. Some developers put the subclass in a separate library that was not loaded until after checking that the superclass was present, but even that trick is not allowed on iPhone OS.

Weak import for C functions works by checking the weak-imported function pointer's value before calling it:

    if (NSNewFunction != NULL) {
        NSNewFunction(...);
    } else {
        // NSNewFunction not supported on this system
    }

The same mechanism is a natural fit with Objective-C classes and Objective-C's handling of messages to nil. These constructs are much nicer than NSClassFromString() or a separate NSBundle.

    if ([NSNewClass class] != nil) {
        [NSNewClass doSomething];
    } else {
        // NSNewClass is unavailable on this system
    }
    @interface MySubclass : NSNewClass ... @end
    MySubclass *obj = [[MySubclass alloc] init];
    if (!obj) {
        // MySubclass (or a superclass thereof) is unavailable on this system
    }

Weak import of Objective-C classes is now available. But you can't use it yet. First, it's only supported today on iPhone OS 3.1; it's expected to arrive in a future Mac OS.

Second, there's nothing you can do with weak import until the first OS update after iPhone OS 3.1. Then you could write an app that adopted new features in that future version, and used weak import to be compatible with 3.1. (It still could not run on 3.0 or 2.x, because those systems lack the runtime machinery to process the weak import references.)

Weak import for Objective-C did not make Snow Leopard for scheduling reasons. Assuming it ships in Mac OS X 10.7 Cat Name Forthcoming, you won't be able to use it until Mac OS X 10.8 LOLcat.

(link) Colorized keyboard backlight   (2009-09-05 1:15 AM)
 

I use my MacBook Pro for astronomy. The backlit keyboard would be great in the dark, but its white light is bad for night vision. This mod makes it red, or any other color you want.

(link) [objc explain]: Selector uniquing in the dyld shared cache   (2009-09-01 2:10 AM)
 

Mac OS X Snow Leopard cuts in half the launch-time overhead of starting the Objective-C runtime, and simultaneously saves a few hundred KB of memory per app. This comes for free to every app, courtesy of one of the few pieces of Mac OS X that lives below even the Objective-C runtime: dyld.

dyld and the shared cache

dyld is the dynamic loader and linker. When your process starts, dyld loads your executable and its shared libraries into memory, links the cross-library C function and variable references together, and starts execution on its way towards main().

In theory a shared library could be different every time your program is run. In practice, you get the same version of the shared libraries almost every time you run, and so does every other process on the system. The system takes advantage of this by building the dyld shared cache. The shared cache contains a copy of many system libraries, with most of dyld's linking and loading work done in advance. Every process can then share that shared cache, saving memory and launch time.

(Incidentally, the shared cache beats the pants off the pre-Leopard prebinding system that was supposed to achieve the same optimizations. Remember the post-install "Optimizing System Performance" step that often took longer than the install itself? That was prebinding being updated. Rebuilding the shared cache is so blazingly fast that the installer doesn't bother to report it anymore.)

Objective-C selector uniquing

Leopard's dyld shared cache is great for C code, but it didn't do anything to help Objective-C's startup overhead. The single biggest launch cost for Objective-C is selector uniquing. The app and every shared library contain their own copies of selector names like "alloc" and "init". The runtime needs to choose a single canonical SEL pointer value for each selector name, and then update the metadata for every call site and method list to use the blessed unique value. This means building a big hash table (memory), calling strcmp() a lot (time), and modifying copy-on-write metadata (more memory).

There are tens of thousands of unique selectors present in a typical process. If you run `strings /usr/lib/libobjc.dylib` on Leopard you can see the thirty-thousand-line built-in selector table that was a previous attempt to reduce the memory cost. Even so the cost goes up with every new class and method added to Cocoa.framework; left unchecked, an identical app would take longer to launch and use more memory after every OS upgrade.

The obvious solution? Do the work of selector uniquing in the dyld shared cache. Build a selector table into the shared cache itself, and update the selector references in the cached copy of the shared libraries. Then you save memory because every process shares the same selector table, and save time because the runtime does not need to rebuild it during every app launch. The runtime only needs to fix the selector references from the app itself. The catch? Selectors are too dynamic to be implemented as C symbols, so the shared cache construction tool needed to be taught how to read and write Objective-C's metadata.

Optimization WIN

Snow Leopard's dyld shared cache uniques Objective-C selectors, and Snow Leopard's Objective-C runtime recognizes when the selectors in a shared library are already uniqued courtesy of the shared cache. About half of the runtime's initialization time is eliminated, making warm app launch several tenths of a second faster. Typical memory savings is 200-500 KB per process, adding up to a few megabytes system-wide. When this optimization ships on the iPhone OS side, it's estimated to save 1 MB on a 128 MB device. The iPhone performance team would pay any number of arms and legs for that kind of gain.

You can watch the system in action with various debugging flags.

$ sudo /usr/bin/update_dyld_shared_cache -debug -verify
[...]
update_dyld_shared_cache: for x86_64, uniquing objc selectors
update_dyld_shared_cache: for x86_64, found 68761 unique objc selectors
update_dyld_shared_cache: for x86_64, 541736/590908 bytes (91%) used in libobjc unique selector section
update_dyld_shared_cache: for x86_64, updated 205230 selector references

$ OBJC_PRINT_PREOPTIMIZATION=YES /usr/bin/defaults
objc[424]: PREOPTIMIZATION: selector preoptimization ENABLED (version 3)
objc[424]: PREOPTIMIZATION: honoring preoptimized selectors in /usr/lib/libobjc.A.dylib
objc[424]: PREOPTIMIZATION: honoring preoptimized selectors in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
objc[424]: PREOPTIMIZATION: honoring preoptimized selectors in /System/Library/Frameworks/CoreServices.framework/Versions/A/Frameworks/Metadata.framework/Versions/A/Metadata
objc[424]: PREOPTIMIZATION: honoring preoptimized selectors in /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation

You can estimate the memory savings with the allmemory tool. Record post-launch memory usage of an app run with and without environment variable OBJC_DISABLE_PREOPTIMIZATION=YES. Look for the count of dirty pages; each dirty page is 4 KB eaten by that process. With 64-bit TextEdit I see the dirty page count jump from 725 to 1069 after disabling the optimization. This is an overestimate - many of those pages would have been not-dirty in Leopard because of the old built-in selector table - but it does show the magnitude of the win.

The Objective-C runtime does more than just selector uniquing during launch. Future improvements to the dyld shared cache may precompute some of that other work, to further improve launch time, save memory, and reduce the cost of linking to Objective-C code that you don't actually use. But selector uniquing as seen in Snow Leopard is by far the biggest bang for the buck.

archive

seal! Greg Parker
gparker-web@sealiesoftware.com
Sealie Software