Mach-O (Mach Object) is the binary format used for all executables, dynamic libraries (.dylib), frameworks, and kernel extensions on Apple platforms.


Why You Need to Understand Mach-O

  • RE tools (IDA, Ghidra) parse Mach-O to display code – you need to know what they are parsing
  • Exploit payloads often need to craft fake Mach-O headers (fake objects)
  • Code signing signatures reside within the Mach-O structure
  • The dyld shared cache is a mega-Mach-O containing most system frameworks
  • The kernelcache is also in Mach-O format

Overall Structure

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚       Mach-O Header         β”‚  ← magic, cputype, filetype, ncmds
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚      Load Commands          β”‚  ← LC_SEGMENT_64, LC_SYMTAB, LC_DYSYMTAB, ...
β”‚  (array of command structs) β”‚     Describes the binary's layout in memory
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                             β”‚
β”‚         Segments            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ __TEXT segment         β”‚  β”‚  ← Code, read-only data, string constants
β”‚  β”‚  β”œβ”€β”€ __text section    β”‚  β”‚     Machine code
β”‚  β”‚  β”œβ”€β”€ __stubs           β”‚  β”‚     PLT-equivalent for lazy binding
β”‚  β”‚  β”œβ”€β”€ __stub_helper     β”‚  β”‚     Helper code for lazy binding
β”‚  β”‚  β”œβ”€β”€ __cstring         β”‚  β”‚     C string literals
β”‚  β”‚  └── __const           β”‚  β”‚     Read-only constants
β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€  β”‚
β”‚  β”‚ __DATA segment         β”‚  β”‚  ← Writable data
β”‚  β”‚  β”œβ”€β”€ __data            β”‚  β”‚     Initialized global variables
β”‚  β”‚  β”œβ”€β”€ __bss             β”‚  β”‚     Uninitialized globals (zero-filled)
β”‚  β”‚  β”œβ”€β”€ __objc_classlist  β”‚  β”‚     ObjC class definitions
β”‚  β”‚  β”œβ”€β”€ __objc_selrefs    β”‚  β”‚     ObjC selector references
β”‚  β”‚  β”œβ”€β”€ __got             β”‚  β”‚     Global Offset Table
β”‚  β”‚  └── __la_symbol_ptr   β”‚  β”‚     Lazy symbol pointers
β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€  β”‚
β”‚  β”‚ __DATA_CONST segment   β”‚  β”‚  ← Data writable only at load time
β”‚  β”‚  └── __const           β”‚  β”‚     vtables, method tables (locked after dyld)
β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€  β”‚
β”‚  β”‚ __LINKEDIT segment     β”‚  β”‚  ← Metadata for linker
β”‚  β”‚  (symbol table, string β”‚  β”‚
β”‚  β”‚   table, code signatureβ”‚  β”‚
β”‚  β”‚   relocation info)     β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Detailed Components

1. Mach-O Header

struct mach_header_64 {
    uint32_t magic;       // 0xFEEDFACF (64-bit) or 0xFEEDFACE (32-bit)
    cpu_type_t cputype;   // CPU_TYPE_ARM64 = 0x0100000C
    cpu_subtype_t cpusubtype;  // CPU_SUBTYPE_ARM64E = 0x02 (PAC-enabled)
    uint32_t filetype;    // MH_EXECUTE, MH_DYLIB, MH_KEXT_BUNDLE, ...
    uint32_t ncmds;       // Number of load commands
    uint32_t sizeofcmds;  // Total size of load commands
    uint32_t flags;       // MH_PIE, MH_NO_HEAP_EXECUTION, ...
    uint32_t reserved;    // Padding (64-bit only)
};

Important file types:

Constant Value Meaning
MH_EXECUTE 0x2 Executable (apps, daemons)
MH_DYLIB 0x6 Dynamic library (.dylib)
MH_BUNDLE 0x8 Loadable bundle (.bundle)
MH_DYLINKER 0x7 Dynamic linker (dyld itself)
MH_KEXT_BUNDLE 0xB Kernel extension
MH_FILESET 0xC Kernelcache (iOS 12+, contains kernel + kexts)

Important CPU subtypes:

  • CPU_SUBTYPE_ARM64_ALL (0x0) – standard arm64
  • CPU_SUBTYPE_ARM64E (0x2) – arm64e with PAC support (A12+)

2. Load Commands

Load commands are an array of structs placed sequentially after the header. Each command starts with:

struct load_command {
    uint32_t cmd;       // LC_SEGMENT_64, LC_SYMTAB, ...
    uint32_t cmdsize;   // Size of this command (including this header)
};

Most important load commands:

LC_SEGMENT_64 – Defines memory segments

struct segment_command_64 {
    uint32_t  cmd;            // LC_SEGMENT_64
    uint32_t  cmdsize;
    char      segname[16];    // "__TEXT", "__DATA", ...
    uint64_t  vmaddr;         // Virtual address when loaded
    uint64_t  vmsize;         // Size in virtual memory
    uint64_t  fileoff;        // Offset in file
    uint64_t  filesize;       // Size in file
    vm_prot_t maxprot;        // Maximum protection (r/w/x)
    vm_prot_t initprot;       // Initial protection
    uint32_t  nsects;         // Number of sections in segment
    uint32_t  flags;
};

Protection bits:

  • VM_PROT_READ (0x1) – __TEXT has read
  • VM_PROT_WRITE (0x2) – __DATA has write
  • VM_PROT_EXECUTE (0x4) – __TEXT has execute
  • __TEXT: r-x (read + execute, no write – W^X policy)
  • __DATA: rw- (read + write, no execute)

LC_CODE_SIGNATURE – Code signing data

Points to the code signature blob in __LINKEDIT. Contains:

  • Code Directory: hash of each page
  • CMS Signature: PKCS#7 certificate chain + signature
  • Entitlements blob: plist XML/binary
  • Requirements blob: code requirements

LC_ENCRYPTION_INFO_64 – App Store encryption

App Store apps have the __TEXT segment encrypted. Must be decrypted before analysis.

Others

  • LC_SYMTAB – Symbol table location
  • LC_DYSYMTAB – Dynamic symbol table
  • LC_LOAD_DYLIB – Dependency declarations
  • LC_MAIN – Entry point (offset from __TEXT)
  • LC_UUID – Unique build identifier
  • LC_SOURCE_VERSION – Source version info

3. Sections

Each segment contains multiple sections:

struct section_64 {
    char      sectname[16];   // "__text", "__cstring", ...
    char      segname[16];    // Parent segment name
    uint64_t  addr;           // Virtual address
    uint64_t  size;
    uint32_t  offset;         // File offset
    uint32_t  align;          // Alignment (power of 2)
    uint32_t  reloff;         // Relocation entries offset
    uint32_t  nreloc;         // Number of relocations
    uint32_t  flags;          // Section type + attributes
    uint32_t  reserved1;      // Indirect symbol table index (for stubs)
    uint32_t  reserved2;      // Stub size (for stubs)
    uint32_t  reserved3;
};

Sections important for exploitation:

Section Segment Content Exploitation relevance
__text __TEXT Machine code Gadget hunting, code analysis
__stubs __TEXT Lazy binding stubs Indirect calls, hooking points
__const __TEXT Read-only constants vtable locations, method tables
__const __DATA_CONST Writable-at-load constants vtable overwrite (before locked)
__got __DATA Global Offset Table GOT overwrite attacks
__la_symbol_ptr __DATA Lazy symbol pointers Symbol pointer overwrite
__objc_classlist __DATA ObjC class list Class method swizzling
__bss __DATA Zero-initialized data Uninitialized variable exploitation

4. Fat / Universal Binaries

struct fat_header {
    uint32_t magic;       // 0xCAFEBABE (big-endian!)
    uint32_t nfat_arch;   // Number of architectures
};

struct fat_arch {
    cpu_type_t cputype;
    cpu_subtype_t cpusubtype;
    uint32_t offset;      // Offset to Mach-O for this arch
    uint32_t size;
    uint32_t align;
};

A fat binary is a container holding multiple Mach-Os (arm64, arm64e, x86_64, …). dyld selects the correct architecture when loading.

5. Kernelcache (MH_FILESET)

On iOS 12+, the kernelcache uses the MH_FILESET type – containing the kernel + all kernel extensions in a single file:

Kernelcache (MH_FILESET)
β”œβ”€β”€ kernel (com.apple.kernel)
β”œβ”€β”€ com.apple.iokit.IOSurface
β”œβ”€β”€ com.apple.iokit.IOGraphicsFamily
β”œβ”€β”€ com.apple.driver.AppleARMPlatform
β”œβ”€β”€ com.apple.security.sandbox
β”œβ”€β”€ ... (hundreds of kexts)

Each kext entry is an embedded Mach-O with its own segments/sections.


Practical Analysis

Dump Mach-O header

# View header
otool -hV /usr/bin/ls
# Output:
# Mach header
#       magic  cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
# MH_MAGIC_64    ARM64        ALL  0x00     EXECUTE    19       1496   NOUNDEFS DYLDLINK TWOLEVEL PIE

# View load commands
otool -lV /usr/bin/ls

# View sections
otool -l /usr/bin/ls | grep -A5 "sectname"

Parse with Python

import struct

def parse_macho(path):
    with open(path, 'rb') as f:
        magic = struct.unpack('<I', f.read(4))[0]
        if magic == 0xFEEDFACF:
            cputype, cpusubtype, filetype, ncmds, sizeofcmds, flags, reserved = \
                struct.unpack('<IIIIIiI', f.read(28))
            print(f"64-bit Mach-O, {ncmds} load commands")
            for i in range(ncmds):
                cmd, cmdsize = struct.unpack('<II', f.read(8))
                print(f"  LC #{i}: cmd=0x{cmd:x}, size={cmdsize}")
                f.read(cmdsize - 8)  # skip rest of command

Resources

  • Apple <mach-o/loader.h> – definitive struct definitions
  • LIEF Project – Library for parsing Mach-O (Python/C++)
  • Jonathan Levin – macOS and iOS Internals Vol. I (Binary Format chapter)
  • Mach-O Wikipedia

Exercises

  1. Hex dump analysis: Open a Mach-O binary in a hex editor, identify the header, first load command, and __TEXT segment by hand
  2. Write a Mach-O parser: Write a Python script to parse the header + all load commands
  3. Compare arm64 vs arm64e: Extract the same binary for both architectures, compare load commands
  4. Extract embedded Mach-Os: From an MH_FILESET kernelcache, extract one kext Mach-O
  5. Modify a load command: Change an LC value with a hex editor, observe the behavior