Contents
  1. 1. dyld: could not load inserted library
  2. 2. libMainThreadChecker.dylib
  3. 3. DYLD_INSERT_LIBRARIES 注入
    1. 3.1. dyld 的源码
    2. 3.2. 实操
    3. 3.3. 防动态注入

dyld: could not load inserted library

从 dyld: could not load inserted library ‘/usr/lib/libgmalloc.dylib’ because image not found 说起。上一篇 Malloc Scribble 讲到可以打开工程里的 Malloc Scribble 开关来检查内存,但是真机开启的时候会报错

dyld: could not load inserted library '/usr/lib/libgmalloc.dylib' because image not found

这是因为手机上的 /usr/lib 找不到这个文件。在 iOS 12 上已经不崩溃了,改成警告了,只是会打个日志出来。

libMainThreadChecker.dylib

很早以前看过 satanwoo 写的 基于桥的全量方法Hook方案 - 探究苹果主线程检查实现,文章探讨了 libMainThreadChecker.dylib 的实现,在文章结尾他留下了一个问题未明(不过他现在已经知道了,只是没更新而已)。

我在开启Main Thread Chekcer后,build了一次产物,但是在通过Mach-O文件中Load Commands部分的时候,却没有发现libMainThreadChecker.dylib的踪影.
符号断点dlopen也并没有发现这个动态库调用的踪影,所以非常好奇苹果是怎么加载这个动态库的,有大佬知道请赐教。

最近我在搞越狱、逆向之类的工作。想到这个未决之事我第一想法就是注入的。

DYLD_INSERT_LIBRARIES 注入

前两天看了 tk 教主机器人发的文章,有篇讲 Mac 注入/反注入的文章 https://theevilbit.github.io/posts/dyld_insert_libraries_dylib_injection_in_macos_osx_deep_dive/ 。里面提到 Mac 可以通过 DYLD_INSERT_LIBRARIES 环境变量来注入动态库。

新创建一个 iOS 应用,勾选 MainThreadCheck。打印 DYLD_INSERT_LIBRARIES 的环境变量,发现都没有相关信息。

NSDictionary *dict = [[NSProcessInfo processInfo] environment];
getenv("DYLD_INSERT_LIBRARIES");

不管是哪种方法都没有信息,不过这俩方法应该是同宗同源。这个时候必须祭出 dyld 源码出来解读了。

dyld 的源码

直接搜

dyld: could not load inserted

没有结果,很好,关机吧,洗洗睡。

咋可能。肯定是拼接在一起的字符串打印,把前缀去掉,一把就有

could not load inserted

static void loadInsertedDylib(const char* path)
{
    ImageLoader* image = NULL;
    unsigned cacheIndex;
    try {
        LoadContext context;
        context.useSearchPaths        = false;
        context.useFallbackPaths    = false;
        context.useLdLibraryPath    = false;
        context.implicitRPath        = false;
        context.matchByInstallName    = false;
        context.dontLoad            = false;
        context.mustBeBundle        = false;
        context.mustBeDylib            = true;
        context.canBePIE            = false;
        context.enforceIOSMac        = true;
        context.origin                = NULL;    // can't use @loader_path with DYLD_INSERT_LIBRARIES
        context.rpath                = NULL;
        image = load(path, context, cacheIndex);
    }
    catch (const char* msg) {
        if ( gLinkContext.allowInsertFailures )
            dyld::log("dyld: warning: could not load inserted library '%s' into hardened process because %s\n", path, msg);
        else
            halt(dyld::mkstringf("could not load inserted library '%s' because %s\n", path, msg));
    }
    catch (...) {
        halt(dyld::mkstringf("could not load inserted library '%s'\n", path));
    }
}

很显然了,catch 里的内容就是打印的语句。看函数的名字也狠明显,就是插入动态库。这个函数只有第一个地方在调用——— _main 函数里的 6188 行。

// load any inserted libraries
if    ( sEnv.DYLD_INSERT_LIBRARIES != NULL ) {
    for (const char* const* lib = sEnv.DYLD_INSERT_LIBRARIES; *lib != NULL; ++lib) 
        loadInsertedDylib(*lib);
}
// record count of inserted libraries so that a flat search will look at 
// inserted libraries, then main, then others.
sInsertedDylibCount = sAllImages.size()-1;

sEnv.DYLD_INSERT_LIBRARIES 不是空的就插入动态库,看下 sEnv.DYLD_INSERT_LIBRARIES 的值从哪里来。全局搜索一波 processDyldEnvironmentVariable 函数来的

#endif
    else if ( strcmp(key, "DYLD_IMAGE_SUFFIX") == 0 ) {
        gLinkContext.imageSuffix = parseColonList(value, NULL);
    }
    else if ( strcmp(key, "DYLD_INSERT_LIBRARIES") == 0 ) {
        sEnv.DYLD_INSERT_LIBRARIES = parseColonList(value, NULL);
#if SUPPORT_ACCELERATE_TABLES
        sDisableAcceleratorTables = true;
#endif
    }
    else if ( strcmp(key, "DYLD_PRINT_OPTS") == 0 ) {
        sEnv.DYLD_PRINT_OPTS = true;
    }

检查 key 有没有 DYLD_INSERT_LIBRARIES,有就拿 value 用冒号分割。 parseColonList 函数名很清晰了,内容都不用看。看下函数的签名

void processDyldEnvironmentVariable(const char* key, const char* value, const char* mainExecutableDir)

key 和 value 都是参数,外面传进来的。到外面去找下。checkEnvironmentVariables 函数的 2192 行的函数 processDyldEnvironmentVariable 处理 dyld 环境变量,名字都写好了。数据的来源是函数的参数 const char* envp[],字符串的数组。

static void checkEnvironmentVariables(const char* envp[])
{
    if ( !gLinkContext.allowEnvVarsPath && !gLinkContext.allowEnvVarsPrint )
        return;
    const char** p;
    for(p = envp; *p != NULL; p++) {
        const char* keyEqualsValue = *p;
        if ( strncmp(keyEqualsValue, "DYLD_", 5) == 0 ) {
            const char* equals = strchr(keyEqualsValue, '=');
            if ( equals != NULL ) {
                strlcat(sLoadingCrashMessage, "\n", sizeof(sLoadingCrashMessage));
                strlcat(sLoadingCrashMessage, keyEqualsValue, sizeof(sLoadingCrashMessage));
                const char* value = &equals[1];
                const size_t keyLen = equals-keyEqualsValue;
                char key[keyLen+1];
                strncpy(key, keyEqualsValue, keyLen);
                key[keyLen] = '\0';
                if ( (strncmp(key, "DYLD_PRINT_", 11) == 0) && !gLinkContext.allowEnvVarsPrint )
                    continue;
                processDyldEnvironmentVariable(key, value, NULL);
            }
        }
        else if ( strncmp(keyEqualsValue, "LD_LIBRARY_PATH=", 16) == 0 ) {
            const char* path = &keyEqualsValue[16];
            sEnv.LD_LIBRARY_PATH = parseColonList(path, NULL);
        }
    }

    // 下面的略掉
}

接着再找环境变量的指针从哪里拿就行了。这时我们会发现是来自于 _main 函数,_main 函数的上一级是 start 函数。好嘛,终于找到了。看下 start 函数

uintptr_t start(const struct macho_header* appsMachHeader, int argc, const char* argv[], 
                intptr_t slide, const struct macho_header* dyldsMachHeader,
                uintptr_t* startGlue)
{
    // if kernel had to slide dyld, we need to fix up load sensitive locations
    // we have to do this before using any global variables
    slide = slideOfMainExecutable(dyldsMachHeader);
    bool shouldRebase = slide != 0;
#if __has_feature(ptrauth_calls)
    shouldRebase = true;
#endif
    if ( shouldRebase ) {
        rebaseDyld(dyldsMachHeader, slide);
    }

    // allow dyld to use mach messaging
    mach_init();

    // kernel sets up env pointer to be just past end of agv array
    // ------------------ 重点在这里 ---------------------
    const char** envp = &argv[argc+1];

    // kernel sets up apple pointer to be just past end of envp array
    const char** apple = envp;
    while(*apple != NULL) { ++apple; }
    ++apple;

    // set up random value for stack canary
    __guard_setup(apple);

#if DYLD_INITIALIZER_SUPPORT
    // run all C++ initializers inside dyld
    runDyldInitializers(dyldsMachHeader, slide, argc, argv, envp, apple);
#endif

    // now that we are done bootstrapping dyld, call dyld's main
    uintptr_t appsSlide = slideOfMainExecutable(appsMachHeader);
    return dyld::_main(appsMachHeader, appsSlide, argc, argv, envp, apple, startGlue);
}

env 是 arg 数组紧接下来的一个数组。背景知识就是 unix 高级编程里说的启动参数、环境变量表。
有两个数组(每个最大好像是 1024)地址高的放 arg 表,接下来是 env 表。再下来就到了栈区的内存上了,然后是一段 stack canary 操作。我本来以为是跟地址随机化有关,查了一通发现不是,但跟地址随机化有点类似,都是对抗栈溢出攻击的。

以gcc编译器为例,编译时若打开栈保护开关,则会在函数的进入和返回的地方增加一些检测指令,这些指令的作用是:当进入函数时,在栈上、ret rip之前保存一个只有操作系统知道的数值;当函数返回时,检查栈上这个地方的数值有没有被改写,若被改写了,则中止程序运行。由于这个数值保存在ret rip的前面,因此若ret rip被改写了,它肯定也会被改写。这个数值被形象的称为金丝雀。

作者:pandolia
鏈接:https://www.jianshu.com/p/47d484b9227e
來源:簡書
簡書著作權歸作者所有,任何形式的轉載都請聯繫作者獲得授權並註明出處。

这篇文章讲金丝雀挺好的:https://veritas501.space/2017/04/28/%E8%AE%BAcanary%E7%9A%84%E5%87%A0%E7%A7%8D%E7%8E%A9%E6%B3%95/

实操

拿 Mac App 实验一下:

static char sLoadingCrashMessage[1024] = "dyld: launch, loading dependent libraries";

int main(int argc, char * argv[]) {
  const char ** envp = &argv[argc+1];
  const char** p;
  for(p = envp; *p != NULL; p++) {
      const char* keyEqualsValue = *p;
      if ( strncmp(keyEqualsValue, "DYLD_", 5) == 0 ) {
          const char* equals = strchr(keyEqualsValue, '=');
          if ( equals != NULL ) {
              strlcat(sLoadingCrashMessage, "\n", sizeof(sLoadingCrashMessage));
              strlcat(sLoadingCrashMessage, keyEqualsValue, sizeof(sLoadingCrashMessage));
              const char* value = &equals[1];
              const size_t keyLen = equals-keyEqualsValue;
              char key[keyLen+1];
              strncpy(key, keyEqualsValue, keyLen);
              key[keyLen] = '\0';
              printf("%s\n", key);
              printf("%s\n", value);

              printf("\n");
              //                processDyldEnvironmentVariable(key, value, NULL);
          }
      }
  }

  return NSApplicationMain(argc, argv);
}

这段代码就是从 checkEnvironmentVariables 函数里抠出来的,实际运行一下看到结果(用户名删掉):

DYLD_LIBRARY_PATH
/Users/Library/Developer/Xcode/DerivedData/Build/Products/Debug:/usr/lib/system/introspection

DYLD_INSERT_LIBRARIES
/Applications/Xcode.app/Contents/Developer/usr/lib/libBacktraceRecording.dylib:/Applications/Xcode.app/Contents/Developer/usr/lib/libMainThreadChecker.dylib:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/Library/Debugger/libViewDebuggerSupport.dylib

DYLD_FRAMEWORK_PATH
/Users//Library/Developer/Xcode/DerivedData/Build/Products/Debug

在 DYLD_INSERT_LIBRARIES 看到了 libMainThreadChecker.dylib。完美~

实际上这段跟直接调用 getenv("DYLD_INSERT_LIBRARIES"); 是没有区别的。

但是在 iOS 上有问题了。

在 iOS 模拟器跑出来的结果是

DYLD_FRAMEWORK_PATH
/Users/Library/Developer/Xcode/DerivedData/Build/Products/Debug-iphonesimulator

DYLD_FALLBACK_LIBRARY_PATH
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib

DYLD_ROOT_PATH
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot

DYLD_FALLBACK_FRAMEWORK_PATH
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/System/Library/Frameworks

DYLD_LIBRARY_PATH
/Users/Library/Developer/Xcode/DerivedData/Build/Products/Debug-iphonesimulator:/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/system/introspection

DYLD_INSERT_LIBRARIES 消失了。

在 iOS 真机跑的结果是

DYLD_LIBRARY_PATH
/usr/lib/system/introspection

信息更少了。

估计是真到了 App 的 main 函数时传进来的各个参数被处理过,截断了一些。
这个时候开启 DYLD_PRINT_ENV 参数就可以看到了。

DYLD_INSERT_LIBRARIES=/Developer/usr/lib/libBacktraceRecording.dylib:/Developer/usr/lib/libMainThreadChecker.dylib:/Developer/Library/PrivateFrameworks/DTDDISupport.framework/libViewDebuggerSupport.dylib

可以看到真机是从 /Developer/usr/lib/ 目录下加载的相关动态库。模拟器的信息就更多了,只截取感兴趣的部分贴

DYLD_INSERT_LIBRARIES=/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libBacktraceRecording.dylib:/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib/libMainThreadChecker.dylib:/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/Developer/Library/PrivateFrameworks/DTDDISupport.framework/libViewDebuggerSupport.dylib

DYLD_FALLBACK_LIBRARY_PATH
/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/Library/CoreSimulator/Profiles/Runtimes/iOS.simruntime/Contents/Resources/RuntimeRoot/usr/lib

DYLD_FALLBACK_LIBRARY_PATH 就是tbd 对应的真实 dylib 提到的那个路径。
我们的 libMainThreadChecker.dylib 也是这个路径下拿来加载的。

防动态注入

上面说的这些动态注入,进而可以引申另一个话题——防动态注入。暂且不表。

Contents
  1. 1. dyld: could not load inserted library
  2. 2. libMainThreadChecker.dylib
  3. 3. DYLD_INSERT_LIBRARIES 注入
    1. 3.1. dyld 的源码
    2. 3.2. 实操
    3. 3.3. 防动态注入