Bug 97229 - pointer-compare sanitizer is very slow due to __asan::IsAddressNearGlobal
Summary: pointer-compare sanitizer is very slow due to __asan::IsAddressNearGlobal
Status: WAITING
Alias: None
Product: gcc
Classification: Unclassified
Component: sanitizer (show other bugs)
Version: 10.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL: https://github.com/google/sanitizers/...
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-28 13:34 UTC by Milian Wolff
Modified: 2020-10-12 10:46 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2020-09-29 00:00:00


Attachments
Reduced test-case (2.27 KB, text/plain)
2020-10-02 09:48 UTC, Martin Liška
Details
Hackish patch candidate (1.22 KB, patch)
2020-10-02 09:51 UTC, Martin Liška
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Milian Wolff 2020-09-28 13:34:49 UTC
I am trying to use the pointer-compare sanitizer during product development. I noticed that it is usually fine from a performance POV, but one specific code path is getting extremely slow. Quasi 99% of the CPU samples point at this backtrace:

```
<our code>
__sanitizer_ptr_cmp
__asanCheckForInvalidPointerPair
__asanCheckForInvalidPointerPair
__asan::IsInvalidPointerPair
__asan::GetGlobalAddressInformation(unsigned long, unsigned long, ...)
__asan::GetGlobalsForAddress(unsigned long, __asan_global*, ...)
__asan::isAddressNearGlobal
```

I have tried to simulate what our code does in this simplistic example: It copies one file to another in a stupid way via mmap. The pointer comparison is within the copy() function below.

```
#include <cstdio>
#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>

#include <algorithm>

static
__attribute__((noinline))
void copy(const unsigned char *source, size_t source_size,
          unsigned char *target, size_t target_size)
{
    if (target + source_size > target + target_size) {
        fprintf(stderr, "bad offsets: %zu %zu\n", target_size, source_size);
        return;
    }
    std::copy_n(source, source_size, target);
}

unsigned char* mapBuffer(const char *path, size_t size)
{
    auto fd = open(path, O_CREAT | O_RDWR, 0600);
    if (fd == -1) {
        perror("failed to open file");
        return nullptr;
    }

    if (posix_fallocate64(fd, 0, size) != 0) {
        perror("failed to resize file");
        close(fd);
        return nullptr;
    }

    auto buffer = mmap(nullptr, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    close(fd);

    if (!buffer) {
        perror("failed to mmap file");
        return nullptr;
    }

    return reinterpret_cast<unsigned char*>(buffer);
}

int main(int argc, char **argv)
{
    if (argc != 3) {
        fprintf(stderr, "USAGE: ./a.out BUFFER_SIZE COPY_SIZE\n");
        return 1;
    }

    const auto size_i = atoi(argv[1]);
    if (size_i < 0) {
        fprintf(stderr, "bad size: %d\n", size_i);
        return 1;
    }
    const auto size = static_cast<size_t>(size_i);

    const auto copySize_i = atoi(argv[2]);
    if (copySize_i < 0 || copySize_i > size_i || (size_i % copySize_i) != 0) {
        fprintf(stderr, "bad copy size: %d %d\n", copySize_i, size_i);
        return 1;
    }
    const auto copySize = static_cast<size_t>(copySize_i);

    auto source = mapBuffer("/tmp/source.dat", size);
    if (!source) {
        return 1;
    }

    auto target = mapBuffer("/tmp/target.dat", size);
    if (!target) {
        return 1;
    }

    for (int i = 0; i < size; i += copySize) {
        copy(source + i, copySize, target + i, copySize);
    }

    munmap(source, size);
    munmap(target, size);
    return 0;
}
```

But that demo does not show the extreme slow down. It is actually behaving quite well, at most 10% slow down, when enabling pointer-compare with the ASAN_OPTIONS env var.

In the real application, the slow-down is more in the order of 100x or more. That app links in a lot of other libraries and also runs code in multiple threads, so I suspect that the issue I'm seeing is related to the amount of globals and potentially libraries available in the application?  Any idea how I could reproduce this to create a proper MWE?
Comment 1 Martin Liška 2020-09-29 15:15:32 UTC
Thank you Millian for the report. I'm the author of the pointer-compare run-time implementation and I can help.

Can you please paste 'perf report' of your real application. I bet you have quite some globals and we stupidly iterate over them here:
https://github.com/gcc-mirror/gcc/blob/master/libsanitizer/asan/asan_globals.cpp#L108-L127
Comment 2 Milian Wolff 2020-09-30 13:26:13 UTC
As I said, >99% of the samples point to this backtrace:

```
<our code>
__sanitizer_ptr_cmp
__asanCheckForInvalidPointerPair
__asanCheckForInvalidPointerPair
__asan::IsInvalidPointerPair
__asan::GetGlobalAddressInformation(unsigned long, unsigned long, ...)
__asan::GetGlobalsForAddress(unsigned long, __asan_global*, ...)
__asan::isAddressNearGlobal
```

if you want per-line cost attribution, I'd first have to compile the sanitizer runtime with debug symbols.

If you really need the `perf report` output instead (why?) I can redo the measurement again.
Comment 3 Martin Liška 2020-10-02 09:48:18 UTC
Created attachment 49298 [details]
Reduced test-case

There's reduced test-case:

$ gcc pointer-cmp.c -fsanitize=address,pointer-compare && ASAN_OPTIONS="detect_invalid_pointer_pairs=2" perf record ./a.out
$ perf report --stdio
...
    95.22%  a.out    libasan.so.6.0.0    [.] __asan::GetGlobalsForAddress
     2.13%  a.out    libasan.so.6.0.0    [.] __sanitizer::internal_memcpy
     0.75%  a.out    libasan.so.6.0.0    [.] __sanitizer::BlockingMutex::Lock
     0.60%  a.out    libasan.so.6.0.0    [.] __sanitizer::BlockingMutex::Unlock
Comment 4 Martin Liška 2020-10-02 09:50:14 UTC
I've got a hackish patch that tries to resolve that.
Basically, linear iteration of globals is very slow and a better data structure should be used (I used sorted list), so each lookup is at least O(log N).
I'm going to report that to upstream.
Comment 5 Martin Liška 2020-10-02 09:51:38 UTC
Created attachment 49299 [details]
Hackish patch candidate
Comment 6 Martin Liška 2020-10-02 09:53:15 UTC
Waiting for an upstream fix.