MIME-Version: 1.0
From:   Chris Chilvers <chilversc@gmail.com>
Date:   Fri, 3 Feb 2023 11:17:28 +0000
Message-ID: <CAAmbk-cQNY3Sd9iQ7vghqw_=sk9JsG-_Mf-OM_iRuw+h8j2E_w@mail.gmail.com>
Subject: decant_cull_table intermittently aborting cachefilesd
To:     linux-cachefs@redhat.com, linux-nfs@vger.kernel.org,
        benmaynard@google.com, brennandoyle@google.com, tom@gunpowder.tech,
        daire@dneg.com, Chris Chilvers <chris.chilvers@appsbroker.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

I have been having an issue where cachefilesd will randomly crash causing the
cache to be withdrawn. The crash is intermittent and can sometimes happen
within minutes, other times it can take hours, or never.

Fortunately it has produced a crash dump so I've been able to analyse what
happened.

From the stack trace (and debug logging) the last operation it was running is
the decant_cull_table. The code fails in the check block at the end of the
function when it calls abort().

    (gdb) bt
    #0  __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140614334650176) at ./nptl/pthread_kill.c:44
    #1  __pthread_kill_internal (signo=6, threadid=140614334650176) at
./nptl/pthread_kill.c:78
    #2  __GI___pthread_kill (threadid=140614334650176,
signo=signo@entry=6) at ./nptl/pthread_kill.c:89
    #3  0x00007fe353442476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
    #4  0x00007fe3534287f3 in __GI_abort () at ./stdlib/abort.c:79
    #5  0x0000556d6c9f0965 in decant_cull_table () at cachefilesd.c:1571
    #6  cachefilesd () at cachefilesd.c:780
    #7  0x0000556d6c9f140b in main (argc=<optimized out>,
argv=<optimized out>) at cachefilesd.c:581

For reference the code at frame 5 from the decant_cull_table function is:

    check:
        for (loop = 0; loop < nr_in_ready_table; loop++)
            if (((long)cullready[loop] & 0xf0000000) == 0x60000000)
                abort();

Checking the cull table, the first object in the cull table appears to be
valid.

    (gdb) p nr_in_ready_table
    $1 = 242

    (gdb) p cullready[0]
    $2 = (struct object *) 0x556d6d7382a0

    (gdb) p -pretty -- *cullready[0]
    $3 = {
        parent = 0x556d6d7352b0,
        children = 0x0,
        next = 0x0,
        prev = 0x0,
        dir = 0x0,
        ino = 13631753,
        usage = 1,
        empty = false,
        new = false,
        cullable = true,
        type = OBJTYPE_DATA,
        atime = 1675349423,
        name = "E"
    }

The inode number from the struct matches a file in the fscache.

    $ sudo find /var/cache/fscache -inum 13631753
    /var/cache/fscache/cache/Infs,3.0,2,,300000a,e5e9b1269df2b0d,,,d0,100000,100000,249f0,249f0,249f0,249f0,1/@00/E210w114Hg92Az0HAMYCClFMVmkMY050002w1qO200

However, the memory address of the struct matches (fails) the check.

    (gdb) p (((long)cullready[0] & 0xf0000000) == 0x60000000)
    $4 = 1

      0000 556d 6d73 82a0
    & 0000 0000 f000 0000
    = 0000 0000 6000 0000

    $ file /sbin/cachefilesd
    /sbin/cachefilesd: ELF 64-bit LSB pie executable, x86-64

Looking at the code, I suspect that this magic 0x60000000 number is supposed
to be some kind of sentinel value that's used as a bug check for errors such
as use after free? This would make sense when the application was 32 bit, as
address pattern 0110 in the highest nibble either cannot occur, or lies within
the kernel address space. However, when compiled as 64 bit this assumption is
no longer true and the bit pattern can appear in perfectly valid addresses.

This would also explain the random nature of the crashes, as the cachefilesd
is at the whims of the OS and calloc function.

--
Chris