2022-02-02 02:46:22

by Luke Dashjr

[permalink] [raw]
Subject: zram corruption

I use ext4 on zram for my temp directories, and sometimes rarely, things get
corrupted. Using ext4 on a normal disk works fine in the same scenarios.

I haven't managed to figure out what exactly is going on, but I do have a
157 GB strace log of it happening.

One scenario that fairly reliably reproduces it, is building 3 copies of
binutils in parallel. About half the
time, /var/tmp/portage/cross-i686-w64-mingw32/binutils-2.37_p1-r2/work/build/binutils/.deps/stabs.Po
ends up truncated, and one of the builds fails.

The only other scenario I've seen it happen in (much less reproducible), is
running Bitcoin functional tests. In this case, however, the ext4 structure
itself got corrupted, and Linux was unable to recover (the directories
affected became unusable until reboot).

I suspect it's probably a threading-related issue, but it's plausible it could
be page size related (I *think* I'm using 64k pages) though in the latter
case I would expect it to be much more common.

https://bugzilla.kernel.org/show_bug.cgi?id=215557