2022-02-28 20:27:45

by Linus Torvalds

[permalink] [raw]
Subject: Re: Intel QAT on A2SDi-8C-HLN4F causes massive data corruption with dm-crypt + xfs

On Mon, Feb 28, 2022 at 12:18 AM Kyle Sanderson <[email protected]> wrote:
>
> Makes sense - this kernel driver has been destroying users for many
> years. I'm disappointed that this critical bricking failure isn't
> searchable for others.

It does sound like we should just disable that driver entirely until
it is fixed.

Or at least the configuration that can cause problems, if there is
some particular sub-case. Although from a cursory glance and the
noises made in this thread, it looks like it's all of the 'qat_aeads'
cases (since that uses qat_alg_aead_enc() which can return -EAGAIN),
which effectively means that all of the QAT stuff.

So presumably CRYPTO_DEV_QAT should just be marked as

depends on BROKEN || COMPILE_TEST

or similar?

Linus


2022-02-28 21:19:50

by Milan Broz

[permalink] [raw]
Subject: Re: [dm-devel] Intel QAT on A2SDi-8C-HLN4F causes massive data corruption with dm-crypt + xfs

On 28/02/2022 20:25, Linus Torvalds wrote:
> On Mon, Feb 28, 2022 at 12:18 AM Kyle Sanderson <[email protected]> wrote:
>>
>> Makes sense - this kernel driver has been destroying users for many
>> years. I'm disappointed that this critical bricking failure isn't
>> searchable for others.
>
> It does sound like we should just disable that driver entirely until
> it is fixed.
>
> Or at least the configuration that can cause problems, if there is
> some particular sub-case. Although from a cursory glance and the
> noises made in this thread, it looks like it's all of the 'qat_aeads'
> cases (since that uses qat_alg_aead_enc() which can return -EAGAIN),
> which effectively means that all of the QAT stuff.
>
> So presumably CRYPTO_DEV_QAT should just be marked as
>
> depends on BROKEN || COMPILE_TEST
>
> or similar?

Yes, please! Or at least disable it in stable for now.

During the last years, we had several reports of problems with this driver
for cryptsetup/LUKS (dm-crypt with qat driver; here it is skcipher, not aead, though).

The problem with the misunderstanding of the crypto API queue has been known
to authors for some time, at least since 2020
see https://lore.kernel.org/dm-devel/[email protected]/
and it is apparently not fixed yet.

Thanks you,
Milan