2023-07-18 13:03:48

by Ard Biesheuvel

[permalink] [raw]
Subject: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

Currently, the ubifs code allocates a worst case buffer size to
recompress a data node, but does not pass the size of that buffer to the
compression code. This means that the compression code will never use
the additional space, and might fail spuriously due to lack of space.

So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
buffer. Doing so is guaranteed not to overflow, given that the preceding
kmalloc_array() call would have failed otherwise.

Signed-off-by: Ard Biesheuvel <[email protected]>
---
fs/ubifs/journal.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index dc52ac0f4a345f30..4e5961878f336033 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info *c, const struct inode *in
if (!buf)
return -ENOMEM;

+ out_len *= WORST_COMPR_FACTOR;
+
dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
data_size = dn_size - UBIFS_DATA_NODE_SZ;
compr_type = le16_to_cpu(dn->compr_type);
--
2.39.2



2023-07-18 22:51:12

by Eric Biggers

[permalink] [raw]
Subject: Re: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
> Currently, the ubifs code allocates a worst case buffer size to
> recompress a data node, but does not pass the size of that buffer to the
> compression code. This means that the compression code will never use
> the additional space, and might fail spuriously due to lack of space.
>
> So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
> buffer. Doing so is guaranteed not to overflow, given that the preceding
> kmalloc_array() call would have failed otherwise.
>
> Signed-off-by: Ard Biesheuvel <[email protected]>
> ---
> fs/ubifs/journal.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
> index dc52ac0f4a345f30..4e5961878f336033 100644
> --- a/fs/ubifs/journal.c
> +++ b/fs/ubifs/journal.c
> @@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info *c, const struct inode *in
> if (!buf)
> return -ENOMEM;
>
> + out_len *= WORST_COMPR_FACTOR;
> +
> dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
> data_size = dn_size - UBIFS_DATA_NODE_SZ;
> compr_type = le16_to_cpu(dn->compr_type);

This looks like another case where data that would be expanded by compression
should just be stored uncompressed instead.

In fact, it seems that UBIFS does that already. ubifs_compress() has this:

/*
* If the data compressed only slightly, it is better to leave it
* uncompressed to improve read speed.
*/
if (in_len - *out_len < UBIFS_MIN_COMPRESS_DIFF)
goto no_compr;

So it's unclear why the WORST_COMPR_FACTOR thing is needed at all.

- Eric

2023-07-19 14:26:49

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

在 2023/7/19 16:33, Ard Biesheuvel 写道:
> On Wed, 19 Jul 2023 at 00:38, Eric Biggers <[email protected]> wrote:
>>
>> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
>>> Currently, the ubifs code allocates a worst case buffer size to
>>> recompress a data node, but does not pass the size of that buffer to the
>>> compression code. This means that the compression code will never use

I think you mean the 'out_len' which describes the lengh of 'buf' is
passed into ubifs_decompress, which effects the result of
decompressor(eg. lz4 uses length to calculate the buffer end pos).
So, we should pass the real lenghth of 'buf'.

Reviewed-by: Zhihao Cheng <[email protected]>

>>> the additional space, and might fail spuriously due to lack of space.
>>>
>>> So let's multiply out_len by WORST_COMPR_FACTOR after allocating the
>>> buffer. Doing so is guaranteed not to overflow, given that the preceding
>>> kmalloc_array() call would have failed otherwise.
>>>
>>> Signed-off-by: Ard Biesheuvel <[email protected]>
>>> ---
>>> fs/ubifs/journal.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
>>> index dc52ac0f4a345f30..4e5961878f336033 100644
>>> --- a/fs/ubifs/journal.c
>>> +++ b/fs/ubifs/journal.c
>>> @@ -1493,6 +1493,8 @@ static int truncate_data_node(const struct ubifs_info *c, const struct inode *in
>>> if (!buf)
>>> return -ENOMEM;
>>>
>>> + out_len *= WORST_COMPR_FACTOR;
>>> +
>>> dlen = le32_to_cpu(dn->ch.len) - UBIFS_DATA_NODE_SZ;
>>> data_size = dn_size - UBIFS_DATA_NODE_SZ;
>>> compr_type = le16_to_cpu(dn->compr_type);
>>
>> This looks like another case where data that would be expanded by compression
>> should just be stored uncompressed instead.
>>
>> In fact, it seems that UBIFS does that already. ubifs_compress() has this:
>>
>> /*
>> * If the data compressed only slightly, it is better to leave it
>> * uncompressed to improve read speed.
>> */
>> if (in_len - *out_len < UBIFS_MIN_COMPRESS_DIFF)
>> goto no_compr;
>>
>> So it's unclear why the WORST_COMPR_FACTOR thing is needed at all.
>>
>
> It is not. The buffer is used for decompression in the truncation
> path, so none of this logic even matters. Even if the subsequent
> recompression of the truncated data node could result in expansion
> beyond the uncompressed size of the original data (which seems
> impossible to me), increasing the size of this buffer would not help
> as it is the input buffer for the compression not the output buffer.
> .
>


2023-07-20 01:29:57

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [RFC PATCH 05/21] ubifs: Pass worst-case buffer size to compression routines

在 2023/7/19 22:38, Ard Biesheuvel 写道:
> On Wed, 19 Jul 2023 at 16:23, Zhihao Cheng <[email protected]> wrote:
>>
>> 在 2023/7/19 16:33, Ard Biesheuvel 写道:
>>> On Wed, 19 Jul 2023 at 00:38, Eric Biggers <[email protected]> wrote:
>>>>
>>>> On Tue, Jul 18, 2023 at 02:58:31PM +0200, Ard Biesheuvel wrote:
>>>>> Currently, the ubifs code allocates a worst case buffer size to
>>>>> recompress a data node, but does not pass the size of that buffer to the
>>>>> compression code. This means that the compression code will never use
>>
>> I think you mean the 'out_len' which describes the lengh of 'buf' is
>> passed into ubifs_decompress, which effects the result of
>> decompressor(eg. lz4 uses length to calculate the buffer end pos).
>> So, we should pass the real lenghth of 'buf'.
>>
>
> Yes, that is what I meant.
>
> But Eric makes a good point, and looking a bit more closely, there is
> really no need for the multiplication here: we know the size of the
> decompressed data, so we don't need the additional space.
>

Right, we get 'out_len' from 'dn->size' which is the length of
uncompressed data. ubifs_compress makes sure the compressed length is
smaller than original length.

> I intend to drop this patch, and replace it with the following:
>
> ----------------8<--------------
>
> Currently, when truncating a data node, a decompression buffer is
> allocated that is twice the size of the data node's uncompressed size.
> However, the fact that this space is available is not communicated to
> the compression routines, as out_len itself is not updated.
>
> The additional space is not needed even in the theoretical worst case
> where compression might lead to inadvertent expansion: first of all,
> increasing the size of the input buffer does not help mitigate that
> issue. And given the truncation of the data node and the fact that the
> original data compressed well enough to pass the UBIFS_MIN_COMPRESS_DIFF
> test, there is no way on this particular code path that compression
> could result in expansion beyond the original decompressed size, and so
> no mitigation is necessary to begin with.
>
> So let's just drop WORST_COMPR_FACTOR here.
>
> Signed-off-by: Ard Biesheuvel <[email protected]>
>
> diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
> index dc52ac0f4a345f30..0b55cbfe0c30505e 100644
> --- a/fs/ubifs/journal.c
> +++ b/fs/ubifs/journal.c
> @@ -1489,7 +1489,7 @@ static int truncate_data_node(const struct
> ubifs_info *c, const struct inode *in
> int err, dlen, compr_type, out_len, data_size;
>
> out_len = le32_to_cpu(dn->size);
> - buf = kmalloc_array(out_len, WORST_COMPR_FACTOR, GFP_NOFS);
> + buf = kmalloc(out_len, GFP_NOFS);
> if (!buf)
> return -ENOMEM;
> .
>

This version looks better.

Reviewed-by: Zhihao Cheng <[email protected]>