In __es_remove_extent() we're storing seemingly arbitrary value
0x7FDEADBEEF into block variable. I assume that the reason is just to
initialize the variable before the use because the actual value does not
matter at this point.
Just remove the arbitrary value and initialized block variable to zero
which is much less suspicious.
Signed-off-by: Lukas Czerner <[email protected]>
---
fs/ext4/extents_status.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 33682aa..3c47b4e 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -782,7 +782,7 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk,
struct extent_status *es;
struct extent_status orig_es;
ext4_lblk_t len1, len2;
- ext4_fsblk_t block;
+ ext4_fsblk_t block = 0;
int err;
retry:
@@ -810,7 +810,6 @@ retry:
newes.es_lblk = end + 1;
newes.es_len = len2;
- block = 0x7FDEADBEEF;
if (ext4_es_is_written(&orig_es) ||
ext4_es_is_unwritten(&orig_es))
block = ext4_es_pblock(&orig_es) +
--
1.8.3.1
In ext4_es_can_be_merged() when checking whether we can merge two
extents we should use EXT_MAX_BLOCKS instead of defining it manually.
Also if it is really the case we should notify userspace because clearly
there is a bug in extent status tree implementation since this should
never happen.
Signed-off-by: Lukas Czerner <[email protected]>
---
fs/ext4/extents_status.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 3c47b4e..b38d71a 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -344,8 +344,14 @@ static int ext4_es_can_be_merged(struct extent_status *es1,
if (ext4_es_status(es1) != ext4_es_status(es2))
return 0;
- if (((__u64) es1->es_len) + es2->es_len > 0xFFFFFFFFULL)
+ if (((__u64) es1->es_len) + es2->es_len > EXT_MAX_BLOCKS) {
+ pr_warn("ES assertion failed when merging extents. "
+ "The sum of lengths of es1 (%d) and es2 (%d) "
+ "is bigger than allowed file size (%d)\n",
+ es1->es_len, es2->es_len, EXT_MAX_BLOCKS);
+ WARN_ON(1);
return 0;
+ }
if (((__u64) es1->es_lblk) + es1->es_len != es2->es_lblk)
return 0;
--
1.8.3.1
On Tue, Apr 15, 2014 at 08:08:21PM +0200, Lukas Czerner wrote:
> In __es_remove_extent() we're storing seemingly arbitrary value
> 0x7FDEADBEEF into block variable. I assume that the reason is just to
> initialize the variable before the use because the actual value does not
> matter at this point.
>
> Just remove the arbitrary value and initialized block variable to zero
> which is much less suspicious.
The whole point was for the value to be suspicious. That way, if
there is a bug, and we try to use that value, it is a large enough
value that for most storage devices, we will get an I/O error (because
the disk isn't that big :-) referencing that block value, and we'll
have a bit of a hint where that suspicious value might have come from.
- Ted
On Thu, 17 Apr 2014, Theodore Ts'o wrote:
> Date: Thu, 17 Apr 2014 12:19:44 -0400
> From: Theodore Ts'o <[email protected]>
> To: Lukas Czerner <[email protected]>
> Cc: [email protected]
> Subject: Re: [PATCH 1/2] ext4: Remove arbitrary block value in
> __es_remove_extent()
>
> On Tue, Apr 15, 2014 at 08:08:21PM +0200, Lukas Czerner wrote:
> > In __es_remove_extent() we're storing seemingly arbitrary value
> > 0x7FDEADBEEF into block variable. I assume that the reason is just to
> > initialize the variable before the use because the actual value does not
> > matter at this point.
> >
> > Just remove the arbitrary value and initialized block variable to zero
> > which is much less suspicious.
>
> The whole point was for the value to be suspicious. That way, if
> there is a bug, and we try to use that value, it is a large enough
> value that for most storage devices, we will get an I/O error (because
> the disk isn't that big :-) referencing that block value, and we'll
> have a bit of a hint where that suspicious value might have come from.
Aside from the fact that this is totally undocumented and there is
not even comment on what is that all about in couple of years we
might actually get file systems big enough that this would not be an
I/O error anymore (that might be a bit of a stretch).
But mainly this value is only going to be used if it is delayed
extent or a hole which implies that it has not been mapped and
pblock does not contain anything valid. And if we really screwed it
up and tried to use pblock of extent which is a hole or delayed
extent, then it would not help us anyway since the only place that
we actually set this is when splitting extent on removal.
Now I can see that in ext4_da_map_blocks() we're actually using ~0
value for the pblock which is a bit better I think as long as we're
using this reliably. So I'll resend the patch which will make sure
that we're using ~0 reliably when storin delayed, or hole extents in
the extent status tree. Does that make sense ?
-Lukas
>
> - Ted
On Fri, Apr 18, 2014 at 11:22:12AM +0200, Lukáš Czerner wrote:
> Aside from the fact that this is totally undocumented and there is
> not even comment on what is that all about in couple of years we
> might actually get file systems big enough that this would not be an
> I/O error anymore (that might be a bit of a stretch).
Well, do you have some suggestions about what might be a good place to
document something like this? My assumption is that it's something
that would be used by developers after a bug had been reported, so
presumably it would be someplace in the source code.
And I used "0x7FDEADBEEF" deliberately so that it would be a extremely
unlikely we would have file systems that big (we're approximately 512
PB, and honestly, if we had fixed all of the scaling limits such that
it was sane to think someone would want to be using ext4 with a file
system that big --- well, that would be a very nice problem to have
:-)
> But mainly this value is only going to be used if it is delayed
> extent or a hole which implies that it has not been mapped and
> pblock does not contain anything valid. And if we really screwed it
> up and tried to use pblock of extent which is a hole or delayed
> extent, then it would not help us anyway since the only place that
> we actually set this is when splitting extent on removal.
>
> Now I can see that in ext4_da_map_blocks() we're actually using ~0
> value for the pblock which is a bit better I think as long as we're
> using this reliably. So I'll resend the patch which will make sure
> that we're using ~0 reliably when storin delayed, or hole extents in
> the extent status tree. Does that make sense ?
So the technique that we're using in mballoc.c is that we use
different illegal flag values depending on where the bad value was
introduced:
% grep "debug value" fs/ext4/mballoc.c
ex.fe_logical = 0xDEADFA11; /* debug value */
ex.fe_logical = 0xDEADC0DE; /* debug value */
ex.fe_logical = 0xDEADF00D; /* debug value */
I think it might be useful to do the same for the physical blocks in
the extent_status tree.
- Ted
On Fri, 18 Apr 2014, Theodore Ts'o wrote:
> Date: Fri, 18 Apr 2014 07:18:26 -0400
> From: Theodore Ts'o <[email protected]>
> To: Lukáš Czerner <[email protected]>
> Cc: [email protected]
> Subject: Re: [PATCH 1/2] ext4: Remove arbitrary block value in
> __es_remove_extent()
>
> On Fri, Apr 18, 2014 at 11:22:12AM +0200, Lukáš Czerner wrote:
> > Aside from the fact that this is totally undocumented and there is
> > not even comment on what is that all about in couple of years we
> > might actually get file systems big enough that this would not be an
> > I/O error anymore (that might be a bit of a stretch).
>
> Well, do you have some suggestions about what might be a good place to
> document something like this? My assumption is that it's something
> that would be used by developers after a bug had been reported, so
> presumably it would be someplace in the source code.
>
> And I used "0x7FDEADBEEF" deliberately so that it would be a extremely
> unlikely we would have file systems that big (we're approximately 512
> PB, and honestly, if we had fixed all of the scaling limits such that
> it was sane to think someone would want to be using ext4 with a file
> system that big --- well, that would be a very nice problem to have
> :-)
Indeed :)
>
> > But mainly this value is only going to be used if it is delayed
> > extent or a hole which implies that it has not been mapped and
> > pblock does not contain anything valid. And if we really screwed it
> > up and tried to use pblock of extent which is a hole or delayed
> > extent, then it would not help us anyway since the only place that
> > we actually set this is when splitting extent on removal.
> >
> > Now I can see that in ext4_da_map_blocks() we're actually using ~0
> > value for the pblock which is a bit better I think as long as we're
> > using this reliably. So I'll resend the patch which will make sure
> > that we're using ~0 reliably when storin delayed, or hole extents in
> > the extent status tree. Does that make sense ?
>
> So the technique that we're using in mballoc.c is that we use
> different illegal flag values depending on where the bad value was
> introduced:
>
> % grep "debug value" fs/ext4/mballoc.c
> ex.fe_logical = 0xDEADFA11; /* debug value */
> ex.fe_logical = 0xDEADC0DE; /* debug value */
> ex.fe_logical = 0xDEADF00D; /* debug value */
>
> I think it might be useful to do the same for the physical blocks in
> the extent_status tree.
>
> - Ted
I agree. I'll send a patch.
Thanks!
-Lukas
On Tue, Apr 15, 2014 at 08:08:22PM +0200, Lukas Czerner wrote:
> In ext4_es_can_be_merged() when checking whether we can merge two
> extents we should use EXT_MAX_BLOCKS instead of defining it manually.
> Also if it is really the case we should notify userspace because clearly
> there is a bug in extent status tree implementation since this should
> never happen.
>
> Signed-off-by: Lukas Czerner <[email protected]>
Thanks for fixing this. It looks good to me.
Reviewed-by: Zheng Liu <[email protected]>
- Zheng
> ---
> fs/ext4/extents_status.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 3c47b4e..b38d71a 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -344,8 +344,14 @@ static int ext4_es_can_be_merged(struct extent_status *es1,
> if (ext4_es_status(es1) != ext4_es_status(es2))
> return 0;
>
> - if (((__u64) es1->es_len) + es2->es_len > 0xFFFFFFFFULL)
> + if (((__u64) es1->es_len) + es2->es_len > EXT_MAX_BLOCKS) {
> + pr_warn("ES assertion failed when merging extents. "
> + "The sum of lengths of es1 (%d) and es2 (%d) "
> + "is bigger than allowed file size (%d)\n",
> + es1->es_len, es2->es_len, EXT_MAX_BLOCKS);
> + WARN_ON(1);
> return 0;
> + }
>
> if (((__u64) es1->es_lblk) + es1->es_len != es2->es_lblk)
> return 0;
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Apr 15, 2014 at 08:08:22PM +0200, Lukas Czerner wrote:
> In ext4_es_can_be_merged() when checking whether we can merge two
> extents we should use EXT_MAX_BLOCKS instead of defining it manually.
> Also if it is really the case we should notify userspace because clearly
> there is a bug in extent status tree implementation since this should
> never happen.
>
> Signed-off-by: Lukas Czerner <[email protected]>
> Reviewed-by: Zheng Liu <[email protected]>
Applied, thanks.
- Ted