2017-03-28 09:51:30

by Arnd Bergmann

[permalink] [raw]
Subject: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

Commit 60928a91b0b3 ("md: raid1: use bio helper in process_checks()")
is probably correct, but I get a new compile-time warning after
it, and have trouble understanding what it fixes:

drivers/md/raid1.c: In function 'sync_request_write':
drivers/md/raid1.c:2172:9: error: 'page_len$' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (memcmp(page_address(ppages[j]),
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
page_address(spages[j]),
~~~~~~~~~~~~~~~~~~~~~~~~
page_len[j]))
~~~~~~~~~~~~
drivers/md/raid1.c:2160:7: note: 'page_len$' was declared here
int page_len[RESYNC_PAGES];
^~~~~~~~

This reverts it to resolve the warning.

Fixes: 60928a91b0b3 ("md: raid1: use bio helper in process_checks()")
Signed-off-by: Arnd Bergmann <[email protected]>
---
drivers/md/raid1.c | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index b7d9651286d4..4d176c8abc33 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2113,7 +2113,6 @@ static void process_checks(struct r1bio *r1_bio)
int j;
int size;
int error;
- struct bio_vec *bi;
struct bio *b = r1_bio->bios[i];
struct resync_pages *rp = get_resync_pages(b);
if (b->bi_end_io != end_sync_read)
@@ -2132,7 +2131,9 @@ static void process_checks(struct r1bio *r1_bio)
b->bi_private = rp;

size = b->bi_iter.bi_size;
- bio_for_each_segment_all(bi, b, j) {
+ for (j = 0; j < vcnt ; j++) {
+ struct bio_vec *bi;
+ bi = &b->bi_io_vec[j];
bi->bv_offset = 0;
if (size > PAGE_SIZE)
bi->bv_len = PAGE_SIZE;
@@ -2156,22 +2157,17 @@ static void process_checks(struct r1bio *r1_bio)
int error = sbio->bi_error;
struct page **ppages = get_resync_pages(pbio)->pages;
struct page **spages = get_resync_pages(sbio)->pages;
- struct bio_vec *bi;
- int page_len[RESYNC_PAGES];

if (sbio->bi_end_io != end_sync_read)
continue;
/* Now we can 'fixup' the error value */
sbio->bi_error = 0;

- bio_for_each_segment_all(bi, sbio, j)
- page_len[j] = bi->bv_len;
-
if (!error) {
for (j = vcnt; j-- ; ) {
if (memcmp(page_address(ppages[j]),
page_address(spages[j]),
- page_len[j]))
+ sbio->bi_io_vec[j].bv_len))
break;
}
} else
--
2.9.0


2017-03-28 10:45:03

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, Mar 28, 2017 at 5:49 PM, Arnd Bergmann <[email protected]> wrote:
> Commit 60928a91b0b3 ("md: raid1: use bio helper in process_checks()")
> is probably correct, but I get a new compile-time warning after
> it, and have trouble understanding what it fixes:
>
> drivers/md/raid1.c: In function 'sync_request_write':
> drivers/md/raid1.c:2172:9: error: 'page_len$' may be used uninitialized in this function [-Werror=maybe-uninitialized]
> if (memcmp(page_address(ppages[j]),
> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> page_address(spages[j]),
> ~~~~~~~~~~~~~~~~~~~~~~~~
> page_len[j]))
> ~~~~~~~~~~~~
> drivers/md/raid1.c:2160:7: note: 'page_len$' was declared here
> int page_len[RESYNC_PAGES];
> ^~~~~~~~
>
> This reverts it to resolve the warning.

Please try the following patch:

https://lkml.org/lkml/2017/3/28/126

BTW, the compile failure is just a false positive.

Thanks,
Ming

2017-03-28 11:35:54

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, Mar 28, 2017 at 12:44 PM, Ming Lei <[email protected]> wrote:
> On Tue, Mar 28, 2017 at 5:49 PM, Arnd Bergmann <[email protected]> wrote:
>> Commit 60928a91b0b3 ("md: raid1: use bio helper in process_checks()")
>> is probably correct, but I get a new compile-time warning after
>> it, and have trouble understanding what it fixes:
>>
>> drivers/md/raid1.c: In function 'sync_request_write':
>> drivers/md/raid1.c:2172:9: error: 'page_len$' may be used uninitialized in this function [-Werror=maybe-uninitialized]
>> if (memcmp(page_address(ppages[j]),
>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> page_address(spages[j]),
>> ~~~~~~~~~~~~~~~~~~~~~~~~
>> page_len[j]))
>> ~~~~~~~~~~~~
>> drivers/md/raid1.c:2160:7: note: 'page_len$' was declared here
>> int page_len[RESYNC_PAGES];
>> ^~~~~~~~
>>
>> This reverts it to resolve the warning.
>
> Please try the following patch:
>
> https://lkml.org/lkml/2017/3/28/126


That patch will certainly shut up the warning, but will also prevent
the compiler from warning when the function gets changed in some
way that actually leads to an uninitialized use of the page_len array,
which is why I didn't suggest doing it that way.

Arnd

2017-03-28 11:42:53

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, Mar 28, 2017 at 7:35 PM, Arnd Bergmann <[email protected]> wrote:
> On Tue, Mar 28, 2017 at 12:44 PM, Ming Lei <[email protected]> wrote:
>> On Tue, Mar 28, 2017 at 5:49 PM, Arnd Bergmann <[email protected]> wrote:
>>> Commit 60928a91b0b3 ("md: raid1: use bio helper in process_checks()")
>>> is probably correct, but I get a new compile-time warning after
>>> it, and have trouble understanding what it fixes:
>>>
>>> drivers/md/raid1.c: In function 'sync_request_write':
>>> drivers/md/raid1.c:2172:9: error: 'page_len$' may be used uninitialized in this function [-Werror=maybe-uninitialized]
>>> if (memcmp(page_address(ppages[j]),
>>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> page_address(spages[j]),
>>> ~~~~~~~~~~~~~~~~~~~~~~~~
>>> page_len[j]))
>>> ~~~~~~~~~~~~
>>> drivers/md/raid1.c:2160:7: note: 'page_len$' was declared here
>>> int page_len[RESYNC_PAGES];
>>> ^~~~~~~~
>>>
>>> This reverts it to resolve the warning.
>>
>> Please try the following patch:
>>
>> https://lkml.org/lkml/2017/3/28/126
>
>
> That patch will certainly shut up the warning, but will also prevent
> the compiler from warning when the function gets changed in some
> way that actually leads to an uninitialized use of the page_len array,

Why do you think that it leads to an uninitialized use of the page_len array?

The following code does initialize the array well enough for future use:

bio_for_each_segment_all(bi, sbio, j)
page_len[j] = bi->bv_len;

That is why we don't need to initialize the array explicitly, but just
for killing
the warning.


Thanks,
Ming Lei

2017-03-28 13:21:07

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, Mar 28, 2017 at 1:42 PM, Ming Lei <[email protected]> wrote:
> On Tue, Mar 28, 2017 at 7:35 PM, Arnd Bergmann <[email protected]> wrote:
>> On Tue, Mar 28, 2017 at 12:44 PM, Ming Lei <[email protected]> wrote:
>>> On Tue, Mar 28, 2017 at 5:49 PM, Arnd Bergmann <[email protected]> wrote:
>>>> Commit 60928a91b0b3 ("md: raid1: use bio helper in process_checks()")
>>>> is probably correct, but I get a new compile-time warning after
>>>> it, and have trouble understanding what it fixes:
>>>>
>>>> drivers/md/raid1.c: In function 'sync_request_write':
>>>> drivers/md/raid1.c:2172:9: error: 'page_len$' may be used uninitialized in this function [-Werror=maybe-uninitialized]
>>>> if (memcmp(page_address(ppages[j]),
>>>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>> page_address(spages[j]),
>>>> ~~~~~~~~~~~~~~~~~~~~~~~~
>>>> page_len[j]))
>>>> ~~~~~~~~~~~~
>>>> drivers/md/raid1.c:2160:7: note: 'page_len$' was declared here
>>>> int page_len[RESYNC_PAGES];
>>>> ^~~~~~~~
>>>>
>>>> This reverts it to resolve the warning.
>>>
>>> Please try the following patch:
>>>
>>> https://lkml.org/lkml/2017/3/28/126
>>
>>
>> That patch will certainly shut up the warning, but will also prevent
>> the compiler from warning when the function gets changed in some
>> way that actually leads to an uninitialized use of the page_len array,
>
> Why do you think that it leads to an uninitialized use of the page_len array?

What I meant is that a future change to the function might cause
another bug to go unnoticed later.

> The following code does initialize the array well enough for future use:
>
> bio_for_each_segment_all(bi, sbio, j)
> page_len[j] = bi->bv_len;
>
> That is why we don't need to initialize the array explicitly, but just
> for killing the warning.

It's also a little less clear why that is safe than the original code:
We rely on sbio->bi_vcnt to be the same as vcnt, but it requires
careful reading of the function to see that this is always true.
gcc warns because it cannot prove this to be the case, so if
something changed here, it's likely that this would also not
get noticed.

Arnd

2017-03-28 15:02:50

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, Mar 28, 2017 at 9:20 PM, Arnd Bergmann <[email protected]> wrote:
> On Tue, Mar 28, 2017 at 1:42 PM, Ming Lei <[email protected]> wrote:
>> On Tue, Mar 28, 2017 at 7:35 PM, Arnd Bergmann <[email protected]> wrote:
>>> On Tue, Mar 28, 2017 at 12:44 PM, Ming Lei <[email protected]> wrote:
>>>> On Tue, Mar 28, 2017 at 5:49 PM, Arnd Bergmann <[email protected]> wrote:
>>>>> Commit 60928a91b0b3 ("md: raid1: use bio helper in process_checks()")
>>>>> is probably correct, but I get a new compile-time warning after
>>>>> it, and have trouble understanding what it fixes:
>>>>>
>>>>> drivers/md/raid1.c: In function 'sync_request_write':
>>>>> drivers/md/raid1.c:2172:9: error: 'page_len$' may be used uninitialized in this function [-Werror=maybe-uninitialized]
>>>>> if (memcmp(page_address(ppages[j]),
>>>>> ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> page_address(spages[j]),
>>>>> ~~~~~~~~~~~~~~~~~~~~~~~~
>>>>> page_len[j]))
>>>>> ~~~~~~~~~~~~
>>>>> drivers/md/raid1.c:2160:7: note: 'page_len$' was declared here
>>>>> int page_len[RESYNC_PAGES];
>>>>> ^~~~~~~~
>>>>>
>>>>> This reverts it to resolve the warning.
>>>>
>>>> Please try the following patch:
>>>>
>>>> https://lkml.org/lkml/2017/3/28/126
>>>
>>>
>>> That patch will certainly shut up the warning, but will also prevent
>>> the compiler from warning when the function gets changed in some
>>> way that actually leads to an uninitialized use of the page_len array,
>>
>> Why do you think that it leads to an uninitialized use of the page_len array?
>
> What I meant is that a future change to the function might cause
> another bug to go unnoticed later.

What is the future change? And what is another bug? Please don't suppose or
assume anything in future.

BTW, I don't think it is a problem, and anyone who want to change the code
much should understand it first, right?

>
>> The following code does initialize the array well enough for future use:
>>
>> bio_for_each_segment_all(bi, sbio, j)
>> page_len[j] = bi->bv_len;
>>
>> That is why we don't need to initialize the array explicitly, but just
>> for killing the warning.
>
> It's also a little less clear why that is safe than the original code:
> We rely on sbio->bi_vcnt to be the same as vcnt, but it requires

That is absolutely true because all read bios in process_checks()
have same vector number, do you think it will be changed in future?

And what we really rely on is that RESYNC_PAGES is equal to or bigger
than the vector number, and that is what we can guarantee.

> careful reading of the function to see that this is always true.
> gcc warns because it cannot prove this to be the case, so if
> something changed here, it's likely that this would also not
> get noticed.

The compiler can't understand runtime behaviour, and
we try to let gcc check more, but that is just fine if gcc can't.

One big purpose of this patch is to remove direct access to
bvec table, so it can't be reverted, or do you have better idea?


Thanks,
Ming Lei

2017-03-28 15:38:14

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, Mar 28, 2017 at 5:02 PM, Ming Lei <[email protected]> wrote:
> On Tue, Mar 28, 2017 at 9:20 PM, Arnd Bergmann <[email protected]> wrote:
>> On Tue, Mar 28, 2017 at 1:42 PM, Ming Lei <[email protected]> wrote:

>> What I meant is that a future change to the function might cause
>> another bug to go unnoticed later.
>
> What is the future change? And what is another bug? Please don't suppose or
> assume anything in future.
>
> BTW, I don't think it is a problem, and anyone who want to change the code
> much should understand it first, right?

I did not have any specific change in mind, I was just following the
principle from https://rusty.ozlabs.org/?p=232

I tend to do a lot of fixes for warnings like this, and in general
adding a fake initialization will lead to worse object code while
changing the code to be easier to understand by the compiler
will also make it easier to understand by human readers, lead
to better object code and to being more robust against future
changes that would have otherwise triggered a correct warning.

>>> The following code does initialize the array well enough for future use:
>>>
>>> bio_for_each_segment_all(bi, sbio, j)
>>> page_len[j] = bi->bv_len;
>>>
>>> That is why we don't need to initialize the array explicitly, but just
>>> for killing the warning.
>>
>> It's also a little less clear why that is safe than the original code:
>> We rely on sbio->bi_vcnt to be the same as vcnt, but it requires
>
> That is absolutely true because all read bios in process_checks()
> have same vector number, do you think it will be changed in future?

No, I have no reason to assume it would be changed.

> And what we really rely on is that RESYNC_PAGES is equal to or bigger
> than the vector number, and that is what we can guarantee.
>
>> careful reading of the function to see that this is always true.
>> gcc warns because it cannot prove this to be the case, so if
>> something changed here, it's likely that this would also not
>> get noticed.
>
> The compiler can't understand runtime behaviour, and
> we try to let gcc check more, but that is just fine if gcc can't.
>
> One big purpose of this patch is to remove direct access to
> bvec table, so it can't be reverted, or do you have better idea?

>From what I can tell, the following walk of the pages does not
have to be done in reverse, so we can perhaps use a single
bio_for_each_segment_all() (only for illustration, haven't
thought this through completely) :

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 4d176c8abc33..e4bcc7cec8da 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2157,25 +2157,30 @@ static void process_checks(struct r1bio *r1_bio)
int error = sbio->bi_error;
struct page **ppages = get_resync_pages(pbio)->pages;
struct page **spages = get_resync_pages(sbio)->pages;
+ struct bio_vec *bi;
+ int page_len[RESYNC_PAGES];
+

if (sbio->bi_end_io != end_sync_read)
continue;
/* Now we can 'fixup' the error value */
sbio->bi_error = 0;

- if (!error) {
- for (j = vcnt; j-- ; ) {
- if (memcmp(page_address(ppages[j]),
- page_address(spages[j]),
- sbio->bi_io_vec[j].bv_len))
- break;
- }
- } else
- j = 0;
- if (j >= 0)
+ if (error) {
atomic64_add(r1_bio->sectors,
&mddev->resync_mismatches);
- if (j < 0 || (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)
- && !error)) {
+ bio_copy_data(sbio, pbio);
+ continue;
+ }
+
+ bio_for_each_segment_all(bi, sbio, j) {
+ if (memcmp(page_address(ppages[j]),
+ page_address(spages[j]),
+ sbio->bi_io_vec[j].bv_len)) {
+ atomic64_add(r1_bio->sectors,
&mddev->resync_mismatches);
+ break;
+ }
+ }
+ if (j == vcnt || (test_bit(MD_RECOVERY_CHECK,
&mddev->recovery))) {
/* No need to write to this device. */
sbio->bi_end_io = NULL;
rdev_dec_pending(conf->mirrors[i].rdev, mddev);

2017-03-28 15:54:15

by Wols Lists

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On 28/03/17 16:02, Ming Lei wrote:
>> What I meant is that a future change to the function might cause
>> > another bug to go unnoticed later.

> What is the future change? And what is another bug? Please don't suppose or
> assume anything in future.

What was that about some American General demanding a list of "unknown
unknowns"?
>
> BTW, I don't think it is a problem, and anyone who want to change the code
> much should understand it first, right?
>
I'm very sorry, but I think you are assuming facts not in evidence (or
rather, facts that the evidence says are wrong).

In real life, it is normal for people to change things without
understanding them. Are you saying that *you* - a couple of years down
the line - will remember this bit of code, and will block a mistaken patch?

What Arnd is doing is commonly called "defensive programming", and
unfortunately reality shows us that it is usually worth its weight in
gold. That's why you put ASSERTs in code - so that if somebody does
something stupid by accident, it blows up. This is just more of the same.

Cheers,
Wol

2017-03-28 16:14:15

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, Mar 28, 2017 at 11:37 PM, Arnd Bergmann <[email protected]> wrote:
> On Tue, Mar 28, 2017 at 5:02 PM, Ming Lei <[email protected]> wrote:
>> On Tue, Mar 28, 2017 at 9:20 PM, Arnd Bergmann <[email protected]> wrote:
>>> On Tue, Mar 28, 2017 at 1:42 PM, Ming Lei <[email protected]> wrote:
>
>>> What I meant is that a future change to the function might cause
>>> another bug to go unnoticed later.
>>
>> What is the future change? And what is another bug? Please don't suppose or
>> assume anything in future.
>>
>> BTW, I don't think it is a problem, and anyone who want to change the code
>> much should understand it first, right?
>
> I did not have any specific change in mind, I was just following the
> principle from https://rusty.ozlabs.org/?p=232
>
> I tend to do a lot of fixes for warnings like this, and in general
> adding a fake initialization will lead to worse object code while
> changing the code to be easier to understand by the compiler
> will also make it easier to understand by human readers, lead
> to better object code and to being more robust against future
> changes that would have otherwise triggered a correct warning.

What gcc can't understand is the following situation:

1) int a[N];

2) for(i = 0; i < vcnt1; i++)
a[i] = b[i];

3) for (i = 0; i < vcnt2; i++)
c[i] = a[i];

The warning is in 3), since gcc may think vcnt2 isn't same with vcnt1, but as
I mentioned that two number is absolutely same, and shouldn't be
changed in future.

That is why I suggest to fix the warning via a[N] = {0}.

>
>>>> The following code does initialize the array well enough for future use:
>>>>
>>>> bio_for_each_segment_all(bi, sbio, j)
>>>> page_len[j] = bi->bv_len;
>>>>
>>>> That is why we don't need to initialize the array explicitly, but just
>>>> for killing the warning.
>>>
>>> It's also a little less clear why that is safe than the original code:
>>> We rely on sbio->bi_vcnt to be the same as vcnt, but it requires
>>
>> That is absolutely true because all read bios in process_checks()
>> have same vector number, do you think it will be changed in future?
>
> No, I have no reason to assume it would be changed.
>
>> And what we really rely on is that RESYNC_PAGES is equal to or bigger
>> than the vector number, and that is what we can guarantee.
>>
>>> careful reading of the function to see that this is always true.
>>> gcc warns because it cannot prove this to be the case, so if
>>> something changed here, it's likely that this would also not
>>> get noticed.
>>
>> The compiler can't understand runtime behaviour, and
>> we try to let gcc check more, but that is just fine if gcc can't.
>>
>> One big purpose of this patch is to remove direct access to
>> bvec table, so it can't be reverted, or do you have better idea?
>
> From what I can tell, the following walk of the pages does not
> have to be done in reverse, so we can perhaps use a single
> bio_for_each_segment_all() (only for illustration, haven't
> thought this through completely) :
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 4d176c8abc33..e4bcc7cec8da 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -2157,25 +2157,30 @@ static void process_checks(struct r1bio *r1_bio)
> int error = sbio->bi_error;
> struct page **ppages = get_resync_pages(pbio)->pages;
> struct page **spages = get_resync_pages(sbio)->pages;
> + struct bio_vec *bi;
> + int page_len[RESYNC_PAGES];
> +
>
> if (sbio->bi_end_io != end_sync_read)
> continue;
> /* Now we can 'fixup' the error value */
> sbio->bi_error = 0;
>
> - if (!error) {
> - for (j = vcnt; j-- ; ) {
> - if (memcmp(page_address(ppages[j]),
> - page_address(spages[j]),
> - sbio->bi_io_vec[j].bv_len))
> - break;
> - }
> - } else
> - j = 0;
> - if (j >= 0)
> + if (error) {
> atomic64_add(r1_bio->sectors,
> &mddev->resync_mismatches);
> - if (j < 0 || (test_bit(MD_RECOVERY_CHECK, &mddev->recovery)
> - && !error)) {
> + bio_copy_data(sbio, pbio);
> + continue;
> + }
> +
> + bio_for_each_segment_all(bi, sbio, j) {
> + if (memcmp(page_address(ppages[j]),
> + page_address(spages[j]),
> + sbio->bi_io_vec[j].bv_len)) {

The above line should be bi->bv_len, and this way should make the array
not necessary, but still a bit ugly.

I suggest to apply the simple fix first since it is correct, then we
can refactor the code in this way
later.

Thanks,
Ming Lei

Subject: Re: [PATCH] Revert "md: raid1: use bio helper in process_checks()"

On Tue, 28 Mar 2017, Wols Lists wrote:
> What Arnd is doing is commonly called "defensive programming", and
> unfortunately reality shows us that it is usually worth its weight in
> gold. That's why you put ASSERTs in code - so that if somebody does
> something stupid by accident, it blows up. This is just more of the same.

You know, BUG_ON(vcnt1 != vcnt2) might address this quite nicely [if you
want to go the assert() way, that is!], since __attribute__((noreturn))
is set for the codepath taken when the BUG_ON condition triggers...

That said, if there is something less insane to be done than OOPsing
when the world has gone strange and vcnt1 != vcnt2, that ought to be a
better solution...

All of this assumes vcnt1 == vcnt2 is really the only possibily correct
situation.

--
Henrique Holschuh