2022-07-21 09:09:14

by hmsjwzb

[permalink] [raw]
Subject: [PATCH]btrfs: Fix fstest case btrfs/219

Hi,
fstest btrfs/291 failed.

[How to reproduce]
mkdir -p /mnt/test/219.mnt
xfs_io -f -c "truncate 256m" /mnt/test/219.img1
mkfs.btrfs /mnt/test/219.img1
cp /mnt/test/219.img1 /mnt/test/219.img2
mount -o loop /mnt/test/219.img1 /mnt/test/219.mnt
umount /mnt/test/219.mnt
losetup -f --show /mnt/test/219.img1 dev
mount /dev/loop0 /mnt/test/219.mnt
umount /mnt/test/219.mnt
mount -o loop /mnt/test/219.img2 /mnt/test/219.mnt

[Root cause]
if (fs_devices->opened && found_transid < device->generation) {
/*
* That is if the FS is _not_ mounted and if you
* are here, that means there is more than one
* disk with same uuid and devid.We keep the one
* with larger generation number or the last-in if
* generation are equal.
*/
mutex_unlock(&fs_devices->device_list_mutex);
return ERR_PTR(-EEXIST);
}

[Personal opinion]
User might back up a block device to another. I think it is improper
to forbid user from mounting it.

Signed-off-by: Flint.Wang <[email protected]>
---
fs/btrfs/volumes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6aa6bc769569a..76af32032ac85 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -900,7 +900,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
* tracking a problem where systems fail mount by subvolume id
* when we reject replacement on a mounted FS.
*/
- if (!fs_devices->opened && found_transid < device->generation) {
+ if (fs_devices->opened && found_transid < device->generation) {
/*
* That is if the FS is _not_ mounted and if you
* are here, that means there is more than one
--
2.37.0


2022-07-21 14:30:45

by Nikolay Borisov

[permalink] [raw]
Subject: Re: [PATCH]btrfs: Fix fstest case btrfs/219



On 21.07.22 г. 11:36 ч., Flint.Wang wrote:
> Hi,
> fstest btrfs/291 failed.
>
> [How to reproduce]
> mkdir -p /mnt/test/219.mnt
> xfs_io -f -c "truncate 256m" /mnt/test/219.img1
> mkfs.btrfs /mnt/test/219.img1
> cp /mnt/test/219.img1 /mnt/test/219.img2
> mount -o loop /mnt/test/219.img1 /mnt/test/219.mnt
> umount /mnt/test/219.mnt
> losetup -f --show /mnt/test/219.img1 dev
> mount /dev/loop0 /mnt/test/219.mnt
> umount /mnt/test/219.mnt
> mount -o loop /mnt/test/219.img2 /mnt/test/219.mnt
>
> [Root cause]
> if (fs_devices->opened && found_transid < device->generation) {
> /*
> * That is if the FS is _not_ mounted and if you
> * are here, that means there is more than one
> * disk with same uuid and devid.We keep the one
> * with larger generation number or the last-in if
> * generation are equal.
> */
> mutex_unlock(&fs_devices->device_list_mutex);
> return ERR_PTR(-EEXIST);
> }
>
> [Personal opinion]
> User might back up a block device to another. I think it is improper
> to forbid user from mounting it.
>
> Signed-off-by: Flint.Wang <[email protected]>


This lacks any explanation whatsoever so it's not possible to judge
whether the fix is correct or not.

> ---
> fs/btrfs/volumes.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 6aa6bc769569a..76af32032ac85 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -900,7 +900,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
> * tracking a problem where systems fail mount by subvolume id
> * when we reject replacement on a mounted FS.
> */
> - if (!fs_devices->opened && found_transid < device->generation) {
> + if (fs_devices->opened && found_transid < device->generation) {
> /*
> * That is if the FS is _not_ mounted and if you
> * are here, that means there is more than one

2022-07-22 05:48:26

by hmsjwzb

[permalink] [raw]
Subject: Re: [PATCH]btrfs: Fix fstest case btrfs/219



On 7/21/22 09:37, Nikolay Borisov wrote:
>
>
> On 21.07.22 г. 11:36 ч., Flint.Wang wrote:
>> Hi,
>> fstest btrfs/291 failed.
>>
>> [How to reproduce]
>> mkdir -p /mnt/test/219.mnt
>> xfs_io -f -c "truncate 256m" /mnt/test/219.img1
>> mkfs.btrfs /mnt/test/219.img1
>> cp /mnt/test/219.img1 /mnt/test/219.img2
>> mount -o loop /mnt/test/219.img1 /mnt/test/219.mnt
>> umount /mnt/test/219.mnt
>> losetup -f --show /mnt/test/219.img1 dev
>> mount /dev/loop0 /mnt/test/219.mnt
>> umount /mnt/test/219.mnt
>> mount -o loop /mnt/test/219.img2 /mnt/test/219.mnt
>>
>> [Root cause]
>> if (fs_devices->opened && found_transid < device->generation) {
>>     /*
>>      * That is if the FS is _not_ mounted and if you
>>      * are here, that means there is more than one
>>      * disk with same uuid and devid.We keep the one
>>      * with larger generation number or the last-in if
>>      * generation are equal.
>>      */
>>     mutex_unlock(&fs_devices->device_list_mutex);
>>     return ERR_PTR(-EEXIST);
>> }
>>
>> [Personal opinion]
>> User might back up a block device to another. I think it is improper
>> to forbid user from mounting it.
>>
>> Signed-off-by: Flint.Wang <[email protected]>
>
>
> This lacks any explanation whatsoever so it's not possible to judge whether the fix is correct or not.
>
>> ---
>>   fs/btrfs/volumes.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 6aa6bc769569a..76af32032ac85 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -900,7 +900,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>>            * tracking a problem where systems fail mount by subvolume id
>>            * when we reject replacement on a mounted FS.
>>            */
>> -        if (!fs_devices->opened && found_transid < device->generation) {
>> +        if (fs_devices->opened && found_transid < device->generation) {
>>               /*
>>                * That is if the FS is _not_ mounted and if you
>>                * are here, that means there is more than one

Hi Nikolay,

It seems the failure of btrfs/219 needs some explanation.

Here is the thing.
1. A storage device A with btrfs filesystem is running on a host.
2. For example, we backup the device A to an exactly some device B.
3. The device A continue to run for a while so the device->generation is getting bigger.
4. Then you umount the device A and try to mount device B.
5. Kernel find that device A has the same UUID as device B and has bigger device->generation.
So the mount request of device B will be rejected.

if (!fs_devices->opened && found_transid < device->generation) {
/*
* That is if the FS is _not_ mounted and if you
* are here, that means there is more than one
* disk with same uuid and devid.We keep the one
* with larger generation number or the last-in if
* generation are equal.
*/
mutex_unlock(&fs_devices->device_list_mutex);
return ERR_PTR(-EEXIST);
}

I think it is improper to reject that request. Because device A is not in open state.

Thanks

2022-07-22 09:07:51

by Nikolay Borisov

[permalink] [raw]
Subject: Re: [PATCH]btrfs: Fix fstest case btrfs/219



On 22.07.22 г. 8:34 ч., hmsjwzb wrote:
>
>
> On 7/21/22 09:37, Nikolay Borisov wrote:
>>
>>
>> On 21.07.22 г. 11:36 ч., Flint.Wang wrote:
>>> Hi,
>>> fstest btrfs/291 failed.
>>>
>>> [How to reproduce]
>>> mkdir -p /mnt/test/219.mnt
>>> xfs_io -f -c "truncate 256m" /mnt/test/219.img1
>>> mkfs.btrfs /mnt/test/219.img1
>>> cp /mnt/test/219.img1 /mnt/test/219.img2
>>> mount -o loop /mnt/test/219.img1 /mnt/test/219.mnt
>>> umount /mnt/test/219.mnt
>>> losetup -f --show /mnt/test/219.img1 dev
>>> mount /dev/loop0 /mnt/test/219.mnt
>>> umount /mnt/test/219.mnt
>>> mount -o loop /mnt/test/219.img2 /mnt/test/219.mnt
>>>
>>> [Root cause]
>>> if (fs_devices->opened && found_transid < device->generation) {
>>>     /*
>>>      * That is if the FS is _not_ mounted and if you
>>>      * are here, that means there is more than one
>>>      * disk with same uuid and devid.We keep the one
>>>      * with larger generation number or the last-in if
>>>      * generation are equal.
>>>      */
>>>     mutex_unlock(&fs_devices->device_list_mutex);
>>>     return ERR_PTR(-EEXIST);
>>> }
>>>
>>> [Personal opinion]
>>> User might back up a block device to another. I think it is improper
>>> to forbid user from mounting it.
>>>
>>> Signed-off-by: Flint.Wang <[email protected]>
>>
>>
>> This lacks any explanation whatsoever so it's not possible to judge whether the fix is correct or not.
>>
>>> ---
>>>   fs/btrfs/volumes.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 6aa6bc769569a..76af32032ac85 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -900,7 +900,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>>>            * tracking a problem where systems fail mount by subvolume id
>>>            * when we reject replacement on a mounted FS.
>>>            */
>>> -        if (!fs_devices->opened && found_transid < device->generation) {
>>> +        if (fs_devices->opened && found_transid < device->generation) {
>>>               /*
>>>                * That is if the FS is _not_ mounted and if you
>>>                * are here, that means there is more than one
>
> Hi Nikolay,
>
> It seems the failure of btrfs/219 needs some explanation.
>
> Here is the thing.
> 1. A storage device A with btrfs filesystem is running on a host.
> 2. For example, we backup the device A to an exactly some device B.
> 3. The device A continue to run for a while so the device->generation is getting bigger.
> 4. Then you umount the device A and try to mount device B.
> 5. Kernel find that device A has the same UUID as device B and has bigger device->generation.
> So the mount request of device B will be rejected.
>
> if (!fs_devices->opened && found_transid < device->generation) {
> /*
> * That is if the FS is _not_ mounted and if you
> * are here, that means there is more than one
> * disk with same uuid and devid.We keep the one
> * with larger generation number or the last-in if
> * generation are equal.
> */
> mutex_unlock(&fs_devices->device_list_mutex);
> return ERR_PTR(-EEXIST);
> }
>
> I think it is improper to reject that request. Because device A is not in open state.

But then you will gravely confuse the filesystem about which device is
the latest one, no? This code is rather old and the comments doesn't
really help. So I'd like Chris (as the original author) to chime in on
what the expected behavior should be ? IMO we shouldn't be allowing to
add devices with older generation at all, irrespective of whether the fs
is mounted or not.


>
> Thanks

2022-07-26 19:05:46

by David Sterba

[permalink] [raw]
Subject: Re: [PATCH]btrfs: Fix fstest case btrfs/219

On Fri, Jul 22, 2022 at 01:34:11AM -0400, hmsjwzb wrote:
> On 7/21/22 09:37, Nikolay Borisov wrote:
> > On 21.07.22 г. 11:36 ч., Flint.Wang wrote:
> >> Hi,
> >> fstest btrfs/291 failed.
> >>
> >> [How to reproduce]
> >> mkdir -p /mnt/test/219.mnt
> >> xfs_io -f -c "truncate 256m" /mnt/test/219.img1
> >> mkfs.btrfs /mnt/test/219.img1
> >> cp /mnt/test/219.img1 /mnt/test/219.img2
> >> mount -o loop /mnt/test/219.img1 /mnt/test/219.mnt
> >> umount /mnt/test/219.mnt
> >> losetup -f --show /mnt/test/219.img1 dev
> >> mount /dev/loop0 /mnt/test/219.mnt
> >> umount /mnt/test/219.mnt
> >> mount -o loop /mnt/test/219.img2 /mnt/test/219.mnt
> >>
> >> [Root cause]
> >> if (fs_devices->opened && found_transid < device->generation) {
> >>     /*
> >>      * That is if the FS is _not_ mounted and if you
> >>      * are here, that means there is more than one
> >>      * disk with same uuid and devid.We keep the one
> >>      * with larger generation number or the last-in if
> >>      * generation are equal.
> >>      */
> >>     mutex_unlock(&fs_devices->device_list_mutex);
> >>     return ERR_PTR(-EEXIST);
> >> }
> >>
> >> [Personal opinion]
> >> User might back up a block device to another. I think it is improper
> >> to forbid user from mounting it.
> >>
> >> Signed-off-by: Flint.Wang <[email protected]>
> >
> >
> > This lacks any explanation whatsoever so it's not possible to judge whether the fix is correct or not.
> >
> >> ---
> >>   fs/btrfs/volumes.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index 6aa6bc769569a..76af32032ac85 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -900,7 +900,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
> >>            * tracking a problem where systems fail mount by subvolume id
> >>            * when we reject replacement on a mounted FS.
> >>            */
> >> -        if (!fs_devices->opened && found_transid < device->generation) {
> >> +        if (fs_devices->opened && found_transid < device->generation) {
> >>               /*
> >>                * That is if the FS is _not_ mounted and if you
> >>                * are here, that means there is more than one
>
> Hi Nikolay,
>
> It seems the failure of btrfs/219 needs some explanation.
>
> Here is the thing.
> 1. A storage device A with btrfs filesystem is running on a host.
> 2. For example, we backup the device A to an exactly some device B.
> 3. The device A continue to run for a while so the device->generation is getting bigger.
> 4. Then you umount the device A and try to mount device B.
> 5. Kernel find that device A has the same UUID as device B and has bigger device->generation.
> So the mount request of device B will be rejected.

That's on purpose, devices are matched by UUIDs and making block copies
of the same filesystem is known "don't do that" and discouraged.

If you must store the block copies then you can change the UUID by
btrfstune, there are two ways (fast metadata_uuid, and slow rewriting
all metadata uuids in all blocks).

>
> if (!fs_devices->opened && found_transid < device->generation) {
> /*
> * That is if the FS is _not_ mounted and if you
> * are here, that means there is more than one
> * disk with same uuid and devid.We keep the one
> * with larger generation number or the last-in if
> * generation are equal.
> */
> mutex_unlock(&fs_devices->device_list_mutex);
> return ERR_PTR(-EEXIST);
> }
>
> I think it is improper to reject that request. Because device A is not in open state.

But this would prevent mounting A. There should really be some way to
distiguish the filesystems, the block device is not a stable identifier,
the UUID is. Imagine having 10 copies of the same filesystem identified
by the same UUID and device UUID, but with different generations and
data. That's asking for problems.

There's not much the filesystem driver can do than to avoid using old
devices and giving preference to the highest generation device. All
devices with btrfs signature are registered in memory and this is the
primary source when mounting the devices, not the block device itself.

2022-07-27 09:24:50

by hmsjwzb

[permalink] [raw]
Subject: [PATCH] btrfs/219: fix problems with mount old generation

Hi Nikolay & David,

As we discussed, I changed the test btrfs/219 to make it pass.

This test try to mount btrfs filesystem with old generation.

Devices are matched by UUID in btrfs filesystem. The mount of btrfs with
old generation should be failed on purpose.

Signed-off-by: Flint.Wang <[email protected]>
---
tests/btrfs/219 | 3 +--
tests/btrfs/219.out | 1 +
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/btrfs/219 b/tests/btrfs/219
index 528175b8..5152fa91 100755
--- a/tests/btrfs/219
+++ b/tests/btrfs/219
@@ -73,8 +73,7 @@ _mount $loop_dev $loop_mnt > /dev/null 2>&1 || \
$UMOUNT_PROG $loop_mnt

_mount -o loop $fs_img2 $loop_mnt > /dev/null 2>&1 || \
- _fail "We couldn't mount the old generation"
-$UMOUNT_PROG $loop_mnt
+ echo "We couldn't mount the old generation"

_mount $loop_dev $loop_mnt > /dev/null 2>&1 || \
_fail "Failed to mount the second time"
diff --git a/tests/btrfs/219.out b/tests/btrfs/219.out
index 162074d3..6fe85f24 100644
--- a/tests/btrfs/219.out
+++ b/tests/btrfs/219.out
@@ -1,2 +1,3 @@
QA output created by 219
+We couldn't mount the old generation
Silence is golden
--
2.37.0

Thanks

2022-08-03 07:00:34

by Anand Jain

[permalink] [raw]
Subject: Re: [PATCH]btrfs: Fix fstest case btrfs/219






On 27/07/2022 02:38, David Sterba wrote:
> On Fri, Jul 22, 2022 at 01:34:11AM -0400, hmsjwzb wrote:
>> On 7/21/22 09:37, Nikolay Borisov wrote:
>>> On 21.07.22 г. 11:36 ч., Flint.Wang wrote:
>>>> Hi,
>>>> fstest btrfs/291 failed.
>>>>
>>>> [How to reproduce]
>>>> mkdir -p /mnt/test/219.mnt
>>>> xfs_io -f -c "truncate 256m" /mnt/test/219.img1
>>>> mkfs.btrfs /mnt/test/219.img1
>>>> cp /mnt/test/219.img1 /mnt/test/219.img2
>>>> mount -o loop /mnt/test/219.img1 /mnt/test/219.mnt
>>>> umount /mnt/test/219.mnt
>>>> losetup -f --show /mnt/test/219.img1 dev
>>>> mount /dev/loop0 /mnt/test/219.mnt
>>>> umount /mnt/test/219.mnt
>>>> mount -o loop /mnt/test/219.img2 /mnt/test/219.mnt
>>>>
>>>> [Root cause]
>>>> if (fs_devices->opened && found_transid < device->generation) {
>>>>     /*
>>>>      * That is if the FS is _not_ mounted and if you
>>>>      * are here, that means there is more than one
>>>>      * disk with same uuid and devid.We keep the one
>>>>      * with larger generation number or the last-in if
>>>>      * generation are equal.
>>>>      */
>>>>     mutex_unlock(&fs_devices->device_list_mutex);
>>>>     return ERR_PTR(-EEXIST);
>>>> }
>>>>
>>>> [Personal opinion]
>>>> User might back up a block device to another. I think it is improper
>>>> to forbid user from mounting it.
>>>>
>>>> Signed-off-by: Flint.Wang <[email protected]>
>>>
>>>
>>> This lacks any explanation whatsoever so it's not possible to judge whether the fix is correct or not.
>>>
>>>> ---
>>>>   fs/btrfs/volumes.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>>> index 6aa6bc769569a..76af32032ac85 100644
>>>> --- a/fs/btrfs/volumes.c
>>>> +++ b/fs/btrfs/volumes.c
>>>> @@ -900,7 +900,7 @@ static noinline struct btrfs_device *device_list_add(const char *path,
>>>>            * tracking a problem where systems fail mount by subvolume id
>>>>            * when we reject replacement on a mounted FS.
>>>>            */
>>>> -        if (!fs_devices->opened && found_transid < device->generation) {
>>>> +        if (fs_devices->opened && found_transid < device->generation) {
>>>>               /*
>>>>                * That is if the FS is _not_ mounted and if you
>>>>                * are here, that means there is more than one
>>
>> Hi Nikolay,
>>
>> It seems the failure of btrfs/219 needs some explanation.
>>
>> Here is the thing.
>> 1. A storage device A with btrfs filesystem is running on a host.
>> 2. For example, we backup the device A to an exactly some device B.
>> 3. The device A continue to run for a while so the device->generation is getting bigger.
>> 4. Then you umount the device A and try to mount device B.
>> 5. Kernel find that device A has the same UUID as device B and has bigger device->generation.
>> So the mount request of device B will be rejected.
>
> That's on purpose, devices are matched by UUIDs and making block copies
> of the same filesystem is known "don't do that" and discouraged.
>
> If you must store the block copies then you can change the UUID by
> btrfstune, there are two ways (fast metadata_uuid, and slow rewriting
> all metadata uuids in all blocks).
>
>>
>> if (!fs_devices->opened && found_transid < device->generation) {
>> /*
>> * That is if the FS is _not_ mounted and if you
>> * are here, that means there is more than one
>> * disk with same uuid and devid.We keep the one
>> * with larger generation number or the last-in if
>> * generation are equal.
>> */
>> mutex_unlock(&fs_devices->device_list_mutex);
>> return ERR_PTR(-EEXIST);
>> }
>>
>> I think it is improper to reject that request. Because device A is not in open state.
>
> But this would prevent mounting A. There should really be some way to
> distiguish the filesystems, the block device is not a stable identifier,
> the UUID is. Imagine having 10 copies of the same filesystem identified
> by the same UUID and device UUID, but with different generations and
> data. That's asking for problems.
>
> There's not much the filesystem driver can do than to avoid using old
> devices and giving preference to the highest generation device. All
> devices with btrfs signature are registered in memory and this is the
> primary source when mounting the devices, not the block device itself.


David,

The unintegrated patch [1] also used the same use case.

[1]
[PATCH v2][RESEND] btrfs: allow single disk devices to mount with
older generations

IMO device-copy and mount (without changing the UUID) can be allowed
for a single device btrfs volume only. We even have a fstest case
btrfs/219, which tests single device duplicate UUIDs.

Please integrate [1].

-Anand