2015-11-03 07:13:12

by Junxiao Bi

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

Hi Gang,

This is not like a right patch.
First, online file check only checks inode's block number, valid flag,
fs generation value, and meta ecc. I never see a real corruption
happened only on this field, if these fields are corrupted, that means
something bad may happen on other place. So fix this field may not help
and even cause corruption more hard.
Second, the repair way is wrong. In
ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
match the ones in memory, the ones in memory are used to update the disk
fields. The question is how do you know these field in memory are
right(they may be the real corrupted ones)?

Thanks,
Junxiao.
On 10/28/2015 02:26 PM, Gang He wrote:
> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
> + struct buffer_head *bh)
> +{
> + int rc;
> + int changed = 0;
> + struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
> +
> + rc = ocfs2_filecheck_validate_inode_block(sb, bh);
> + /* Can't fix invalid inode block */
> + if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
> + return rc;
> +
> + trace_ocfs2_filecheck_repair_inode_block(
> + (unsigned long long)bh->b_blocknr);
> +
> + if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
> + ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
> + mlog(ML_ERROR,
> + "Filecheck: try to repair dinode #%llu on readonly filesystem\n",
> + (unsigned long long)bh->b_blocknr);
> + return -OCFS2_FILECHECK_ERR_READONLY;
> + }
> +
> + if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
> + di->i_blkno = cpu_to_le64(bh->b_blocknr);
> + changed = 1;
> + mlog(ML_ERROR,
> + "Filecheck: reset dinode #%llu: i_blkno to %llu\n",
> + (unsigned long long)bh->b_blocknr,
> + (unsigned long long)le64_to_cpu(di->i_blkno));
> + }
> +
> + if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
> + di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
> + changed = 1;
> + mlog(ML_ERROR,
> + "Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
> + (unsigned long long)bh->b_blocknr);
> + }
> +
> + if (le32_to_cpu(di->i_fs_generation) !=
> + OCFS2_SB(sb)->fs_generation) {
> + di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
> + changed = 1;
> + mlog(ML_ERROR,
> + "Filecheck: reset dinode #%llu: fs_generation to %u\n",
> + (unsigned long long)bh->b_blocknr,
> + le32_to_cpu(di->i_fs_generation));
> + }
> +
> + if (changed ||
> + ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
> + ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
> + mark_buffer_dirty(bh);
> + mlog(ML_ERROR,
> + "Filecheck: reset dinode #%llu: compute meta ecc\n",
> + (unsigned long long)bh->b_blocknr);
> + }
> +
> + return 0;
> +}


2015-11-03 08:15:25

by Gang He

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

Hello Junxiao,

See my comments inline.


>>>
> Hi Gang,
>
> This is not like a right patch.
> First, online file check only checks inode's block number, valid flag,
> fs generation value, and meta ecc. I never see a real corruption
> happened only on this field, if these fields are corrupted, that means
> something bad may happen on other place. So fix this field may not help
> and even cause corruption more hard.
This online file check/fix feature is used to check/fix some light file meta block corruption, instead of turning a file system off and using fsck.ocfs2.
e.g. meta ecc error, we really need not to use fsck.ocfs2.
of course, this feature does not replace fsck.ocfs2 and touch some complicated meta block problems, if there is some potential problem in some areas, we can discuss them one by one.



> Second, the repair way is wrong. In
> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
> match the ones in memory, the ones in memory are used to update the disk
> fields. The question is how do you know these field in memory are
> right(they may be the real corrupted ones)?
Here, if the inode block was corrupted, the file system is not able to load it into the memory.
ocfs2_filecheck_repair_inode_block() will able to load it into the memory, since it try to fix these light-level problem before loading.
if the fix is OK, the changed meta-block can pass the block-validate function and load into the memory as a inode object.
Since the file system is under a cluster environment, we have to use some existing function and code path to keep these block operation under a cluster lock.


Thanks
Gang

>
> Thanks,
> Junxiao.
> On 10/28/2015 02:26 PM, Gang He wrote:
>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>> + struct buffer_head *bh)
>> +{
>> + int rc;
>> + int changed = 0;
>> + struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>> +
>> + rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>> + /* Can't fix invalid inode block */
>> + if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>> + return rc;
>> +
>> + trace_ocfs2_filecheck_repair_inode_block(
>> + (unsigned long long)bh->b_blocknr);
>> +
>> + if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>> + ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>> + mlog(ML_ERROR,
>> + "Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>> + (unsigned long long)bh->b_blocknr);
>> + return -OCFS2_FILECHECK_ERR_READONLY;
>> + }
>> +
>> + if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>> + di->i_blkno = cpu_to_le64(bh->b_blocknr);
>> + changed = 1;
>> + mlog(ML_ERROR,
>> + "Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>> + (unsigned long long)bh->b_blocknr,
>> + (unsigned long long)le64_to_cpu(di->i_blkno));
>> + }
>> +
>> + if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>> + di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>> + changed = 1;
>> + mlog(ML_ERROR,
>> + "Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>> + (unsigned long long)bh->b_blocknr);
>> + }
>> +
>> + if (le32_to_cpu(di->i_fs_generation) !=
>> + OCFS2_SB(sb)->fs_generation) {
>> + di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>> + changed = 1;
>> + mlog(ML_ERROR,
>> + "Filecheck: reset dinode #%llu: fs_generation to %u\n",
>> + (unsigned long long)bh->b_blocknr,
>> + le32_to_cpu(di->i_fs_generation));
>> + }
>> +
>> + if (changed ||
>> + ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>> + ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>> + mark_buffer_dirty(bh);
>> + mlog(ML_ERROR,
>> + "Filecheck: reset dinode #%llu: compute meta ecc\n",
>> + (unsigned long long)bh->b_blocknr);
>> + }
>> +
>> + return 0;
>> +}

2015-11-03 08:32:44

by Junxiao Bi

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

On 11/03/2015 04:15 PM, Gang He wrote:
> Hello Junxiao,
>
> See my comments inline.
>
>
>>>>
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened only on this field, if these fields are corrupted, that means
>> something bad may happen on other place. So fix this field may not help
>> and even cause corruption more hard.
> This online file check/fix feature is used to check/fix some light file meta block corruption, instead of turning a file system off and using fsck.ocfs2.
What's light meta block corruption? Do you have a case about it?
> e.g. meta ecc error, we really need not to use fsck.ocfs2.
> of course, this feature does not replace fsck.ocfs2 and touch some complicated meta block problems, if there is some potential problem in some areas, we can discuss them one by one.
>
>
>
>> Second, the repair way is wrong. In
>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>> match the ones in memory, the ones in memory are used to update the disk
>> fields. The question is how do you know these field in memory are
>> right(they may be the real corrupted ones)?
> Here, if the inode block was corrupted, the file system is not able to load it into the memory.
How do you know inode block corrupted? If bh for inode block is
overwritten, i mean bh corrupted, the repair will corrupted a good inode
block.

Thanks,
Junxiao.

> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, since it try to fix these light-level problem before loading.
> if the fix is OK, the changed meta-block can pass the block-validate function and load into the memory as a inode object.
> Since the file system is under a cluster environment, we have to use some existing function and code path to keep these block operation under a cluster lock.
>
>
> Thanks
> Gang
>
>>
>> Thanks,
>> Junxiao.
>> On 10/28/2015 02:26 PM, Gang He wrote:
>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>> + struct buffer_head *bh)
>>> +{
>>> + int rc;
>>> + int changed = 0;
>>> + struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>> +
>>> + rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>> + /* Can't fix invalid inode block */
>>> + if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>> + return rc;
>>> +
>>> + trace_ocfs2_filecheck_repair_inode_block(
>>> + (unsigned long long)bh->b_blocknr);
>>> +
>>> + if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>> + ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>> + mlog(ML_ERROR,
>>> + "Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>> + (unsigned long long)bh->b_blocknr);
>>> + return -OCFS2_FILECHECK_ERR_READONLY;
>>> + }
>>> +
>>> + if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>> + di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>> + changed = 1;
>>> + mlog(ML_ERROR,
>>> + "Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>> + (unsigned long long)bh->b_blocknr,
>>> + (unsigned long long)le64_to_cpu(di->i_blkno));
>>> + }
>>> +
>>> + if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>> + di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>> + changed = 1;
>>> + mlog(ML_ERROR,
>>> + "Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>> + (unsigned long long)bh->b_blocknr);
>>> + }
>>> +
>>> + if (le32_to_cpu(di->i_fs_generation) !=
>>> + OCFS2_SB(sb)->fs_generation) {
>>> + di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>> + changed = 1;
>>> + mlog(ML_ERROR,
>>> + "Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>> + (unsigned long long)bh->b_blocknr,
>>> + le32_to_cpu(di->i_fs_generation));
>>> + }
>>> +
>>> + if (changed ||
>>> + ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>> + ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>> + mark_buffer_dirty(bh);
>>> + mlog(ML_ERROR,
>>> + "Filecheck: reset dinode #%llu: compute meta ecc\n",
>>> + (unsigned long long)bh->b_blocknr);
>>> + }
>>> +
>>> + return 0;
>>> +}
>

2015-11-03 08:48:13

by Gang He

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check




>>>
> On 11/03/2015 04:15 PM, Gang He wrote:
>> Hello Junxiao,
>>
>> See my comments inline.
>>
>>
>>>>>
>>> Hi Gang,
>>>
>>> This is not like a right patch.
>>> First, online file check only checks inode's block number, valid flag,
>>> fs generation value, and meta ecc. I never see a real corruption
>>> happened only on this field, if these fields are corrupted, that means
>>> something bad may happen on other place. So fix this field may not help
>>> and even cause corruption more hard.
>> This online file check/fix feature is used to check/fix some light file meta
> block corruption, instead of turning a file system off and using fsck.ocfs2.
> What's light meta block corruption? Do you have a case about it?
>> e.g. meta ecc error, we really need not to use fsck.ocfs2.
>> of course, this feature does not replace fsck.ocfs2 and touch some
> complicated meta block problems, if there is some potential problem in some
> areas, we can discuss them one by one.
>>
>>
>>
>>> Second, the repair way is wrong. In
>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>> match the ones in memory, the ones in memory are used to update the disk
>>> fields. The question is how do you know these field in memory are
>>> right(they may be the real corrupted ones)?
>> Here, if the inode block was corrupted, the file system is not able to load
> it into the memory.
> How do you know inode block corrupted? If bh for inode block is
> overwritten, i mean bh corrupted, the repair will corrupted a good inode
> block.
You know, the meta block is only validated when the file system loads the block from disk to memory.
If the inode object is in the memory, we consider this inode block is OK.
If the inode is not loaded by the file system via the normal way, the file system will print a kernel error log to tell which ino is corrupted.
we will use ocfs2_filecheck_repair_inode_block() function to fix the inode block before loading.

Thanks
Gang

>
> Thanks,
> Junxiao.
>
>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory,
> since it try to fix these light-level problem before loading.
>> if the fix is OK, the changed meta-block can pass the block-validate function
> and load into the memory as a inode object.
>> Since the file system is under a cluster environment, we have to use some
> existing function and code path to keep these block operation under a cluster
> lock.
>>
>>
>> Thanks
>> Gang
>>
>>>
>>> Thanks,
>>> Junxiao.
>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>> + struct buffer_head *bh)
>>>> +{
>>>> + int rc;
>>>> + int changed = 0;
>>>> + struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>> +
>>>> + rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>> + /* Can't fix invalid inode block */
>>>> + if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>> + return rc;
>>>> +
>>>> + trace_ocfs2_filecheck_repair_inode_block(
>>>> + (unsigned long long)bh->b_blocknr);
>>>> +
>>>> + if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>> + ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>> + mlog(ML_ERROR,
>>>> + "Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>> + (unsigned long long)bh->b_blocknr);
>>>> + return -OCFS2_FILECHECK_ERR_READONLY;
>>>> + }
>>>> +
>>>> + if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>> + di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>> + changed = 1;
>>>> + mlog(ML_ERROR,
>>>> + "Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>> + (unsigned long long)bh->b_blocknr,
>>>> + (unsigned long long)le64_to_cpu(di->i_blkno));
>>>> + }
>>>> +
>>>> + if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>> + di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>> + changed = 1;
>>>> + mlog(ML_ERROR,
>>>> + "Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>> + (unsigned long long)bh->b_blocknr);
>>>> + }
>>>> +
>>>> + if (le32_to_cpu(di->i_fs_generation) !=
>>>> + OCFS2_SB(sb)->fs_generation) {
>>>> + di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>> + changed = 1;
>>>> + mlog(ML_ERROR,
>>>> + "Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>> + (unsigned long long)bh->b_blocknr,
>>>> + le32_to_cpu(di->i_fs_generation));
>>>> + }
>>>> +
>>>> + if (changed ||
>>>> + ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>> + ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>> + mark_buffer_dirty(bh);
>>>> + mlog(ML_ERROR,
>>>> + "Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>> + (unsigned long long)bh->b_blocknr);
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>

2015-11-03 09:05:17

by Junxiao Bi

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

On 11/03/2015 04:47 PM, Gang He wrote:
>
>
>
>>>>
>> On 11/03/2015 04:15 PM, Gang He wrote:
>>> Hello Junxiao,
>>>
>>> See my comments inline.
>>>
>>>
>>>>>>
>>>> Hi Gang,
>>>>
>>>> This is not like a right patch.
>>>> First, online file check only checks inode's block number, valid flag,
>>>> fs generation value, and meta ecc. I never see a real corruption
>>>> happened only on this field, if these fields are corrupted, that means
>>>> something bad may happen on other place. So fix this field may not help
>>>> and even cause corruption more hard.
>>> This online file check/fix feature is used to check/fix some light file meta
>> block corruption, instead of turning a file system off and using fsck.ocfs2.
>> What's light meta block corruption? Do you have a case about it?
>>> e.g. meta ecc error, we really need not to use fsck.ocfs2.
>>> of course, this feature does not replace fsck.ocfs2 and touch some
>> complicated meta block problems, if there is some potential problem in some
>> areas, we can discuss them one by one.
>>>
>>>
>>>
>>>> Second, the repair way is wrong. In
>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>> match the ones in memory, the ones in memory are used to update the disk
>>>> fields. The question is how do you know these field in memory are
>>>> right(they may be the real corrupted ones)?
>>> Here, if the inode block was corrupted, the file system is not able to load
>> it into the memory.
>> How do you know inode block corrupted? If bh for inode block is
>> overwritten, i mean bh corrupted, the repair will corrupted a good inode
>> block.
> You know, the meta block is only validated when the file system loads the block from disk to memory.
> If the inode object is in the memory, we consider this inode block is OK.
This assuming is not true as there are always bugs. Bugs can make inode
object in memory bad and corrupted the fs when repair the inode.

Thanks,
Junxiao.
> If the inode is not loaded by the file system via the normal way, the file system will print a kernel error log to tell which ino is corrupted.
> we will use ocfs2_filecheck_repair_inode_block() function to fix the inode block before loading.
>
> Thanks
> Gang
>
>>
>> Thanks,
>> Junxiao.
>>
>>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory,
>> since it try to fix these light-level problem before loading.
>>> if the fix is OK, the changed meta-block can pass the block-validate function
>> and load into the memory as a inode object.
>>> Since the file system is under a cluster environment, we have to use some
>> existing function and code path to keep these block operation under a cluster
>> lock.
>>>
>>>
>>> Thanks
>>> Gang
>>>
>>>>
>>>> Thanks,
>>>> Junxiao.
>>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>>> + struct buffer_head *bh)
>>>>> +{
>>>>> + int rc;
>>>>> + int changed = 0;
>>>>> + struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>>> +
>>>>> + rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>>> + /* Can't fix invalid inode block */
>>>>> + if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>>> + return rc;
>>>>> +
>>>>> + trace_ocfs2_filecheck_repair_inode_block(
>>>>> + (unsigned long long)bh->b_blocknr);
>>>>> +
>>>>> + if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>>> + ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>>> + mlog(ML_ERROR,
>>>>> + "Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>>> + (unsigned long long)bh->b_blocknr);
>>>>> + return -OCFS2_FILECHECK_ERR_READONLY;
>>>>> + }
>>>>> +
>>>>> + if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>>> + di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>>> + changed = 1;
>>>>> + mlog(ML_ERROR,
>>>>> + "Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>>> + (unsigned long long)bh->b_blocknr,
>>>>> + (unsigned long long)le64_to_cpu(di->i_blkno));
>>>>> + }
>>>>> +
>>>>> + if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>>> + di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>>> + changed = 1;
>>>>> + mlog(ML_ERROR,
>>>>> + "Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>>> + (unsigned long long)bh->b_blocknr);
>>>>> + }
>>>>> +
>>>>> + if (le32_to_cpu(di->i_fs_generation) !=
>>>>> + OCFS2_SB(sb)->fs_generation) {
>>>>> + di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>>> + changed = 1;
>>>>> + mlog(ML_ERROR,
>>>>> + "Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>>> + (unsigned long long)bh->b_blocknr,
>>>>> + le32_to_cpu(di->i_fs_generation));
>>>>> + }
>>>>> +
>>>>> + if (changed ||
>>>>> + ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>>> + ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>>> + mark_buffer_dirty(bh);
>>>>> + mlog(ML_ERROR,
>>>>> + "Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>>> + (unsigned long long)bh->b_blocknr);
>>>>> + }
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>

2015-11-03 09:25:36

by Gang He

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check




>>>
> On 11/03/2015 04:47 PM, Gang He wrote:
>>
>>
>>
>>>>>
>>> On 11/03/2015 04:15 PM, Gang He wrote:
>>>> Hello Junxiao,
>>>>
>>>> See my comments inline.
>>>>
>>>>
>>>>>>>
>>>>> Hi Gang,
>>>>>
>>>>> This is not like a right patch.
>>>>> First, online file check only checks inode's block number, valid flag,
>>>>> fs generation value, and meta ecc. I never see a real corruption
>>>>> happened only on this field, if these fields are corrupted, that means
>>>>> something bad may happen on other place. So fix this field may not help
>>>>> and even cause corruption more hard.
>>>> This online file check/fix feature is used to check/fix some light file meta
>
>>> block corruption, instead of turning a file system off and using fsck.ocfs2.
>>> What's light meta block corruption? Do you have a case about it?
>>>> e.g. meta ecc error, we really need not to use fsck.ocfs2.
>>>> of course, this feature does not replace fsck.ocfs2 and touch some
>>> complicated meta block problems, if there is some potential problem in some
>>> areas, we can discuss them one by one.
>>>>
>>>>
>>>>
>>>>> Second, the repair way is wrong. In
>>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>>> match the ones in memory, the ones in memory are used to update the disk
>>>>> fields. The question is how do you know these field in memory are
>>>>> right(they may be the real corrupted ones)?
>>>> Here, if the inode block was corrupted, the file system is not able to load
>>> it into the memory.
>>> How do you know inode block corrupted? If bh for inode block is
>>> overwritten, i mean bh corrupted, the repair will corrupted a good inode
>>> block.
>> You know, the meta block is only validated when the file system loads the
> block from disk to memory.
>> If the inode object is in the memory, we consider this inode block is OK.
> This assuming is not true as there are always bugs. Bugs can make inode
> object in memory bad and corrupted the fs when repair the inode.
The inode object in the memory has probably been corrupted in some cases, right.
but in these cases, online file check/fix feature considers this inode object is validated, will not do any further modification and exit.
if the next corruption happens in case this inode is flushed into the disk, the bad thing is not made by online file check/fix code.
Next (maybe next mount), the inode block is reload into the memory by the file system will fail, with a kernel error log printing,
the online file check/fix probably can help to fix at this monent.

Thanks
Gang

>
> Thanks,
> Junxiao.
>> If the inode is not loaded by the file system via the normal way, the file
> system will print a kernel error log to tell which ino is corrupted.
>> we will use ocfs2_filecheck_repair_inode_block() function to fix the inode
> block before loading.
>>
>> Thanks
>> Gang
>>
>>>
>>> Thanks,
>>> Junxiao.
>>>
>>>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory,
>>> since it try to fix these light-level problem before loading.
>>>> if the fix is OK, the changed meta-block can pass the block-validate function
>>> and load into the memory as a inode object.
>>>> Since the file system is under a cluster environment, we have to use some
>>> existing function and code path to keep these block operation under a
> cluster
>>> lock.
>>>>
>>>>
>>>> Thanks
>>>> Gang
>>>>
>>>>>
>>>>> Thanks,
>>>>> Junxiao.
>>>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>>>> + struct buffer_head *bh)
>>>>>> +{
>>>>>> + int rc;
>>>>>> + int changed = 0;
>>>>>> + struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>>>> +
>>>>>> + rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>>>> + /* Can't fix invalid inode block */
>>>>>> + if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>>>> + return rc;
>>>>>> +
>>>>>> + trace_ocfs2_filecheck_repair_inode_block(
>>>>>> + (unsigned long long)bh->b_blocknr);
>>>>>> +
>>>>>> + if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>>>> + ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>>>> + mlog(ML_ERROR,
>>>>>> + "Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>>>> + (unsigned long long)bh->b_blocknr);
>>>>>> + return -OCFS2_FILECHECK_ERR_READONLY;
>>>>>> + }
>>>>>> +
>>>>>> + if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>>>> + di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>>>> + changed = 1;
>>>>>> + mlog(ML_ERROR,
>>>>>> + "Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>>>> + (unsigned long long)bh->b_blocknr,
>>>>>> + (unsigned long long)le64_to_cpu(di->i_blkno));
>>>>>> + }
>>>>>> +
>>>>>> + if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>>>> + di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>>>> + changed = 1;
>>>>>> + mlog(ML_ERROR,
>>>>>> + "Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>>>> + (unsigned long long)bh->b_blocknr);
>>>>>> + }
>>>>>> +
>>>>>> + if (le32_to_cpu(di->i_fs_generation) !=
>>>>>> + OCFS2_SB(sb)->fs_generation) {
>>>>>> + di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>>>> + changed = 1;
>>>>>> + mlog(ML_ERROR,
>>>>>> + "Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>>>> + (unsigned long long)bh->b_blocknr,
>>>>>> + le32_to_cpu(di->i_fs_generation));
>>>>>> + }
>>>>>> +
>>>>>> + if (changed ||
>>>>>> + ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>>>> + ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>>>> + mark_buffer_dirty(bh);
>>>>>> + mlog(ML_ERROR,
>>>>>> + "Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>>>> + (unsigned long long)bh->b_blocknr);
>>>>>> + }
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>

2015-11-24 22:16:11

by Mark Fasheh

[permalink] [raw]
Subject: Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

Hi Junxiao,

On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
> Hi Gang,
>
> This is not like a right patch.
> First, online file check only checks inode's block number, valid flag,
> fs generation value, and meta ecc. I never see a real corruption
> happened only on this field, if these fields are corrupted, that means
> something bad may happen on other place. So fix this field may not help
> and even cause corruption more hard.

I agree that these are rather uncommon, we might even consider removing the
VALID_FL fixup. I definitely don't think we're ready for anything more
complicated than this though either. We kind of have to start somewhere too.


> Second, the repair way is wrong. In
> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
> match the ones in memory, the ones in memory are used to update the disk
> fields. The question is how do you know these field in memory are
> right(they may be the real corrupted ones)?

Your second point (and the last part of your 1st point) makes a good
argument for why this shouldn't happen automatically. Some of these
corruptions might require a human to look at the log and decide what to do.
Especially as you point out, where we might not know where the source of the
corruption is. And if the human can't figure it out, then it's probably time
to unmount and fsck.

Thanks,
--Mark

--
Mark Fasheh

2015-11-25 04:11:43

by Junxiao Bi

[permalink] [raw]
Subject: Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

Hi Mark,

On 11/25/2015 06:16 AM, Mark Fasheh wrote:
> Hi Junxiao,
>
> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened only on this field, if these fields are corrupted, that means
>> something bad may happen on other place. So fix this field may not help
>> and even cause corruption more hard.
>
> I agree that these are rather uncommon, we might even consider removing the
> VALID_FL fixup. I definitely don't think we're ready for anything more
> complicated than this though either. We kind of have to start somewhere too.
>
Yes, the fix is too simple, and just a start, I think we'd better wait
more useful parts done before merging it.
>
>> Second, the repair way is wrong. In
>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>> match the ones in memory, the ones in memory are used to update the disk
>> fields. The question is how do you know these field in memory are
>> right(they may be the real corrupted ones)?
>
> Your second point (and the last part of your 1st point) makes a good
> argument for why this shouldn't happen automatically. Some of these
> corruptions might require a human to look at the log and decide what to do.
> Especially as you point out, where we might not know where the source of the
> corruption is. And if the human can't figure it out, then it's probably time
> to unmount and fsck.
The point is that the fix way is wrong, just flush memory info to disk
is not right. I agree online fsck is good feature, but need carefully
design, it should not involve more corruptions. A rough idea from mine
is that maybe we need some "frezee" mechanism in fs, which can hung all
fs op and let fs stop at a safe area. After freeze fs, we can do some
fsck work on it and these works should not cost lots time. What's your idea?

Thanks,
Junxiao.

>
> Thanks,
> --Mark
>
> --
> Mark Fasheh
>

2015-11-25 05:04:13

by Gang He

[permalink] [raw]
Subject: Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

Hi Mark and Junxiao,


>>>
> Hi Mark,
>
> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>> Hi Junxiao,
>>
>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> This is not like a right patch.
>>> First, online file check only checks inode's block number, valid flag,
>>> fs generation value, and meta ecc. I never see a real corruption
>>> happened only on this field, if these fields are corrupted, that means
>>> something bad may happen on other place. So fix this field may not help
>>> and even cause corruption more hard.
>>
>> I agree that these are rather uncommon, we might even consider removing the
>> VALID_FL fixup. I definitely don't think we're ready for anything more
>> complicated than this though either. We kind of have to start somewhere too.
>>
> Yes, the fix is too simple, and just a start, I think we'd better wait
> more useful parts done before merging it.
I agree, just remark VALID_FL flag to fix this field is too simple, we should delay this field fix before
I have a flawless solution, I will remove these lines code in the first version patches. In the future submits,
I also hope your guys to help review the code carefully, shout out your comments when you doubt somewhere.



>>
>>> Second, the repair way is wrong. In
>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>> match the ones in memory, the ones in memory are used to update the disk
>>> fields. The question is how do you know these field in memory are
>>> right(they may be the real corrupted ones)?
>>
>> Your second point (and the last part of your 1st point) makes a good
>> argument for why this shouldn't happen automatically. Some of these
>> corruptions might require a human to look at the log and decide what to do.
>> Especially as you point out, where we might not know where the source of the
>> corruption is. And if the human can't figure it out, then it's probably time
>> to unmount and fsck.
> The point is that the fix way is wrong, just flush memory info to disk
> is not right. I agree online fsck is good feature, but need carefully
> design, it should not involve more corruptions. A rough idea from mine
> is that maybe we need some "frezee" mechanism in fs, which can hung all
> fs op and let fs stop at a safe area. After freeze fs, we can do some
> fsck work on it and these works should not cost lots time. What's your idea?
If we need to touch some global data structures, freezing fs can be considered when we can't
get any way in case using the locks.
If we only handle some independent problem, we just need to lock the related data structures.

>
> Thanks,
> Junxiao.
>
>>
>> Thanks,
>> --Mark
>>
>> --
>> Mark Fasheh
>>

2015-11-25 05:45:11

by Junxiao Bi

[permalink] [raw]
Subject: Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check

On 11/25/2015 01:04 PM, Gang He wrote:
> Hi Mark and Junxiao,
>
>
>>>>
>> Hi Mark,
>>
>> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>>> Hi Junxiao,
>>>
>>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>>> Hi Gang,
>>>>
>>>> This is not like a right patch.
>>>> First, online file check only checks inode's block number, valid flag,
>>>> fs generation value, and meta ecc. I never see a real corruption
>>>> happened only on this field, if these fields are corrupted, that means
>>>> something bad may happen on other place. So fix this field may not help
>>>> and even cause corruption more hard.
>>>
>>> I agree that these are rather uncommon, we might even consider removing the
>>> VALID_FL fixup. I definitely don't think we're ready for anything more
>>> complicated than this though either. We kind of have to start somewhere too.
>>>
>> Yes, the fix is too simple, and just a start, I think we'd better wait
>> more useful parts done before merging it.
> I agree, just remark VALID_FL flag to fix this field is too simple, we should delay this field fix before
> I have a flawless solution, I will remove these lines code in the first version patches. In the future submits,
> I also hope your guys to help review the code carefully, shout out your comments when you doubt somewhere.
Sure.

>
>
>
>>>
>>>> Second, the repair way is wrong. In
>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>> match the ones in memory, the ones in memory are used to update the disk
>>>> fields. The question is how do you know these field in memory are
>>>> right(they may be the real corrupted ones)?
>>>
>>> Your second point (and the last part of your 1st point) makes a good
>>> argument for why this shouldn't happen automatically. Some of these
>>> corruptions might require a human to look at the log and decide what to do.
>>> Especially as you point out, where we might not know where the source of the
>>> corruption is. And if the human can't figure it out, then it's probably time
>>> to unmount and fsck.
>> The point is that the fix way is wrong, just flush memory info to disk
>> is not right. I agree online fsck is good feature, but need carefully
>> design, it should not involve more corruptions. A rough idea from mine
>> is that maybe we need some "frezee" mechanism in fs, which can hung all
>> fs op and let fs stop at a safe area. After freeze fs, we can do some
>> fsck work on it and these works should not cost lots time. What's your idea?
> If we need to touch some global data structures, freezing fs can be considered when we can't
> get any way in case using the locks.
> If we only handle some independent problem, we just need to lock the related data structures.
Hmm, I am not sure whether it's hard to decide an independent issue.

Thanks,
Junxiao.
>
>>
>> Thanks,
>> Junxiao.
>>
>>>
>>> Thanks,
>>> --Mark
>>>
>>> --
>>> Mark Fasheh
>>>
>