On Thu, Sep 21, 2023 at 07:24:23AM -0400, Mimi Zohar wrote:
> On Thu, 2023-09-21 at 06:32 -0400, Jeff Layton wrote:
> > On Wed, 2023-09-20 at 17:52 -0700, Casey Schaufler wrote:
> > > On 9/20/2023 5:10 PM, Stefan Berger wrote:
> > > >
> > > > On 9/20/23 18:09, Stefan Berger wrote:
> > > > >
> > > > > On 9/20/23 17:16, Jeff Layton wrote:
> > > > > > On Wed, 2023-09-20 at 16:37 -0400, Stefan Berger wrote:
> > > > > > > On 9/20/23 13:01, Stefan Berger wrote:
> > > > > > > > On 9/17/23 20:04, syzbot wrote:
> > > > > > > > > syzbot has bisected this issue to:
> > > > > > > > >
> > > > > > > > > commit db1d1e8b9867aae5c3e61ad7859abfcc4a6fd6c7
> > > > > > > > > Author: Jeff Layton <[email protected]>
> > > > > > > > > Date: Mon Apr 17 16:55:51 2023 +0000
> > > > > > > > >
> > > > > > > > > IMA: use vfs_getattr_nosec to get the i_version
> > > > > > > > >
> > > > > > > > > bisection log:
> > > > > > > > > https://syzkaller.appspot.com/x/bisect.txt?x=106f7e54680000
> > > > > > > > > start commit: a747acc0b752 Merge tag
> > > > > > > > > 'linux-kselftest-next-6.6-rc2'
> > > > > > > > > of g..
> > > > > > > > > git tree: upstream
> > > > > > > > > final oops:
> > > > > > > > > https://syzkaller.appspot.com/x/report.txt?x=126f7e54680000
> > > > > > > > > console output:
> > > > > > > > > https://syzkaller.appspot.com/x/log.txt?x=146f7e54680000
> > > > > > > > > kernel config:
> > > > > > > > > https://syzkaller.appspot.com/x/.config?x=df91a3034fe3f122
> > > > > > > > > dashboard link:
> > > > > > > > > https://syzkaller.appspot.com/bug?extid=a67fc5321ffb4b311c98
> > > > > > > > > syz repro:
> > > > > > > > > https://syzkaller.appspot.com/x/repro.syz?x=1671b694680000
> > > > > > > > > C reproducer:
> > > > > > > > > https://syzkaller.appspot.com/x/repro.c?x=14ec94d8680000
> > > > > > > > >
> > > > > > > > > Reported-by: [email protected]
> > > > > > > > > Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the
> > > > > > > > > i_version")
> > > > > > > > >
> > > > > > > > > For information about bisection process see:
> > > > > > > > > https://goo.gl/tpsmEJ#bisection
> > > > > > > > The final oops shows this here:
> > > > > > > >
> > > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000058
> > > > > > > > #PF: supervisor read access in kernel mode
> > > > > > > > #PF: error_code(0x0000) - not-present page
> > > > > > > > PGD 0 P4D 0
> > > > > > > > Oops: 0000 [#1] PREEMPT SMP
> > > > > > > > CPU: 0 PID: 3192 Comm: syz-executor.0 Not tainted
> > > > > > > > 6.4.0-rc2-syzkaller #0
> > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > > > > > > > BIOS Google 08/04/2023
> > > > > > > > RIP: 0010:__lock_acquire+0x35/0x490 kernel/locking/lockdep.c:4946
> > > > > > > > Code: 83 ec 18 65 4c 8b 35 aa 60 f4 7e 83 3d b7 11 e4 02 00 0f 84 05
> > > > > > > > 02 00 00 4c 89 cb 89 cd 41 89 d5 49 89 ff 83 fe 01 77 0c 89 f0
> > > > > > > > <49> 8b
> > > > > > > > 44 c7 08 48 85 c0 75 1b 4c 89 ff 31 d2 45 89 c4 e8 74 f6 ff
> > > > > > > > RSP: 0018:ffffc90002edb840 EFLAGS: 00010097
> > > > > > > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
> > > > > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000050
> > > > > > > > RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000000
> > > > > > > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> > > > > > > > R13: 0000000000000000 R14: ffff888102ea5340 R15: 0000000000000050
> > > > > > > > FS: 0000000000000000(0000) GS:ffff88813bc00000(0000)
> > > > > > > > knlGS:0000000000000000
> > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > > > CR2: 0000000000000058 CR3: 0000000003aa8000 CR4: 00000000003506f0
> > > > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > > > > > Call Trace:
> > > > > > > > <TASK>
> > > > > > > > lock_acquire+0xd8/0x1f0 kernel/locking/lockdep.c:5691
> > > > > > > > seqcount_lockdep_reader_access include/linux/seqlock.h:102 [inline]
> > > > > > > > get_fs_root_rcu fs/d_path.c:243 [inline]
> > > > > > > > d_path+0xd1/0x1f0 fs/d_path.c:285
> > > > > > > > audit_log_d_path+0x65/0x130 kernel/audit.c:2139
> > > > > > > > dump_common_audit_data security/lsm_audit.c:224 [inline]
> > > > > > > > common_lsm_audit+0x3b3/0x840 security/lsm_audit.c:458
> > > > > > > > smack_log+0xad/0x130 security/smack/smack_access.c:383
> > > > > > > > smk_tskacc+0xb1/0xd0 security/smack/smack_access.c:253
> > > > > > > > smack_inode_getattr+0x8a/0xb0 security/smack/smack_lsm.c:1187
> > > > > > > > security_inode_getattr+0x32/0x50 security/security.c:2114
> > > > > > > > vfs_getattr+0x1b/0x40 fs/stat.c:167
> > > > > > > > ovl_getattr+0xa6/0x3e0 fs/overlayfs/inode.c:173
> > > > > > > > ima_check_last_writer security/integrity/ima/ima_main.c:171
> > > > > > > > [inline]
> > > > > > > > ima_file_free+0xbd/0x130 security/integrity/ima/ima_main.c:203
> > > > > > > > __fput+0xc7/0x220 fs/file_table.c:315
> > > > > > > > task_work_run+0x7d/0xa0 kernel/task_work.c:179
> > > > > > > > exit_task_work include/linux/task_work.h:38 [inline]
> > > > > > > > do_exit+0x2c7/0xa80 kernel/exit.c:871 <-----------------------
> > > > > > > > do_group_exit+0x85/0xa0 kernel/exit.c:1021
> > > > > > > > get_signal+0x73c/0x7f0 kernel/signal.c:2874
> > > > > > > > arch_do_signal_or_restart+0x89/0x290 arch/x86/kernel/signal.c:306
> > > > > > > > exit_to_user_mode_loop+0x61/0xb0 kernel/entry/common.c:168
> > > > > > > > exit_to_user_mode_prepare+0x64/0xb0 kernel/entry/common.c:204
> > > > > > > > __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
> > > > > > > > syscall_exit_to_user_mode+0x2b/0x1d0 kernel/entry/common.c:297
> > > > > > > > do_syscall_64+0x4d/0x90 arch/x86/entry/common.c:86
> > > > > > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > > > > >
> > > > > > > >
> > > > > > > > do_exit has called exit_fs(tsk) [
> > > > > > > > https://elixir.bootlin.com/linux/v6.4-rc2/source/kernel/exit.c#L867 ]
> > > > > > > >
> > > > > > > > exit_fs(tsk) has set tsk->fs = NULL [
> > > > > > > > https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/fs_struct.c#L103
> > > > > > > > ]
> > > > > > > >
> > > > > > > > I think this then bites in d_path() where it calls:
> > > > > > > >
> > > > > > > > get_fs_root_rcu(current->fs, &root); [
> > > > > > > > https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/d_path.c#L285 ]
> > > > > > > >
> > > > > > > > current->fs is likely NULL here.
> > > > > > > >
> > > > > > > > If this was correct it would have nothing to do with the actual
> > > > > > > > patch,
> > > > > > > > though, but rather with the fact that smack logs on process
> > > > > > > > termination. I am not sure what the solution would be other than
> > > > > > > > testing for current->fs == NULL in d_path before using it and
> > > > > > > > returning an error that is not normally returned or trying to
> > > > > > > > intercept this case in smack.
> > > > > > > I have now been able to recreate the syzbot issue with the test
> > > > > > > program
> > > > > > > and the issue is exactly the one described here, current->fs == NULL.
> > > > > > >
> > > > > > Earlier in this thread, Amir had a diagnosis that IMA is
> > > > > > inappropriately
> > > > > > trying to use f_path directly instead of using the helpers that are
> > > > > > friendly for stacking filesystems.
> > > > > >
> > > > > > https://lore.kernel.org/linux-fsdevel/CAOQ4uxgjnYyeQL-LbS5kQ7+C0d6sjzKqMDWAtZW8cAkPaed6=Q@mail.gmail.com/
> > > > > >
> > > > > >
> > > > > > I'm not an IMA hacker so I'm not planning to roll a fix here. Perhaps
> > > > > > someone on the IMA team could try this approach?
> > > > >
> > > > >
> > > > > I have applied this patch here from Amir now and it does NOT resolve
> > > > > the issue:
> > > > >
> > > > > https://lore.kernel.org/linux-integrity/[email protected]/T/#m4ebdb780bf6952e7f210c55e87950d0cfa1d5eb0
> > > > >
> > > > >
> > > >
> > > > This seems to resolve the issue:
> > > >
> > > > diff --git a/security/smack/smack_access.c
> > > > b/security/smack/smack_access.c
> > > > index 585e5e35710b..57afcea1e39b 100644
> > > > --- a/security/smack/smack_access.c
> > > > +++ b/security/smack/smack_access.c
> > > > @@ -347,6 +347,9 @@ void smack_log(char *subject_label, char
> > > > *object_label, int request,
> > > > struct smack_audit_data *sad;
> > > > struct common_audit_data *a = &ad->a;
> > > >
> > > > + if (current->flags & PF_EXITING)
> > > > + return;
> > > > +
> > >
> > > Based on what I see here I can understand that this prevents the panic,
> > > but it isn't so clear what changed that introduced the problem.
> > >
> > > > /* check if we have to log the current event */
> > > > if (result < 0 && (log_policy & SMACK_AUDIT_DENIED) == 0)
> > > > return;
> > > >
> > > >
> >
> > Apparently, it's this patch:
> >
> > db1d1e8b9867 IMA: use vfs_getattr_nosec to get the i_version
>
> Yes, the syzbot was updated with that info.
>
> > At one time, IMA would reach directly into the inode to get the
> > i_version and ctime. That was fine for certain filesystems, but with
> > more recent changes it needs to go through ->getattr instead. Evidently,
> > it's selecting the wrong inode to query when dealing with overlayfs and
> > that's causing panics at times.
> >
> > As to why the above patch helps, I'm not sure, but given that it doesn't
> > seem to change which inode is being queried via getattr, it seems like
> > this is probably papering over the real bug. That said, IMA and
> > overlayfs are not really in my wheelhouse, so I could be very wrong
> > here.
>
> The call to vfs_getattr_nosec() somehow triggers a call to
> security_inode_getattr(). Without the call neither ovl_getattr() nor
> smack_inode_getattr() would be called.
ima_file_free()
-> ima_check_last_writer()
-> vfs_getattr_nosec()
-> i_op->getattr() == ovl_getattr()
-> vfs_getattr()
-> security_inode_getattr()
-> real_i_op->getattr()
is the callchain that triggers this.
ima_file_free() is called in a very sensitive location: __fput() that
can be called from task work when the process is already PF_EXITING.
The ideal solution would be for ima to stop calling back into the
filesystems in this location at all but that's probably not going to
happen because I now realize you also set extended attributes from
__fput():
ima_check_last_writer()
-> ima_update_xatt()
-> ima_fix_xattr()
-> __vfs_setxattr_noperm()
The __vfs_setxattr_noperm() codepath can itself trigger
security_inode_post_setxattr() and security_inode_setsecurity(). So
those hooks are hopefully safe to be called with PF_EXITING tasks as
well...
Imho, this is all very wild but I'm not judging.
Two solutions imho:
(1) teach stacking filesystems like overlayfs and ecryptfs to use
vfs_getattr_nosec() in their ->getattr() implementation when they
are themselves called via vfs_getattr_nosec(). This will fix this by
not triggering another LSM hook.
(2) make all ->getattr() LSM hooks PF_EXITING safe ideally don't do
anything
On 9/21/23 07:48, Christian Brauner wrote:
> On Thu, Sep 21, 2023 at 07:24:23AM -0400, Mimi Zohar wrote:
>> On Thu, 2023-09-21 at 06:32 -0400, Jeff Layton wrote:
>>> On Wed, 2023-09-20 at 17:52 -0700, Casey Schaufler wrote:
>>>> On 9/20/2023 5:10 PM, Stefan Berger wrote:
>>>>> On 9/20/23 18:09, Stefan Berger wrote:
>>>>>> On 9/20/23 17:16, Jeff Layton wrote:
>>>>>>> On Wed, 2023-09-20 at 16:37 -0400, Stefan Berger wrote:
>>>>>>>> On 9/20/23 13:01, Stefan Berger wrote:
>>>>>>>>> On 9/17/23 20:04, syzbot wrote:
>>>>>>>>>> syzbot has bisected this issue to:
>>>>>>>>>>
>>>>>>>>>> commit db1d1e8b9867aae5c3e61ad7859abfcc4a6fd6c7
>>>>>>>>>> Author: Jeff Layton <[email protected]>
>>>>>>>>>> Date: Mon Apr 17 16:55:51 2023 +0000
>>>>>>>>>>
>>>>>>>>>> IMA: use vfs_getattr_nosec to get the i_version
>>>>>>>>>>
>>>>>>>>>> bisection log:
>>>>>>>>>> https://syzkaller.appspot.com/x/bisect.txt?x=106f7e54680000
>>>>>>>>>> start commit: a747acc0b752 Merge tag
>>>>>>>>>> 'linux-kselftest-next-6.6-rc2'
>>>>>>>>>> of g..
>>>>>>>>>> git tree: upstream
>>>>>>>>>> final oops:
>>>>>>>>>> https://syzkaller.appspot.com/x/report.txt?x=126f7e54680000
>>>>>>>>>> console output:
>>>>>>>>>> https://syzkaller.appspot.com/x/log.txt?x=146f7e54680000
>>>>>>>>>> kernel config:
>>>>>>>>>> https://syzkaller.appspot.com/x/.config?x=df91a3034fe3f122
>>>>>>>>>> dashboard link:
>>>>>>>>>> https://syzkaller.appspot.com/bug?extid=a67fc5321ffb4b311c98
>>>>>>>>>> syz repro:
>>>>>>>>>> https://syzkaller.appspot.com/x/repro.syz?x=1671b694680000
>>>>>>>>>> C reproducer:
>>>>>>>>>> https://syzkaller.appspot.com/x/repro.c?x=14ec94d8680000
>>>>>>>>>>
>>>>>>>>>> Reported-by: [email protected]
>>>>>>>>>> Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the
>>>>>>>>>> i_version")
>>>>>>>>>>
>>>>>>>>>> For information about bisection process see:
>>>>>>>>>> https://goo.gl/tpsmEJ#bisection
>>>>>>>>> The final oops shows this here:
>>>>>>>>>
>>>>>>>>> BUG: kernel NULL pointer dereference, address: 0000000000000058
>>>>>>>>> #PF: supervisor read access in kernel mode
>>>>>>>>> #PF: error_code(0x0000) - not-present page
>>>>>>>>> PGD 0 P4D 0
>>>>>>>>> Oops: 0000 [#1] PREEMPT SMP
>>>>>>>>> CPU: 0 PID: 3192 Comm: syz-executor.0 Not tainted
>>>>>>>>> 6.4.0-rc2-syzkaller #0
>>>>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>>>>>>>>> BIOS Google 08/04/2023
>>>>>>>>> RIP: 0010:__lock_acquire+0x35/0x490 kernel/locking/lockdep.c:4946
>>>>>>>>> Code: 83 ec 18 65 4c 8b 35 aa 60 f4 7e 83 3d b7 11 e4 02 00 0f 84 05
>>>>>>>>> 02 00 00 4c 89 cb 89 cd 41 89 d5 49 89 ff 83 fe 01 77 0c 89 f0
>>>>>>>>> <49> 8b
>>>>>>>>> 44 c7 08 48 85 c0 75 1b 4c 89 ff 31 d2 45 89 c4 e8 74 f6 ff
>>>>>>>>> RSP: 0018:ffffc90002edb840 EFLAGS: 00010097
>>>>>>>>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
>>>>>>>>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000050
>>>>>>>>> RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000000
>>>>>>>>> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
>>>>>>>>> R13: 0000000000000000 R14: ffff888102ea5340 R15: 0000000000000050
>>>>>>>>> FS: 0000000000000000(0000) GS:ffff88813bc00000(0000)
>>>>>>>>> knlGS:0000000000000000
>>>>>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>>> CR2: 0000000000000058 CR3: 0000000003aa8000 CR4: 00000000003506f0
>>>>>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>>>>>> Call Trace:
>>>>>>>>> <TASK>
>>>>>>>>> lock_acquire+0xd8/0x1f0 kernel/locking/lockdep.c:5691
>>>>>>>>> seqcount_lockdep_reader_access include/linux/seqlock.h:102 [inline]
>>>>>>>>> get_fs_root_rcu fs/d_path.c:243 [inline]
>>>>>>>>> d_path+0xd1/0x1f0 fs/d_path.c:285
>>>>>>>>> audit_log_d_path+0x65/0x130 kernel/audit.c:2139
>>>>>>>>> dump_common_audit_data security/lsm_audit.c:224 [inline]
>>>>>>>>> common_lsm_audit+0x3b3/0x840 security/lsm_audit.c:458
>>>>>>>>> smack_log+0xad/0x130 security/smack/smack_access.c:383
>>>>>>>>> smk_tskacc+0xb1/0xd0 security/smack/smack_access.c:253
>>>>>>>>> smack_inode_getattr+0x8a/0xb0 security/smack/smack_lsm.c:1187
>>>>>>>>> security_inode_getattr+0x32/0x50 security/security.c:2114
>>>>>>>>> vfs_getattr+0x1b/0x40 fs/stat.c:167
>>>>>>>>> ovl_getattr+0xa6/0x3e0 fs/overlayfs/inode.c:173
>>>>>>>>> ima_check_last_writer security/integrity/ima/ima_main.c:171
>>>>>>>>> [inline]
>>>>>>>>> ima_file_free+0xbd/0x130 security/integrity/ima/ima_main.c:203
>>>>>>>>> __fput+0xc7/0x220 fs/file_table.c:315
>>>>>>>>> task_work_run+0x7d/0xa0 kernel/task_work.c:179
>>>>>>>>> exit_task_work include/linux/task_work.h:38 [inline]
>>>>>>>>> do_exit+0x2c7/0xa80 kernel/exit.c:871 <-----------------------
>>>>>>>>> do_group_exit+0x85/0xa0 kernel/exit.c:1021
>>>>>>>>> get_signal+0x73c/0x7f0 kernel/signal.c:2874
>>>>>>>>> arch_do_signal_or_restart+0x89/0x290 arch/x86/kernel/signal.c:306
>>>>>>>>> exit_to_user_mode_loop+0x61/0xb0 kernel/entry/common.c:168
>>>>>>>>> exit_to_user_mode_prepare+0x64/0xb0 kernel/entry/common.c:204
>>>>>>>>> __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
>>>>>>>>> syscall_exit_to_user_mode+0x2b/0x1d0 kernel/entry/common.c:297
>>>>>>>>> do_syscall_64+0x4d/0x90 arch/x86/entry/common.c:86
>>>>>>>>> entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> do_exit has called exit_fs(tsk) [
>>>>>>>>> https://elixir.bootlin.com/linux/v6.4-rc2/source/kernel/exit.c#L867 ]
>>>>>>>>>
>>>>>>>>> exit_fs(tsk) has set tsk->fs = NULL [
>>>>>>>>> https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/fs_struct.c#L103
>>>>>>>>> ]
>>>>>>>>>
>>>>>>>>> I think this then bites in d_path() where it calls:
>>>>>>>>>
>>>>>>>>> get_fs_root_rcu(current->fs, &root); [
>>>>>>>>> https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/d_path.c#L285 ]
>>>>>>>>>
>>>>>>>>> current->fs is likely NULL here.
>>>>>>>>>
>>>>>>>>> If this was correct it would have nothing to do with the actual
>>>>>>>>> patch,
>>>>>>>>> though, but rather with the fact that smack logs on process
>>>>>>>>> termination. I am not sure what the solution would be other than
>>>>>>>>> testing for current->fs == NULL in d_path before using it and
>>>>>>>>> returning an error that is not normally returned or trying to
>>>>>>>>> intercept this case in smack.
>>>>>>>> I have now been able to recreate the syzbot issue with the test
>>>>>>>> program
>>>>>>>> and the issue is exactly the one described here, current->fs == NULL.
>>>>>>>>
>>>>>>> Earlier in this thread, Amir had a diagnosis that IMA is
>>>>>>> inappropriately
>>>>>>> trying to use f_path directly instead of using the helpers that are
>>>>>>> friendly for stacking filesystems.
>>>>>>>
>>>>>>> https://lore.kernel.org/linux-fsdevel/CAOQ4uxgjnYyeQL-LbS5kQ7+C0d6sjzKqMDWAtZW8cAkPaed6=Q@mail.gmail.com/
>>>>>>>
>>>>>>>
>>>>>>> I'm not an IMA hacker so I'm not planning to roll a fix here. Perhaps
>>>>>>> someone on the IMA team could try this approach?
>>>>>>
>>>>>> I have applied this patch here from Amir now and it does NOT resolve
>>>>>> the issue:
>>>>>>
>>>>>> https://lore.kernel.org/linux-integrity/[email protected]/T/#m4ebdb780bf6952e7f210c55e87950d0cfa1d5eb0
>>>>>>
>>>>>>
>>>>> This seems to resolve the issue:
>>>>>
>>>>> diff --git a/security/smack/smack_access.c
>>>>> b/security/smack/smack_access.c
>>>>> index 585e5e35710b..57afcea1e39b 100644
>>>>> --- a/security/smack/smack_access.c
>>>>> +++ b/security/smack/smack_access.c
>>>>> @@ -347,6 +347,9 @@ void smack_log(char *subject_label, char
>>>>> *object_label, int request,
>>>>> struct smack_audit_data *sad;
>>>>> struct common_audit_data *a = &ad->a;
>>>>>
>>>>> + if (current->flags & PF_EXITING)
>>>>> + return;
>>>>> +
>>>> Based on what I see here I can understand that this prevents the panic,
>>>> but it isn't so clear what changed that introduced the problem.
>>>>
>>>>> /* check if we have to log the current event */
>>>>> if (result < 0 && (log_policy & SMACK_AUDIT_DENIED) == 0)
>>>>> return;
>>>>>
>>>>>
>>> Apparently, it's this patch:
>>>
>>> db1d1e8b9867 IMA: use vfs_getattr_nosec to get the i_version
>> Yes, the syzbot was updated with that info.
>>
>>> At one time, IMA would reach directly into the inode to get the
>>> i_version and ctime. That was fine for certain filesystems, but with
>>> more recent changes it needs to go through ->getattr instead. Evidently,
>>> it's selecting the wrong inode to query when dealing with overlayfs and
>>> that's causing panics at times.
>>>
>>> As to why the above patch helps, I'm not sure, but given that it doesn't
>>> seem to change which inode is being queried via getattr, it seems like
>>> this is probably papering over the real bug. That said, IMA and
>>> overlayfs are not really in my wheelhouse, so I could be very wrong
>>> here.
>> The call to vfs_getattr_nosec() somehow triggers a call to
>> security_inode_getattr(). Without the call neither ovl_getattr() nor
>> smack_inode_getattr() would be called.
> ima_file_free()
> -> ima_check_last_writer()
> -> vfs_getattr_nosec()
> -> i_op->getattr() == ovl_getattr()
> -> vfs_getattr()
> -> security_inode_getattr()
> -> real_i_op->getattr()
>
> is the callchain that triggers this.
>
> ima_file_free() is called in a very sensitive location: __fput() that
> can be called from task work when the process is already PF_EXITING.
>
> The ideal solution would be for ima to stop calling back into the
> filesystems in this location at all but that's probably not going to
> happen because I now realize you also set extended attributes from
> __fput():
>
>
> ima_check_last_writer()
> -> ima_update_xatt()
> -> ima_fix_xattr()
> -> __vfs_setxattr_noperm()
>
> The __vfs_setxattr_noperm() codepath can itself trigger
> security_inode_post_setxattr() and security_inode_setsecurity(). So
> those hooks are hopefully safe to be called with PF_EXITING tasks as
> well...
LSM inode_post_setxattr has two users, Smack and SELinux. Smack does not
call smack_log/audit in this case. SELinux seems safe as well.
LSM inode_setsecurity has two users, Smack and SELinux. Smack does not
call smack_log/audit in this case. SELinux seems safe as well.
>
> Imho, this is all very wild but I'm not judging.
>
> Two solutions imho:
> (1) teach stacking filesystems like overlayfs and ecryptfs to use
> vfs_getattr_nosec() in their ->getattr() implementation when they
> are themselves called via vfs_getattr_nosec(). This will fix this by
> not triggering another LSM hook.
> (2) make all ->getattr() LSM hooks PF_EXITING safe ideally don't do
> anything
Re (2): LSM's getattr should still make policy decision, right? So
callers should be allowed to go to some depth into these functions but
avoid calls deeper into smack_log (for example) that trigger the call path
common_lsm_audit
-> dump_common_audit_data
-> audit_log_d_path via a->type LSM_AUDIT_DATA_PATH or
LSM_AUDIT_DATA_FILE or LSM_AUDIT_DATA_IOCTL_OP and then possibly run
into current->fs == NULL in d_path.
To avoid audit_log_d_path being called the a->type should not be either
one of the 3 mentioned above:
LSM_AUDIT_DATA_PATH: used by Smack & SELinux
LSM_AUDIT_DATA_FILE: used by SELinux
LSM_AUDIT_DATA_IOCTL: used by SELinux
The LSM getattr has users AppArmor, SELinux, Smack and Tomoyo.
Tomoyo: seems safe
SELinux:
selinux_inode_getattr
-> path_has_perm [ sets LSM_AUDIT_DATA_FILE !! ]
-> inode_has_perm
-> avc_has_perm
-> avc_audit
-> slow_avc_audit
-> common_lsm_audit
-> dump_common_audit_data
-> audit_log_d_path
-> d_path
-> get_fs_root_rcu with current->fs = NULL
Smack: (the known path per this syzbot issue)
smack_inode_getattr [ sets LSM_AUDIT_DATA_PATH !! ]
-> smk_curacc
-> smk_tskacc
-> smack_log
-> common_lsm_audit
-> dump_common_audit_data
-> audit_log_d_path
-> d_path
-> get_fs_root_rcu with current->fs = NULL
AppArmor:
apparmor_inode_getattr
-> common_perm_cond
-> common_perm
-> aa_path_perm
-> profile_path_perm
-> aa_path_perm
-> aa_audit_file [ sets LSM_AUDIT_DATA_TASK ]
-> aa_audit
-> aa_audit_msg
-> common_lsm_audit
-> dump_common_audit_data
-> DOES NOT call audit_log_d_path but calls
task_tgid_nr(tsk)
So, SELinux and Smack would be affected. The common code path starts
with common_lsm_audit and either a check for current->flags & PF_EXITING
or a more fine-grained one like this here could be a solution for (2)
diff --git a/security/lsm_audit.c b/security/lsm_audit.c
index 849e832719e2..4f3570322851 100644
--- a/security/lsm_audit.c
+++ b/security/lsm_audit.c
@@ -445,6 +445,18 @@ void common_lsm_audit(struct common_audit_data *a,
if (a == NULL)
return;
+
+ if (current->flags & PF_EXITING) {
+ /*
+ * Avoid running into audit_log_d_path -> d_path
+ * -> get_fs_root_rcu with current->fs = NULL
+ */
+ if (a->type == LSM_AUDIT_DATA_PATH ||
+ a->type == LSM_AUDIT_DATA_FILE ||
+ a->type == LSM_AUDIT_DATA_IOCTL_OP)
+ return;
+ }
+
/* we use GFP_ATOMIC so we won't sleep */
ab = audit_log_start(audit_context(), GFP_ATOMIC | __GFP_NOWARN,
AUDIT_AVC);
On Thu, 2023-09-21 at 13:48 +0200, Christian Brauner wrote:
> On Thu, Sep 21, 2023 at 07:24:23AM -0400, Mimi Zohar wrote:
> > On Thu, 2023-09-21 at 06:32 -0400, Jeff Layton wrote:
> > > On Wed, 2023-09-20 at 17:52 -0700, Casey Schaufler wrote:
> > > > On 9/20/2023 5:10 PM, Stefan Berger wrote:
> > > > >
> > > > > On 9/20/23 18:09, Stefan Berger wrote:
> > > > > >
> > > > > > On 9/20/23 17:16, Jeff Layton wrote:
> > > > > > > On Wed, 2023-09-20 at 16:37 -0400, Stefan Berger wrote:
> > > > > > > > On 9/20/23 13:01, Stefan Berger wrote:
> > > > > > > > > On 9/17/23 20:04, syzbot wrote:
> > > > > > > > > > syzbot has bisected this issue to:
> > > > > > > > > >
> > > > > > > > > > commit db1d1e8b9867aae5c3e61ad7859abfcc4a6fd6c7
> > > > > > > > > > Author: Jeff Layton <[email protected]>
> > > > > > > > > > Date: Mon Apr 17 16:55:51 2023 +0000
> > > > > > > > > >
> > > > > > > > > > IMA: use vfs_getattr_nosec to get the i_version
> > > > > > > > > >
> > > > > > > > > > bisection log:
> > > > > > > > > > https://syzkaller.appspot.com/x/bisect.txt?x=106f7e54680000
> > > > > > > > > > start commit: a747acc0b752 Merge tag
> > > > > > > > > > 'linux-kselftest-next-6.6-rc2'
> > > > > > > > > > of g..
> > > > > > > > > > git tree: upstream
> > > > > > > > > > final oops:
> > > > > > > > > > https://syzkaller.appspot.com/x/report.txt?x=126f7e54680000
> > > > > > > > > > console output:
> > > > > > > > > > https://syzkaller.appspot.com/x/log.txt?x=146f7e54680000
> > > > > > > > > > kernel config:
> > > > > > > > > > https://syzkaller.appspot.com/x/.config?x=df91a3034fe3f122
> > > > > > > > > > dashboard link:
> > > > > > > > > > https://syzkaller.appspot.com/bug?extid=a67fc5321ffb4b311c98
> > > > > > > > > > syz repro:
> > > > > > > > > > https://syzkaller.appspot.com/x/repro.syz?x=1671b694680000
> > > > > > > > > > C reproducer:
> > > > > > > > > > https://syzkaller.appspot.com/x/repro.c?x=14ec94d8680000
> > > > > > > > > >
> > > > > > > > > > Reported-by: [email protected]
> > > > > > > > > > Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the
> > > > > > > > > > i_version")
> > > > > > > > > >
> > > > > > > > > > For information about bisection process see:
> > > > > > > > > > https://goo.gl/tpsmEJ#bisection
> > > > > > > > > The final oops shows this here:
> > > > > > > > >
> > > > > > > > > BUG: kernel NULL pointer dereference, address: 0000000000000058
> > > > > > > > > #PF: supervisor read access in kernel mode
> > > > > > > > > #PF: error_code(0x0000) - not-present page
> > > > > > > > > PGD 0 P4D 0
> > > > > > > > > Oops: 0000 [#1] PREEMPT SMP
> > > > > > > > > CPU: 0 PID: 3192 Comm: syz-executor.0 Not tainted
> > > > > > > > > 6.4.0-rc2-syzkaller #0
> > > > > > > > > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > > > > > > > > BIOS Google 08/04/2023
> > > > > > > > > RIP: 0010:__lock_acquire+0x35/0x490 kernel/locking/lockdep.c:4946
> > > > > > > > > Code: 83 ec 18 65 4c 8b 35 aa 60 f4 7e 83 3d b7 11 e4 02 00 0f 84 05
> > > > > > > > > 02 00 00 4c 89 cb 89 cd 41 89 d5 49 89 ff 83 fe 01 77 0c 89 f0
> > > > > > > > > <49> 8b
> > > > > > > > > 44 c7 08 48 85 c0 75 1b 4c 89 ff 31 d2 45 89 c4 e8 74 f6 ff
> > > > > > > > > RSP: 0018:ffffc90002edb840 EFLAGS: 00010097
> > > > > > > > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
> > > > > > > > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000050
> > > > > > > > > RBP: 0000000000000002 R08: 0000000000000001 R09: 0000000000000000
> > > > > > > > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> > > > > > > > > R13: 0000000000000000 R14: ffff888102ea5340 R15: 0000000000000050
> > > > > > > > > FS: 0000000000000000(0000) GS:ffff88813bc00000(0000)
> > > > > > > > > knlGS:0000000000000000
> > > > > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > > > > > > CR2: 0000000000000058 CR3: 0000000003aa8000 CR4: 00000000003506f0
> > > > > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > > > > > > Call Trace:
> > > > > > > > > <TASK>
> > > > > > > > > lock_acquire+0xd8/0x1f0 kernel/locking/lockdep.c:5691
> > > > > > > > > seqcount_lockdep_reader_access include/linux/seqlock.h:102 [inline]
> > > > > > > > > get_fs_root_rcu fs/d_path.c:243 [inline]
> > > > > > > > > d_path+0xd1/0x1f0 fs/d_path.c:285
> > > > > > > > > audit_log_d_path+0x65/0x130 kernel/audit.c:2139
> > > > > > > > > dump_common_audit_data security/lsm_audit.c:224 [inline]
> > > > > > > > > common_lsm_audit+0x3b3/0x840 security/lsm_audit.c:458
> > > > > > > > > smack_log+0xad/0x130 security/smack/smack_access.c:383
> > > > > > > > > smk_tskacc+0xb1/0xd0 security/smack/smack_access.c:253
> > > > > > > > > smack_inode_getattr+0x8a/0xb0 security/smack/smack_lsm.c:1187
> > > > > > > > > security_inode_getattr+0x32/0x50 security/security.c:2114
> > > > > > > > > vfs_getattr+0x1b/0x40 fs/stat.c:167
> > > > > > > > > ovl_getattr+0xa6/0x3e0 fs/overlayfs/inode.c:173
> > > > > > > > > ima_check_last_writer security/integrity/ima/ima_main.c:171
> > > > > > > > > [inline]
> > > > > > > > > ima_file_free+0xbd/0x130 security/integrity/ima/ima_main.c:203
> > > > > > > > > __fput+0xc7/0x220 fs/file_table.c:315
> > > > > > > > > task_work_run+0x7d/0xa0 kernel/task_work.c:179
> > > > > > > > > exit_task_work include/linux/task_work.h:38 [inline]
> > > > > > > > > do_exit+0x2c7/0xa80 kernel/exit.c:871 <-----------------------
> > > > > > > > > do_group_exit+0x85/0xa0 kernel/exit.c:1021
> > > > > > > > > get_signal+0x73c/0x7f0 kernel/signal.c:2874
> > > > > > > > > arch_do_signal_or_restart+0x89/0x290 arch/x86/kernel/signal.c:306
> > > > > > > > > exit_to_user_mode_loop+0x61/0xb0 kernel/entry/common.c:168
> > > > > > > > > exit_to_user_mode_prepare+0x64/0xb0 kernel/entry/common.c:204
> > > > > > > > > __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
> > > > > > > > > syscall_exit_to_user_mode+0x2b/0x1d0 kernel/entry/common.c:297
> > > > > > > > > do_syscall_64+0x4d/0x90 arch/x86/entry/common.c:86
> > > > > > > > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > do_exit has called exit_fs(tsk) [
> > > > > > > > > https://elixir.bootlin.com/linux/v6.4-rc2/source/kernel/exit.c#L867 ]
> > > > > > > > >
> > > > > > > > > exit_fs(tsk) has set tsk->fs = NULL [
> > > > > > > > > https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/fs_struct.c#L103
> > > > > > > > > ]
> > > > > > > > >
> > > > > > > > > I think this then bites in d_path() where it calls:
> > > > > > > > >
> > > > > > > > > get_fs_root_rcu(current->fs, &root); [
> > > > > > > > > https://elixir.bootlin.com/linux/v6.4-rc2/source/fs/d_path.c#L285 ]
> > > > > > > > >
> > > > > > > > > current->fs is likely NULL here.
> > > > > > > > >
> > > > > > > > > If this was correct it would have nothing to do with the actual
> > > > > > > > > patch,
> > > > > > > > > though, but rather with the fact that smack logs on process
> > > > > > > > > termination. I am not sure what the solution would be other than
> > > > > > > > > testing for current->fs == NULL in d_path before using it and
> > > > > > > > > returning an error that is not normally returned or trying to
> > > > > > > > > intercept this case in smack.
> > > > > > > > I have now been able to recreate the syzbot issue with the test
> > > > > > > > program
> > > > > > > > and the issue is exactly the one described here, current->fs == NULL.
> > > > > > > >
> > > > > > > Earlier in this thread, Amir had a diagnosis that IMA is
> > > > > > > inappropriately
> > > > > > > trying to use f_path directly instead of using the helpers that are
> > > > > > > friendly for stacking filesystems.
> > > > > > >
> > > > > > > https://lore.kernel.org/linux-fsdevel/CAOQ4uxgjnYyeQL-LbS5kQ7+C0d6sjzKqMDWAtZW8cAkPaed6=Q@mail.gmail.com/
> > > > > > >
> > > > > > >
> > > > > > > I'm not an IMA hacker so I'm not planning to roll a fix here. Perhaps
> > > > > > > someone on the IMA team could try this approach?
> > > > > >
> > > > > >
> > > > > > I have applied this patch here from Amir now and it does NOT resolve
> > > > > > the issue:
> > > > > >
> > > > > > https://lore.kernel.org/linux-integrity/[email protected]/T/#m4ebdb780bf6952e7f210c55e87950d0cfa1d5eb0
> > > > > >
> > > > > >
> > > > >
> > > > > This seems to resolve the issue:
> > > > >
> > > > > diff --git a/security/smack/smack_access.c
> > > > > b/security/smack/smack_access.c
> > > > > index 585e5e35710b..57afcea1e39b 100644
> > > > > --- a/security/smack/smack_access.c
> > > > > +++ b/security/smack/smack_access.c
> > > > > @@ -347,6 +347,9 @@ void smack_log(char *subject_label, char
> > > > > *object_label, int request,
> > > > > struct smack_audit_data *sad;
> > > > > struct common_audit_data *a = &ad->a;
> > > > >
> > > > > + if (current->flags & PF_EXITING)
> > > > > + return;
> > > > > +
> > > >
> > > > Based on what I see here I can understand that this prevents the panic,
> > > > but it isn't so clear what changed that introduced the problem.
> > > >
> > > > > /* check if we have to log the current event */
> > > > > if (result < 0 && (log_policy & SMACK_AUDIT_DENIED) == 0)
> > > > > return;
> > > > >
> > > > >
> > >
> > > Apparently, it's this patch:
> > >
> > > db1d1e8b9867 IMA: use vfs_getattr_nosec to get the i_version
> >
> > Yes, the syzbot was updated with that info.
> >
> > > At one time, IMA would reach directly into the inode to get the
> > > i_version and ctime. That was fine for certain filesystems, but with
> > > more recent changes it needs to go through ->getattr instead. Evidently,
> > > it's selecting the wrong inode to query when dealing with overlayfs and
> > > that's causing panics at times.
> > >
> > > As to why the above patch helps, I'm not sure, but given that it doesn't
> > > seem to change which inode is being queried via getattr, it seems like
> > > this is probably papering over the real bug. That said, IMA and
> > > overlayfs are not really in my wheelhouse, so I could be very wrong
> > > here.
> >
> > The call to vfs_getattr_nosec() somehow triggers a call to
> > security_inode_getattr(). Without the call neither ovl_getattr() nor
> > smack_inode_getattr() would be called.
>
> ima_file_free()
> -> ima_check_last_writer()
> -> vfs_getattr_nosec()
> -> i_op->getattr() == ovl_getattr()
> -> vfs_getattr()
> -> security_inode_getattr()
> -> real_i_op->getattr()
>
> is the callchain that triggers this.
Thank you for the explanation as to why ovl_getattr() and subsequently
smack_inode_getattr() is being called.
>
> ima_file_free() is called in a very sensitive location: __fput() that
> can be called from task work when the process is already PF_EXITING.
>
> The ideal solution would be for ima to stop calling back into the
> filesystems in this location at all but that's probably not going to
> happen because I now realize you also set extended attributes from
> __fput():
>
>
> ima_check_last_writer()
> -> ima_update_xatt()
> -> ima_fix_xattr()
> -> __vfs_setxattr_noperm()
>
> The __vfs_setxattr_noperm() codepath can itself trigger
> security_inode_post_setxattr() and security_inode_setsecurity(). So
> those hooks are hopefully safe to be called with PF_EXITING tasks as
> well...
>
> Imho, this is all very wild but I'm not judging.
Measuring and verifying immutable files is straight forward.
Measuring, verifiying, and updating mutable file hashes is a lot more
complicated. Re-calculating the file hash everytime the file changes
would impact performance. The file hash is currently updated as the
last writer closes the file (__fput). One of the reasons for the wq
was for IMA to safely calculate the file hash and and take the i_mutex
to write the xattr.
IMA support for mutable files makes IMA a lot more complicated. Any
improvement suggestions would be appreciated.
>
> Two solutions imho:
> (1) teach stacking filesystems like overlayfs and ecryptfs to use
> vfs_getattr_nosec() in their ->getattr() implementation when they
> are themselves called via vfs_getattr_nosec(). This will fix this by
> not triggering another LSM hook.
> (2) make all ->getattr() LSM hooks PF_EXITING safe ideally don't do
> anything
The original problem was detecting i_version change on overlayfs.
Amir's proposed patch might resolve it without commit db1d1e8b9867
("IMA: use vfs_getattr_nosec to get the i_version"). However, as Amir
said, it does not address the new problem introduced by it. Assuming
Amir's proposed patch resolves the original problem, an alternative
solution would be to revert commit db1d1e8b9867.
Mimi
On 9/21/23 07:48, Christian Brauner wrote:
>
> Imho, this is all very wild but I'm not judging.
>
> Two solutions imho:
> (1) teach stacking filesystems like overlayfs and ecryptfs to use
> vfs_getattr_nosec() in their ->getattr() implementation when they
> are themselves called via vfs_getattr_nosec(). This will fix this by
> not triggering another LSM hook.
This somewhat lengthy patch I think would be a solution for (1). I don't
think the Fixes tag is correct but IMO it should propagate farther back,
if possible.
From 01467f6e879c4cd757abb4d79cb18bf11150bed8 Mon Sep 17 00:00:00 2001
From: Stefan Berger <[email protected]>
Date: Thu, 28 Sep 2023 14:42:39 -0400
Subject: [PATCH] fs: Enable GETATTR_NOSEC parameter for getattr interface
function
When vfs_getattr_nosec() calls a filesystem's getattr interface function
then the 'nosec' should propagate into this function so that
vfs_getattr_nosec() can again be called from the filesystem's gettattr
rather than vfs_getattr(). The latter would add unnecessary security
checks that the initial vfs_getattr_nosec() call wanted to avoid.
Therefore, introduce the getattr flag GETATTR_NOSEC and allow to pass
with the new getattr_flags parameter to the getattr interface function.
In overlayfs and ecryptfs use this flag to determine which one of the
two functions to call.
In a recent code change introduced to IMA vfs_getattr_nosec() ended up
calling vfs_getattr() in overlayfs, which in turn called
security_inode_getattr() on an exiting process that did not have
current->fs set anymore, which then caused a kernel NULL pointer
dereference. With this change the call to security_inode_getattr() can
be avoided, thus avoiding the NULL pointer dereference.
Reported-by: [email protected]
Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version")
Signed-off-by: Stefan Berger <[email protected]>
---
simple_getattr has been adjusted and all files returned by the following
grep have been adjusted as well.
grep -rEI "^[[:space:]]+\.getattr" ./ | \
grep -v simple_getattr | \
cut -d ":" -f1 | sort | uniq
---
fs/9p/vfs_inode.c | 3 ++-
fs/9p/vfs_inode_dotl.c | 3 ++-
fs/afs/inode.c | 3 ++-
fs/afs/internal.h | 2 +-
fs/bad_inode.c | 3 ++-
fs/btrfs/inode.c | 3 ++-
fs/ceph/inode.c | 9 ++++++---
fs/ceph/super.h | 3 ++-
fs/coda/coda_linux.h | 2 +-
fs/coda/inode.c | 3 ++-
fs/ecryptfs/inode.c | 14 ++++++++++----
fs/erofs/inode.c | 2 +-
fs/erofs/internal.h | 2 +-
fs/exfat/exfat_fs.h | 2 +-
fs/exfat/file.c | 2 +-
fs/ext2/ext2.h | 2 +-
fs/ext2/inode.c | 3 ++-
fs/ext4/ext4.h | 6 ++++--
fs/ext4/inode.c | 9 ++++++---
fs/ext4/symlink.c | 6 ++++--
fs/f2fs/f2fs.h | 3 ++-
fs/f2fs/file.c | 3 ++-
fs/f2fs/namei.c | 6 ++++--
fs/fat/fat.h | 3 ++-
fs/fat/file.c | 3 ++-
fs/fuse/dir.c | 3 ++-
fs/gfs2/inode.c | 4 +++-
fs/hfsplus/hfsplus_fs.h | 2 +-
fs/hfsplus/inode.c | 2 +-
fs/kernfs/inode.c | 3 ++-
fs/kernfs/kernfs-internal.h | 3 ++-
fs/libfs.c | 5 +++--
fs/minix/inode.c | 3 ++-
fs/minix/minix.h | 3 ++-
fs/nfs/inode.c | 3 ++-
fs/nfs/namespace.c | 5 +++--
fs/ntfs3/file.c | 3 ++-
fs/ntfs3/ntfs_fs.h | 3 ++-
fs/ocfs2/file.c | 3 ++-
fs/ocfs2/file.h | 3 ++-
fs/orangefs/inode.c | 3 ++-
fs/orangefs/orangefs-kernel.h | 3 ++-
fs/overlayfs/inode.c | 8 ++++++--
fs/overlayfs/overlayfs.h | 3 ++-
fs/proc/base.c | 6 ++++--
fs/proc/fd.c | 3 ++-
fs/proc/generic.c | 3 ++-
fs/proc/internal.h | 3 ++-
fs/proc/proc_net.c | 3 ++-
fs/proc/proc_sysctl.c | 3 ++-
fs/proc/root.c | 3 ++-
fs/smb/client/cifsfs.h | 3 ++-
fs/smb/client/inode.c | 3 ++-
fs/stat.c | 3 ++-
fs/sysv/itree.c | 3 ++-
fs/sysv/sysv.h | 2 +-
fs/ubifs/dir.c | 3 ++-
fs/ubifs/file.c | 6 ++++--
fs/ubifs/ubifs.h | 3 ++-
fs/udf/symlink.c | 3 ++-
fs/vboxsf/utils.c | 3 ++-
fs/vboxsf/vfsmod.h | 2 +-
fs/xfs/xfs_iops.c | 3 ++-
include/linux/fs.h | 10 ++++++++--
include/linux/nfs_fs.h | 2 +-
mm/shmem.c | 3 ++-
66 files changed, 159 insertions(+), 82 deletions(-)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 0d28ecf668d0..9c5a7e653bb1 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -1000,7 +1000,8 @@ v9fs_vfs_rename(struct mnt_idmap *idmap, struct
inode *old_dir,
static int
v9fs_vfs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct dentry *dentry = path->dentry;
struct inode *inode = d_inode(dentry);
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index 1312f68965ac..e4238ee243bf 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -439,7 +439,8 @@ static int v9fs_vfs_mkdir_dotl(struct mnt_idmap *idmap,
static int
v9fs_vfs_getattr_dotl(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags)
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct dentry *dentry = path->dentry;
struct v9fs_session_info *v9ses;
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 1c794a1896aa..8763e6126a8c 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -750,7 +750,8 @@ int afs_validate(struct afs_vnode *vnode, struct key
*key)
* read the attributes of an inode
*/
int afs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int query_flags)
+ struct kstat *stat, u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct afs_vnode *vnode = AFS_FS_I(inode);
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index da73b97e19a9..b8dfb6232086 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -1175,7 +1175,7 @@ extern bool afs_check_validity(struct afs_vnode *);
extern int afs_validate(struct afs_vnode *, struct key *);
bool afs_pagecache_valid(struct afs_vnode *);
extern int afs_getattr(struct mnt_idmap *idmap, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int, unsigned int);
extern int afs_setattr(struct mnt_idmap *idmap, struct dentry *,
struct iattr *);
extern void afs_evict_inode(struct inode *);
extern int afs_drop_inode(struct inode *);
diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 83f9566c973b..22219161382d 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -97,7 +97,8 @@ static int bad_inode_permission(struct mnt_idmap *idmap,
static int bad_inode_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
return -EIO;
}
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 7814b9d654ce..bc9fbaa42b93 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8624,7 +8624,8 @@ int __init btrfs_init_cachep(void)
static int btrfs_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags)
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
u64 delalloc_bytes;
u64 inode_bytes;
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 800ab7920513..a798f0a7238f 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -2255,11 +2255,13 @@ static const char
*ceph_encrypted_get_link(struct dentry *dentry,
static int ceph_encrypted_symlink_getattr(struct mnt_idmap *idmap,
const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags)
+ unsigned int query_flags,
+ unsigned int getattr_flags)
{
int ret;
- ret = ceph_getattr(idmap, path, stat, request_mask, query_flags);
+ ret = ceph_getattr(idmap, path, stat, request_mask, query_flags,
+ getattr_flags);
if (ret)
return ret;
return fscrypt_symlink_getattr(path, stat);
@@ -2960,7 +2962,8 @@ static int statx_to_caps(u32 want, umode_t mode)
* then we can avoid talking to the MDS at all.
*/
int ceph_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct super_block *sb = inode->i_sb;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 51c7f2b14f6f..f472cefd21bd 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -1100,7 +1100,8 @@ extern int ceph_setattr(struct mnt_idmap *idmap,
struct dentry *dentry, struct iattr *attr);
extern int ceph_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags);
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags);
void ceph_inode_shutdown(struct inode *inode);
static inline bool ceph_inode_is_shutdown(struct inode *inode)
diff --git a/fs/coda/coda_linux.h b/fs/coda/coda_linux.h
index dd6277d87afb..ec6e30cbb35f 100644
--- a/fs/coda/coda_linux.h
+++ b/fs/coda/coda_linux.h
@@ -50,7 +50,7 @@ int coda_permission(struct mnt_idmap *idmap, struct
inode *inode,
int mask);
int coda_revalidate_inode(struct inode *);
int coda_getattr(struct mnt_idmap *, const struct path *, struct kstat *,
- u32, unsigned int);
+ u32, unsigned int, unsigned int);
int coda_setattr(struct mnt_idmap *, struct dentry *, struct iattr *);
/* this file: helpers */
diff --git a/fs/coda/inode.c b/fs/coda/inode.c
index 0c7c2528791e..52465963c455 100644
--- a/fs/coda/inode.c
+++ b/fs/coda/inode.c
@@ -252,7 +252,8 @@ static void coda_evict_inode(struct inode *inode)
}
int coda_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
int err = coda_revalidate_inode(d_inode(path->dentry));
if (!err)
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 992d9c7e64ae..31173a4534d2 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -974,7 +974,8 @@ static int ecryptfs_setattr(struct mnt_idmap *idmap,
static int ecryptfs_getattr_link(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags)
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct dentry *dentry = path->dentry;
struct ecryptfs_mount_crypt_stat *mount_crypt_stat;
@@ -1000,14 +1001,19 @@ static int ecryptfs_getattr_link(struct
mnt_idmap *idmap,
static int ecryptfs_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags)
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct dentry *dentry = path->dentry;
struct kstat lower_stat;
int rc;
- rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
- request_mask, flags);
+ if (getattr_flags & GETATTR_NOSEC)
+ rc = vfs_getattr_nosec(ecryptfs_dentry_to_lower_path(dentry),
+ &lower_stat, request_mask, flags);
+ else
+ rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry),
+ &lower_stat, request_mask, flags);
if (!rc) {
fsstack_copy_attr_all(d_inode(dentry),
ecryptfs_inode_to_lower(d_inode(dentry)));
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index edc8ec7581b8..f9ac4c001c17 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -358,7 +358,7 @@ struct inode *erofs_iget(struct super_block *sb,
erofs_nid_t nid)
int erofs_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags)
+ unsigned int query_flags, unsigned int getattr_flags)
{
struct inode *const inode = d_inode(path->dentry);
diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h
index 4ff88d0dd980..16fbb39d670f 100644
--- a/fs/erofs/internal.h
+++ b/fs/erofs/internal.h
@@ -422,7 +422,7 @@ int erofs_map_blocks(struct inode *inode, struct
erofs_map_blocks *map);
struct inode *erofs_iget(struct super_block *sb, erofs_nid_t nid);
int erofs_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags);
+ unsigned int query_flags, unsigned int getattr_flags);
int erofs_namei(struct inode *dir, const struct qstr *name,
erofs_nid_t *nid, unsigned int *d_type);
diff --git a/fs/exfat/exfat_fs.h b/fs/exfat/exfat_fs.h
index f55498e5c23d..5e86c62de12f 100644
--- a/fs/exfat/exfat_fs.h
+++ b/fs/exfat/exfat_fs.h
@@ -454,7 +454,7 @@ int exfat_setattr(struct mnt_idmap *idmap, struct
dentry *dentry,
struct iattr *attr);
int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, unsigned int request_mask,
- unsigned int query_flags);
+ unsigned int query_flags, unsigned int getattr_flags);
int exfat_file_fsync(struct file *file, loff_t start, loff_t end, int
datasync);
long exfat_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
long exfat_compat_ioctl(struct file *filp, unsigned int cmd,
diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index 32395ef686a2..06cb318bfa0a 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -227,7 +227,7 @@ void exfat_truncate(struct inode *inode)
int exfat_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, unsigned int request_mask,
- unsigned int query_flags)
+ unsigned int query_flags, unsigned int getattr_flags)
{
struct inode *inode = d_backing_inode(path->dentry);
struct exfat_inode_info *ei = EXFAT_I(inode);
diff --git a/fs/ext2/ext2.h b/fs/ext2/ext2.h
index 7fdd685c384d..a2b460786d21 100644
--- a/fs/ext2/ext2.h
+++ b/fs/ext2/ext2.h
@@ -748,7 +748,7 @@ void ext2_write_failed(struct address_space
*mapping, loff_t to);
extern int ext2_get_block(struct inode *, sector_t, struct buffer_head
*, int);
extern int ext2_setattr (struct mnt_idmap *, struct dentry *, struct
iattr *);
extern int ext2_getattr (struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int, unsigned int);
extern void ext2_set_inode_flags(struct inode *inode);
extern int ext2_fiemap(struct inode *inode, struct fiemap_extent_info
*fieinfo,
u64 start, u64 len);
diff --git a/fs/ext2/inode.c b/fs/ext2/inode.c
index 314b415ee518..153738e254ac 100644
--- a/fs/ext2/inode.c
+++ b/fs/ext2/inode.c
@@ -1611,7 +1611,8 @@ int ext2_write_inode(struct inode *inode, struct
writeback_control *wbc)
}
int ext2_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int query_flags)
+ struct kstat *stat, u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct ext2_inode_info *ei = EXT2_I(inode);
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 9418359b1d9d..90b1e08bd89a 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2977,11 +2977,13 @@ extern int ext4_setattr(struct mnt_idmap *,
struct dentry *,
struct iattr *);
extern u32 ext4_dio_alignment(struct inode *inode);
extern int ext4_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int,
+ unsigned int);
extern void ext4_evict_inode(struct inode *);
extern void ext4_clear_inode(struct inode *);
extern int ext4_file_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int,
+ unsigned int);
extern void ext4_dirty_inode(struct inode *, int);
extern int ext4_change_inode_journal_flag(struct inode *, int);
extern int ext4_get_inode_loc(struct inode *, struct ext4_iloc *);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 4ce35f1c8b0a..ede71d313519 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5515,7 +5515,8 @@ u32 ext4_dio_alignment(struct inode *inode)
}
int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int query_flags)
+ struct kstat *stat, u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct ext4_inode *raw_inode;
@@ -5577,12 +5578,14 @@ int ext4_getattr(struct mnt_idmap *idmap, const
struct path *path,
int ext4_file_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
u64 delalloc_blocks;
- ext4_getattr(idmap, path, stat, request_mask, query_flags);
+ ext4_getattr(idmap, path, stat, request_mask, query_flags,
+ getattr_flags);
/*
* If there is inline data in the inode, the inode will normally not
diff --git a/fs/ext4/symlink.c b/fs/ext4/symlink.c
index 75bf1f88843c..abc30ebb2be2 100644
--- a/fs/ext4/symlink.c
+++ b/fs/ext4/symlink.c
@@ -58,9 +58,11 @@ static const char *ext4_encrypted_get_link(struct
dentry *dentry,
static int ext4_encrypted_symlink_getattr(struct mnt_idmap *idmap,
const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags)
+ unsigned int query_flags,
+ unsigned int getattr_flags)
{
- ext4_getattr(idmap, path, stat, request_mask, query_flags);
+ ext4_getattr(idmap, path, stat, request_mask, query_flags,
+ getattr_flags);
return fscrypt_symlink_getattr(path, stat);
}
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 6d688e42d89c..0a8436cd2f5f 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -3477,7 +3477,8 @@ int f2fs_do_truncate_blocks(struct inode *inode,
u64 from, bool lock);
int f2fs_truncate_blocks(struct inode *inode, u64 from, bool lock);
int f2fs_truncate(struct inode *inode);
int f2fs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags);
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags);
int f2fs_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct iattr *attr);
int f2fs_truncate_hole(struct inode *inode, pgoff_t pg_start, pgoff_t
pg_end);
diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c
index ca5904129b16..e05bdd318aef 100644
--- a/fs/f2fs/file.c
+++ b/fs/f2fs/file.c
@@ -832,7 +832,8 @@ static bool f2fs_force_buffered_io(struct inode
*inode, int rw)
}
int f2fs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int query_flags)
+ struct kstat *stat, u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct f2fs_inode_info *fi = F2FS_I(inode);
diff --git a/fs/f2fs/namei.c b/fs/f2fs/namei.c
index 193b22a2d6bf..cfdcdeb2c0b0 100644
--- a/fs/f2fs/namei.c
+++ b/fs/f2fs/namei.c
@@ -1356,9 +1356,11 @@ static const char *f2fs_encrypted_get_link(struct
dentry *dentry,
static int f2fs_encrypted_symlink_getattr(struct mnt_idmap *idmap,
const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags)
+ unsigned int query_flags,
+ unsigned int getattr_flags)
{
- f2fs_getattr(idmap, path, stat, request_mask, query_flags);
+ f2fs_getattr(idmap, path, stat, request_mask, query_flags,
+ getattr_flags);
return fscrypt_symlink_getattr(path, stat);
}
diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 66cf4778cf3b..74b57f2e1c36 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -403,7 +403,8 @@ extern int fat_setattr(struct mnt_idmap *idmap,
struct dentry *dentry,
extern void fat_truncate_blocks(struct inode *inode, loff_t offset);
extern int fat_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags);
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags);
extern int fat_file_fsync(struct file *file, loff_t start, loff_t end,
int datasync);
diff --git a/fs/fat/file.c b/fs/fat/file.c
index e887e9ab7472..2177784f54a2 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -396,7 +396,8 @@ void fat_truncate_blocks(struct inode *inode, loff_t
offset)
}
int fat_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb);
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index d707e6987da9..3531d4239635 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -2069,7 +2069,8 @@ static int fuse_setattr(struct mnt_idmap *idmap,
struct dentry *entry,
static int fuse_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags)
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct fuse_conn *fc = get_fuse_conn(inode);
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 0eac04507904..50383ec30573 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -2038,6 +2038,7 @@ static int gfs2_setattr(struct mnt_idmap *idmap,
* @stat: The inode's stats
* @request_mask: Mask of STATX_xxx flags indicating the caller's
interests
* @flags: AT_STATX_xxx setting
+ * @getattr_flags: GETATTR_xxx
*
* This may be called from the VFS directly, or from within GFS2 with the
* inode locked, so we look to see if the glock is already locked and only
@@ -2050,7 +2051,8 @@ static int gfs2_setattr(struct mnt_idmap *idmap,
static int gfs2_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags)
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct gfs2_inode *ip = GFS2_I(inode);
diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index 7ededcb720c1..3da3a79c9742 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -483,7 +483,7 @@ int hfsplus_cat_read_inode(struct inode *inode,
struct hfs_find_data *fd);
int hfsplus_cat_write_inode(struct inode *inode);
int hfsplus_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags);
+ unsigned int query_flags, unsigned int getattr_flags);
int hfsplus_file_fsync(struct file *file, loff_t start, loff_t end,
int datasync);
int hfsplus_fileattr_get(struct dentry *dentry, struct fileattr *fa);
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index c65c8c4b03dd..afa7e8ee8cb2 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -278,7 +278,7 @@ static int hfsplus_setattr(struct mnt_idmap *idmap,
int hfsplus_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags)
+ unsigned int query_flags, unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct hfsplus_inode_info *hip = HFSPLUS_I(inode);
diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 922719a343a7..92af4b394274 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -182,7 +182,8 @@ static void kernfs_refresh_inode(struct kernfs_node
*kn, struct inode *inode)
int kernfs_iop_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct kernfs_node *kn = inode->i_private;
diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h
index a9b854cdfdb5..ccf74e08105c 100644
--- a/fs/kernfs/kernfs-internal.h
+++ b/fs/kernfs/kernfs-internal.h
@@ -135,7 +135,8 @@ int kernfs_iop_setattr(struct mnt_idmap *idmap,
struct dentry *dentry,
struct iattr *iattr);
int kernfs_iop_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags);
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags);
ssize_t kernfs_iop_listxattr(struct dentry *dentry, char *buf, size_t
size);
int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr);
diff --git a/fs/libfs.c b/fs/libfs.c
index 37f2d34ee090..757d24b8f4be 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -30,7 +30,7 @@
int simple_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *stat, u32 request_mask,
- unsigned int query_flags)
+ unsigned int query_flags, unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
@@ -1579,7 +1579,8 @@ static struct dentry *empty_dir_lookup(struct
inode *dir, struct dentry *dentry,
static int empty_dir_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
generic_fillattr(&nop_mnt_idmap, request_mask, inode, stat);
diff --git a/fs/minix/inode.c b/fs/minix/inode.c
index df575473c1cc..74ae2c33f5a5 100644
--- a/fs/minix/inode.c
+++ b/fs/minix/inode.c
@@ -651,7 +651,8 @@ static int minix_write_inode(struct inode *inode,
struct writeback_control *wbc)
}
int minix_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct super_block *sb = path->dentry->d_sb;
struct inode *inode = d_inode(path->dentry);
diff --git a/fs/minix/minix.h b/fs/minix/minix.h
index d493507c064f..6ac55d3b649c 100644
--- a/fs/minix/minix.h
+++ b/fs/minix/minix.h
@@ -52,7 +52,8 @@ extern int minix_new_block(struct inode * inode);
extern void minix_free_block(struct inode *inode, unsigned long block);
extern unsigned long minix_count_free_blocks(struct super_block *sb);
extern int minix_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int,
+ unsigned int);
extern int minix_prepare_chunk(struct page *page, loff_t pos, unsigned
len);
extern void V1_minix_truncate(struct inode *);
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index e21c073158e5..c5cad2515b37 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -831,7 +831,8 @@ static u32 nfs_get_valid_attrmask(struct inode *inode)
}
int nfs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int query_flags)
+ struct kstat *stat, u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct nfs_server *server = NFS_SERVER(inode);
diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c
index e7494cdd957e..d4c3622ff4ed 100644
--- a/fs/nfs/namespace.c
+++ b/fs/nfs/namespace.c
@@ -210,11 +210,12 @@ struct vfsmount *nfs_d_automount(struct path *path)
static int
nfs_namespace_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
if (NFS_FH(d_inode(path->dentry))->size != 0)
return nfs_getattr(idmap, path, stat, request_mask,
- query_flags);
+ query_flags, getattr_flags);
generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(path->dentry),
stat);
return 0;
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index 962f12ce6c0a..729ca3c09958 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -72,7 +72,8 @@ static long ntfs_compat_ioctl(struct file *filp, u32
cmd, unsigned long arg)
* ntfs_getattr - inode_operations::getattr
*/
int ntfs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, u32 flags)
+ struct kstat *stat, u32 request_mask, u32 flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct ntfs_inode *ni = ntfs_i(inode);
diff --git a/fs/ntfs3/ntfs_fs.h b/fs/ntfs3/ntfs_fs.h
index 629403ede6e5..3c83352883f2 100644
--- a/fs/ntfs3/ntfs_fs.h
+++ b/fs/ntfs3/ntfs_fs.h
@@ -492,7 +492,8 @@ extern const struct file_operations ntfs_dir_operations;
/* Globals from file.c */
int ntfs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, u32 flags);
+ struct kstat *stat, u32 request_mask, u32 flags,
+ unsigned int getattr_flags);
int ntfs3_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct iattr *attr);
void ntfs_sparse_cluster(struct inode *inode, struct page *page0, CLST
vcn,
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index c45596c25c66..7e367cd801d3 100644
--- a/fs/ocfs2/file.c
+++ b/fs/ocfs2/file.c
@@ -1300,7 +1300,8 @@ int ocfs2_setattr(struct mnt_idmap *idmap, struct
dentry *dentry,
}
int ocfs2_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct super_block *sb = path->dentry->d_sb;
diff --git a/fs/ocfs2/file.h b/fs/ocfs2/file.h
index 8e53e4ac1120..2dbf4edd94da 100644
--- a/fs/ocfs2/file.h
+++ b/fs/ocfs2/file.h
@@ -52,7 +52,8 @@ int ocfs2_zero_extend(struct inode *inode, struct
buffer_head *di_bh,
int ocfs2_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct iattr *attr);
int ocfs2_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags);
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags);
int ocfs2_permission(struct mnt_idmap *idmap,
struct inode *inode,
int mask);
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 085912268442..f7c2c318d392 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -859,7 +859,8 @@ int orangefs_setattr(struct mnt_idmap *idmap, struct
dentry *dentry,
* Obtain attributes of an object given a dentry
*/
int orangefs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
int ret;
struct inode *inode = path->dentry->d_inode;
diff --git a/fs/orangefs/orangefs-kernel.h b/fs/orangefs/orangefs-kernel.h
index b711654ca18a..f0682b51e5e0 100644
--- a/fs/orangefs/orangefs-kernel.h
+++ b/fs/orangefs/orangefs-kernel.h
@@ -365,7 +365,8 @@ int __orangefs_setattr_mode(struct dentry *dentry,
struct iattr *iattr);
int orangefs_setattr(struct mnt_idmap *, struct dentry *, struct iattr *);
int orangefs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags);
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags);
int orangefs_permission(struct mnt_idmap *idmap,
struct inode *inode, int mask);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 83ef66644c21..8c0f3e125d09 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -155,7 +155,8 @@ static void ovl_map_dev_ino(struct dentry *dentry,
struct kstat *stat, int fsid)
}
int ovl_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct dentry *dentry = path->dentry;
enum ovl_path_type type;
@@ -171,7 +172,10 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,
type = ovl_path_real(dentry, &realpath);
old_cred = ovl_override_creds(dentry->d_sb);
- err = vfs_getattr(&realpath, stat, request_mask, flags);
+ if (getattr_flags & GETATTR_NOSEC)
+ err = vfs_getattr_nosec(&realpath, stat, request_mask, flags);
+ else
+ err = vfs_getattr(&realpath, stat, request_mask, flags);
if (err)
goto out;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 9817b2dcb132..6f91bfb1bbbd 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -687,7 +687,8 @@ unsigned int ovl_get_nlink(struct ovl_fs *ofs,
struct dentry *lowerdentry,
int ovl_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct iattr *attr);
int ovl_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags);
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags);
int ovl_permission(struct mnt_idmap *idmap, struct inode *inode,
int mask);
int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const
char *name,
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ffd54617c354..3e733cd17fba 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1960,7 +1960,8 @@ static struct inode
*proc_pid_make_base_inode(struct super_block *sb,
}
int pid_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int query_flags)
+ struct kstat *stat, u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct proc_fs_info *fs_info = proc_sb_info(inode->i_sb);
@@ -3896,7 +3897,8 @@ static int proc_task_readdir(struct file *file,
struct dir_context *ctx)
static int proc_task_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct task_struct *p = get_proc_task(inode);
diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index 6276b3938842..1ea9eea9953d 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -347,7 +347,8 @@ int proc_fd_permission(struct mnt_idmap *idmap,
static int proc_fd_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
int rv = 0;
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index 775ce0bcf08c..4acb07536329 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -135,7 +135,8 @@ static int proc_notify_change(struct mnt_idmap *idmap,
static int proc_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct proc_dir_entry *de = PDE(inode);
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 9a8f32f21ff5..c3bc8eee4771 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -163,7 +163,8 @@ extern int proc_pid_statm(struct seq_file *, struct
pid_namespace *,
*/
extern const struct dentry_operations pid_dentry_operations;
extern int pid_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int,
+ unsigned int);
extern int proc_setattr(struct mnt_idmap *, struct dentry *,
struct iattr *);
extern void proc_pid_evict_inode(struct proc_inode *);
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 2ba31b6d68c0..e76019d1fe3b 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -301,7 +301,8 @@ static struct dentry *proc_tgid_net_lookup(struct
inode *dir,
static int proc_tgid_net_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct net *net;
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index c88854df0b62..3dc86f02f64e 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -842,7 +842,8 @@ static int proc_sys_setattr(struct mnt_idmap *idmap,
static int proc_sys_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct ctl_table_header *head = grab_header(inode);
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 9191248f2dac..ab9113fc119d 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -312,7 +312,8 @@ void __init proc_root_init(void)
static int proc_root_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(path->dentry),
stat);
diff --git a/fs/smb/client/cifsfs.h b/fs/smb/client/cifsfs.h
index 41daebd220ff..453a3bd7ba65 100644
--- a/fs/smb/client/cifsfs.h
+++ b/fs/smb/client/cifsfs.h
@@ -73,7 +73,8 @@ extern int cifs_invalidate_mapping(struct inode *inode);
extern int cifs_revalidate_mapping(struct inode *inode);
extern int cifs_zap_mapping(struct inode *inode);
extern int cifs_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int,
+ unsigned int);
extern int cifs_setattr(struct mnt_idmap *, struct dentry *,
struct iattr *);
extern int cifs_fiemap(struct inode *, struct fiemap_extent_info *,
u64 start,
diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c
index d7c302442c1e..1aa1bd0c85fd 100644
--- a/fs/smb/client/inode.c
+++ b/fs/smb/client/inode.c
@@ -2570,7 +2570,8 @@ int cifs_revalidate_dentry(struct dentry *dentry)
}
int cifs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct dentry *dentry = path->dentry;
struct cifs_sb_info *cifs_sb = CIFS_SB(dentry->d_sb);
diff --git a/fs/stat.c b/fs/stat.c
index d43a5cc1bfa4..fc4f6c0a91a8 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -133,7 +133,8 @@ int vfs_getattr_nosec(const struct path *path,
struct kstat *stat,
idmap = mnt_idmap(path->mnt);
if (inode->i_op->getattr)
return inode->i_op->getattr(idmap, path, stat,
- request_mask, query_flags);
+ request_mask, query_flags,
+ GETATTR_NOSEC);
generic_fillattr(idmap, request_mask, inode, stat);
return 0;
diff --git a/fs/sysv/itree.c b/fs/sysv/itree.c
index edb94e55de8e..3ff96577f0db 100644
--- a/fs/sysv/itree.c
+++ b/fs/sysv/itree.c
@@ -446,7 +446,8 @@ static unsigned sysv_nblocks(struct super_block *s,
loff_t size)
}
int sysv_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct super_block *s = path->dentry->d_sb;
generic_fillattr(&nop_mnt_idmap, request_mask, d_inode(path->dentry),
diff --git a/fs/sysv/sysv.h b/fs/sysv/sysv.h
index e3f988b469ee..1e6e1e2faad4 100644
--- a/fs/sysv/sysv.h
+++ b/fs/sysv/sysv.h
@@ -142,7 +142,7 @@ extern int sysv_write_inode(struct inode *, struct
writeback_control *wbc);
extern int sysv_sync_inode(struct inode *);
extern void sysv_set_inode(struct inode *, dev_t);
extern int sysv_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int, unsigned int);
extern int sysv_init_icache(void);
extern void sysv_destroy_icache(void);
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index 2f48c58d47cd..30fe9f64292c 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -1632,7 +1632,8 @@ static int ubifs_rename(struct mnt_idmap *idmap,
}
int ubifs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags)
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
loff_t size;
struct inode *inode = d_inode(path->dentry);
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index e5382f0b2587..ef0dda82adfc 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1622,9 +1622,11 @@ static const char *ubifs_get_link(struct dentry
*dentry,
static int ubifs_symlink_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
- ubifs_getattr(idmap, path, stat, request_mask, query_flags);
+ ubifs_getattr(idmap, path, stat, request_mask, query_flags,
+ getattr_flags);
if (IS_ENCRYPTED(d_inode(path->dentry)))
return fscrypt_symlink_getattr(path, stat);
diff --git a/fs/ubifs/ubifs.h b/fs/ubifs/ubifs.h
index ebb3ad6b5e7e..bf1429096176 100644
--- a/fs/ubifs/ubifs.h
+++ b/fs/ubifs/ubifs.h
@@ -2033,7 +2033,8 @@ int ubifs_update_time(struct inode *inode, int flags);
struct inode *ubifs_new_inode(struct ubifs_info *c, struct inode *dir,
umode_t mode, bool is_xattr);
int ubifs_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *stat, u32 request_mask, unsigned int flags);
+ struct kstat *stat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags);
int ubifs_check_dir_empty(struct inode *dir);
/* xattr.c */
diff --git a/fs/udf/symlink.c b/fs/udf/symlink.c
index f7eaf7b14594..743f5f59e94c 100644
--- a/fs/udf/symlink.c
+++ b/fs/udf/symlink.c
@@ -143,7 +143,8 @@ static int udf_symlink_filler(struct file *file,
struct folio *folio)
static int udf_symlink_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int flags)
+ u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
struct dentry *dentry = path->dentry;
struct inode *inode = d_backing_inode(dentry);
diff --git a/fs/vboxsf/utils.c b/fs/vboxsf/utils.c
index 83f20dd15522..8bbc31c99ede 100644
--- a/fs/vboxsf/utils.c
+++ b/fs/vboxsf/utils.c
@@ -232,7 +232,8 @@ int vboxsf_inode_revalidate(struct dentry *dentry)
}
int vboxsf_getattr(struct mnt_idmap *idmap, const struct path *path,
- struct kstat *kstat, u32 request_mask, unsigned int flags)
+ struct kstat *kstat, u32 request_mask, unsigned int flags,
+ unsigned int getattr_flags)
{
int err;
struct dentry *dentry = path->dentry;
diff --git a/fs/vboxsf/vfsmod.h b/fs/vboxsf/vfsmod.h
index 05973eb89d52..8ec1edc0cbe6 100644
--- a/fs/vboxsf/vfsmod.h
+++ b/fs/vboxsf/vfsmod.h
@@ -99,7 +99,7 @@ int vboxsf_stat_dentry(struct dentry *dentry, struct
shfl_fsobjinfo *info);
int vboxsf_inode_revalidate(struct dentry *dentry);
int vboxsf_getattr(struct mnt_idmap *idmap, const struct path *path,
struct kstat *kstat, u32 request_mask,
- unsigned int query_flags);
+ unsigned int query_flags, unsigned int getattr_flags);
int vboxsf_setattr(struct mnt_idmap *idmap, struct dentry *dentry,
struct iattr *iattr);
struct shfl_string *vboxsf_path_from_dentry(struct vboxsf_sbi *sbi,
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 1c1e6171209d..d70ab9d791a9 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -552,7 +552,8 @@ xfs_vn_getattr(
const struct path *path,
struct kstat *stat,
u32 request_mask,
- unsigned int query_flags)
+ unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = d_inode(path->dentry);
struct xfs_inode *ip = XFS_I(inode);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b528f063e8ff..f2ec5a48e5dc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1924,7 +1924,7 @@ struct inode_operations {
struct inode *, struct dentry *, unsigned int);
int (*setattr) (struct mnt_idmap *, struct dentry *, struct iattr *);
int (*getattr) (struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int, unsigned int);
ssize_t (*listxattr) (struct dentry *, char *, size_t);
int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
u64 len);
@@ -2098,6 +2098,11 @@ static inline bool sb_rdonly(const struct
super_block *sb) { return sb->s_flags
#define IS_WHITEOUT(inode) (S_ISCHR(inode->i_mode) && \
(inode)->i_rdev == WHITEOUT_DEV)
+/*
+ * getattr flags
+ */
+#define GETATTR_NOSEC (1 << 0)
+
static inline bool HAS_UNMAPPED_ID(struct mnt_idmap *idmap,
struct inode *inode)
{
@@ -3066,7 +3071,8 @@ extern int dcache_readdir(struct file *, struct
dir_context *);
extern int simple_setattr(struct mnt_idmap *, struct dentry *,
struct iattr *);
extern int simple_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int,
+ unsigned int);
extern int simple_statfs(struct dentry *, struct kstatfs *);
extern int simple_open(struct inode *inode, struct file *file);
extern int simple_link(struct dentry *, struct inode *, struct dentry *);
diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
index 279262057a92..f6cecdbe11ca 100644
--- a/include/linux/nfs_fs.h
+++ b/include/linux/nfs_fs.h
@@ -421,7 +421,7 @@ extern int nfs_post_op_update_inode(struct inode
*inode, struct nfs_fattr *fattr
extern int nfs_post_op_update_inode_force_wcc(struct inode *inode,
struct nfs_fattr *fattr);
extern int nfs_post_op_update_inode_force_wcc_locked(struct inode
*inode, struct nfs_fattr *fattr);
extern int nfs_getattr(struct mnt_idmap *, const struct path *,
- struct kstat *, u32, unsigned int);
+ struct kstat *, u32, unsigned int, unsigned int);
extern void nfs_access_add_cache(struct inode *, struct
nfs_access_entry *, const struct cred *);
extern void nfs_access_set_mask(struct nfs_access_entry *, u32);
extern int nfs_permission(struct mnt_idmap *, struct inode *, int);
diff --git a/mm/shmem.c b/mm/shmem.c
index 69595d341882..ff43a0d248bb 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1119,7 +1119,8 @@ EXPORT_SYMBOL_GPL(shmem_truncate_range);
static int shmem_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
- u32 request_mask, unsigned int query_flags)
+ u32 request_mask, unsigned int query_flags,
+ unsigned int getattr_flags)
{
struct inode *inode = path->dentry->d_inode;
struct shmem_inode_info *info = SHMEM_I(inode);
--
2.40.1
On Fri, Sep 29, 2023 at 3:02 AM Stefan Berger <[email protected]> wrote:
>
>
> On 9/21/23 07:48, Christian Brauner wrote:
> >
> > Imho, this is all very wild but I'm not judging.
> >
> > Two solutions imho:
> > (1) teach stacking filesystems like overlayfs and ecryptfs to use
> > vfs_getattr_nosec() in their ->getattr() implementation when they
> > are themselves called via vfs_getattr_nosec(). This will fix this by
> > not triggering another LSM hook.
>
> This somewhat lengthy patch I think would be a solution for (1). I don't
> think the Fixes tag is correct but IMO it should propagate farther back,
> if possible.
>
>
> From 01467f6e879c4cd757abb4d79cb18bf11150bed8 Mon Sep 17 00:00:00 2001
> From: Stefan Berger <[email protected]>
> Date: Thu, 28 Sep 2023 14:42:39 -0400
> Subject: [PATCH] fs: Enable GETATTR_NOSEC parameter for getattr interface
> function
>
> When vfs_getattr_nosec() calls a filesystem's getattr interface function
> then the 'nosec' should propagate into this function so that
> vfs_getattr_nosec() can again be called from the filesystem's gettattr
> rather than vfs_getattr(). The latter would add unnecessary security
> checks that the initial vfs_getattr_nosec() call wanted to avoid.
> Therefore, introduce the getattr flag GETATTR_NOSEC and allow to pass
> with the new getattr_flags parameter to the getattr interface function.
> In overlayfs and ecryptfs use this flag to determine which one of the
> two functions to call.
>
> In a recent code change introduced to IMA vfs_getattr_nosec() ended up
> calling vfs_getattr() in overlayfs, which in turn called
> security_inode_getattr() on an exiting process that did not have
> current->fs set anymore, which then caused a kernel NULL pointer
> dereference. With this change the call to security_inode_getattr() can
> be avoided, thus avoiding the NULL pointer dereference.
>
> Reported-by: [email protected]
> Fixes: db1d1e8b9867 ("IMA: use vfs_getattr_nosec to get the i_version")
> Signed-off-by: Stefan Berger <[email protected]>
>
> ---
>
> simple_getattr has been adjusted and all files returned by the following
> grep have been adjusted as well.
>
> grep -rEI "^[[:space:]]+\.getattr" ./ | \
> grep -v simple_getattr | \
> cut -d ":" -f1 | sort | uniq
> ---
> fs/9p/vfs_inode.c | 3 ++-
> fs/9p/vfs_inode_dotl.c | 3 ++-
> fs/afs/inode.c | 3 ++-
> fs/afs/internal.h | 2 +-
> fs/bad_inode.c | 3 ++-
> fs/btrfs/inode.c | 3 ++-
> fs/ceph/inode.c | 9 ++++++---
> fs/ceph/super.h | 3 ++-
> fs/coda/coda_linux.h | 2 +-
> fs/coda/inode.c | 3 ++-
> fs/ecryptfs/inode.c | 14 ++++++++++----
> fs/erofs/inode.c | 2 +-
> fs/erofs/internal.h | 2 +-
> fs/exfat/exfat_fs.h | 2 +-
> fs/exfat/file.c | 2 +-
> fs/ext2/ext2.h | 2 +-
> fs/ext2/inode.c | 3 ++-
> fs/ext4/ext4.h | 6 ++++--
> fs/ext4/inode.c | 9 ++++++---
> fs/ext4/symlink.c | 6 ++++--
> fs/f2fs/f2fs.h | 3 ++-
> fs/f2fs/file.c | 3 ++-
> fs/f2fs/namei.c | 6 ++++--
> fs/fat/fat.h | 3 ++-
> fs/fat/file.c | 3 ++-
> fs/fuse/dir.c | 3 ++-
> fs/gfs2/inode.c | 4 +++-
> fs/hfsplus/hfsplus_fs.h | 2 +-
> fs/hfsplus/inode.c | 2 +-
> fs/kernfs/inode.c | 3 ++-
> fs/kernfs/kernfs-internal.h | 3 ++-
> fs/libfs.c | 5 +++--
> fs/minix/inode.c | 3 ++-
> fs/minix/minix.h | 3 ++-
> fs/nfs/inode.c | 3 ++-
> fs/nfs/namespace.c | 5 +++--
> fs/ntfs3/file.c | 3 ++-
> fs/ntfs3/ntfs_fs.h | 3 ++-
> fs/ocfs2/file.c | 3 ++-
> fs/ocfs2/file.h | 3 ++-
> fs/orangefs/inode.c | 3 ++-
> fs/orangefs/orangefs-kernel.h | 3 ++-
> fs/overlayfs/inode.c | 8 ++++++--
> fs/overlayfs/overlayfs.h | 3 ++-
> fs/proc/base.c | 6 ++++--
> fs/proc/fd.c | 3 ++-
> fs/proc/generic.c | 3 ++-
> fs/proc/internal.h | 3 ++-
> fs/proc/proc_net.c | 3 ++-
> fs/proc/proc_sysctl.c | 3 ++-
> fs/proc/root.c | 3 ++-
> fs/smb/client/cifsfs.h | 3 ++-
> fs/smb/client/inode.c | 3 ++-
> fs/stat.c | 3 ++-
> fs/sysv/itree.c | 3 ++-
> fs/sysv/sysv.h | 2 +-
> fs/ubifs/dir.c | 3 ++-
> fs/ubifs/file.c | 6 ++++--
> fs/ubifs/ubifs.h | 3 ++-
> fs/udf/symlink.c | 3 ++-
> fs/vboxsf/utils.c | 3 ++-
> fs/vboxsf/vfsmod.h | 2 +-
> fs/xfs/xfs_iops.c | 3 ++-
> include/linux/fs.h | 10 ++++++++--
> include/linux/nfs_fs.h | 2 +-
> mm/shmem.c | 3 ++-
> 66 files changed, 159 insertions(+), 82 deletions(-)
>
You can avoid all this churn.
Just use the existing query_flags arg.
Nothing outside the AT_STATX_SYNC_TYPE query_flags is
passed into filesystems from userspace.
Mast out AT_STATX_SYNC_TYPE in vfs_getattr()
And allow kernel internal request_flags in vfs_getattr_nosec()
The AT_ flag namespace is already a challenge, but mixing user
flags and kernel-only flags in vfs interfaces has been done before.
...
> @@ -171,7 +172,10 @@ int ovl_getattr(struct mnt_idmap *idmap, const
> struct path *path,
>
> type = ovl_path_real(dentry, &realpath);
> old_cred = ovl_override_creds(dentry->d_sb);
> - err = vfs_getattr(&realpath, stat, request_mask, flags);
> + if (getattr_flags & GETATTR_NOSEC)
> + err = vfs_getattr_nosec(&realpath, stat, request_mask, flags);
> + else
> + err = vfs_getattr(&realpath, stat, request_mask, flags);
> if (err)
> goto out;
>
There are two more vfs_getattr() calls in this function that you missed.
Please create ovl_do_getattr() inline wrapper in overlayfs.h next to all
the other ovl_do_ wrappers and use it here.
Thanks,
Amir.
On 9/29/23 00:25, Amir Goldstein wrote:
> On Fri, Sep 29, 2023 at 3:02 AM Stefan Berger <[email protected]> wrote:
>>
>> On 9/21/23 07:48, Christian Brauner wrote:
>>> Imho, this is all very wild but I'm not judging.
>>>
>>> Two solutions imho:
>>> (1) teach stacking filesystems like overlayfs and ecryptfs to use
>>> vfs_getattr_nosec() in their ->getattr() implementation when they
>>> are themselves called via vfs_getattr_nosec(). This will fix this by
>>> not triggering another LSM hook.
>>
>> You can avoid all this churn.
>> Just use the existing query_flags arg.
>> Nothing outside the AT_STATX_SYNC_TYPE query_flags is
>> passed into filesystems from userspace.
>>
>> Mast out AT_STATX_SYNC_TYPE in vfs_getattr()
>> And allow kernel internal request_flags in vfs_getattr_nosec()
Hm, I thought that vfs_getattr_nosec needs to pass AT_GETATTR_NOSEC into
->getattr().
>>
>> The AT_ flag namespace is already a challenge, but mixing user
>> flags and kernel-only flags in vfs interfaces has been done before.
>>
>> ...
That's what I wanted to avoid since now all filesystems' getattr() may
have the AT_GETATTR_NOSEC mixed into the query_flags.
Anyway, here's what I currently have:
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 992d9c7e64ae..f7b5b1843dcc 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -998,16 +998,28 @@ static int ecryptfs_getattr_link(struct mnt_idmap
*idmap,
return rc;
}
+static int ecryptfs_do_getattr(bool nosec, const struct path *path,
+ struct kstat *stat, u32 request_mask,
+ unsigned int flags)
+{
+ if (nosec)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
+
static int ecryptfs_getattr(struct mnt_idmap *idmap,
const struct path *path, struct kstat *stat,
u32 request_mask, unsigned int flags)
{
struct dentry *dentry = path->dentry;
struct kstat lower_stat;
+ bool nosec = flags & AT_GETATTR_NOSEC;
int rc;
- rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
- request_mask, flags);
+ flags &= ~AT_INTERNAL_MASK;
+
+ rc = ecryptfs_do_getattr(nosec,
ecryptfs_dentry_to_lower_path(dentry),
+ &lower_stat, request_mask, flags);
if (!rc) {
fsstack_copy_attr_all(d_inode(dentry),
ecryptfs_inode_to_lower(d_inode(dentry)));
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 83ef66644c21..ec4ceb5b4ebf 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -166,12 +166,15 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,
int fsid = 0;
int err;
bool metacopy_blocks = false;
+ bool nosec = flags & AT_GETATTR_NOSEC;
+
+ flags &= ~AT_INTERNAL_MASK;
metacopy_blocks = ovl_is_metacopy_dentry(dentry);
type = ovl_path_real(dentry, &realpath);
old_cred = ovl_override_creds(dentry->d_sb);
- err = vfs_getattr(&realpath, stat, request_mask, flags);
+ err = ovl_do_getattr(nosec, &realpath, stat, request_mask, flags);
if (err)
goto out;
@@ -196,8 +199,8 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,
(!is_dir ? STATX_NLINK : 0);
ovl_path_lower(dentry, &realpath);
- err = vfs_getattr(&realpath, &lowerstat,
- lowermask, flags);
+ err = ovl_do_getattr(nosec, &realpath, &lowerstat,
+ lowermask, flags);
if (err)
goto out;
@@ -249,8 +252,9 @@ int ovl_getattr(struct mnt_idmap *idmap, const
struct path *path,
ovl_path_lowerdata(dentry, &realpath);
if (realpath.dentry) {
- err = vfs_getattr(&realpath, &lowerdatastat,
- lowermask, flags);
+ err = ovl_do_getattr(nosec, &realpath,
+ &lowerdatastat, lowermask,
+ flags);
if (err)
goto out;
} else {
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 9817b2dcb132..cbee3ff3bab7 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -397,6 +397,15 @@ static inline bool ovl_open_flags_need_copy_up(int
flags)
return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
}
+static inline int ovl_do_getattr(bool nosec, const struct path *path,
+ struct kstat *stat, u32 request_mask,
+ unsigned int flags)
+{
+ if (nosec)
+ return vfs_getattr_nosec(path, stat, request_mask, flags);
+ return vfs_getattr(path, stat, request_mask, flags);
+}
+
/* util.c */
int ovl_want_write(struct dentry *dentry);
void ovl_drop_write(struct dentry *dentry);
diff --git a/fs/stat.c b/fs/stat.c
index d43a5cc1bfa4..3250e427e1aa 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -133,7 +133,8 @@ int vfs_getattr_nosec(const struct path *path,
struct kstat *stat,
idmap = mnt_idmap(path->mnt);
if (inode->i_op->getattr)
return inode->i_op->getattr(idmap, path, stat,
- request_mask, query_flags);
+ request_mask,
+ query_flags | AT_GETATTR_NOSEC);
generic_fillattr(idmap, request_mask, inode, stat);
return 0;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index b528f063e8ff..9069d6a301f0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2027,6 +2027,12 @@ struct super_operations {
void (*shutdown)(struct super_block *sb);
};
+/*
+ * Internal query flags. See fcntl.h AT_xxx flags for the rest.
+ */
+#define AT_GETATTR_NOSEC 0x80000000
+#define AT_INTERNAL_MASK 0x80000000
+
/*
* Inode flags - they have no relation to superblock flags now
*/
On Fri, Sep 29, 2023 at 3:39 PM Stefan Berger <[email protected]> wrote:
>
>
> On 9/29/23 00:25, Amir Goldstein wrote:
> > On Fri, Sep 29, 2023 at 3:02 AM Stefan Berger <[email protected]> wrote:
> >>
> >> On 9/21/23 07:48, Christian Brauner wrote:
> >>> Imho, this is all very wild but I'm not judging.
> >>>
> >>> Two solutions imho:
> >>> (1) teach stacking filesystems like overlayfs and ecryptfs to use
> >>> vfs_getattr_nosec() in their ->getattr() implementation when they
> >>> are themselves called via vfs_getattr_nosec(). This will fix this by
> >>> not triggering another LSM hook.
> >>
> >> You can avoid all this churn.
> >> Just use the existing query_flags arg.
> >> Nothing outside the AT_STATX_SYNC_TYPE query_flags is
> >> passed into filesystems from userspace.
> >>
> >> Mast out AT_STATX_SYNC_TYPE in vfs_getattr()
> >> And allow kernel internal request_flags in vfs_getattr_nosec()
> Hm, I thought that vfs_getattr_nosec needs to pass AT_GETATTR_NOSEC into
> ->getattr().
> >>
> >> The AT_ flag namespace is already a challenge, but mixing user
> >> flags and kernel-only flags in vfs interfaces has been done before.
> >>
> >> ...
>
>
> That's what I wanted to avoid since now all filesystems' getattr() may
> have the AT_GETATTR_NOSEC mixed into the query_flags.
>
> Anyway, here's what I currently have:
>
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index 992d9c7e64ae..f7b5b1843dcc 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -998,16 +998,28 @@ static int ecryptfs_getattr_link(struct mnt_idmap
> *idmap,
> return rc;
> }
>
> +static int ecryptfs_do_getattr(bool nosec, const struct path *path,
> + struct kstat *stat, u32 request_mask,
> + unsigned int flags)
> +{
> + if (nosec)
> + return vfs_getattr_nosec(path, stat, request_mask, flags);
> + return vfs_getattr(path, stat, request_mask, flags);
> +}
> +
> static int ecryptfs_getattr(struct mnt_idmap *idmap,
> const struct path *path, struct kstat *stat,
> u32 request_mask, unsigned int flags)
> {
> struct dentry *dentry = path->dentry;
> struct kstat lower_stat;
> + bool nosec = flags & AT_GETATTR_NOSEC;
> int rc;
>
> - rc = vfs_getattr(ecryptfs_dentry_to_lower_path(dentry), &lower_stat,
> - request_mask, flags);
> + flags &= ~AT_INTERNAL_MASK;
> +
> + rc = ecryptfs_do_getattr(nosec,
> ecryptfs_dentry_to_lower_path(dentry),
> + &lower_stat, request_mask, flags);
> if (!rc) {
> fsstack_copy_attr_all(d_inode(dentry),
> ecryptfs_inode_to_lower(d_inode(dentry)));
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 83ef66644c21..ec4ceb5b4ebf 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -166,12 +166,15 @@ int ovl_getattr(struct mnt_idmap *idmap, const
> struct path *path,
> int fsid = 0;
> int err;
> bool metacopy_blocks = false;
> + bool nosec = flags & AT_GETATTR_NOSEC;
> +
> + flags &= ~AT_INTERNAL_MASK;
I don't understand why you need the nosec helper arg.
What's wrong with:
static int ovl_do_getattr(const struct path *path,
struct kstat *stat, u32 request_mask,
unsigned int flags)
{
if (flags & AT_GETATTR_NOSEC)
return vfs_getattr_nosec(path, stat, request_mask, flags);
return vfs_getattr(path, stat, request_mask, flags);
}
likewise for ecryptfs.
>
> metacopy_blocks = ovl_is_metacopy_dentry(dentry);
>
> type = ovl_path_real(dentry, &realpath);
> old_cred = ovl_override_creds(dentry->d_sb);
> - err = vfs_getattr(&realpath, stat, request_mask, flags);
> + err = ovl_do_getattr(nosec, &realpath, stat, request_mask, flags);
> if (err)
> goto out;
>
> @@ -196,8 +199,8 @@ int ovl_getattr(struct mnt_idmap *idmap, const
> struct path *path,
> (!is_dir ? STATX_NLINK : 0);
>
> ovl_path_lower(dentry, &realpath);
> - err = vfs_getattr(&realpath, &lowerstat,
> - lowermask, flags);
> + err = ovl_do_getattr(nosec, &realpath, &lowerstat,
> + lowermask, flags);
> if (err)
> goto out;
>
> @@ -249,8 +252,9 @@ int ovl_getattr(struct mnt_idmap *idmap, const
> struct path *path,
>
> ovl_path_lowerdata(dentry, &realpath);
> if (realpath.dentry) {
> - err = vfs_getattr(&realpath, &lowerdatastat,
> - lowermask, flags);
> + err = ovl_do_getattr(nosec, &realpath,
> + &lowerdatastat, lowermask,
> + flags);
> if (err)
> goto out;
> } else {
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 9817b2dcb132..cbee3ff3bab7 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -397,6 +397,15 @@ static inline bool ovl_open_flags_need_copy_up(int
> flags)
> return ((OPEN_FMODE(flags) & FMODE_WRITE) || (flags & O_TRUNC));
> }
>
> +static inline int ovl_do_getattr(bool nosec, const struct path *path,
> + struct kstat *stat, u32 request_mask,
> + unsigned int flags)
> +{
> + if (nosec)
> + return vfs_getattr_nosec(path, stat, request_mask, flags);
> + return vfs_getattr(path, stat, request_mask, flags);
> +}
> +
> /* util.c */
> int ovl_want_write(struct dentry *dentry);
> void ovl_drop_write(struct dentry *dentry);
> diff --git a/fs/stat.c b/fs/stat.c
> index d43a5cc1bfa4..3250e427e1aa 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -133,7 +133,8 @@ int vfs_getattr_nosec(const struct path *path,
> struct kstat *stat,
> idmap = mnt_idmap(path->mnt);)
> if (inode->i_op->getattr)
> return inode->i_op->getattr(idmap, path, stat,
> - request_mask, query_flags);
> + request_mask,
> + query_flags | AT_GETATTR_NOSEC);
>
You also need in vfs_getattr():
if (WARN_ON_ONCE(query_flags & AT_GETATTR_NOSEC)
return -EPERM;
> generic_fillattr(idmap, request_mask, inode, stat);
> return 0;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index b528f063e8ff..9069d6a301f0 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2027,6 +2027,12 @@ struct super_operations {
> void (*shutdown)(struct super_block *sb);
> };
>
> +/*
> + * Internal query flags. See fcntl.h AT_xxx flags for the rest.
> + */
> +#define AT_GETATTR_NOSEC 0x80000000
> +#define AT_INTERNAL_MASK 0x80000000
> +
Yeh, the problem is that people adding flags to fcntl.h
won't be seeing this comment and we don't want to put those
"expose" those flags in uapi header either.
One possible compromise is to put them in fcntl.h under
#ifdef __KERNEL__
Very controversial, yes, I know.
The whole concept of mixing functional flags (i.e. AT_STATX_*)
with lookup AT_* flags is controversial to begin with, not to
mention flag overload for different syscalls (i.e. AT_EACCESS/
AT_REMOVEDIR/AT_HANDLE_FID).
But since we have accepted this necessary evil, I think that at least
we could explicitly partition the AT_ flags namespace and declare:
#ifdef __KERNEL__
AT_LOOKUP_FLAGS_MASK ...
AT_MOUNT_FLAGS_MASK AT_RECURSIVE
AT_SYSCALL_PRIVATE_MASK AT_EACCESS
AT_SYNC_TYPE_MASK AT_STATX_SYNC_TYPE
AT_KERNEL_INTERNAL_MASK 0x80000000
#endif
Sfefan,
I feel that I have to stress the point that this is only *my* opinion and
I accept that others (like some vfs co-maintains..) may passionately
disagree to further pollute the AT_ flags namespace.
The advantage of the AT_KERNEL_INTERNAL_MASK is that it is
in no way exposed to users via ABI, so if we decide to undo this
decision anytime in the future and reclaim the internal AT_ flags,
we could do that.
IMO, this is a decent compromise compared to the very noisy
patch that adds another flags argument to ->getattr() just to fix
this IMA/overlayfs corner case.
Thanks,
Amir.