task_struct.cred and task_struct.real_cred are annotated by __rcu,
hence use rcu_access_pointer to access them.
Fixes the following sparse errors:
kernel/cred.c:144:9: error: incompatible types in comparison expression
(different address spaces):
kernel/cred.c:144:9: struct cred *
kernel/cred.c:144:9: struct cred const [noderef] <asn:4> *
kernel/cred.c:145:9: error: incompatible types in comparison expression
(different address spaces):
kernel/cred.c:145:9: struct cred *
kernel/cred.c:145:9: struct cred const [noderef] <asn:4> *
Signed-off-by: Amol Grover <[email protected]>
---
kernel/cred.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/cred.c b/kernel/cred.c
index 809a985b1793..3043c8e1544d 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -141,8 +141,8 @@ void __put_cred(struct cred *cred)
cred->magic = CRED_MAGIC_DEAD;
cred->put_addr = __builtin_return_address(0);
#endif
- BUG_ON(cred == current->cred);
- BUG_ON(cred == current->real_cred);
+ BUG_ON(cred == rcu_access_pointer(current->cred));
+ BUG_ON(cred == rcu_access_pointer(current->real_cred));
if (cred->non_rcu)
put_cred_rcu(&cred->rcu);
--
2.24.1
On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> task_struct.cred and task_struct.real_cred are annotated by __rcu,
task_struct.cred doesn't actually have RCU semantics though, see
commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
it would probably be more correct to remove the __rcu annotation?
> hence use rcu_access_pointer to access them.
On 01/28, Jann Horn wrote:
>
> On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > task_struct.cred and task_struct.real_cred are annotated by __rcu,
>
> task_struct.cred doesn't actually have RCU semantics though, see
> commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> it would probably be more correct to remove the __rcu annotation?
Yes, but get_task_cred() assumes it has RCU semantics... To be honest
I didn't know we have this helper. Can't it race with revert_creds() in
do_faccessat() ?
Oleg.
On Tue, Jan 28, 2020 at 12:48 PM Oleg Nesterov <[email protected]> wrote:
> On 01/28, Jann Horn wrote:
> > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> >
> > task_struct.cred doesn't actually have RCU semantics though, see
> > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > it would probably be more correct to remove the __rcu annotation?
>
> Yes, but get_task_cred() assumes it has RCU semantics...
Oh, huh. AFAICS get_task_cred() makes no sense semantically, and I
think it ought to be deleted.
There are the following users at the moment:
rdtgroup_task_write_permission() - uses it for acting on a task as
object, should be using objective credentials instead.
__cgroup1_procs_write() - same thing.
task_state() - should also be using objective credentials instead, you
wouldn't want a task to show up as belonging to root in there just
because it's in the middle of some overlayfs filesystem method or
something like that.
apparmor_getprocattr() - same thing as for task_state()
rpcauth_bind_root_cred() - should also be using objective credentials
instead, if init is in overlayfs or whatever, you probably wouldn't
want that to have an effect here
prepare_kernel_cred() - most callers pass NULL and don't hit this
codepath, and those that don't pass NULL use `current`, so there can
be no concurrent modification. Maybe this could be rewritten to
something like this:
if (daemon) {
BUG_ON(daemon != current);
old = get_current_cred();
} else {
...
}
or maybe it could just use the objective creds, it shouldn't matter
here, I think.
> To be honest I didn't know we have this helper.
Same here.
> Can't it race with revert_creds() in
> do_faccessat() ?
Yeah, I think you can probably trigger use-after-free reads with that.
I also remember seeing something fishy in Smack at some point that had
similar issues... smack_file_send_sigiotask() does
smk_of_task(smack_cred(tsk->cred)), which looks very wrong.
On 01/28, Jann Horn wrote:
>
> On Tue, Jan 28, 2020 at 12:48 PM Oleg Nesterov <[email protected]> wrote:
> > On 01/28, Jann Horn wrote:
> > > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> > >
> > > task_struct.cred doesn't actually have RCU semantics though, see
> > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > it would probably be more correct to remove the __rcu annotation?
> >
> > Yes, but get_task_cred() assumes it has RCU semantics...
>
> Oh, huh. AFAICS get_task_cred() makes no sense semantically, and I
> think it ought to be deleted.
Ah, sorry for noise Jann. Somehow I managed to missread this function
as if it uses task->cred. No, it reads ->real_cred so it is fine.
Oleg.
Amol Grover <[email protected]> wrote:
> task_struct.cred and task_struct.real_cred are annotated by __rcu,
> hence use rcu_access_pointer to access them.
>
> Fixes the following sparse errors:
> kernel/cred.c:144:9: error: incompatible types in comparison expression
> (different address spaces):
> kernel/cred.c:144:9: struct cred *
> kernel/cred.c:144:9: struct cred const [noderef] <asn:4> *
> kernel/cred.c:145:9: error: incompatible types in comparison expression
> (different address spaces):
> kernel/cred.c:145:9: struct cred *
> kernel/cred.c:145:9: struct cred const [noderef] <asn:4> *
>
> Signed-off-by: Amol Grover <[email protected]>
Reviewed-by: David Howells <[email protected]>
On Tue, Jan 28, 2020 at 10:30:19AM +0100, Jann Horn wrote:
> On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > task_struct.cred and task_struct.real_cred are annotated by __rcu,
>
> task_struct.cred doesn't actually have RCU semantics though, see
> commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> it would probably be more correct to remove the __rcu annotation?
>
Hi Jann,
I went through the commit you mentioned. If I understand it correctly,
->cred was not being accessed concurrently (via RCU), hence, a non_rcu
flag was introduced, which determined if the clean-up should wait for
RCU grace-periods or not. And since, the changes were 'thread local'
there was no need to wait for an entire RCU GP to elapse.
The commit too, as you said, mentions the removal of __rcu annotation.
However, simply removing the annotation won't work, as there are quite a
few instances where RCU primitives are used. Even get_current_cred()
uses RCU APIs to get a reference to ->cred. So, currently, maybe we
should continue to use RCU APIs and leave the __rcu annotation in?
(Until someone who takes it on himself to remove __rcu annotation and
fix all the instances). Does that sound good? Or do you want me to
remove __rcu annotation and get the process started?
Thanks
Amol
> > hence use rcu_access_pointer to access them.
On Tue, Jan 28, 2020 at 6:04 PM Amol Grover <[email protected]> wrote:
> On Tue, Jan 28, 2020 at 10:30:19AM +0100, Jann Horn wrote:
> > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> >
> > task_struct.cred doesn't actually have RCU semantics though, see
> > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > it would probably be more correct to remove the __rcu annotation?
> >
>
> Hi Jann,
>
> I went through the commit you mentioned. If I understand it correctly,
> ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> flag was introduced, which determined if the clean-up should wait for
> RCU grace-periods or not. And since, the changes were 'thread local'
> there was no need to wait for an entire RCU GP to elapse.
Yeah.
> The commit too, as you said, mentions the removal of __rcu annotation.
> However, simply removing the annotation won't work, as there are quite a
> few instances where RCU primitives are used. Even get_current_cred()
> uses RCU APIs to get a reference to ->cred.
Luckily, there aren't too many places that directly access ->cred,
since luckily there are helper functions like get_current_cred() that
will do it for you. Grepping through the kernel, I see:
Places that need adjustment:
include/linux/cred.h: rcu_dereference_protected(current->cred, 1)
kernel/auditsc.c: * the only situations where tsk->cred may be
accessed without an rcu read lock.
kernel/auditsc.c: cred = rcu_dereference_check(tsk->cred, tsk ==
current || task_creation);
kernel/cred.c: rcu_assign_pointer(task->cred, new);
kernel/cred.c: rcu_assign_pointer(current->cred, new);
kernel/cred.c: rcu_assign_pointer(current->cred, old);
Places that already don't use RCU accessors:
drivers/virt/vboxguest/vboxguest_linux.c: if
(from_kuid(current_user_ns(), current->cred->uid) == 0)
kernel/cred.c: BUG_ON(cred == current->cred);
kernel/cred.c: kdebug("exit_creds(%u,%p,%p,{%d,%d})", tsk->pid,
tsk->real_cred, tsk->cred,
kernel/cred.c: atomic_read(&tsk->cred->usage),
kernel/cred.c: read_cred_subscribers(tsk->cred));
kernel/cred.c: cred = (struct cred *) tsk->cred;
kernel/cred.c: tsk->cred = NULL;
kernel/cred.c: old = task->cred;
kernel/cred.c: !p->cred->thread_keyring &&
kernel/cred.c: p->real_cred = get_cred(p->cred);
kernel/cred.c: get_cred(p->cred);
kernel/cred.c: alter_cred_subscribers(p->cred, 2);
kernel/cred.c: p->cred, atomic_read(&p->cred->usage),
kernel/cred.c: read_cred_subscribers(p->cred));
kernel/cred.c: atomic_inc(&p->cred->user->processes);
kernel/cred.c: p->cred = p->real_cred = get_cred(new);
kernel/cred.c: BUG_ON(task->cred != old);
kernel/cred.c: const struct cred *old = current->cred;
kernel/cred.c: * '->cred' pointer, not the '->real_cred' pointer that is
kernel/cred.c: const struct cred *override = current->cred;
kernel/cred.c: cred == tsk->cred ? "[eff]" : "");
kernel/cred.c: if (tsk->cred == tsk->real_cred) {
kernel/cred.c: if (unlikely(read_cred_subscribers(tsk->cred) < 2 ||
kernel/cred.c: creds_are_invalid(tsk->cred)))
kernel/cred.c: read_cred_subscribers(tsk->cred) < 1 ||
kernel/cred.c: creds_are_invalid(tsk->cred)))
kernel/cred.c: if (tsk->cred != tsk->real_cred)
kernel/cred.c: dump_invalid_creds(tsk->cred, "Effective", tsk);
kernel/cred.c: tsk->real_cred, tsk->cred,
kernel/cred.c: atomic_read(&tsk->cred->usage),
kernel/cred.c: read_cred_subscribers(tsk->cred));
kernel/fork.c: atomic_dec(&p->cred->user->processes);
security/security.c: lsm_early_cred((struct cred *) current->cred);
security/smack/smack_lsm.c: struct cred *cred = (struct cred *)
current->cred;
security/tomoyo/common.c: (!uid_eq(task->cred->uid,
GLOBAL_ROOT_UID) ||
security/tomoyo/common.c: !uid_eq(task->cred->euid,
GLOBAL_ROOT_UID)))
Places that don't use RCU and are broken:
security/smack/smack_lsm.c: struct smack_known *tkp =
smk_of_task(smack_cred(tsk->cred));
So actually, the number of places that already don't use RCU accessors
is much higher than the number of places that use them.
> So, currently, maybe we
> should continue to use RCU APIs and leave the __rcu annotation in?
> (Until someone who takes it on himself to remove __rcu annotation and
> fix all the instances). Does that sound good? Or do you want me to
> remove __rcu annotation and get the process started?
I don't think it's a good idea to add more uses of RCU APIs for
->cred; you shouldn't "fix" warnings by making the code more wrong.
If you want to fix this, I think it would be relatively easy to fix
this properly - as far as I can tell, there are only seven places that
you'll have to change, although you may have to split it up into three
patches.
On Tue, Jan 28, 2020 at 08:09:17PM +0100, Jann Horn wrote:
> On Tue, Jan 28, 2020 at 6:04 PM Amol Grover <[email protected]> wrote:
> > On Tue, Jan 28, 2020 at 10:30:19AM +0100, Jann Horn wrote:
> > > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> > >
> > > task_struct.cred doesn't actually have RCU semantics though, see
> > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > it would probably be more correct to remove the __rcu annotation?
> > >
> >
> > Hi Jann,
> >
> > I went through the commit you mentioned. If I understand it correctly,
> > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > flag was introduced, which determined if the clean-up should wait for
> > RCU grace-periods or not. And since, the changes were 'thread local'
> > there was no need to wait for an entire RCU GP to elapse.
>
> Yeah.
>
> > The commit too, as you said, mentions the removal of __rcu annotation.
> > However, simply removing the annotation won't work, as there are quite a
> > few instances where RCU primitives are used. Even get_current_cred()
> > uses RCU APIs to get a reference to ->cred.
>
> Luckily, there aren't too many places that directly access ->cred,
> since luckily there are helper functions like get_current_cred() that
> will do it for you. Grepping through the kernel, I see:
>
> Places that need adjustment:
>
> include/linux/cred.h: rcu_dereference_protected(current->cred, 1)
> kernel/auditsc.c: * the only situations where tsk->cred may be
> accessed without an rcu read lock.
> kernel/auditsc.c: cred = rcu_dereference_check(tsk->cred, tsk ==
> current || task_creation);
> kernel/cred.c: rcu_assign_pointer(task->cred, new);
> kernel/cred.c: rcu_assign_pointer(current->cred, new);
> kernel/cred.c: rcu_assign_pointer(current->cred, old);
>
>
> Places that already don't use RCU accessors:
>
> drivers/virt/vboxguest/vboxguest_linux.c: if
> (from_kuid(current_user_ns(), current->cred->uid) == 0)
> kernel/cred.c: BUG_ON(cred == current->cred);
> kernel/cred.c: kdebug("exit_creds(%u,%p,%p,{%d,%d})", tsk->pid,
> tsk->real_cred, tsk->cred,
> kernel/cred.c: atomic_read(&tsk->cred->usage),
> kernel/cred.c: read_cred_subscribers(tsk->cred));
> kernel/cred.c: cred = (struct cred *) tsk->cred;
> kernel/cred.c: tsk->cred = NULL;
> kernel/cred.c: old = task->cred;
> kernel/cred.c: !p->cred->thread_keyring &&
> kernel/cred.c: p->real_cred = get_cred(p->cred);
> kernel/cred.c: get_cred(p->cred);
> kernel/cred.c: alter_cred_subscribers(p->cred, 2);
> kernel/cred.c: p->cred, atomic_read(&p->cred->usage),
> kernel/cred.c: read_cred_subscribers(p->cred));
> kernel/cred.c: atomic_inc(&p->cred->user->processes);
> kernel/cred.c: p->cred = p->real_cred = get_cred(new);
> kernel/cred.c: BUG_ON(task->cred != old);
> kernel/cred.c: const struct cred *old = current->cred;
> kernel/cred.c: * '->cred' pointer, not the '->real_cred' pointer that is
> kernel/cred.c: const struct cred *override = current->cred;
> kernel/cred.c: cred == tsk->cred ? "[eff]" : "");
> kernel/cred.c: if (tsk->cred == tsk->real_cred) {
> kernel/cred.c: if (unlikely(read_cred_subscribers(tsk->cred) < 2 ||
> kernel/cred.c: creds_are_invalid(tsk->cred)))
> kernel/cred.c: read_cred_subscribers(tsk->cred) < 1 ||
> kernel/cred.c: creds_are_invalid(tsk->cred)))
> kernel/cred.c: if (tsk->cred != tsk->real_cred)
> kernel/cred.c: dump_invalid_creds(tsk->cred, "Effective", tsk);
> kernel/cred.c: tsk->real_cred, tsk->cred,
> kernel/cred.c: atomic_read(&tsk->cred->usage),
> kernel/cred.c: read_cred_subscribers(tsk->cred));
> kernel/fork.c: atomic_dec(&p->cred->user->processes);
> security/security.c: lsm_early_cred((struct cred *) current->cred);
> security/smack/smack_lsm.c: struct cred *cred = (struct cred *)
> current->cred;
> security/tomoyo/common.c: (!uid_eq(task->cred->uid,
> GLOBAL_ROOT_UID) ||
> security/tomoyo/common.c: !uid_eq(task->cred->euid,
> GLOBAL_ROOT_UID)))
>
>
> Places that don't use RCU and are broken:
>
> security/smack/smack_lsm.c: struct smack_known *tkp =
> smk_of_task(smack_cred(tsk->cred));
>
> So actually, the number of places that already don't use RCU accessors
> is much higher than the number of places that use them.
>
> > So, currently, maybe we
> > should continue to use RCU APIs and leave the __rcu annotation in?
> > (Until someone who takes it on himself to remove __rcu annotation and
> > fix all the instances). Does that sound good? Or do you want me to
> > remove __rcu annotation and get the process started?
>
> I don't think it's a good idea to add more uses of RCU APIs for
> ->cred; you shouldn't "fix" warnings by making the code more wrong.
>
> If you want to fix this, I think it would be relatively easy to fix
> this properly - as far as I can tell, there are only seven places that
> you'll have to change, although you may have to split it up into three
> patches.
Thank you for the detailed analysis. I'll try my best and send you a
patch. But before I start I want to make sure one thing. The changes
done by the commit you mentioned (which introduced non_rcu flag), should
be now reverted, right? Since, prior to the commit RCU semantics were
there and RCU was being used (which was unnecessary) and the fix merely,
removed these (unnecessary) RCU usages (with checks to either use them
or not, but now we actually don't use RCU for subjective credentials).
So, now what's left is the unused RCU code (which needs to be removed)
and the changes done in the temporary fix (which would be reverted since
we don't want to use RCU).
Thanks
Amol
On Wed, Jan 29, 2020 at 7:57 AM Amol Grover <[email protected]> wrote:
> On Tue, Jan 28, 2020 at 08:09:17PM +0100, Jann Horn wrote:
> > On Tue, Jan 28, 2020 at 6:04 PM Amol Grover <[email protected]> wrote:
> > > On Tue, Jan 28, 2020 at 10:30:19AM +0100, Jann Horn wrote:
> > > > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> > > >
> > > > task_struct.cred doesn't actually have RCU semantics though, see
> > > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > > it would probably be more correct to remove the __rcu annotation?
> > >
> > > Hi Jann,
> > >
> > > I went through the commit you mentioned. If I understand it correctly,
> > > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > > flag was introduced, which determined if the clean-up should wait for
> > > RCU grace-periods or not. And since, the changes were 'thread local'
> > > there was no need to wait for an entire RCU GP to elapse.
> >
> > Yeah.
> >
> > > The commit too, as you said, mentions the removal of __rcu annotation.
> > > However, simply removing the annotation won't work, as there are quite a
> > > few instances where RCU primitives are used. Even get_current_cred()
> > > uses RCU APIs to get a reference to ->cred.
> >
> > Luckily, there aren't too many places that directly access ->cred,
> > since luckily there are helper functions like get_current_cred() that
> > will do it for you. Grepping through the kernel, I see:
[...]
> > So actually, the number of places that already don't use RCU accessors
> > is much higher than the number of places that use them.
> >
> > > So, currently, maybe we
> > > should continue to use RCU APIs and leave the __rcu annotation in?
> > > (Until someone who takes it on himself to remove __rcu annotation and
> > > fix all the instances). Does that sound good? Or do you want me to
> > > remove __rcu annotation and get the process started?
> >
> > I don't think it's a good idea to add more uses of RCU APIs for
> > ->cred; you shouldn't "fix" warnings by making the code more wrong.
> >
> > If you want to fix this, I think it would be relatively easy to fix
> > this properly - as far as I can tell, there are only seven places that
> > you'll have to change, although you may have to split it up into three
> > patches.
>
> Thank you for the detailed analysis. I'll try my best and send you a
> patch.
While you can CC me on that, I'm not a kernel maintainer; you should
send patches to the people who maintain the areas of kernel code that
you're modifying. (kernel/cred.c has no specific maintainer; for that
file, I'd probably try sending patches to Andrew Morton, Oleg
Nesterov, David Howells and Eric Biederman, as well as the
linux-kernel@ mailinglist.)
> But before I start I want to make sure one thing. The changes
> done by the commit you mentioned (which introduced non_rcu flag), should
> be now reverted, right?
No.
> Since, prior to the commit RCU semantics were
> there and RCU was being used (which was unnecessary) and the fix merely,
> removed these (unnecessary) RCU usages (with checks to either use them
> or not, but now we actually don't use RCU for subjective credentials).
>
> So, now what's left is the unused RCU code (which needs to be removed)
> and the changes done in the temporary fix (which would be reverted since
> we don't want to use RCU).
No. Instances of `struct cred` *can* still have an RCU-protected
lifetime; but only certain references to it have RCU semantics.
{task}->cred doesn't have RCU semantics, but {task}->real_cred does
have RCU semantics, and those two can point to the same object.
__rcu annotations mark that a *reference* has RCU semantics. Lack of
__rcu annotation means that the *reference* does not have RCU
semantics, but the object it points to can still have a RCU-protected
lifetime.
Jann Horn <[email protected]> wrote:
> > task_struct.cred and task_struct.real_cred are annotated by __rcu,
>
> task_struct.cred doesn't actually have RCU semantics though, see
> commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> it would probably be more correct to remove the __rcu annotation?
You're right, I think, there shouldn't be any need for __rcu on
task_struct::cred since it shouldn't be accessed on any task except current.
I've a feeling that there was something at the time, proc perhaps, but I don't
remember.
David
On Wed, Jan 29, 2020 at 03:14:56PM +0100, Jann Horn wrote:
> On Wed, Jan 29, 2020 at 7:57 AM Amol Grover <[email protected]> wrote:
> > On Tue, Jan 28, 2020 at 08:09:17PM +0100, Jann Horn wrote:
> > > On Tue, Jan 28, 2020 at 6:04 PM Amol Grover <[email protected]> wrote:
> > > > On Tue, Jan 28, 2020 at 10:30:19AM +0100, Jann Horn wrote:
> > > > > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > > > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> > > > >
> > > > > task_struct.cred doesn't actually have RCU semantics though, see
> > > > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > > > it would probably be more correct to remove the __rcu annotation?
> > > >
> > > > Hi Jann,
> > > >
> > > > I went through the commit you mentioned. If I understand it correctly,
> > > > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > > > flag was introduced, which determined if the clean-up should wait for
> > > > RCU grace-periods or not. And since, the changes were 'thread local'
> > > > there was no need to wait for an entire RCU GP to elapse.
> > >
> > > Yeah.
> > >
> > > > The commit too, as you said, mentions the removal of __rcu annotation.
> > > > However, simply removing the annotation won't work, as there are quite a
> > > > few instances where RCU primitives are used. Even get_current_cred()
> > > > uses RCU APIs to get a reference to ->cred.
> > >
> > > Luckily, there aren't too many places that directly access ->cred,
> > > since luckily there are helper functions like get_current_cred() that
> > > will do it for you. Grepping through the kernel, I see:
> [...]
> > > So actually, the number of places that already don't use RCU accessors
> > > is much higher than the number of places that use them.
> > >
> > > > So, currently, maybe we
> > > > should continue to use RCU APIs and leave the __rcu annotation in?
> > > > (Until someone who takes it on himself to remove __rcu annotation and
> > > > fix all the instances). Does that sound good? Or do you want me to
> > > > remove __rcu annotation and get the process started?
> > >
> > > I don't think it's a good idea to add more uses of RCU APIs for
> > > ->cred; you shouldn't "fix" warnings by making the code more wrong.
> > >
> > > If you want to fix this, I think it would be relatively easy to fix
> > > this properly - as far as I can tell, there are only seven places that
> > > you'll have to change, although you may have to split it up into three
> > > patches.
> >
> > Thank you for the detailed analysis. I'll try my best and send you a
> > patch.
Amol, Jann, if I understand the discussion correctly, objects ->cred
point (the subjective creds) are never (or never need to be) RCU-managed.
This makes sense in light of the commit Jann pointed out
(d7852fbd0f0423937fa287a598bfde188bb68c22).
How about the following diff as a starting point?
1. Remove all ->cred accessing happening through RCU primitive.
2. Remove __rcu from task_struct ->cred
3. Also I removed the whole non_rcu flag, and introduced a new put_cred_non_rcu() API
which places that task-synchronously use ->cred can overwrite. Callers
doing such accesses like access() can use this API instead.
I have only build tested the below diff and it is likely buggy but Amol you
can use it as a starting point, or we can discuss more on this thread.
---8<-----------------------
diff --git a/fs/open.c b/fs/open.c
index 8cdb2b6758678..bbdd1f352e4e3 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -374,25 +374,6 @@ long do_faccessat(int dfd, const char __user *filename, int mode)
override_cred->cap_permitted;
}
- /*
- * The new set of credentials can *only* be used in
- * task-synchronous circumstances, and does not need
- * RCU freeing, unless somebody then takes a separate
- * reference to it.
- *
- * NOTE! This is _only_ true because this credential
- * is used purely for override_creds() that installs
- * it as the subjective cred. Other threads will be
- * accessing ->real_cred, not the subjective cred.
- *
- * If somebody _does_ make a copy of this (using the
- * 'get_current_cred()' function), that will clear the
- * non_rcu field, because now that other user may be
- * expecting RCU freeing. But normal thread-synchronous
- * cred accesses will keep things non-RCY.
- */
- override_cred->non_rcu = 1;
-
old_cred = override_creds(override_cred);
retry:
res = user_path_at(dfd, filename, lookup_flags, &path);
@@ -436,7 +417,13 @@ long do_faccessat(int dfd, const char __user *filename, int mode)
}
out:
revert_creds(old_cred);
- put_cred(override_cred);
+
+ /*
+ * override_cred points to current task's ->cred and will not be used
+ * by anyone but the current task. Since we are done using it, remove it
+ * without waiting for a grace period.
+ */
+ put_cred_non_rcu(override_cred);
return res;
}
diff --git a/include/linux/cred.h b/include/linux/cred.h
index 18639c069263f..2f41a0fa75741 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -145,14 +145,10 @@ struct cred {
struct user_struct *user; /* real user ID subscription */
struct user_namespace *user_ns; /* user_ns the caps and keyrings are relative to. */
struct group_info *group_info; /* supplementary groups for euid/fsgid */
- /* RCU deletion */
- union {
- int non_rcu; /* Can we skip RCU deletion? */
- struct rcu_head rcu; /* RCU deletion hook */
- };
+ struct rcu_head rcu; /* RCU deletion hook if needed*/
} __randomize_layout;
-extern void __put_cred(struct cred *);
+extern void __put_cred(struct cred *, bool rcu);
extern void exit_creds(struct task_struct *);
extern int copy_creds(struct task_struct *, unsigned long);
extern const struct cred *get_task_cred(struct task_struct *);
@@ -250,7 +246,6 @@ static inline const struct cred *get_cred(const struct cred *cred)
if (!cred)
return cred;
validate_creds(cred);
- nonconst_cred->non_rcu = 0;
return get_new_cred(nonconst_cred);
}
@@ -262,7 +257,6 @@ static inline const struct cred *get_cred_rcu(const struct cred *cred)
if (!atomic_inc_not_zero(&nonconst_cred->usage))
return NULL;
validate_creds(cred);
- nonconst_cred->non_rcu = 0;
return cred;
}
@@ -270,8 +264,9 @@ static inline const struct cred *get_cred_rcu(const struct cred *cred)
* put_cred - Release a reference to a set of credentials
* @cred: The credentials to release
*
- * Release a reference to a set of credentials, deleting them when the last ref
- * is released. If %NULL is passed, nothing is done.
+ * Release a reference to a set of credentials, deleting them after an RCU
+ * grace period when the last ref is released.
+ * If %NULL is passed, nothing is done.
*
* This takes a const pointer to a set of credentials because the credentials
* on task_struct are attached by const pointers to prevent accidental
@@ -284,18 +279,35 @@ static inline void put_cred(const struct cred *_cred)
if (cred) {
validate_creds(cred);
if (atomic_dec_and_test(&(cred)->usage))
- __put_cred(cred);
+ __put_cred(cred, true);
+ }
+}
+
+/**
+ * put_cred - Release a reference to a set of credentials
+ * @cred: The credentials to release
+ *
+ * Same as put_cred() except that the freeing of the cred object is done
+ * immediately without waiting for RCU grace period. This is useful when
+ * using a task's temporary subjective credentials (task_struct::cred).
+ */
+static inline void put_cred_non_rcu(const struct cred *_cred)
+{
+ struct cred *cred = (struct cred *) _cred;
+
+ if (cred) {
+ validate_creds(cred);
+ if (atomic_dec_and_test(&(cred)->usage))
+ __put_cred(cred, false);
}
}
/**
* current_cred - Access the current task's subjective credentials
*
- * Access the subjective credentials of the current task. RCU-safe,
- * since nobody else can modify it.
+ * Access the subjective credentials of the current task.
*/
-#define current_cred() \
- rcu_dereference_protected(current->cred, 1)
+#define current_cred() READ_ONCE(current->cred)
/**
* current_real_cred - Access the current task's objective credentials
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 716ad1d8d95e1..227375311d992 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -879,8 +879,9 @@ struct task_struct {
/* Objective and real subjective task credentials (COW): */
const struct cred __rcu *real_cred;
- /* Effective (overridable) subjective task credentials (COW): */
- const struct cred __rcu *cred;
+ /* Effective (overridable) subjective task credentials (COW):
+ * Access is not managed by RCU and is used task-synchronously */
+ const struct cred *cred;
#ifdef CONFIG_KEYS
/* Cached requested key. */
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 4effe01ebbe2b..2e70a73646d53 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -430,24 +430,19 @@ static int audit_field_compare(struct task_struct *tsk,
/* Determine if any context name data matches a rule's watch data */
/* Compare a task_struct with an audit_rule. Return 1 on match, 0
* otherwise.
- *
- * If task_creation is true, this is an explicit indication that we are
- * filtering a task rule at task creation time. This and tsk == current are
- * the only situations where tsk->cred may be accessed without an rcu read lock.
*/
static int audit_filter_rules(struct task_struct *tsk,
struct audit_krule *rule,
struct audit_context *ctx,
struct audit_names *name,
- enum audit_state *state,
- bool task_creation)
+ enum audit_state *state)
{
const struct cred *cred;
int i, need_sid = 1;
u32 sid;
unsigned int sessionid;
- cred = rcu_dereference_check(tsk->cred, tsk == current || task_creation);
+ cred = READ_ONCE(tsk->cred);
for (i = 0; i < rule->field_count; i++) {
struct audit_field *f = &rule->fields[i];
@@ -745,7 +740,7 @@ static enum audit_state audit_filter_task(struct task_struct *tsk, char **key)
rcu_read_lock();
list_for_each_entry_rcu(e, &audit_filter_list[AUDIT_FILTER_TASK], list) {
if (audit_filter_rules(tsk, &e->rule, NULL, NULL,
- &state, true)) {
+ &state)) {
if (state == AUDIT_RECORD_CONTEXT)
*key = kstrdup(e->rule.filterkey, GFP_ATOMIC);
rcu_read_unlock();
@@ -791,7 +786,7 @@ static enum audit_state audit_filter_syscall(struct task_struct *tsk,
list_for_each_entry_rcu(e, list, list) {
if (audit_in_mask(&e->rule, ctx->major) &&
audit_filter_rules(tsk, &e->rule, ctx, NULL,
- &state, false)) {
+ &state)) {
rcu_read_unlock();
ctx->current_state = state;
return state;
@@ -815,7 +810,7 @@ static int audit_filter_inode_name(struct task_struct *tsk,
list_for_each_entry_rcu(e, list, list) {
if (audit_in_mask(&e->rule, ctx->major) &&
- audit_filter_rules(tsk, &e->rule, ctx, n, &state, false)) {
+ audit_filter_rules(tsk, &e->rule, ctx, n, &state)) {
ctx->current_state = state;
return 1;
}
diff --git a/kernel/cred.c b/kernel/cred.c
index 809a985b17934..120258b9d87df 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -129,7 +129,7 @@ static void put_cred_rcu(struct rcu_head *rcu)
*
* Destroy a set of credentials on which no references remain.
*/
-void __put_cred(struct cred *cred)
+void __put_cred(struct cred *cred, bool rcu)
{
kdebug("__put_cred(%p{%d,%d})", cred,
atomic_read(&cred->usage),
@@ -144,10 +144,10 @@ void __put_cred(struct cred *cred)
BUG_ON(cred == current->cred);
BUG_ON(cred == current->real_cred);
- if (cred->non_rcu)
- put_cred_rcu(&cred->rcu);
- else
+ if (rcu)
call_rcu(&cred->rcu, put_cred_rcu);
+ else
+ put_cred_rcu(&cred->rcu);
}
EXPORT_SYMBOL(__put_cred);
@@ -264,7 +264,6 @@ struct cred *prepare_creds(void)
old = task->cred;
memcpy(new, old, sizeof(struct cred));
- new->non_rcu = 0;
atomic_set(&new->usage, 1);
set_cred_subscribers(new, 0);
get_group_info(new->group_info);
@@ -485,7 +484,7 @@ int commit_creds(struct cred *new)
if (new->user != old->user)
atomic_inc(&new->user->processes);
rcu_assign_pointer(task->real_cred, new);
- rcu_assign_pointer(task->cred, new);
+ WRITE_ONCE(task->cred, new);
if (new->user != old->user)
atomic_dec(&old->user->processes);
alter_cred_subscribers(old, -2);
@@ -549,20 +548,9 @@ const struct cred *override_creds(const struct cred *new)
validate_creds(old);
validate_creds(new);
- /*
- * NOTE! This uses 'get_new_cred()' rather than 'get_cred()'.
- *
- * That means that we do not clear the 'non_rcu' flag, since
- * we are only installing the cred into the thread-synchronous
- * '->cred' pointer, not the '->real_cred' pointer that is
- * visible to other threads under RCU.
- *
- * Also note that we did validate_creds() manually, not depending
- * on the validation in 'get_cred()'.
- */
- get_new_cred((struct cred *)new);
+ get_cred(new);
alter_cred_subscribers(new, 1);
- rcu_assign_pointer(current->cred, new);
+ WRITE_ONCE(current->cred, new);
alter_cred_subscribers(old, -1);
kdebug("override_creds() = %p{%d,%d}", old,
@@ -590,7 +578,7 @@ void revert_creds(const struct cred *old)
validate_creds(old);
validate_creds(override);
alter_cred_subscribers(old, 1);
- rcu_assign_pointer(current->cred, old);
+ WRITE_ONCE(current->cred, old);
alter_cred_subscribers(override, -1);
put_cred(override);
}
@@ -697,7 +685,6 @@ struct cred *prepare_kernel_cred(struct task_struct *daemon)
validate_creds(old);
*new = *old;
- new->non_rcu = 0;
atomic_set(&new->usage, 1);
set_cred_subscribers(new, 0);
get_uid(new->user);
On Thu, Feb 6, 2020 at 2:32 AM Joel Fernandes <[email protected]> wrote:
> On Wed, Jan 29, 2020 at 03:14:56PM +0100, Jann Horn wrote:
> > On Wed, Jan 29, 2020 at 7:57 AM Amol Grover <[email protected]> wrote:
> > > On Tue, Jan 28, 2020 at 08:09:17PM +0100, Jann Horn wrote:
> > > > On Tue, Jan 28, 2020 at 6:04 PM Amol Grover <[email protected]> wrote:
> > > > > On Tue, Jan 28, 2020 at 10:30:19AM +0100, Jann Horn wrote:
> > > > > > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > > > > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> > > > > >
> > > > > > task_struct.cred doesn't actually have RCU semantics though, see
> > > > > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > > > > it would probably be more correct to remove the __rcu annotation?
> > > > >
> > > > > Hi Jann,
> > > > >
> > > > > I went through the commit you mentioned. If I understand it correctly,
> > > > > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > > > > flag was introduced, which determined if the clean-up should wait for
> > > > > RCU grace-periods or not. And since, the changes were 'thread local'
> > > > > there was no need to wait for an entire RCU GP to elapse.
> > > >
> > > > Yeah.
> > > >
> > > > > The commit too, as you said, mentions the removal of __rcu annotation.
> > > > > However, simply removing the annotation won't work, as there are quite a
> > > > > few instances where RCU primitives are used. Even get_current_cred()
> > > > > uses RCU APIs to get a reference to ->cred.
> > > >
> > > > Luckily, there aren't too many places that directly access ->cred,
> > > > since luckily there are helper functions like get_current_cred() that
> > > > will do it for you. Grepping through the kernel, I see:
> > [...]
> > > > So actually, the number of places that already don't use RCU accessors
> > > > is much higher than the number of places that use them.
> > > >
> > > > > So, currently, maybe we
> > > > > should continue to use RCU APIs and leave the __rcu annotation in?
> > > > > (Until someone who takes it on himself to remove __rcu annotation and
> > > > > fix all the instances). Does that sound good? Or do you want me to
> > > > > remove __rcu annotation and get the process started?
> > > >
> > > > I don't think it's a good idea to add more uses of RCU APIs for
> > > > ->cred; you shouldn't "fix" warnings by making the code more wrong.
> > > >
> > > > If you want to fix this, I think it would be relatively easy to fix
> > > > this properly - as far as I can tell, there are only seven places that
> > > > you'll have to change, although you may have to split it up into three
> > > > patches.
> > >
> > > Thank you for the detailed analysis. I'll try my best and send you a
> > > patch.
>
> Amol, Jann, if I understand the discussion correctly, objects ->cred
> point (the subjective creds) are never (or never need to be) RCU-managed.
> This makes sense in light of the commit Jann pointed out
> (d7852fbd0f0423937fa287a598bfde188bb68c22).
>
> How about the following diff as a starting point?
>
> 1. Remove all ->cred accessing happening through RCU primitive.
Sounds good.
> 2. Remove __rcu from task_struct ->cred
Sounds good.
> 3. Also I removed the whole non_rcu flag, and introduced a new put_cred_non_rcu() API
> which places that task-synchronously use ->cred can overwrite. Callers
> doing such accesses like access() can use this API instead.
That's wrong, don't do that.
->cred is a reference without RCU semantics, ->real_cred is a
reference with RCU semantics. If there have never been any references
with RCU semantics to a specific instance of struct cred, then that
instance can indeed be freed without an RCU grace period. But it would
be possible for some filesystem code to take a reference to
current->cred, and assign it to some pointer with RCU semantics
somewhere, then drop that reference with put_cred() immediately before
you reach put_cred_non_rcu(); with the result that despite using
put_cred(), the other side doesn't get RCU semantics.
Just leave the whole ->non_rcu thing exactly as it was.
On Wed, Feb 05, 2020 at 08:32:51PM -0500, Joel Fernandes wrote:
> On Wed, Jan 29, 2020 at 03:14:56PM +0100, Jann Horn wrote:
> > On Wed, Jan 29, 2020 at 7:57 AM Amol Grover <[email protected]> wrote:
> > > On Tue, Jan 28, 2020 at 08:09:17PM +0100, Jann Horn wrote:
> > > > On Tue, Jan 28, 2020 at 6:04 PM Amol Grover <[email protected]> wrote:
> > > > > On Tue, Jan 28, 2020 at 10:30:19AM +0100, Jann Horn wrote:
> > > > > > On Tue, Jan 28, 2020 at 8:28 AM Amol Grover <[email protected]> wrote:
> > > > > > > task_struct.cred and task_struct.real_cred are annotated by __rcu,
> > > > > >
> > > > > > task_struct.cred doesn't actually have RCU semantics though, see
> > > > > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > > > > it would probably be more correct to remove the __rcu annotation?
> > > > >
> > > > > Hi Jann,
> > > > >
> > > > > I went through the commit you mentioned. If I understand it correctly,
> > > > > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > > > > flag was introduced, which determined if the clean-up should wait for
> > > > > RCU grace-periods or not. And since, the changes were 'thread local'
> > > > > there was no need to wait for an entire RCU GP to elapse.
> > > >
> > > > Yeah.
> > > >
> > > > > The commit too, as you said, mentions the removal of __rcu annotation.
> > > > > However, simply removing the annotation won't work, as there are quite a
> > > > > few instances where RCU primitives are used. Even get_current_cred()
> > > > > uses RCU APIs to get a reference to ->cred.
> > > >
> > > > Luckily, there aren't too many places that directly access ->cred,
> > > > since luckily there are helper functions like get_current_cred() that
> > > > will do it for you. Grepping through the kernel, I see:
> > [...]
> > > > So actually, the number of places that already don't use RCU accessors
> > > > is much higher than the number of places that use them.
> > > >
> > > > > So, currently, maybe we
> > > > > should continue to use RCU APIs and leave the __rcu annotation in?
> > > > > (Until someone who takes it on himself to remove __rcu annotation and
> > > > > fix all the instances). Does that sound good? Or do you want me to
> > > > > remove __rcu annotation and get the process started?
> > > >
> > > > I don't think it's a good idea to add more uses of RCU APIs for
> > > > ->cred; you shouldn't "fix" warnings by making the code more wrong.
> > > >
> > > > If you want to fix this, I think it would be relatively easy to fix
> > > > this properly - as far as I can tell, there are only seven places that
> > > > you'll have to change, although you may have to split it up into three
> > > > patches.
> > >
> > > Thank you for the detailed analysis. I'll try my best and send you a
> > > patch.
>
> Amol, Jann, if I understand the discussion correctly, objects ->cred
> point (the subjective creds) are never (or never need to be) RCU-managed.
> This makes sense in light of the commit Jann pointed out
> (d7852fbd0f0423937fa287a598bfde188bb68c22).
>
> How about the following diff as a starting point?
>
> 1. Remove all ->cred accessing happening through RCU primitive.
> 2. Remove __rcu from task_struct ->cred
> 3. Also I removed the whole non_rcu flag, and introduced a new put_cred_non_rcu() API
> which places that task-synchronously use ->cred can overwrite. Callers
> doing such accesses like access() can use this API instead.
>
> I have only build tested the below diff and it is likely buggy but Amol you
> can use it as a starting point, or we can discuss more on this thread.
>
Thank you for starting this Joel! This will make our lives easier! I'll
go through it once and get back to Jann's latest reply.
Thanks
Amol
On Thu, Feb 06, 2020 at 12:28:42PM +0100, Jann Horn wrote:
[snip]
> > > > > > > task_struct.cred doesn't actually have RCU semantics though, see
> > > > > > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > > > > > it would probably be more correct to remove the __rcu annotation?
> > > > > >
> > > > > > Hi Jann,
> > > > > >
> > > > > > I went through the commit you mentioned. If I understand it correctly,
> > > > > > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > > > > > flag was introduced, which determined if the clean-up should wait for
> > > > > > RCU grace-periods or not. And since, the changes were 'thread local'
> > > > > > there was no need to wait for an entire RCU GP to elapse.
> > > > >
> > > > > Yeah.
> > > > >
> > > > > > The commit too, as you said, mentions the removal of __rcu annotation.
> > > > > > However, simply removing the annotation won't work, as there are quite a
> > > > > > few instances where RCU primitives are used. Even get_current_cred()
> > > > > > uses RCU APIs to get a reference to ->cred.
> > > > >
> > > > > Luckily, there aren't too many places that directly access ->cred,
> > > > > since luckily there are helper functions like get_current_cred() that
> > > > > will do it for you. Grepping through the kernel, I see:
> > > [...]
> > > > > So actually, the number of places that already don't use RCU accessors
> > > > > is much higher than the number of places that use them.
> > > > >
> > > > > > So, currently, maybe we
> > > > > > should continue to use RCU APIs and leave the __rcu annotation in?
> > > > > > (Until someone who takes it on himself to remove __rcu annotation and
> > > > > > fix all the instances). Does that sound good? Or do you want me to
> > > > > > remove __rcu annotation and get the process started?
> > > > >
> > > > > I don't think it's a good idea to add more uses of RCU APIs for
> > > > > ->cred; you shouldn't "fix" warnings by making the code more wrong.
> > > > >
> > > > > If you want to fix this, I think it would be relatively easy to fix
> > > > > this properly - as far as I can tell, there are only seven places that
> > > > > you'll have to change, although you may have to split it up into three
> > > > > patches.
> > > >
> > > > Thank you for the detailed analysis. I'll try my best and send you a
> > > > patch.
> >
> > Amol, Jann, if I understand the discussion correctly, objects ->cred
> > point (the subjective creds) are never (or never need to be) RCU-managed.
> > This makes sense in light of the commit Jann pointed out
> > (d7852fbd0f0423937fa287a598bfde188bb68c22).
> >
> > How about the following diff as a starting point?
> >
> > 1. Remove all ->cred accessing happening through RCU primitive.
>
> Sounds good.
>
> > 2. Remove __rcu from task_struct ->cred
>
> Sounds good.
>
> > 3. Also I removed the whole non_rcu flag, and introduced a new put_cred_non_rcu() API
> > which places that task-synchronously use ->cred can overwrite. Callers
> > doing such accesses like access() can use this API instead.
>
> That's wrong, don't do that.
>
> ->cred is a reference without RCU semantics, ->real_cred is a
> reference with RCU semantics. If there have never been any references
> with RCU semantics to a specific instance of struct cred, then that
> instance can indeed be freed without an RCU grace period. But it would
> be possible for some filesystem code to take a reference to
> current->cred, and assign it to some pointer with RCU semantics
> somewhere, then drop that reference with put_cred() immediately before
> you reach put_cred_non_rcu(); with the result that despite using
> put_cred(), the other side doesn't get RCU semantics.
>
> Just leave the whole ->non_rcu thing exactly as it was.
Can you point to an example in the kernel that actually uses ->cred this way?
I'm just curious. That is, reads task's ->cred pointer, and assigns it to an
RCU managed pointer?
I think such an example would be the point that the commit you mentioned
addresses. The commit basically says "as long as nobody does get_cred() on
the task_struct ->cred, we are good, but if somebody does do it, then we have
to deferred-free it". But I could not find such an example.
That said, I agree the removal of non_rcu can be considered out of scope for
this patch.
thanks,
- Joel
On Thu, Feb 6, 2020 at 5:49 PM Joel Fernandes <[email protected]> wrote:
> On Thu, Feb 06, 2020 at 12:28:42PM +0100, Jann Horn wrote:
> [snip]
> > > > > > > > task_struct.cred doesn't actually have RCU semantics though, see
> > > > > > > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > > > > > > it would probably be more correct to remove the __rcu annotation?
> > > > > > >
> > > > > > > Hi Jann,
> > > > > > >
> > > > > > > I went through the commit you mentioned. If I understand it correctly,
> > > > > > > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > > > > > > flag was introduced, which determined if the clean-up should wait for
> > > > > > > RCU grace-periods or not. And since, the changes were 'thread local'
> > > > > > > there was no need to wait for an entire RCU GP to elapse.
> > > > > >
> > > > > > Yeah.
> > > > > >
> > > > > > > The commit too, as you said, mentions the removal of __rcu annotation.
> > > > > > > However, simply removing the annotation won't work, as there are quite a
> > > > > > > few instances where RCU primitives are used. Even get_current_cred()
> > > > > > > uses RCU APIs to get a reference to ->cred.
> > > > > >
> > > > > > Luckily, there aren't too many places that directly access ->cred,
> > > > > > since luckily there are helper functions like get_current_cred() that
> > > > > > will do it for you. Grepping through the kernel, I see:
> > > > [...]
> > > > > > So actually, the number of places that already don't use RCU accessors
> > > > > > is much higher than the number of places that use them.
> > > > > >
> > > > > > > So, currently, maybe we
> > > > > > > should continue to use RCU APIs and leave the __rcu annotation in?
> > > > > > > (Until someone who takes it on himself to remove __rcu annotation and
> > > > > > > fix all the instances). Does that sound good? Or do you want me to
> > > > > > > remove __rcu annotation and get the process started?
> > > > > >
> > > > > > I don't think it's a good idea to add more uses of RCU APIs for
> > > > > > ->cred; you shouldn't "fix" warnings by making the code more wrong.
> > > > > >
> > > > > > If you want to fix this, I think it would be relatively easy to fix
> > > > > > this properly - as far as I can tell, there are only seven places that
> > > > > > you'll have to change, although you may have to split it up into three
> > > > > > patches.
> > > > >
> > > > > Thank you for the detailed analysis. I'll try my best and send you a
> > > > > patch.
> > >
> > > Amol, Jann, if I understand the discussion correctly, objects ->cred
> > > point (the subjective creds) are never (or never need to be) RCU-managed.
> > > This makes sense in light of the commit Jann pointed out
> > > (d7852fbd0f0423937fa287a598bfde188bb68c22).
[...]
> > > 3. Also I removed the whole non_rcu flag, and introduced a new put_cred_non_rcu() API
> > > which places that task-synchronously use ->cred can overwrite. Callers
> > > doing such accesses like access() can use this API instead.
> >
> > That's wrong, don't do that.
> >
> > ->cred is a reference without RCU semantics, ->real_cred is a
> > reference with RCU semantics. If there have never been any references
> > with RCU semantics to a specific instance of struct cred, then that
> > instance can indeed be freed without an RCU grace period. But it would
> > be possible for some filesystem code to take a reference to
> > current->cred, and assign it to some pointer with RCU semantics
> > somewhere, then drop that reference with put_cred() immediately before
> > you reach put_cred_non_rcu(); with the result that despite using
> > put_cred(), the other side doesn't get RCU semantics.
> >
> > Just leave the whole ->non_rcu thing exactly as it was.
>
> Can you point to an example in the kernel that actually uses ->cred this way?
> I'm just curious. That is, reads task's ->cred pointer, and assigns it to an
> RCU managed pointer?
I'm almost sure that there are no such cases at the moment. However,
from a maintainability standpoint, I'm still very twitchy about this
change; the current API encapsulates the RCU weirdness in the standard
helper functions, but with your proposal, suddenly taking f_cred from
somewhere and using it as a new task's subjective creds, or something
like that, would be unsafe.
> I think such an example would be the point that the commit you mentioned
> addresses. The commit basically says "as long as nobody does get_cred() on
> the task_struct ->cred, we are good, but if somebody does do it, then we have
> to deferred-free it". But I could not find such an example.
>
> That said, I agree the removal of non_rcu can be considered out of scope for
> this patch.
On Thu, Feb 06, 2020 at 06:15:56PM +0100, Jann Horn wrote:
> On Thu, Feb 6, 2020 at 5:49 PM Joel Fernandes <[email protected]> wrote:
> > On Thu, Feb 06, 2020 at 12:28:42PM +0100, Jann Horn wrote:
> > [snip]
> > > > > > > > > task_struct.cred doesn't actually have RCU semantics though, see
> > > > > > > > > commit d7852fbd0f0423937fa287a598bfde188bb68c22. For task_struct.cred,
> > > > > > > > > it would probably be more correct to remove the __rcu annotation?
> > > > > > > >
> > > > > > > > Hi Jann,
> > > > > > > >
> > > > > > > > I went through the commit you mentioned. If I understand it correctly,
> > > > > > > > ->cred was not being accessed concurrently (via RCU), hence, a non_rcu
> > > > > > > > flag was introduced, which determined if the clean-up should wait for
> > > > > > > > RCU grace-periods or not. And since, the changes were 'thread local'
> > > > > > > > there was no need to wait for an entire RCU GP to elapse.
> > > > > > >
> > > > > > > Yeah.
> > > > > > >
> > > > > > > > The commit too, as you said, mentions the removal of __rcu annotation.
> > > > > > > > However, simply removing the annotation won't work, as there are quite a
> > > > > > > > few instances where RCU primitives are used. Even get_current_cred()
> > > > > > > > uses RCU APIs to get a reference to ->cred.
> > > > > > >
> > > > > > > Luckily, there aren't too many places that directly access ->cred,
> > > > > > > since luckily there are helper functions like get_current_cred() that
> > > > > > > will do it for you. Grepping through the kernel, I see:
> > > > > [...]
> > > > > > > So actually, the number of places that already don't use RCU accessors
> > > > > > > is much higher than the number of places that use them.
> > > > > > >
> > > > > > > > So, currently, maybe we
> > > > > > > > should continue to use RCU APIs and leave the __rcu annotation in?
> > > > > > > > (Until someone who takes it on himself to remove __rcu annotation and
> > > > > > > > fix all the instances). Does that sound good? Or do you want me to
> > > > > > > > remove __rcu annotation and get the process started?
> > > > > > >
> > > > > > > I don't think it's a good idea to add more uses of RCU APIs for
> > > > > > > ->cred; you shouldn't "fix" warnings by making the code more wrong.
> > > > > > >
> > > > > > > If you want to fix this, I think it would be relatively easy to fix
> > > > > > > this properly - as far as I can tell, there are only seven places that
> > > > > > > you'll have to change, although you may have to split it up into three
> > > > > > > patches.
> > > > > >
> > > > > > Thank you for the detailed analysis. I'll try my best and send you a
> > > > > > patch.
> > > >
> > > > Amol, Jann, if I understand the discussion correctly, objects ->cred
> > > > point (the subjective creds) are never (or never need to be) RCU-managed.
> > > > This makes sense in light of the commit Jann pointed out
> > > > (d7852fbd0f0423937fa287a598bfde188bb68c22).
> [...]
> > > > 3. Also I removed the whole non_rcu flag, and introduced a new put_cred_non_rcu() API
> > > > which places that task-synchronously use ->cred can overwrite. Callers
> > > > doing such accesses like access() can use this API instead.
> > >
> > > That's wrong, don't do that.
> > >
> > > ->cred is a reference without RCU semantics, ->real_cred is a
> > > reference with RCU semantics. If there have never been any references
> > > with RCU semantics to a specific instance of struct cred, then that
> > > instance can indeed be freed without an RCU grace period. But it would
> > > be possible for some filesystem code to take a reference to
> > > current->cred, and assign it to some pointer with RCU semantics
> > > somewhere, then drop that reference with put_cred() immediately before
> > > you reach put_cred_non_rcu(); with the result that despite using
> > > put_cred(), the other side doesn't get RCU semantics.
> > >
> > > Just leave the whole ->non_rcu thing exactly as it was.
> >
> > Can you point to an example in the kernel that actually uses ->cred this way?
> > I'm just curious. That is, reads task's ->cred pointer, and assigns it to an
> > RCU managed pointer?
>
> I'm almost sure that there are no such cases at the moment. However,
> from a maintainability standpoint, I'm still very twitchy about this
> change; the current API encapsulates the RCU weirdness in the standard
> helper functions, but with your proposal, suddenly taking f_cred from
> somewhere and using it as a new task's subjective creds, or something
> like that, would be unsafe.
I agree with you. I talked to Amol and he will remove that part of the diff
when he sends patches. I believe he needs to also split into separate patches
as needed.
thanks,
- Joel