2023-12-06 14:58:03

by Luis Henriques

[permalink] [raw]
Subject: [RFC PATCH] keys: flush work when accessing /proc/key-users

Make sure the garbage collector has been run before cycling through all
the user keys.

Signed-off-by: Luis Henriques <[email protected]>
---
Hi!

This patch is mostly for getting some feedback on how to fix an fstest
failing for ext4/fscrypt (generic/581). Basically, the test relies on the
data read from /proc/key-users to be up-to-date regarding the number of
keys a given user currently has. However, this file can't be trusted
because it races against the keys GC.

Using flush_work() seems to work (I can't reproduce the failure), but it
may be overkill. Or simply not acceptable. Maybe, as Eric suggested
elsewhere [1], there could be a synchronous key_put/revoke/invalidate/...,
which would wait for the key GC to do its work, although that probably
would require some more code re-work.

[1] https://lore.kernel.org/all/[email protected]/

security/keys/gc.c | 6 ++++++
security/keys/internal.h | 1 +
security/keys/proc.c | 1 +
3 files changed, 8 insertions(+)

diff --git a/security/keys/gc.c b/security/keys/gc.c
index 3c90807476eb..57b5a54490a0 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -44,6 +44,12 @@ struct key_type key_type_dead = {
.name = ".dead",
};

+void key_flush_gc(void)
+{
+ kenter("");
+ flush_work(&key_gc_work);
+}
+
/*
* Schedule a garbage collection run.
* - time precision isn't particularly important
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 471cf36dedc0..fee1d0025d96 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -170,6 +170,7 @@ extern void keyring_restriction_gc(struct key *keyring,
extern void key_schedule_gc(time64_t gc_at);
extern void key_schedule_gc_links(void);
extern void key_gc_keytype(struct key_type *ktype);
+extern void key_flush_gc(void);

extern int key_task_permission(const key_ref_t key_ref,
const struct cred *cred,
diff --git a/security/keys/proc.c b/security/keys/proc.c
index d0cde6685627..2837e00a240a 100644
--- a/security/keys/proc.c
+++ b/security/keys/proc.c
@@ -277,6 +277,7 @@ static void *proc_key_users_start(struct seq_file *p, loff_t *_pos)
struct rb_node *_p;
loff_t pos = *_pos;

+ key_flush_gc();
spin_lock(&key_user_lock);

_p = key_user_first(seq_user_ns(p), &key_user_tree);


2023-12-06 16:13:34

by David Howells

[permalink] [raw]
Subject: Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

Luis Henriques <[email protected]> wrote:

> This patch is mostly for getting some feedback on how to fix an fstest
> failing for ext4/fscrypt (generic/581). Basically, the test relies on the
> data read from /proc/key-users to be up-to-date regarding the number of
> keys a given user currently has. However, this file can't be trusted
> because it races against the keys GC.

Unfortunately, I don't think your patch helps. If the GC hasn't started yet,
it won't achieve anything and the GC can still be triggered at any time after
the flush and thus race.

What is it you're actually trying to determine?

And is it only for doing the test?

David

2023-12-06 17:56:18

by Luis Henriques

[permalink] [raw]
Subject: Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

David Howells <[email protected]> writes:

> Luis Henriques <[email protected]> wrote:
>
>> This patch is mostly for getting some feedback on how to fix an fstest
>> failing for ext4/fscrypt (generic/581). Basically, the test relies on the
>> data read from /proc/key-users to be up-to-date regarding the number of
>> keys a given user currently has. However, this file can't be trusted
>> because it races against the keys GC.
>
> Unfortunately, I don't think your patch helps. If the GC hasn't started yet,
> it won't achieve anything and the GC can still be triggered at any time after
> the flush and thus race.
>
> What is it you're actually trying to determine?
>
> And is it only for doing the test?

OK, let me try to describe what the generic/581 fstest does.

After doing a few fscrypt related things, which involve adding and
removing keys, the test will:

1. Get the number of keys for user 'fsgqa' from '/proc/key-users'
2. Set the maxkeys to 5 + <keys the user had in 1.>
3. In a loop, try to add 6 new keys, to confirm the last one will fail

Most of the time the test passes, i.e., the 6th key fails to be added.
However, if, for example, the test is executed in a loop, it is possible
to have it fail because the 6th key was successfully added. The reason
is, obviously, because the test is racy: the GC can kick-in too late,
after the maxkeys is set in step 2.

So, this is mostly an issue with the test itself, but I couldn't figure
out a way to work around it.

Another solution I thought but I didn't look too deep into was to try to
move the

atomic_dec(&key->user->nkeys);

out of the GC, in function key_gc_unused_keys(). Decrementing it
synchronously in key_put() (or whatever other functions could schedule GC)
should solve the problem with this test. But as I said I didn't went too
far looking into that, so I don't really know if that's feasible.

Finally, the test itself could be hacked so that the loop in step 3. would
update the maxkeys value if needed, i.e. if the current number of keys for
the user isn't what was expected in each loop iteration. But even that
would still be racy.

Cheers,
--
Luís

2023-12-07 02:43:37

by Eric Biggers

[permalink] [raw]
Subject: Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

On Wed, Dec 06, 2023 at 05:55:52PM +0000, Luis Henriques wrote:
> David Howells <[email protected]> writes:
>
> > Luis Henriques <[email protected]> wrote:
> >
> >> This patch is mostly for getting some feedback on how to fix an fstest
> >> failing for ext4/fscrypt (generic/581). Basically, the test relies on the
> >> data read from /proc/key-users to be up-to-date regarding the number of
> >> keys a given user currently has. However, this file can't be trusted
> >> because it races against the keys GC.
> >
> > Unfortunately, I don't think your patch helps. If the GC hasn't started yet,
> > it won't achieve anything and the GC can still be triggered at any time after
> > the flush and thus race.
> >
> > What is it you're actually trying to determine?
> >
> > And is it only for doing the test?
>
> OK, let me try to describe what the generic/581 fstest does.
>
> After doing a few fscrypt related things, which involve adding and
> removing keys, the test will:
>
> 1. Get the number of keys for user 'fsgqa' from '/proc/key-users'
> 2. Set the maxkeys to 5 + <keys the user had in 1.>
> 3. In a loop, try to add 6 new keys, to confirm the last one will fail
>
> Most of the time the test passes, i.e., the 6th key fails to be added.
> However, if, for example, the test is executed in a loop, it is possible
> to have it fail because the 6th key was successfully added. The reason
> is, obviously, because the test is racy: the GC can kick-in too late,
> after the maxkeys is set in step 2.
>
> So, this is mostly an issue with the test itself, but I couldn't figure
> out a way to work around it.
>
> Another solution I thought but I didn't look too deep into was to try to
> move the
>
> atomic_dec(&key->user->nkeys);
>
> out of the GC, in function key_gc_unused_keys(). Decrementing it
> synchronously in key_put() (or whatever other functions could schedule GC)
> should solve the problem with this test. But as I said I didn't went too
> far looking into that, so I don't really know if that's feasible.
>
> Finally, the test itself could be hacked so that the loop in step 3. would
> update the maxkeys value if needed, i.e. if the current number of keys for
> the user isn't what was expected in each loop iteration. But even that
> would still be racy.

If there was a function that fully and synchronously releases a key's quota,
fs/crypto/ could call it before unlinking the key. key_payload_reserve(key, 0)
almost does the trick, but it would release the key's bytes, not the key itself.

However, that would only fix the flakiness of the key quota for fs/crypto/, not
for other users of the keyrings service. Maybe this suggests that key_put()
should release the key's quota right away if the key's refcount drops to 0?

Either way, note that where fs/crypto/ does key_put() on a whole keyring at
once, it would first need to call keyring_clear() to clear it synchronously.

A third solution would be to make fs/crypto/ completely stop using 'struct key',
and handle quotas itself. It would do it correctly, i.e. synchronously so that
the results are predictable. This would likely mean separate accounting, where
adding an fscrypt key counts against an fscrypt key quota, not the regular
keyrings service quota as it does now. That should be fine, though the same
limits might still need to be used, in case users are relying on the sysctls...

The last solution seems quite attractive at this point, given the number of
times that issues in the keyrings service have caused problems for fs/crypto/.
Any thoughts are appreciated, though.

- Eric

2023-12-07 04:33:38

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

David, this really needs your feedback.

BR, Jarkko

On Wed, 2023-12-06 at 14:57 +0000, Luis Henriques wrote:
> Make sure the garbage collector has been run before cycling through
> all
> the user keys.
>
> Signed-off-by: Luis Henriques <[email protected]>
> ---
> Hi!
>
> This patch is mostly for getting some feedback on how to fix an
> fstest
> failing for ext4/fscrypt (generic/581).  Basically, the test relies
> on the
> data read from /proc/key-users to be up-to-date regarding the number
> of
> keys a given user currently has.  However, this file can't be trusted
> because it races against the keys GC.
>
> Using flush_work() seems to work (I can't reproduce the failure), but
> it
> may be overkill.  Or simply not acceptable.  Maybe, as Eric suggested
> elsewhere [1], there could be a synchronous
> key_put/revoke/invalidate/...,
> which would wait for the key GC to do its work, although that
> probably
> would require some more code re-work.
>
> [1]
> https://lore.kernel.org/all/[email protected]/
>
>  security/keys/gc.c       | 6 ++++++
>  security/keys/internal.h | 1 +
>  security/keys/proc.c     | 1 +
>  3 files changed, 8 insertions(+)
>
> diff --git a/security/keys/gc.c b/security/keys/gc.c
> index 3c90807476eb..57b5a54490a0 100644
> --- a/security/keys/gc.c
> +++ b/security/keys/gc.c
> @@ -44,6 +44,12 @@ struct key_type key_type_dead = {
>   .name = ".dead",
>  };
>  
> +void key_flush_gc(void)
> +{
> + kenter("");
> + flush_work(&key_gc_work);
> +}
> +
>  /*
>   * Schedule a garbage collection run.
>   * - time precision isn't particularly important
> diff --git a/security/keys/internal.h b/security/keys/internal.h
> index 471cf36dedc0..fee1d0025d96 100644
> --- a/security/keys/internal.h
> +++ b/security/keys/internal.h
> @@ -170,6 +170,7 @@ extern void keyring_restriction_gc(struct key
> *keyring,
>  extern void key_schedule_gc(time64_t gc_at);
>  extern void key_schedule_gc_links(void);
>  extern void key_gc_keytype(struct key_type *ktype);
> +extern void key_flush_gc(void);
>  
>  extern int key_task_permission(const key_ref_t key_ref,
>          const struct cred *cred,
> diff --git a/security/keys/proc.c b/security/keys/proc.c
> index d0cde6685627..2837e00a240a 100644
> --- a/security/keys/proc.c
> +++ b/security/keys/proc.c
> @@ -277,6 +277,7 @@ static void *proc_key_users_start(struct seq_file
> *p, loff_t *_pos)
>   struct rb_node *_p;
>   loff_t pos = *_pos;
>  
> + key_flush_gc();
>   spin_lock(&key_user_lock);
>  
>   _p = key_user_first(seq_user_ns(p), &key_user_tree);

2023-12-11 14:06:57

by David Howells

[permalink] [raw]
Subject: Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

Eric Biggers <[email protected]> wrote:

> If there was a function that fully and synchronously releases a key's quota,
> fs/crypto/ could call it before unlinking the key. key_payload_reserve(key,
> 0) almost does the trick, but it would release the key's bytes, not the key
> itself.

Umm... The point of the quota is that the key is occupying unswappable kernel
memory (partly true in the case of big_key) and we need to limit that.
Further, the key is not released until it is unlinked.

> However, that would only fix the flakiness of the key quota for fs/crypto/,
> not for other users of the keyrings service. Maybe this suggests that
> key_put() should release the key's quota right away if the key's refcount
> drops to 0?

That I would be okay with as the key should be removed in short order.

Note that you'd have to change the spinlocks on key->user->lock to irq-locking
types. Or maybe we can do without them, at least for key gc, and use atomic
counters. key_invalidate() should probably drop the quota also.

I'm also working up a patch so that key types can be marked for immediate gc
if they expire, rather than there being a period (key_gc_delay) in which they
cause EKEYEXPIRED rather than ENOKEY to be returned for better indication to
userspace as to what's happened when a filesystem op fails to to key problems.

> Either way, note that where fs/crypto/ does key_put() on a whole keyring at
> once, it would first need to call keyring_clear() to clear it synchronously.

What if there's another link on the keyring? Should it still be cleared?

Do we need faster disposal of keys? Perhaps keeping a list of keys that need
destroying rather than scanning the entire key set for them. We still need to
scan non-destroyed keyrings, though, to find the pointers to defunct keys
unless I have some sort of backpointer list.

David

2023-12-12 03:03:21

by Eric Biggers

[permalink] [raw]
Subject: Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

On Mon, Dec 11, 2023 at 02:02:47PM +0000, David Howells wrote:
> Eric Biggers <[email protected]> wrote:
>
> > If there was a function that fully and synchronously releases a key's quota,
> > fs/crypto/ could call it before unlinking the key. key_payload_reserve(key,
> > 0) almost does the trick, but it would release the key's bytes, not the key
> > itself.
>
> Umm... The point of the quota is that the key is occupying unswappable kernel
> memory (partly true in the case of big_key) and we need to limit that.
> Further, the key is not released until it is unlinked.

Well, fs/crypto/ no longer uses the keyrings subsystem for the actual keys, as
that was far too broken. It just ties into the quota now. So what's needed is
a way to release quota synchronously.

That might just mean not using the keyrings subsystem at all anymore.

> Do we need faster disposal of keys? Perhaps keeping a list of keys that need
> destroying rather than scanning the entire key set for them. We still need to
> scan non-destroyed keyrings, though, to find the pointers to defunct keys
> unless I have some sort of backpointer list.

If it's still asynchronous, that doesn't solve the problem.

- Eric

2023-12-14 14:44:44

by Luis Henriques

[permalink] [raw]
Subject: Re: [RFC PATCH] keys: flush work when accessing /proc/key-users

Hi David,

On Mon, Dec 11, 2023 at 02:02:47PM +0000, David Howells wrote:
<snip>
> > However, that would only fix the flakiness of the key quota for fs/crypto/,
> > not for other users of the keyrings service. Maybe this suggests that
> > key_put() should release the key's quota right away if the key's refcount
> > drops to 0?
>
> That I would be okay with as the key should be removed in short order.
>
> Note that you'd have to change the spinlocks on key->user->lock to irq-locking
> types. Or maybe we can do without them, at least for key gc, and use atomic
> counters. key_invalidate() should probably drop the quota also.

I was trying to help with this but, first, I don't think atomic counters
would actually be a solution. For example, we have the following in
key_alloc():

spin_lock(&user->lock);
if (!(flags & KEY_ALLOC_QUOTA_OVERRUN)) {
if (user->qnkeys + 1 > maxkeys ||
user->qnbytes + quotalen > maxbytes ||
user->qnbytes + quotalen < user->qnbytes)
goto no_quota;
}
user->qnkeys++;
user->qnbytes += quotalen;
spin_unlock(&user->lock);

Thus, I don't think it's really possible to simply stop using a lock
without making these checks+changes non-atomic.

As for using spin_lock_irq() or spin_lock_irqsave(), my understanding is
that the only places where this could be necessary is in key_put() and,
possibly, key_payload_reserve(). key_alloc() shouldn't need that.

Finally, why would key_invalidate() require handling quotas? I'm probably
just missing some subtlety, but I don't see the user->usage refcount being
decremented anywhere in that path (or anywhere else, really).

Cheers,
--
Lu?s

2024-01-15 12:03:21

by Luis Henriques

[permalink] [raw]
Subject: [RFC PATCH v2] keys: update key quotas in key_put()

Delaying key quotas update when key's refcount reaches 0 in key_put() has
been causing some issues in fscrypt testing. This patches fixes this test
flakiness by dealing with the quotas immediately, but leaving all the other
clean-ups to the key garbage collector. Unfortunately, this means that we
also need to switch to the irq-version of the spinlock that protects quota.

Signed-off-by: Luis Henriques <[email protected]>
---
Hi David!

I have these changes in my local disk for a while; I wanted to send them
before EOY break but... yeah, it didn't happen. Anyway, I'm still sending
it as an RFC as I'm probably missing something.

security/keys/gc.c | 8 --------
security/keys/key.c | 32 ++++++++++++++++++++++----------
security/keys/keyctl.c | 11 ++++++-----
3 files changed, 28 insertions(+), 23 deletions(-)

diff --git a/security/keys/gc.c b/security/keys/gc.c
index eaddaceda14e..7d687b0962b1 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -155,14 +155,6 @@ static noinline void key_gc_unused_keys(struct list_head *keys)

security_key_free(key);

- /* deal with the user's key tracking and quota */
- if (test_bit(KEY_FLAG_IN_QUOTA, &key->flags)) {
- spin_lock(&key->user->lock);
- key->user->qnkeys--;
- key->user->qnbytes -= key->quotalen;
- spin_unlock(&key->user->lock);
- }
-
atomic_dec(&key->user->nkeys);
if (state != KEY_IS_UNINSTANTIATED)
atomic_dec(&key->user->nikeys);
diff --git a/security/keys/key.c b/security/keys/key.c
index 5b10641debd5..ec155cfaae38 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -231,6 +231,7 @@ struct key *key_alloc(struct key_type *type, const char *desc,
struct key *key;
size_t desclen, quotalen;
int ret;
+ unsigned long irqflags;

key = ERR_PTR(-EINVAL);
if (!desc || !*desc)
@@ -260,7 +261,7 @@ struct key *key_alloc(struct key_type *type, const char *desc,
unsigned maxbytes = uid_eq(uid, GLOBAL_ROOT_UID) ?
key_quota_root_maxbytes : key_quota_maxbytes;

- spin_lock(&user->lock);
+ spin_lock_irqsave(&user->lock, irqflags);
if (!(flags & KEY_ALLOC_QUOTA_OVERRUN)) {
if (user->qnkeys + 1 > maxkeys ||
user->qnbytes + quotalen > maxbytes ||
@@ -270,7 +271,7 @@ struct key *key_alloc(struct key_type *type, const char *desc,

user->qnkeys++;
user->qnbytes += quotalen;
- spin_unlock(&user->lock);
+ spin_unlock_irqrestore(&user->lock, irqflags);
}

/* allocate and initialise the key and its description */
@@ -328,10 +329,10 @@ struct key *key_alloc(struct key_type *type, const char *desc,
kfree(key->description);
kmem_cache_free(key_jar, key);
if (!(flags & KEY_ALLOC_NOT_IN_QUOTA)) {
- spin_lock(&user->lock);
+ spin_lock_irqsave(&user->lock, irqflags);
user->qnkeys--;
user->qnbytes -= quotalen;
- spin_unlock(&user->lock);
+ spin_unlock_irqrestore(&user->lock, irqflags);
}
key_user_put(user);
key = ERR_PTR(ret);
@@ -341,10 +342,10 @@ struct key *key_alloc(struct key_type *type, const char *desc,
kmem_cache_free(key_jar, key);
no_memory_2:
if (!(flags & KEY_ALLOC_NOT_IN_QUOTA)) {
- spin_lock(&user->lock);
+ spin_lock_irqsave(&user->lock, irqflags);
user->qnkeys--;
user->qnbytes -= quotalen;
- spin_unlock(&user->lock);
+ spin_unlock_irqrestore(&user->lock, irqflags);
}
key_user_put(user);
no_memory_1:
@@ -352,7 +353,7 @@ struct key *key_alloc(struct key_type *type, const char *desc,
goto error;

no_quota:
- spin_unlock(&user->lock);
+ spin_unlock_irqrestore(&user->lock, irqflags);
key_user_put(user);
key = ERR_PTR(-EDQUOT);
goto error;
@@ -381,8 +382,9 @@ int key_payload_reserve(struct key *key, size_t datalen)
if (delta != 0 && test_bit(KEY_FLAG_IN_QUOTA, &key->flags)) {
unsigned maxbytes = uid_eq(key->user->uid, GLOBAL_ROOT_UID) ?
key_quota_root_maxbytes : key_quota_maxbytes;
+ unsigned long flags;

- spin_lock(&key->user->lock);
+ spin_lock_irqsave(&key->user->lock, flags);

if (delta > 0 &&
(key->user->qnbytes + delta > maxbytes ||
@@ -393,7 +395,7 @@ int key_payload_reserve(struct key *key, size_t datalen)
key->user->qnbytes += delta;
key->quotalen += delta;
}
- spin_unlock(&key->user->lock);
+ spin_unlock_irqrestore(&key->user->lock, flags);
}

/* change the recorded data length if that didn't generate an error */
@@ -646,8 +648,18 @@ void key_put(struct key *key)
if (key) {
key_check(key);

- if (refcount_dec_and_test(&key->usage))
+ if (refcount_dec_and_test(&key->usage)) {
+ unsigned long flags;
+
+ /* deal with the user's key tracking and quota */
+ if (test_bit(KEY_FLAG_IN_QUOTA, &key->flags)) {
+ spin_lock_irqsave(&key->user->lock, flags);
+ key->user->qnkeys--;
+ key->user->qnbytes -= key->quotalen;
+ spin_unlock_irqrestore(&key->user->lock, flags);
+ }
schedule_work(&key_gc_work);
+ }
}
}
EXPORT_SYMBOL(key_put);
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 10ba439968f7..4bc3e9398ee3 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -954,6 +954,7 @@ long keyctl_chown_key(key_serial_t id, uid_t user, gid_t group)
long ret;
kuid_t uid;
kgid_t gid;
+ unsigned long flags;

uid = make_kuid(current_user_ns(), user);
gid = make_kgid(current_user_ns(), group);
@@ -1010,7 +1011,7 @@ long keyctl_chown_key(key_serial_t id, uid_t user, gid_t group)
unsigned maxbytes = uid_eq(uid, GLOBAL_ROOT_UID) ?
key_quota_root_maxbytes : key_quota_maxbytes;

- spin_lock(&newowner->lock);
+ spin_lock_irqsave(&newowner->lock, flags);
if (newowner->qnkeys + 1 > maxkeys ||
newowner->qnbytes + key->quotalen > maxbytes ||
newowner->qnbytes + key->quotalen <
@@ -1019,12 +1020,12 @@ long keyctl_chown_key(key_serial_t id, uid_t user, gid_t group)

newowner->qnkeys++;
newowner->qnbytes += key->quotalen;
- spin_unlock(&newowner->lock);
+ spin_unlock_irqrestore(&newowner->lock, flags);

- spin_lock(&key->user->lock);
+ spin_lock_irqsave(&key->user->lock, flags);
key->user->qnkeys--;
key->user->qnbytes -= key->quotalen;
- spin_unlock(&key->user->lock);
+ spin_unlock_irqrestore(&key->user->lock, flags);
}

atomic_dec(&key->user->nkeys);
@@ -1056,7 +1057,7 @@ long keyctl_chown_key(key_serial_t id, uid_t user, gid_t group)
return ret;

quota_overrun:
- spin_unlock(&newowner->lock);
+ spin_unlock_irqrestore(&newowner->lock, flags);
zapowner = newowner;
ret = -EDQUOT;
goto error_put;

2024-01-19 21:10:29

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [RFC PATCH v2] keys: update key quotas in key_put()

On Mon Jan 15, 2024 at 12:03 PM UTC, Luis Henriques wrote:
> Delaying key quotas update when key's refcount reaches 0 in key_put() has
> been causing some issues in fscrypt testing. This patches fixes this test
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This commit fixes the test

> flakiness by dealing with the quotas immediately, but leaving all the other
> clean-ups to the key garbage collector. Unfortunately, this means that we
> also need to switch to the irq-version of the spinlock that protects quota.

The commit message fails to describe the implementation changes.

You have a motivation paragraph but you also need to add implementation
paragraph, which describes how the code changes reflect the motivation.

BR, Jarkko

2024-01-22 12:03:37

by Luis Henriques

[permalink] [raw]
Subject: Re: [RFC PATCH v2] keys: update key quotas in key_put()

"Jarkko Sakkinen" <[email protected]> writes:

> On Mon Jan 15, 2024 at 12:03 PM UTC, Luis Henriques wrote:
>> Delaying key quotas update when key's refcount reaches 0 in key_put() has
>> been causing some issues in fscrypt testing. This patches fixes this test
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> This commit fixes the test
>
>> flakiness by dealing with the quotas immediately, but leaving all the other
>> clean-ups to the key garbage collector. Unfortunately, this means that we
>> also need to switch to the irq-version of the spinlock that protects quota.
>
> The commit message fails to describe the implementation changes.
>
> You have a motivation paragraph but you also need to add implementation
> paragraph, which describes how the code changes reflect the motivation.

Thank you for your feedback, Jarkko. I'll address your comments in v3.

But before sending another rev, I'll wait a bit more, maybe David also has
some feedback on the implementation.

Cheers,
--
Luís

2024-01-22 19:47:49

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [RFC PATCH v2] keys: update key quotas in key_put()

On Mon Jan 22, 2024 at 1:50 PM EET, Luis Henriques wrote:
> "Jarkko Sakkinen" <[email protected]> writes:
>
> > On Mon Jan 15, 2024 at 12:03 PM UTC, Luis Henriques wrote:
> >> Delaying key quotas update when key's refcount reaches 0 in key_put() has
> >> been causing some issues in fscrypt testing. This patches fixes this test
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > This commit fixes the test
> >
> >> flakiness by dealing with the quotas immediately, but leaving all the other
> >> clean-ups to the key garbage collector. Unfortunately, this means that we
> >> also need to switch to the irq-version of the spinlock that protects quota.
> >
> > The commit message fails to describe the implementation changes.
> >
> > You have a motivation paragraph but you also need to add implementation
> > paragraph, which describes how the code changes reflect the motivation.
>
> Thank you for your feedback, Jarkko. I'll address your comments in v3.
>
> But before sending another rev, I'll wait a bit more, maybe David also has
> some feedback on the implementation.
>
> Cheers,

Take you time :-)

BR, Jarkko

2024-01-24 22:12:35

by Eric Biggers

[permalink] [raw]
Subject: Re: [RFC PATCH v2] keys: update key quotas in key_put()

On Mon, Jan 15, 2024 at 12:03:00PM +0000, Luis Henriques wrote:
> Delaying key quotas update when key's refcount reaches 0 in key_put() has
> been causing some issues in fscrypt testing. This patches fixes this test
> flakiness by dealing with the quotas immediately, but leaving all the other
> clean-ups to the key garbage collector. Unfortunately, this means that we
> also need to switch to the irq-version of the spinlock that protects quota.
>
> Signed-off-by: Luis Henriques <[email protected]>
> ---
> Hi David!
>
> I have these changes in my local disk for a while; I wanted to send them
> before EOY break but... yeah, it didn't happen. Anyway, I'm still sending
> it as an RFC as I'm probably missing something.
>
> security/keys/gc.c | 8 --------
> security/keys/key.c | 32 ++++++++++++++++++++++----------
> security/keys/keyctl.c | 11 ++++++-----
> 3 files changed, 28 insertions(+), 23 deletions(-)

This patch seems reasonable to me, though I'm still thinking about changing
fs/crypto/ to manage its key quotas itself which would avoid the issue entirely.

Note that as I said before, fs/crypto/ does key_put() on a whole keyring at
once, in order to release the quota of the keys in the keyring. Do you plan to
also change fs/crypto/ to keyring_clear() the keyring before putting it?
Without that, I don't think the problem is solved, as the quota release will
still happen asynchronously due to the keyring being cleared asynchronously.

- Eric

2024-01-26 16:14:37

by Luis Henriques

[permalink] [raw]
Subject: Re: [RFC PATCH v2] keys: update key quotas in key_put()

Eric Biggers <[email protected]> writes:

> On Mon, Jan 15, 2024 at 12:03:00PM +0000, Luis Henriques wrote:
>> Delaying key quotas update when key's refcount reaches 0 in key_put() has
>> been causing some issues in fscrypt testing. This patches fixes this test
>> flakiness by dealing with the quotas immediately, but leaving all the other
>> clean-ups to the key garbage collector. Unfortunately, this means that we
>> also need to switch to the irq-version of the spinlock that protects quota.
>>
>> Signed-off-by: Luis Henriques <[email protected]>
>> ---
>> Hi David!
>>
>> I have these changes in my local disk for a while; I wanted to send them
>> before EOY break but... yeah, it didn't happen. Anyway, I'm still sending
>> it as an RFC as I'm probably missing something.
>>
>> security/keys/gc.c | 8 --------
>> security/keys/key.c | 32 ++++++++++++++++++++++----------
>> security/keys/keyctl.c | 11 ++++++-----
>> 3 files changed, 28 insertions(+), 23 deletions(-)
>
> This patch seems reasonable to me, though I'm still thinking about changing
> fs/crypto/ to manage its key quotas itself which would avoid the issue entirely.
>
> Note that as I said before, fs/crypto/ does key_put() on a whole keyring at
> once, in order to release the quota of the keys in the keyring. Do you plan to
> also change fs/crypto/ to keyring_clear() the keyring before putting it?
> Without that, I don't think the problem is solved, as the quota release will
> still happen asynchronously due to the keyring being cleared asynchronously.

Ah, good point. In the meantime I had forgotten everything about this
code and missed that. So, I can send another patch to fs/crypto to add
that extra call once (or if) this patch is accepted.

If I'm reading the code correctly, the only place where this extra call is
required is on fscrypt_put_master_key():

diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c
index 0edf0b58daa7..4afd32f1aed9 100644
--- a/fs/crypto/keyring.c
+++ b/fs/crypto/keyring.c
@@ -74,6 +74,7 @@ void fscrypt_put_master_key(struct fscrypt_master_key *mk)
* that concurrent keyring lookups can no longer find it.
*/
WARN_ON_ONCE(refcount_read(&mk->mk_active_refs) != 0);
+ keyring_clear(mk->mk_users);
key_put(mk->mk_users);
mk->mk_users = NULL;
call_rcu(&mk->mk_rcu_head, fscrypt_free_master_key);

On the other hand, if you're really working towards dropping this code
entirely, maybe there's not point doing that.

Cheers,
--
Luís


2024-01-27 06:42:31

by Eric Biggers

[permalink] [raw]
Subject: Re: [RFC PATCH v2] keys: update key quotas in key_put()

On Fri, Jan 26, 2024 at 04:12:59PM +0000, Luis Henriques wrote:
> Eric Biggers <[email protected]> writes:
>
> > On Mon, Jan 15, 2024 at 12:03:00PM +0000, Luis Henriques wrote:
> >> Delaying key quotas update when key's refcount reaches 0 in key_put() has
> >> been causing some issues in fscrypt testing. This patches fixes this test
> >> flakiness by dealing with the quotas immediately, but leaving all the other
> >> clean-ups to the key garbage collector. Unfortunately, this means that we
> >> also need to switch to the irq-version of the spinlock that protects quota.
> >>
> >> Signed-off-by: Luis Henriques <[email protected]>
> >> ---
> >> Hi David!
> >>
> >> I have these changes in my local disk for a while; I wanted to send them
> >> before EOY break but... yeah, it didn't happen. Anyway, I'm still sending
> >> it as an RFC as I'm probably missing something.
> >>
> >> security/keys/gc.c | 8 --------
> >> security/keys/key.c | 32 ++++++++++++++++++++++----------
> >> security/keys/keyctl.c | 11 ++++++-----
> >> 3 files changed, 28 insertions(+), 23 deletions(-)
> >
> > This patch seems reasonable to me, though I'm still thinking about changing
> > fs/crypto/ to manage its key quotas itself which would avoid the issue entirely.
> >
> > Note that as I said before, fs/crypto/ does key_put() on a whole keyring at
> > once, in order to release the quota of the keys in the keyring. Do you plan to
> > also change fs/crypto/ to keyring_clear() the keyring before putting it?
> > Without that, I don't think the problem is solved, as the quota release will
> > still happen asynchronously due to the keyring being cleared asynchronously.
>
> Ah, good point. In the meantime I had forgotten everything about this
> code and missed that. So, I can send another patch to fs/crypto to add
> that extra call once (or if) this patch is accepted.
>
> If I'm reading the code correctly, the only place where this extra call is
> required is on fscrypt_put_master_key():
>
> diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c
> index 0edf0b58daa7..4afd32f1aed9 100644
> --- a/fs/crypto/keyring.c
> +++ b/fs/crypto/keyring.c
> @@ -74,6 +74,7 @@ void fscrypt_put_master_key(struct fscrypt_master_key *mk)
> * that concurrent keyring lookups can no longer find it.
> */
> WARN_ON_ONCE(refcount_read(&mk->mk_active_refs) != 0);
> + keyring_clear(mk->mk_users);
> key_put(mk->mk_users);
> mk->mk_users = NULL;

This will need a NULL check, since keyring_clear() doesn't accept NULL:

if (mk->mk_users) {
keyring_clear(mk->mk_users);
key_put(mk->mk_users);
mk->mk_users = NULL;
}

> call_rcu(&mk->mk_rcu_head, fscrypt_free_master_key);
>
> On the other hand, if you're really working towards dropping this code
> entirely, maybe there's not point doing that.

Well, it depends whether this patch (the one for security/keys/) is accepted or
not. If it's accepted, I think it makes sense to add the keyring_clear() and
otherwise keep the fs/crypto/ code as-is. If it's not accepted, I think I'll
have to make fs/crypto/ handle the quotas itself.

- Eric

2024-01-29 11:24:13

by Luis Henriques

[permalink] [raw]
Subject: Re: [RFC PATCH v2] keys: update key quotas in key_put()

Eric Biggers <[email protected]> writes:

> On Fri, Jan 26, 2024 at 04:12:59PM +0000, Luis Henriques wrote:
>> Eric Biggers <[email protected]> writes:
>>
>> > On Mon, Jan 15, 2024 at 12:03:00PM +0000, Luis Henriques wrote:
>> >> Delaying key quotas update when key's refcount reaches 0 in key_put() has
>> >> been causing some issues in fscrypt testing. This patches fixes this test
>> >> flakiness by dealing with the quotas immediately, but leaving all the other
>> >> clean-ups to the key garbage collector. Unfortunately, this means that we
>> >> also need to switch to the irq-version of the spinlock that protects quota.
>> >>
>> >> Signed-off-by: Luis Henriques <[email protected]>
>> >> ---
>> >> Hi David!
>> >>
>> >> I have these changes in my local disk for a while; I wanted to send them
>> >> before EOY break but... yeah, it didn't happen. Anyway, I'm still sending
>> >> it as an RFC as I'm probably missing something.
>> >>
>> >> security/keys/gc.c | 8 --------
>> >> security/keys/key.c | 32 ++++++++++++++++++++++----------
>> >> security/keys/keyctl.c | 11 ++++++-----
>> >> 3 files changed, 28 insertions(+), 23 deletions(-)
>> >
>> > This patch seems reasonable to me, though I'm still thinking about changing
>> > fs/crypto/ to manage its key quotas itself which would avoid the issue entirely.
>> >
>> > Note that as I said before, fs/crypto/ does key_put() on a whole keyring at
>> > once, in order to release the quota of the keys in the keyring. Do you plan to
>> > also change fs/crypto/ to keyring_clear() the keyring before putting it?
>> > Without that, I don't think the problem is solved, as the quota release will
>> > still happen asynchronously due to the keyring being cleared asynchronously.
>>
>> Ah, good point. In the meantime I had forgotten everything about this
>> code and missed that. So, I can send another patch to fs/crypto to add
>> that extra call once (or if) this patch is accepted.
>>
>> If I'm reading the code correctly, the only place where this extra call is
>> required is on fscrypt_put_master_key():
>>
>> diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c
>> index 0edf0b58daa7..4afd32f1aed9 100644
>> --- a/fs/crypto/keyring.c
>> +++ b/fs/crypto/keyring.c
>> @@ -74,6 +74,7 @@ void fscrypt_put_master_key(struct fscrypt_master_key *mk)
>> * that concurrent keyring lookups can no longer find it.
>> */
>> WARN_ON_ONCE(refcount_read(&mk->mk_active_refs) != 0);
>> + keyring_clear(mk->mk_users);
>> key_put(mk->mk_users);
>> mk->mk_users = NULL;
>
> This will need a NULL check, since keyring_clear() doesn't accept NULL:
>
> if (mk->mk_users) {
> keyring_clear(mk->mk_users);
> key_put(mk->mk_users);
> mk->mk_users = NULL;
> }
>

Ah, good point. Makes sense.

>> call_rcu(&mk->mk_rcu_head, fscrypt_free_master_key);
>>
>> On the other hand, if you're really working towards dropping this code
>> entirely, maybe there's not point doing that.
>
> Well, it depends whether this patch (the one for security/keys/) is accepted or
> not. If it's accepted, I think it makes sense to add the keyring_clear() and
> otherwise keep the fs/crypto/ code as-is. If it's not accepted, I think I'll
> have to make fs/crypto/ handle the quotas itself.

Awesome, thank.

David, do you have any feedback on this patch or would you rather have me
send v3, addressing Jarkko comments regarding the commit message?

Cheers,
--
Luís