2021-02-05 18:46:12

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 1/2] MAINTAINERS: Add Dave Hansen as reviewer for INTEL SGX

Add Dave as reviewer for INTEL SGX patches.

Cc: Borislav Petkov <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
---
MAINTAINERS | 1 +
1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5b66de2097d6..41b78e20bd1f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9227,6 +9227,7 @@ F: include/linux/tboot.h

INTEL SGX
M: Jarkko Sakkinen <[email protected]>
+R: Dave Hansen <[email protected]>
L: [email protected]
S: Supported
Q: https://patchwork.kernel.org/project/intel-sgx/list/
--
2.30.0


2021-02-05 18:46:21

by Jarkko Sakkinen

[permalink] [raw]
Subject: [PATCH 2/2] x86/sgx: Maintain encl->refcount for each encl->mm_list entry

This has been shown in tests:

[ +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100

There are two functions that drain encl->mm_list:

- sgx_release() (i.e. VFS release) removes the remaining mm_list entries.
- sgx_mmu_notifier_release() removes mm_list entry for the registered
process, if it still exists.

If encl->refcount is taken only for VFS, this can lead to
sgx_encl_release() being executed before sgx_mmu_notifier_release()
completes, which is exactly what happens in the above klog entry.

Each process also needs its own enclave reference.

In order to fix the race condition, increase encl->refcount when an
entry to encl->mm_list added for a process. Release this reference
when the mm_list entry is cleaned up, either in
sgx_mmu_notifier_release() or sgx_release().

Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer")
Cc: Dave Hansen <[email protected]
Reported-by: Haitao Huang <[email protected]>
Signed-off-by: Jarkko Sakkinen <[email protected]>
---
v7:
- No changes from v6. Resend of
https://patchwork.kernel.org/project/intel-sgx/patch/[email protected]/
v6:
- Maintain refcount for each encl->mm_list entry.
v5:
- To make sure that the instance does not get deleted use kref_get()
kref_put(). This also removes the need for additional
synchronize_srcu().
v4:
- Rewrite the commit message.
- Just change the call order. *_expedited() is out of scope for this
bug fix.
v3: Fine-tuned tags, and added missing change log for v2.
v2: Switch to synchronize_srcu_expedited().
arch/x86/kernel/cpu/sgx/driver.c | 6 ++++++
arch/x86/kernel/cpu/sgx/encl.c | 8 ++++++++
2 files changed, 14 insertions(+)

diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/driver.c
index f2eac41bb4ff..8d8fcc91c0d6 100644
--- a/arch/x86/kernel/cpu/sgx/driver.c
+++ b/arch/x86/kernel/cpu/sgx/driver.c
@@ -72,6 +72,12 @@ static int sgx_release(struct inode *inode, struct file *file)
synchronize_srcu(&encl->srcu);
mmu_notifier_unregister(&encl_mm->mmu_notifier, encl_mm->mm);
kfree(encl_mm);
+
+ /*
+ * Release the mm_list reference, as sgx_mmu_notifier_release()
+ * will only do this only, when it grabs encl_mm.
+ */
+ kref_put(&encl->refcount, sgx_encl_release);
}

kref_put(&encl->refcount, sgx_encl_release);
diff --git a/arch/x86/kernel/cpu/sgx/encl.c b/arch/x86/kernel/cpu/sgx/encl.c
index ee50a5010277..c1d9c86c0265 100644
--- a/arch/x86/kernel/cpu/sgx/encl.c
+++ b/arch/x86/kernel/cpu/sgx/encl.c
@@ -474,6 +474,7 @@ static void sgx_mmu_notifier_release(struct mmu_notifier *mn,
if (tmp == encl_mm) {
synchronize_srcu(&encl_mm->encl->srcu);
mmu_notifier_put(mn);
+ kref_put(&encl_mm->encl->refcount, sgx_encl_release);
}
}

@@ -545,6 +546,13 @@ int sgx_encl_mm_add(struct sgx_encl *encl, struct mm_struct *mm)
}

spin_lock(&encl->mm_lock);
+
+ /*
+ * Take a reference to guarantee that the enclave is not destroyed,
+ * while sgx_mmu_notifier_release() is active.
+ */
+ kref_get(&encl->refcount);
+
list_add_rcu(&encl_mm->list, &encl->mm_list);
/* Pairs with smp_rmb() in sgx_reclaimer_block(). */
smp_wmb();
--
2.30.0

2021-02-05 19:42:40

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/sgx: Maintain encl->refcount for each encl->mm_list entry

On 2/5/21 10:28 AM, Jarkko Sakkinen wrote:
> This has been shown in tests:
>
> [ +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100
>
> There are two functions that drain encl->mm_list:
>
> - sgx_release() (i.e. VFS release) removes the remaining mm_list entries.
> - sgx_mmu_notifier_release() removes mm_list entry for the registered
> process, if it still exists.

Jarkko, I like your approach. This actually has the potential to be a
lot more understandable than the fix we settled on before.

But I think the explanation needs some tweaking, and I think I can take
it a step further to make it even more straightforward. The issue here
isn't *really* mm_list, it's this:

encl_mm->encl = encl;

That literally establishes a encl_mm to encl reference and needs a
reference count. That reference remains until 'encl_mm' is freed. I
don't think mm_list needs to even be taken into account.

The most straightforward way to fix this is to take a refcount at
"encl_mm->encl = encl" and release it at kfree(encl_mm). That makes a
*lot* of logical sense to me, and it's also trivial to audit.

Totally untested patch attached (adapted directly from yours).


Attachments:
raw.patch (2.93 kB)

2021-02-07 21:34:19

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/sgx: Maintain encl->refcount for each encl->mm_list entry

On Fri, Feb 05, 2021 at 11:36:57AM -0800, Dave Hansen wrote:
> On 2/5/21 10:28 AM, Jarkko Sakkinen wrote:
> > This has been shown in tests:
> >
> > [ +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100
> >
> > There are two functions that drain encl->mm_list:
> >
> > - sgx_release() (i.e. VFS release) removes the remaining mm_list entries.
> > - sgx_mmu_notifier_release() removes mm_list entry for the registered
> > process, if it still exists.
>
> Jarkko, I like your approach. This actually has the potential to be a
> lot more understandable than the fix we settled on before.

Yeah, it's more like by-the-book use of refcount, each processs gets
a reference. This way things should be always serialized correctly.

> But I think the explanation needs some tweaking, and I think I can take
> it a step further to make it even more straightforward. The issue here
> isn't *really* mm_list, it's this:
>
> encl_mm->encl = encl;

Agreed.

This was also in center of thinking when I did this new patch.

> That literally establishes a encl_mm to encl reference and needs a
> reference count. That reference remains until 'encl_mm' is freed. I
> don't think mm_list needs to even be taken into account.
>
> The most straightforward way to fix this is to take a refcount at
> "encl_mm->encl = encl" and release it at kfree(encl_mm). That makes a
> *lot* of logical sense to me, and it's also trivial to audit.
>
> Totally untested patch attached (adapted directly from yours).

I tested this version, and it also seems to work. Boris, can you
pick this refined version from Dave's attachment or do you prefer
that I do a re-send?

/Jarkko

2021-02-07 21:35:35

by Jarkko Sakkinen

[permalink] [raw]
Subject: Re: [PATCH 2/2] x86/sgx: Maintain encl->refcount for each encl->mm_list entry

On Sun, Feb 07, 2021 at 11:29:49PM +0200, Jarkko Sakkinen wrote:
> On Fri, Feb 05, 2021 at 11:36:57AM -0800, Dave Hansen wrote:
> > On 2/5/21 10:28 AM, Jarkko Sakkinen wrote:
> > > This has been shown in tests:
> > >
> > > [ +0.000008] WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100
> > >
> > > There are two functions that drain encl->mm_list:
> > >
> > > - sgx_release() (i.e. VFS release) removes the remaining mm_list entries.
> > > - sgx_mmu_notifier_release() removes mm_list entry for the registered
> > > process, if it still exists.
> >
> > Jarkko, I like your approach. This actually has the potential to be a
> > lot more understandable than the fix we settled on before.
>
> Yeah, it's more like by-the-book use of refcount, each processs gets
> a reference. This way things should be always serialized correctly.
>
> > But I think the explanation needs some tweaking, and I think I can take
> > it a step further to make it even more straightforward. The issue here
> > isn't *really* mm_list, it's this:
> >
> > encl_mm->encl = encl;
>
> Agreed.
>
> This was also in center of thinking when I did this new patch.
>
> > That literally establishes a encl_mm to encl reference and needs a
> > reference count. That reference remains until 'encl_mm' is freed. I
> > don't think mm_list needs to even be taken into account.
> >
> > The most straightforward way to fix this is to take a refcount at
> > "encl_mm->encl = encl" and release it at kfree(encl_mm). That makes a
> > *lot* of logical sense to me, and it's also trivial to audit.
> >
> > Totally untested patch attached (adapted directly from yours).
>
> I tested this version, and it also seems to work. Boris, can you
> pick this refined version from Dave's attachment or do you prefer
> that I do a re-send?

Nevermind. I'll send a proper patch (just noticed that the attachment
did have short summary).

/Jarkko