LinuxLists.cc - [RFC PATCH 4/9] mm: Introduce vm

2019-05-31 23:34:05

Subject: [RFC PATCH 4/9] mm: Introduce vm_ops->mprotect()

SGX will use the mprotect() hook to prevent userspace from circumventing
various security checks, i.e. Linux Security Modules.

Enclaves are built by copying data from normal memory into the Enclave
Page Cache (EPC). Due to the nature of SGX, the EPC is represented by a
single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
single MAP_SHARED vm_file. Furthermore, all enclaves will need read,
write and execute pages in the EPC.

As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
that originated in a non-EXECUTE file (which is long gone by the time
mprotect() is called).

By hooking mprotect(), SGX can make explicit LSM upcalls while an
enclave is being built, i.e. when the kernel has a handle to origin of
each enclave page, and enforce the result of the LSM policy whenever
userspace maps the enclave page in the future.

Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
that approach is quite ugly, e.g. would require userspace to call an
SGX ioctl() prior to using mprotect() to extend a page's protections.

Signed-off-by: Sean Christopherson <[email protected]>
---
include/linux/mm.h | 2 ++
mm/mprotect.c | 15 +++++++++++----
2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0e8834ac32b7..50a42364a885 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -458,6 +458,8 @@ struct vm_operations_struct {
void (*close)(struct vm_area_struct * area);
int (*split)(struct vm_area_struct * area, unsigned long addr);
int (*mremap)(struct vm_area_struct * area);
+ int (*mprotect)(struct vm_area_struct * area, unsigned long start,
+ unsigned long end, unsigned long prot);
vm_fault_t (*fault)(struct vm_fault *vmf);
vm_fault_t (*huge_fault)(struct vm_fault *vmf,
enum page_entry_size pe_size);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index bf38dfbbb4b4..e466ca5e4fe0 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -547,13 +547,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
goto out;
}

- error = security_file_mprotect(vma, reqprot, prot);
- if (error)
- goto out;
-
tmp = vma->vm_end;
if (tmp > end)
tmp = end;
+
+ if (vma->vm_ops && vma->vm_ops->mprotect) {
+ error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
+ if (error)
+ goto out;
+ }
+
+ error = security_file_mprotect(vma, reqprot, prot);
+ if (error)
+ goto out;
+
error = mprotect_fixup(vma, &prev, nstart, tmp, newflags);
if (error)
goto out;
--
2.21.0

2019-06-03 07:13:26

by Xing, Cedric

[permalink] [raw]

Subject: RE: [RFC PATCH 4/9] mm: Introduce vm_ops->mprotect()

> From: Christopherson, Sean J
> Sent: Friday, May 31, 2019 4:32 PM
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h index 0e8834ac32b7..50a42364a885
> 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -458,6 +458,8 @@ struct vm_operations_struct {
> void (*close)(struct vm_area_struct * area);
> int (*split)(struct vm_area_struct * area, unsigned long addr);
> int (*mremap)(struct vm_area_struct * area);
> + int (*mprotect)(struct vm_area_struct * area, unsigned long start,
> + unsigned long end, unsigned long prot);

As I commented in my reply to the cover letter, SGX driver doesn't need to intercept mprotect() if ALLOW_* flags are not spilled into it.

> vm_fault_t (*fault)(struct vm_fault *vmf);
> vm_fault_t (*huge_fault)(struct vm_fault *vmf,
> enum page_entry_size pe_size);

2019-06-04 12:26:31

by Jarkko Sakkinen

[permalink] [raw]

Subject: Re: [RFC PATCH 4/9] mm: Introduce vm_ops->mprotect()

On Fri, May 31, 2019 at 04:31:54PM -0700, Sean Christopherson wrote:
> SGX will use the mprotect() hook to prevent userspace from circumventing
> various security checks, i.e. Linux Security Modules.
>
> Enclaves are built by copying data from normal memory into the Enclave
> Page Cache (EPC). Due to the nature of SGX, the EPC is represented by a
> single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
> single MAP_SHARED vm_file. Furthermore, all enclaves will need read,
> write and execute pages in the EPC.

What does the last sentence is pointing out? Enclaves read, write and
execute pages, so?

> As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
> can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
> that originated in a non-EXECUTE file (which is long gone by the time
> mprotect() is called).

I'm not sure what kind of scenario this is describing where some LSM
can't dent PROT_EXEC. Kind of cryptic paragraph, have to say.

> By hooking mprotect(), SGX can make explicit LSM upcalls while an
> enclave is being built, i.e. when the kernel has a handle to origin of
> each enclave page, and enforce the result of the LSM policy whenever
> userspace maps the enclave page in the future.

How does mprotect() enabled adding new LSM hooks?

> Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
> that approach is quite ugly, e.g. would require userspace to call an
> SGX ioctl() prior to using mprotect() to extend a page's protections.

Not really sure I got this. SGX gets page permissions in SECINFO.
Also recurring comment about MAY_* constants.

> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> include/linux/mm.h | 2 ++
> mm/mprotect.c | 15 +++++++++++----
> 2 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e8834ac32b7..50a42364a885 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -458,6 +458,8 @@ struct vm_operations_struct {
> void (*close)(struct vm_area_struct * area);
> int (*split)(struct vm_area_struct * area, unsigned long addr);
> int (*mremap)(struct vm_area_struct * area);
> + int (*mprotect)(struct vm_area_struct * area, unsigned long start,
> + unsigned long end, unsigned long prot);

Right, the hook must be here obviously because mprotect() can be called
when /dev/sgx/enclave is closed. Can you describe start and end i.e.
what range they are in?

> vm_fault_t (*fault)(struct vm_fault *vmf);
> vm_fault_t (*huge_fault)(struct vm_fault *vmf,
> enum page_entry_size pe_size);
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index bf38dfbbb4b4..e466ca5e4fe0 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -547,13 +547,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
> goto out;
> }
>
> - error = security_file_mprotect(vma, reqprot, prot);
> - if (error)
> - goto out;
> -
> tmp = vma->vm_end;
> if (tmp > end)
> tmp = end;
> +
> + if (vma->vm_ops && vma->vm_ops->mprotect) {
> + error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> + if (error)
> + goto out;
> + }
> +
> + error = security_file_mprotect(vma, reqprot, prot);

Why is mprotect callback called post the LSM hook?

> + if (error)
> + goto out;

/Jarkko

2019-06-04 14:54:19

by Andy Lutomirski

[permalink] [raw]

Subject: Re: [RFC PATCH 4/9] mm: Introduce vm_ops->mprotect()

On Fri, May 31, 2019 at 4:32 PM Sean Christopherson
<[email protected]> wrote:
>
> SGX will use the mprotect() hook to prevent userspace from circumventing
> various security checks, i.e. Linux Security Modules.
>
> Enclaves are built by copying data from normal memory into the Enclave
> Page Cache (EPC). Due to the nature of SGX, the EPC is represented by a
> single file that must be MAP_SHARED, i.e. mprotect() only ever sees a
> single MAP_SHARED vm_file. Furthermore, all enclaves will need read,
> write and execute pages in the EPC.
>
> As a result, LSM policies cannot be meaningfully applied, e.g. an LSM
> can deny access to the EPC as a whole, but can't deny PROT_EXEC on page
> that originated in a non-EXECUTE file (which is long gone by the time
> mprotect() is called).
>
> By hooking mprotect(), SGX can make explicit LSM upcalls while an
> enclave is being built, i.e. when the kernel has a handle to origin of
> each enclave page, and enforce the result of the LSM policy whenever
> userspace maps the enclave page in the future.
>
> Alternatively, SGX could play games with MAY_{READ,WRITE,EXEC}, but
> that approach is quite ugly, e.g. would require userspace to call an
> SGX ioctl() prior to using mprotect() to extend a page's protections.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> include/linux/mm.h | 2 ++
> mm/mprotect.c | 15 +++++++++++----
> 2 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0e8834ac32b7..50a42364a885 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -458,6 +458,8 @@ struct vm_operations_struct {
> void (*close)(struct vm_area_struct * area);
> int (*split)(struct vm_area_struct * area, unsigned long addr);
> int (*mremap)(struct vm_area_struct * area);
> + int (*mprotect)(struct vm_area_struct * area, unsigned long start,
> + unsigned long end, unsigned long prot);
> vm_fault_t (*fault)(struct vm_fault *vmf);
> vm_fault_t (*huge_fault)(struct vm_fault *vmf,
> enum page_entry_size pe_size);
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index bf38dfbbb4b4..e466ca5e4fe0 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -547,13 +547,20 @@ static int do_mprotect_pkey(unsigned long start, size_t len,
> goto out;
> }
>
> - error = security_file_mprotect(vma, reqprot, prot);
> - if (error)
> - goto out;
> -
> tmp = vma->vm_end;
> if (tmp > end)
> tmp = end;
> +
> + if (vma->vm_ops && vma->vm_ops->mprotect) {
> + error = vma->vm_ops->mprotect(vma, nstart, tmp, prot);
> + if (error)
> + goto out;
> + }
> +
> + error = security_file_mprotect(vma, reqprot, prot);
> + if (error)
> + goto out;
> +

I think that, if you're going to do it like this, you need to call it
mprotect_and_check_security or something. Or you could just add
.may_mprotect, which is allowed to fail but, on success, falls through
to call security_file_mprotect and mprotect_fixup().

--Andy