2018-11-01 21:20:42

by Michael S. Tsirkin

[permalink] [raw]
Subject: [PULL] vhost: cleanups and fixes

The following changes since commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d:

Linux 4.19 (2018-10-22 07:37:37 +0100)

are available in the Git repository at:

git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

for you to fetch changes up to 79f800b2e76923cd8ce0aa659cb5c019d9643bc9:

MAINTAINERS: remove reference to bogus vsock file (2018-10-24 21:16:14 -0400)

----------------------------------------------------------------
virtio, vhost: fixes, tweaks

virtio balloon page hinting support
vhost scsi control queue

misc fixes.

Signed-off-by: Michael S. Tsirkin <[email protected]>

----------------------------------------------------------------
Bijan Mottahedeh (3):
vhost/scsi: Respond to control queue operations
vhost/scsi: Extract common handling code from control queue handler
vhost/scsi: Use common handling code in request queue handler

Greg Edwards (1):
vhost/scsi: truncate T10 PI iov_iter to prot_bytes

Lénaïc Huard (1):
kvm_config: add CONFIG_VIRTIO_MENU

Stefan Hajnoczi (1):
MAINTAINERS: remove reference to bogus vsock file

Wei Wang (3):
virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
mm/page_poison: expose page_poisoning_enabled to kernel modules
virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON

MAINTAINERS | 1 -
drivers/vhost/scsi.c | 426 ++++++++++++++++++++++++++++--------
drivers/virtio/virtio_balloon.c | 380 +++++++++++++++++++++++++++++---
include/uapi/linux/virtio_balloon.h | 8 +
kernel/configs/kvm_guest.config | 1 +
mm/page_poison.c | 6 +
6 files changed, 688 insertions(+), 134 deletions(-)


2018-11-01 21:45:59

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Thu, Nov 1, 2018 at 2:19 PM Michael S. Tsirkin <[email protected]> wrote:
>
> virtio, vhost: fixes, tweaks

Pulled.

Linus

2018-11-01 23:09:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <[email protected]> wrote:
>
> + memset(&rsp, 0, sizeof(rsp));
> + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> + resp = vq->iov[out].iov_base;
> + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
>
> Is it actually safe to trust that iov_base has passed an earlier
> access_ok() check here? Why not just use copy_to_user() instead?

Good point.

We really should have removed those double-underscore things ages ago.

Also, apart from the address, what about the size? Wouldn't it be
better to use copy_to_iter() rather than implement it badly by hand?

Linus

2018-11-01 23:09:29

by Kees Cook

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Thu, Nov 1, 2018 at 2:19 PM, Michael S. Tsirkin <[email protected]> wrote:
> The following changes since commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d:
>
> Linux 4.19 (2018-10-22 07:37:37 +0100)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
>
> for you to fetch changes up to 79f800b2e76923cd8ce0aa659cb5c019d9643bc9:
>
> MAINTAINERS: remove reference to bogus vsock file (2018-10-24 21:16:14 -0400)
>
> ----------------------------------------------------------------
> virtio, vhost: fixes, tweaks
>
> virtio balloon page hinting support
> vhost scsi control queue
>
> misc fixes.
>
> Signed-off-by: Michael S. Tsirkin <[email protected]>
>
> ----------------------------------------------------------------
> Bijan Mottahedeh (3):
> vhost/scsi: Respond to control queue operations

+static void
+vhost_scsi_send_tmf_resp(struct vhost_scsi *vs,
+ struct vhost_virtqueue *vq,
+ int head, unsigned int out)
+{
+ struct virtio_scsi_ctrl_tmf_resp __user *resp;
+ struct virtio_scsi_ctrl_tmf_resp rsp;
+ int ret;
+
+ pr_debug("%s\n", __func__);
+ memset(&rsp, 0, sizeof(rsp));
+ rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
+ resp = vq->iov[out].iov_base;
+ ret = __copy_to_user(resp, &rsp, sizeof(rsp));

Is it actually safe to trust that iov_base has passed an earlier
access_ok() check here? Why not just use copy_to_user() instead?

-Kees

> vhost/scsi: Extract common handling code from control queue handler
> vhost/scsi: Use common handling code in request queue handler
>
> Greg Edwards (1):
> vhost/scsi: truncate T10 PI iov_iter to prot_bytes
>
> Lénaïc Huard (1):
> kvm_config: add CONFIG_VIRTIO_MENU
>
> Stefan Hajnoczi (1):
> MAINTAINERS: remove reference to bogus vsock file
>
> Wei Wang (3):
> virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
> mm/page_poison: expose page_poisoning_enabled to kernel modules
> virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON
>
> MAINTAINERS | 1 -
> drivers/vhost/scsi.c | 426 ++++++++++++++++++++++++++++--------
> drivers/virtio/virtio_balloon.c | 380 +++++++++++++++++++++++++++++---
> include/uapi/linux/virtio_balloon.h | 8 +
> kernel/configs/kvm_guest.config | 1 +
> mm/page_poison.c | 6 +
> 6 files changed, 688 insertions(+), 134 deletions(-)



--
Kees Cook

2018-11-01 23:38:54

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Thu, Nov 01, 2018 at 04:00:23PM -0700, Kees Cook wrote:
> On Thu, Nov 1, 2018 at 2:19 PM, Michael S. Tsirkin <[email protected]> wrote:
> > The following changes since commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d:
> >
> > Linux 4.19 (2018-10-22 07:37:37 +0100)
> >
> > are available in the Git repository at:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus
> >
> > for you to fetch changes up to 79f800b2e76923cd8ce0aa659cb5c019d9643bc9:
> >
> > MAINTAINERS: remove reference to bogus vsock file (2018-10-24 21:16:14 -0400)
> >
> > ----------------------------------------------------------------
> > virtio, vhost: fixes, tweaks
> >
> > virtio balloon page hinting support
> > vhost scsi control queue
> >
> > misc fixes.
> >
> > Signed-off-by: Michael S. Tsirkin <[email protected]>
> >
> > ----------------------------------------------------------------
> > Bijan Mottahedeh (3):
> > vhost/scsi: Respond to control queue operations
>
> +static void
> +vhost_scsi_send_tmf_resp(struct vhost_scsi *vs,
> + struct vhost_virtqueue *vq,
> + int head, unsigned int out)
> +{
> + struct virtio_scsi_ctrl_tmf_resp __user *resp;
> + struct virtio_scsi_ctrl_tmf_resp rsp;
> + int ret;
> +
> + pr_debug("%s\n", __func__);
> + memset(&rsp, 0, sizeof(rsp));
> + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> + resp = vq->iov[out].iov_base;
> + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
>
> Is it actually safe to trust that iov_base has passed an earlier
> access_ok() check here? Why not just use copy_to_user() instead?
>
> -Kees

I am not sure copy_to_user will do the right thing here, because all
this runs in context of a kernel thread. We do need access_ok which
takes place way earlier in context of the task.

Another reason it is safe is because the address is not
coming from userspace at all.




> > vhost/scsi: Extract common handling code from control queue handler
> > vhost/scsi: Use common handling code in request queue handler
> >
> > Greg Edwards (1):
> > vhost/scsi: truncate T10 PI iov_iter to prot_bytes
> >
> > L?na?c Huard (1):
> > kvm_config: add CONFIG_VIRTIO_MENU
> >
> > Stefan Hajnoczi (1):
> > MAINTAINERS: remove reference to bogus vsock file
> >
> > Wei Wang (3):
> > virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
> > mm/page_poison: expose page_poisoning_enabled to kernel modules
> > virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON
> >
> > MAINTAINERS | 1 -
> > drivers/vhost/scsi.c | 426 ++++++++++++++++++++++++++++--------
> > drivers/virtio/virtio_balloon.c | 380 +++++++++++++++++++++++++++++---
> > include/uapi/linux/virtio_balloon.h | 8 +
> > kernel/configs/kvm_guest.config | 1 +
> > mm/page_poison.c | 6 +
> > 6 files changed, 688 insertions(+), 134 deletions(-)
>
>
>
> --
> Kees Cook

2018-11-01 23:57:49

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
> On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <[email protected]> wrote:
> >
> > + memset(&rsp, 0, sizeof(rsp));
> > + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> > + resp = vq->iov[out].iov_base;
> > + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> >
> > Is it actually safe to trust that iov_base has passed an earlier
> > access_ok() check here? Why not just use copy_to_user() instead?
>
> Good point.
>
> We really should have removed those double-underscore things ages ago.

Well in case of vhost there are a bunch of reasons to keep them.

One is that all access_ok checks take place
way earlier in context of the owner task. Result is saved
and then used for access repeatedly. Skipping reding access_ok twice
did seem to give a small speedup in the past.
In fact I am looking
into switching some of the uses to unsafe_put_user/unsafe_get_user
after doing something like barrier_nospec after the
access_ok checks. Seems to give a measureable speedup.


Another is that the double underscore things actually can be done
in softirq context if you do preempt_disable/preempt_enable.
IIUC Jason's looking into using that to cut down the latency
for when the access is very small.

> Also, apart from the address, what about the size? Wouldn't it be
> better to use copy_to_iter() rather than implement it badly by hand?
>
> Linus

Generally size is checked when we retrieve the iov but I will recheck
this case and reply here.

2018-11-02 11:47:25

by Mark Rutland

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
> On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <[email protected]> wrote:
> >
> > + memset(&rsp, 0, sizeof(rsp));
> > + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> > + resp = vq->iov[out].iov_base;
> > + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> >
> > Is it actually safe to trust that iov_base has passed an earlier
> > access_ok() check here? Why not just use copy_to_user() instead?
>
> Good point.
>
> We really should have removed those double-underscore things ages ago.

FWIW, on arm64 we always check/sanitize the user address as a result of
our sanitization of speculated values. Almost all of our uaccess
routines have an explicit access_ok().

All our uaccess routines mask the user pointer based on addr_limit,
which prevents speculative or architectural uaccess to kernel addresses
when addr_limit it USER_DS:

4d8efc2d5ee4c9cc ("arm64: Use pointer masking to limit uaccess speculation")

We also inhibit speculative stores to addr_limit being forwarded under
speculation:

c2f0ad4fc089cff8 ("arm64: uaccess: Prevent speculative use of the current addr_limit")

... and given all that, we folded explicit access_ok() checks into
__{get,put}_user():

84624087dd7e3b48 ("arm64: uaccess: Don't bother eliding access_ok checks in __{get, put}_user")

IMO we could/should do the same for __copy_{to,from}_user().

Thanks,
Mark.

2018-11-02 13:05:35

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 02, 2018 at 11:46:36AM +0000, Mark Rutland wrote:
> On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
> > On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <[email protected]> wrote:
> > >
> > > + memset(&rsp, 0, sizeof(rsp));
> > > + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> > > + resp = vq->iov[out].iov_base;
> > > + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> > >
> > > Is it actually safe to trust that iov_base has passed an earlier
> > > access_ok() check here? Why not just use copy_to_user() instead?
> >
> > Good point.
> >
> > We really should have removed those double-underscore things ages ago.
>
> FWIW, on arm64 we always check/sanitize the user address as a result of
> our sanitization of speculated values. Almost all of our uaccess
> routines have an explicit access_ok().
>
> All our uaccess routines mask the user pointer based on addr_limit,
> which prevents speculative or architectural uaccess to kernel addresses
> when addr_limit it USER_DS:
>
> 4d8efc2d5ee4c9cc ("arm64: Use pointer masking to limit uaccess speculation")
>
> We also inhibit speculative stores to addr_limit being forwarded under
> speculation:
>
> c2f0ad4fc089cff8 ("arm64: uaccess: Prevent speculative use of the current addr_limit")
>
> ... and given all that, we folded explicit access_ok() checks into
> __{get,put}_user():
>
> 84624087dd7e3b48 ("arm64: uaccess: Don't bother eliding access_ok checks in __{get, put}_user")
>
> IMO we could/should do the same for __copy_{to,from}_user().
>
> Thanks,
> Mark.

I've tried making access_ok mask the parameter it gets. Works because
access_ok is a macro. Most users pass in a variable so that will block
attempts to use speculation to bypass the access_ok checks.

Not 100% as someone can copy the value before access_ok, but
then it's all mitigation anyway.

Places which call access_ok on a non-lvalue need to be fixed then but
there are not too many of these.

The advantage here is that a code like this:

access_ok
for(...)
__get_user

isn't slowed down as the masking is outside the loop.

OTOH macros changing their arguments are kind of ugly.
What do others think?

Just to show what I mean:

Signed-off-by: Michael S. Tsirkin <[email protected]>

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index aae77eb8491c..c4d12c8f47d7 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -7,6 +7,7 @@
#include <linux/compiler.h>
#include <linux/kasan-checks.h>
#include <linux/string.h>
+#include <linux/nospec.h>
#include <asm/asm.h>
#include <asm/page.h>
#include <asm/smap.h>
@@ -69,6 +70,33 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
__chk_range_not_ok((unsigned long __force)(addr), size, limit); \
})

+/*
+ * Test whether a block of memory is a valid user space address.
+ * Returns 0 if the range is valid, address itself otherwise.
+ */
+static inline unsigned long __verify_range_nospec(unsigned long addr,
+ unsigned long size,
+ unsigned long limit)
+{
+ /* Be careful about overflow */
+ limit = array_index_nospec(limit, size);
+
+ /*
+ * If we have used "sizeof()" for the size,
+ * we know it won't overflow the limit (but
+ * it might overflow the 'addr', so it's
+ * important to subtract the size from the
+ * limit, not add it to the address).
+ */
+ if (__builtin_constant_p(size)) {
+ return array_index_nospec(addr, limit - size + 1);
+ }
+
+ /* Arbitrary sizes? Be careful about overflow */
+ return array_index_mask_nospec(limit, size) &
+ array_index_nospec(addr, limit - size + 1);
+}
+
#ifdef CONFIG_DEBUG_ATOMIC_SLEEP
# define WARN_ON_IN_IRQ() WARN_ON_ONCE(!in_task())
#else
@@ -95,12 +123,46 @@ static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, un
* checks that the pointer is in the user space range - after calling
* this function, memory access functions may still return -EFAULT.
*/
-#define access_ok(type, addr, size) \
+#define unsafe_access_ok(type, addr, size) \
({ \
WARN_ON_IN_IRQ(); \
likely(!__range_not_ok(addr, size, user_addr_max())); \
})

+/**
+ * access_ok_nospec: - Checks if a user space pointer is valid
+ * @type: Type of access: %VERIFY_READ or %VERIFY_WRITE. Note that
+ * %VERIFY_WRITE is a superset of %VERIFY_READ - if it is safe
+ * to write to a block, it is always safe to read from it.
+ * @addr: User space pointer to start of block to check
+ * @size: Size of block to check
+ *
+ * Context: User context only. This function may sleep if pagefaults are
+ * enabled.
+ *
+ * Checks if a pointer to a block of memory in user space is valid.
+ *
+ * Returns address itself (nonzero) if the memory block may be valid,
+ * zero if it is definitely invalid.
+ *
+ * To prevent speculation, the returned value must then be used
+ * for accesses.
+ *
+ * Note that, depending on architecture, this function probably just
+ * checks that the pointer is in the user space range - after calling
+ * this function, memory access functions may still return -EFAULT.
+ */
+#define access_ok_nospec(type, addr, size) \
+({ \
+ WARN_ON_IN_IRQ(); \
+ __chk_user_ptr(addr); \
+ addr = (typeof(addr) __force) \
+ __verify_range_nospec((unsigned long __force)(addr), \
+ size, user_addr_max()); \
+})
+
+#define access_ok(type, addr, size) access_ok_nospec(type, addr, size)
+
/*
* These are the main single-value transfer routines. They automatically
* use the right size if we just have the right pointer type.
--
MST

2018-11-02 16:17:11

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 2, 2018 at 6:04 AM Michael S. Tsirkin <[email protected]> wrote:
>
> I've tried making access_ok mask the parameter it gets.

PLEASE don't do this.

Just use "copy_to/from_user()".

We have had lots of bugs because code bitrots.

And no, the access_ok() checks aren't expensive, not even in a loop.
They *used* to be somewhat expensive compared to the access, but that
simply isn't true any more. The real expense in copy_to_user and
friends are in the user access bit setting (STAC and CLAC on x86),
which easily an order of magnitude more expensive than access_ok().

So just get rid of the double-underscore version. It's basically
always a mis-optimization due to entirely historical reasons. I can
pretty much guarantee that it's not visible in profiles.

Linus

2018-11-02 17:00:30

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 02, 2018 at 09:14:51AM -0700, Linus Torvalds wrote:
> On Fri, Nov 2, 2018 at 6:04 AM Michael S. Tsirkin <[email protected]> wrote:
> >
> > I've tried making access_ok mask the parameter it gets.
>
> PLEASE don't do this.

Okay.

> Just use "copy_to/from_user()".

Just for completeness I'd like to point out for vhost the copies are
done from the kernel thread. So yes we can switch to copy_to/from_user
but for e.g. 32-bit userspace running on top of a 64 bit kernel it is
IIUC not sufficient - we must *also* do access_ok checks on control path
when addresses are passed to the kernel and when current points to the
correct task struct.

> We have had lots of bugs because code bitrots.

Yes, I wish we did not need these access_ok checks and could just rely
on copy_to/from_user.

> And no, the access_ok() checks aren't expensive, not even in a loop.
> They *used* to be somewhat expensive compared to the access, but that
> simply isn't true any more. The real expense in copy_to_user and
> friends are in the user access bit setting (STAC and CLAC on x86),
> which easily an order of magnitude more expensive than access_ok().
>
> So just get rid of the double-underscore version. It's basically
> always a mis-optimization due to entirely historical reasons. I can
> pretty much guarantee that it's not visible in profiles.
>
> Linus

OK. So maybe we should focus on switching to user_access_begin/end +
unsafe_get_user/unsafe_put_user in a loop which does seem to be
measureable. That moves the barrier out of the loop, which seems to be
consistent with what you would expect.

--
MST

2018-11-02 17:17:00

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 2, 2018 at 10:10 AM Linus Torvalds
<[email protected]> wrote:
>
> Don't you take over the VM with "use_mm()" when you do the copies? So
> yes, it's a kernel thread, but it has a user VM, and though that
> should have the user limits.

Oooh. *Just* as I sent this, I realized that "use_mm()" doesn't update
the thread addr_limit.

That actually looks like a bug to me - although one that you've
apparently been aware of and worked around.

Wouldn't it be nicer to just make "use_mm()" do

set_fs(USER_DS);

instead? And undo it on unuse_mm()?

And, in fact, maybe we should default kernel threads to have a zero
address limit, so that they can't do any user accesses at all without
doing this?

Adding Al to the cc, because I think he's been looking at set_fs() in general.

Linus

2018-11-02 17:21:32

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 2, 2018 at 9:59 AM Michael S. Tsirkin <[email protected]> wrote:
>
> Just for completeness I'd like to point out for vhost the copies are
> done from the kernel thread. So yes we can switch to copy_to/from_user
> but for e.g. 32-bit userspace running on top of a 64 bit kernel it is
> IIUC not sufficient - we must *also* do access_ok checks on control path
> when addresses are passed to the kernel and when current points to the
> correct task struct.

Don't you take over the VM with "use_mm()" when you do the copies? So
yes, it's a kernel thread, but it has a user VM, and though that
should have the user limits.

No?

Linus

2018-11-02 17:21:56

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 02, 2018 at 10:10:45AM -0700, Linus Torvalds wrote:
> On Fri, Nov 2, 2018 at 9:59 AM Michael S. Tsirkin <[email protected]> wrote:
> >
> > Just for completeness I'd like to point out for vhost the copies are
> > done from the kernel thread. So yes we can switch to copy_to/from_user
> > but for e.g. 32-bit userspace running on top of a 64 bit kernel it is
> > IIUC not sufficient - we must *also* do access_ok checks on control path
> > when addresses are passed to the kernel and when current points to the
> > correct task struct.
>
> Don't you take over the VM with "use_mm()" when you do the copies?

Yes we do.

> So
> yes, it's a kernel thread, but it has a user VM, and though that
> should have the user limits.
>
> No?
>
> Linus

Here's what I meant: we have

#define access_ok(type, addr, size) \
({ \
WARN_ON_IN_IRQ(); \
likely(!__range_not_ok(addr, size, user_addr_max())); \
})

and

#define user_addr_max() (current->thread.addr_limit.seg)

it seems that it depends on current not on the active mm.

get_user and friends are similar:

ENTRY(__get_user_1)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
and %_ASM_DX, %_ASM_AX
ASM_STAC
1: movzbl (%_ASM_AX),%edx
xor %eax,%eax
ASM_CLAC
ret
ENDPROC(__get_user_1)
EXPORT_SYMBOL(__get_user_1)

--
MST

2018-11-02 18:11:42

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 2, 2018 at 10:21 AM Michael S. Tsirkin <[email protected]> wrote:
>
> it seems that it depends on current not on the active mm.

Yes, see my other suggestion to just fix "use_mm()" to update the address range.

Would you mind testing that?

Because that would seem to be the *much* less error-prone model..

Linus

2018-11-02 18:13:16

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 02, 2018 at 11:02:35AM -0700, Linus Torvalds wrote:
> On Fri, Nov 2, 2018 at 10:21 AM Michael S. Tsirkin <[email protected]> wrote:
> >
> > it seems that it depends on current not on the active mm.
>
> Yes, see my other suggestion to just fix "use_mm()" to update the address range.
>
> Would you mind testing that?

Sure, I'll test it.

> Because that would seem to be the *much* less error-prone model..
>
> Linus

I agree, it's always been bothering me.

--
MST

2018-11-02 19:02:41

by Al Viro

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 02, 2018 at 10:15:56AM -0700, Linus Torvalds wrote:
> On Fri, Nov 2, 2018 at 10:10 AM Linus Torvalds
> <[email protected]> wrote:
> >
> > Don't you take over the VM with "use_mm()" when you do the copies? So
> > yes, it's a kernel thread, but it has a user VM, and though that
> > should have the user limits.
>
> Oooh. *Just* as I sent this, I realized that "use_mm()" doesn't update
> the thread addr_limit.
>
> That actually looks like a bug to me - although one that you've
> apparently been aware of and worked around.
>
> Wouldn't it be nicer to just make "use_mm()" do
>
> set_fs(USER_DS);
>
> instead? And undo it on unuse_mm()?
>
> And, in fact, maybe we should default kernel threads to have a zero
> address limit, so that they can't do any user accesses at all without
> doing this?

Try it and watch it fail to set initramfs up, let alone exec the init...

> Adding Al to the cc, because I think he's been looking at set_fs() in general.

It would be the right thing (with return to KERNEL_DS), but I'm not certain
if GPU users will survive - these two
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h:157: use_mm(mmptr); \
drivers/gpu/drm/i915/gvt/kvmgt.c:1799: use_mm(kvm->mm);
I don't understand the call chains there (especially for the first one) well
enough to tell.

2018-11-30 13:45:35

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
> On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <[email protected]> wrote:
> >
> > + memset(&rsp, 0, sizeof(rsp));
> > + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> > + resp = vq->iov[out].iov_base;
> > + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> >
> > Is it actually safe to trust that iov_base has passed an earlier
> > access_ok() check here? Why not just use copy_to_user() instead?
>
> Good point.
>
> We really should have removed those double-underscore things ages ago.
>
> Also, apart from the address, what about the size? Wouldn't it be
> better to use copy_to_iter() rather than implement it badly by hand?
>
> Linus

Bijan can you respond please?
Are you going to look into this and convert code to copy_to_iter?
I don't think we should release Linux like this, so if you don't
have the time I'd rather revert for now and you can look
into reposting for the next release.

Thanks,

--
MST


2018-11-30 19:03:04

by Bijan Mottahedeh

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On 11/30/2018 5:44 AM, Michael S. Tsirkin wrote:
> On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
>> On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <[email protected]> wrote:
>>> + memset(&rsp, 0, sizeof(rsp));
>>> + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
>>> + resp = vq->iov[out].iov_base;
>>> + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
>>>
>>> Is it actually safe to trust that iov_base has passed an earlier
>>> access_ok() check here? Why not just use copy_to_user() instead?
>> Good point.
>>
>> We really should have removed those double-underscore things ages ago.
>>
>> Also, apart from the address, what about the size? Wouldn't it be
>> better to use copy_to_iter() rather than implement it badly by hand?
>>
>> Linus
> Bijan can you respond please?
> Are you going to look into this and convert code to copy_to_iter?
> I don't think we should release Linux like this, so if you don't
> have the time I'd rather revert for now and you can look
> into reposting for the next release.
>
> Thanks,
>

Sure, will do.  Can I send an individual patch for the fix to
vhost_scsi_send_tmf_reject()?

Thanks.

--bijan

2018-11-30 19:56:38

by Michael S. Tsirkin

[permalink] [raw]
Subject: Re: [PULL] vhost: cleanups and fixes

On Fri, Nov 30, 2018 at 11:01:03AM -0800, Bijan Mottahedeh wrote:
> On 11/30/2018 5:44 AM, Michael S. Tsirkin wrote:
> > On Thu, Nov 01, 2018 at 04:06:19PM -0700, Linus Torvalds wrote:
> > > On Thu, Nov 1, 2018 at 4:00 PM Kees Cook <[email protected]> wrote:
> > > > + memset(&rsp, 0, sizeof(rsp));
> > > > + rsp.response = VIRTIO_SCSI_S_FUNCTION_REJECTED;
> > > > + resp = vq->iov[out].iov_base;
> > > > + ret = __copy_to_user(resp, &rsp, sizeof(rsp));
> > > >
> > > > Is it actually safe to trust that iov_base has passed an earlier
> > > > access_ok() check here? Why not just use copy_to_user() instead?
> > > Good point.
> > >
> > > We really should have removed those double-underscore things ages ago.
> > >
> > > Also, apart from the address, what about the size? Wouldn't it be
> > > better to use copy_to_iter() rather than implement it badly by hand?
> > >
> > > Linus
> > Bijan can you respond please?
> > Are you going to look into this and convert code to copy_to_iter?
> > I don't think we should release Linux like this, so if you don't
> > have the time I'd rather revert for now and you can look
> > into reposting for the next release.
> >
> > Thanks,
> >
>
> Sure, will do.? Can I send an individual patch for the fix to
> vhost_scsi_send_tmf_reject()?
>
> Thanks.
>
> --bijan

Please go ahead.

--
MST