Received: by 10.192.165.148 with SMTP id m20csp249270imm; Thu, 19 Apr 2018 20:54:24 -0700 (PDT) X-Google-Smtp-Source: AIpwx48Cfn3wHb4Y2oxoRouSD4CYccUq0XajX8CAF0YJSv1Ldbr+U7TL6Ytjk3CPl5evrPPDGuC3 X-Received: by 2002:a17:902:52a6:: with SMTP id a35-v6mr8608750pli.131.1524196464214; Thu, 19 Apr 2018 20:54:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524196464; cv=none; d=google.com; s=arc-20160816; b=ghQ3wcvhKx0UDdu8zPo7FcZNBqNeIjW1T8atUDAw9nyapFXxU2oCneGGFJy6uTdCs3 pmqMvkr0h8wB1pO2vaZxPiJONX26hvZETQYz2Ri4MnoxOogelQPDOfbVsinPrKdv/NxO V9GTgLi3rxg7CoQQb3btXzHcfI3eEWvHogGEXOUGc2PqCn3/SKUym/r1UBqp6Fz8tm4A f4HjbfrDs5C837qxaAGtmQVyc1DJQIkOA6CiBRkYweFdH81ZLRoU93YThff9fH9DSWgY bHoCYXgdWbYtw2ka+8DQprYhYKAv/feZnrPlsZLQm1Jzu02t0VAu7WbiVJUqi3SqZRqa 4QJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=PwIdx3z3f/KmYYam09kT5fWeEpTvFT5wNr4OiJ6GZ5I=; b=s04pj/dNDjjFMFABQnrVgtpa1b4fzWddQq7OCSz2h6aAcQl/91gUw5MTGfoyL51lOc nSbH3arwojHE5//Fw7BuXMfVHMD0BsoZYfn/jT2U2AMvMzsai4wabJZlnN93KNPOQL+i yEzfFBc/17+UoVwIi36Q66VuHjXwPQPlIqxwUz+4I7EWQrJYTxxemslmH7KUYFlTGl7q pFaLjWeJNHEruJCyH+WDmX+4UEJOxI079e/6uiqqqjHL9sb5nY6yjxUwAckfVBAXvxmB erFZxVQLkoZZJnfe9ydKHIkdA3VeDa5kktwXBTQlA/u16PlwTWorHO3HTxz8dwVaf3DA 2/vQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y7si4366349pfl.313.2018.04.19.20.54.10; Thu, 19 Apr 2018 20:54:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754285AbeDTDxA (ORCPT + 99 others); Thu, 19 Apr 2018 23:53:00 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:45728 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754148AbeDTDw7 (ORCPT ); Thu, 19 Apr 2018 23:52:59 -0400 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AD9614072CF7; Fri, 20 Apr 2018 03:52:58 +0000 (UTC) Received: from [10.72.12.138] (ovpn-12-138.pek2.redhat.com [10.72.12.138]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C0B702166BAE; Fri, 20 Apr 2018 03:52:50 +0000 (UTC) Subject: Re: [RFC] vhost: introduce mdev based hardware vhost backend To: "Michael S. Tsirkin" Cc: Tiwei Bie , alex.williamson@redhat.com, ddutile@redhat.com, alexander.h.duyck@intel.com, virtio-dev@lists.oasis-open.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, dan.daly@intel.com, cunming.liang@intel.com, zhihong.wang@intel.com, jianfeng.tan@intel.com, xiao.w.wang@intel.com References: <20180402152330.4158-1-tiwei.bie@intel.com> <622f4bd7-1249-5545-dc5a-5a92b64f5c26@redhat.com> <20180410045723.rftsb7l4l3ip2ioi@debian> <30a63fff-7599-640a-361f-a27e5783012a@redhat.com> <20180419212911-mutt-send-email-mst@kernel.org> From: Jason Wang Message-ID: <060e2b5f-2e93-c53f-387b-5baaa33e87cd@redhat.com> Date: Fri, 20 Apr 2018 11:52:47 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <20180419212911-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 on 10.11.54.6 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 20 Apr 2018 03:52:58 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Fri, 20 Apr 2018 03:52:58 +0000 (UTC) for IP:'10.11.54.6' DOMAIN:'int-mx06.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jasowang@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018年04月20日 02:40, Michael S. Tsirkin wrote: > On Tue, Apr 10, 2018 at 03:25:45PM +0800, Jason Wang wrote: >>>>> One problem is that, different virtio ring compatible devices >>>>> may have different device interfaces. That is to say, we will >>>>> need different drivers in QEMU. It could be troublesome. And >>>>> that's what this patch trying to fix. The idea behind this >>>>> patch is very simple: mdev is a standard way to emulate device >>>>> in kernel. >>>> So you just move the abstraction layer from qemu to kernel, and you still >>>> need different drivers in kernel for different device interfaces of >>>> accelerators. This looks even more complex than leaving it in qemu. As you >>>> said, another idea is to implement userspace vhost backend for accelerators >>>> which seems easier and could co-work with other parts of qemu without >>>> inventing new type of messages. >>> I'm not quite sure. Do you think it's acceptable to >>> add various vendor specific hardware drivers in QEMU? >>> >> I don't object but we need to figure out the advantages of doing it in qemu >> too. >> >> Thanks > To be frank kernel is exactly where device drivers belong. DPDK did > move them to userspace but that's merely a requirement for data path. > *If* you can have them in kernel that is best: > - update kernel and there's no need to rebuild userspace Well, you still need to rebuild userspace since a new vhost backend is required which relies vhost protocol through mdev API. And I believe upgrading userspace package is considered to be more lightweight than upgrading kernel. With mdev, we're likely to repeat the story of vhost API, dealing with features/versions and inventing new API endless for new features. And you will still need to rebuild the userspace. > - apps can be written in any language no need to maintain multiple > libraries or add wrappers This is not a big issue consider It's not a generic network driver but a mdev driver, the only possible user is VM. > - security concerns are much smaller (ok people are trying to > raise the bar with IOMMUs and such, but it's already pretty > good even without) Well, I think not, kernel bugs are much more serious than userspace ones. And I beg the kernel driver itself won't be small. > > The biggest issue is that you let userspace poke at the > device which is also allowed by the IOMMU to poke at > kernel memory (needed for kernel driver to work). I don't quite get. The userspace driver could be built on top of VFIO for sure. So kernel memory were perfectly isolated in this case. > > Yes, maybe if device is not buggy it's all fine, but > it's better if we do not have to trust the device > otherwise the security picture becomes more murky. > > I suggested attaching a PASID to (some) queues - see my old post "using > PASIDs to enable a safe variant of direct ring access". > > Then using IOMMU with VFIO to limit access through queue to corrent > ranges of memory. Well userspace driver could benefit from this too. And we can even go further by using nested IO page tables to share IOVA address space between devices and a VM. Thanks