Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp641201pxj; Fri, 14 May 2021 11:53:08 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyvYkVsQms4rpxAzuMZQpyax+j4ImeNp8k51os32E5qxt9RwGrjXrkC4ftiByyp6CDMY/ZO X-Received: by 2002:a17:906:84d:: with SMTP id f13mr7091491ejd.451.1621018387839; Fri, 14 May 2021 11:53:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621018387; cv=none; d=google.com; s=arc-20160816; b=od/jamivZ1te5DpPIOWoVaEw3lkl6g3donrk6nl5xQzWj95eCA/UdK1VhXUJmePMqr HXrp8QtWROlnfLnlS295oGD0uSwnEO40YOuBcnGjAsheoSt/3p/y2rNj+i5L1j4MSfbu pqP5U0RRlnEJPHKkr1ckAe1eCc7RRS9Uc9IBT9ZCdVkl5/woBRGrXC/6uRpbtVhGatjT TJFeMdPRvEfbBePo4vFJcfjc3UOVCatwCFbJsq26DwnA8287RX22xXwNcuCLOU6N1xE5 LeE13/ChMX1a6BQMNw9j30M2/538MkC5da00eKs9+XsrBZeDf306xys1mwNnKHA11Zb0 as/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=bfYFyDNOaHNznWD3QdjOTFC8+ozDhd6BMb68HdfprOU=; b=iKcd8YCE4GbMlo/aa126+/hFbAztVDXpAlEbA1jYx5JSSqcDXPHYhaIr1DPxwcECJV rIpoAJnVVG5VdqBbwFelPhHU3f0IdaWnteqMWIDB2Pg5Xo4oGllkHLCr2fvWMNMhw1kZ fw+MLurx7OfNF6+s0JqydzY/qRbe6QGcPltluSAl7NcJL4Y/o+rIgfUUe5/vmgLxXU/I C8rdeiN0BtunviiAMR3bAibzLX8NPatdDP8GSMI3cWkOxcMTVuLKZQhubLrD4x5PqT// B3GWnTqloEy4uFAljyyO22c4wbBQUNOxRK5nqVa0z3VhRGBylK+ve7RpPEkPsx7zT7a6 VvOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=c01F5hKE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u15si7643073edt.43.2021.05.14.11.52.44; Fri, 14 May 2021 11:53:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20150623.gappssmtp.com header.s=20150623 header.b=c01F5hKE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232486AbhENN7z (ORCPT + 99 others); Fri, 14 May 2021 09:59:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232455AbhENN7y (ORCPT ); Fri, 14 May 2021 09:59:54 -0400 Received: from mail-il1-x130.google.com (mail-il1-x130.google.com [IPv6:2607:f8b0:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AC74C061574 for ; Fri, 14 May 2021 06:58:43 -0700 (PDT) Received: by mail-il1-x130.google.com with SMTP id j12so25881569ils.4 for ; Fri, 14 May 2021 06:58:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bfYFyDNOaHNznWD3QdjOTFC8+ozDhd6BMb68HdfprOU=; b=c01F5hKE8Z6Bx4tcWxeDqSwTHAQXEuTzA80kuP5Y/4gWobUYbKnf5x5a93aCLtQt+7 /T4/ZcijHUQ5pkR6PldpwKw+JI6Wt4AAZTEsno0EtwfWn+Dl5vO/fCwWQvoiYrmKDXm1 8JHA5ipjZoX8GXyEiaK011JEGH291V6SxSpiqWFZOc8TQLsacNXWfjBVTK86yx+znbq/ iuYuY2/1bkI2uResvm8SG90jQPrWRWDeyKphHoX/dFo69ehgrmgAXPLd0W7LLHECpkDo w58oYYU5YoSVmTEL5KEmHEhzQseJVKY5XZkkNSZq9VIvqb9F9DvJbrBjY/d86yxRvplT 0XhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bfYFyDNOaHNznWD3QdjOTFC8+ozDhd6BMb68HdfprOU=; b=n0hPNit9Hx5S63rqASnOPfRE+MDGBVBmuB4INxqDb5LEbMbdk+j/ljQpvhC+2l0I1e Ss1Rq7gQ0UNYFOAl8KHkTOdqfiO7G4a2xqa5uq9oSIePc27nhp7jUJaRAzSFWhkJwHwT e4DwncvJoRBSKvNpRpPhFZGqfwwWcs7sz1ru8BTkitAMPyPC0XSOf7pimMbacfN/FACB PfycZ0kitAXn8Ef8eW35wLguErAC9lFIVkpDZSmgR7uS0a37T26sHiG8tD+TOcue91VD M0y2BCZmY2jIQkOtjDgDEHszuGV2CB+deg6ZLO/nPKb2Ie+bkAyOGtUSLfAjzP1G7FGG ACDg== X-Gm-Message-State: AOAM532+PbVsuGRWF3bh1qjjAn8YKWU/9RGLDhij+Eennq0wZcAWGQD+ 3UNHBfUNU9T+C169qxFPg2yzhChTrs9RDoBwm070 X-Received: by 2002:a92:c884:: with SMTP id w4mr40269472ilo.186.1621000722714; Fri, 14 May 2021 06:58:42 -0700 (PDT) MIME-Version: 1.0 References: <20210423080942.2997-1-jasowang@redhat.com> <20210514073452-mutt-send-email-mst@kernel.org> In-Reply-To: <20210514073452-mutt-send-email-mst@kernel.org> From: Yongji Xie Date: Fri, 14 May 2021 21:58:32 +0800 Message-ID: Subject: Re: Re: Re: [RFC PATCH V2 0/7] Do not read from descripto ring To: "Michael S. Tsirkin" Cc: Stefan Hajnoczi , Jason Wang , virtualization , linux-kernel , file@sect.tu-berlin.de, ashish.kalra@amd.com, konrad.wilk@oracle.com, kvm , Christoph Hellwig Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 14, 2021 at 7:36 PM Michael S. Tsirkin wrote: > > On Fri, May 14, 2021 at 07:27:22PM +0800, Yongji Xie wrote: > > On Fri, May 14, 2021 at 7:17 PM Stefan Hajnoczi wrote: > > > > > > On Fri, May 14, 2021 at 03:29:20PM +0800, Jason Wang wrote: > > > > On Fri, May 14, 2021 at 12:27 AM Stefan Hajnoczi wrote: > > > > > > > > > > On Fri, Apr 23, 2021 at 04:09:35PM +0800, Jason Wang wrote: > > > > > > Sometimes, the driver doesn't trust the device. This is usually > > > > > > happens for the encrtpyed VM or VDUSE[1]. > > > > > > > > > > Thanks for doing this. > > > > > > > > > > Can you describe the overall memory safety model that virtio drivers > > > > > must follow? > > > > > > > > My understanding is that, basically the driver should not trust the > > > > device (since the driver doesn't know what kind of device that it > > > > tries to drive) > > > > > > > > 1) For any read only metadata (required at the spec level) which is > > > > mapped as coherent, driver should not depend on the metadata that is > > > > stored in a place that could be wrote by the device. This is what this > > > > series tries to achieve. > > > > 2) For other metadata that is produced by the device, need to make > > > > sure there's no malicious device triggered behavior, this is somehow > > > > similar to what vhost did. No DOS, loop, kernel bug and other stuffs. > > > > 3) swiotb is a must to enforce memory access isolation. (VDUSE or encrypted VM) > > > > > > > > > For example: > > > > > > > > > > - Driver-to-device buffers must be on dedicated pages to avoid > > > > > information leaks. > > > > > > > > It looks to me if swiotlb is used, we don't need this since the > > > > bouncing is not done at byte not page. > > > > > > > > But if swiotlb is not used, we need to enforce this. > > > > > > > > > > > > > > - Driver-to-device buffers must be on dedicated pages to avoid memory > > > > > corruption. > > > > > > > > Similar to the above. > > > > > > > > > > > > > > When I say "pages" I guess it's the IOMMU page size that matters? > > > > > > > > > > > > > And the IOTLB page size. > > > > > > > > > What is the memory access granularity of VDUSE? > > > > > > > > It has an swiotlb, but the access and bouncing is done per byte. > > > > > > > > > > > > > > I'm asking these questions because there is driver code that exposes > > > > > kernel memory to the device and I'm not sure it's safe. For example: > > > > > > > > > > static int virtblk_add_req(struct virtqueue *vq, struct virtblk_req *vbr, > > > > > struct scatterlist *data_sg, bool have_data) > > > > > { > > > > > struct scatterlist hdr, status, *sgs[3]; > > > > > unsigned int num_out = 0, num_in = 0; > > > > > > > > > > sg_init_one(&hdr, &vbr->out_hdr, sizeof(vbr->out_hdr)); > > > > > ^^^^^^^^^^^^^ > > > > > sgs[num_out++] = &hdr; > > > > > > > > > > if (have_data) { > > > > > if (vbr->out_hdr.type & cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_OUT)) > > > > > sgs[num_out++] = data_sg; > > > > > else > > > > > sgs[num_out + num_in++] = data_sg; > > > > > } > > > > > > > > > > sg_init_one(&status, &vbr->status, sizeof(vbr->status)); > > > > > ^^^^^^^^^^^^ > > > > > sgs[num_out + num_in++] = &status; > > > > > > > > > > return virtqueue_add_sgs(vq, sgs, num_out, num_in, vbr, GFP_ATOMIC); > > > > > } > > > > > > > > > > I guess the drivers don't need to be modified as long as swiotlb is used > > > > > to bounce the buffers through "insecure" memory so that the memory > > > > > surrounding the buffers is not exposed? > > > > > > > > Yes, swiotlb won't bounce the whole page. So I think it's safe. > > > > > > Thanks Jason and Yongji Xie for clarifying. Seems like swiotlb or a > > > similar mechanism can handle byte-granularity isolation so the drivers > > > not need to worry about information leaks or memory corruption outside > > > the mapped byte range. > > > > > > We still need to audit virtio guest drivers to ensure they don't trust > > > data that can be modified by the device. I will look at virtio-blk and > > > virtio-fs next week. > > > > > > > Oh, that's great. Thank you! > > > > I also did some audit work these days and will send a new version for > > reviewing next Monday. > > > > Thanks, > > Yongji > > Doing it in a way that won't hurt performance for simple > configs that trust the device is a challenge though. > Pls take a look at the discussion with Christoph for some ideas > on how to do this. > I see. Thanks for the reminder. Thanks, Yongji