Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2119128imu; Fri, 14 Dec 2018 06:08:46 -0800 (PST) X-Google-Smtp-Source: AFSGD/X+g0R/bajwJkSsMG9AqYIz38CH0pOeg3icim44skklmpR+7N79Xeg6GoXyu458VVDR7907 X-Received: by 2002:a62:5003:: with SMTP id e3mr3073939pfb.23.1544796525999; Fri, 14 Dec 2018 06:08:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1544796525; cv=none; d=google.com; s=arc-20160816; b=VIPfTDOjwXQckdYAnNswEB6dhOqN+BzQxwzRLBJJbffRuVMXvryupbhnA+GLBdSO0X gNWZFQ/Y3VPoC+hStVWDEshv1eWMOmR9X4Jjvo6hmxtl+sLxAbPTsC9lkiC5WK/34O58 E/BFeQAk0FnXbbj+yVxzTT80GUDIozUeKyZ5IKSVv9xvfwHfIi4L5OiXb/Yg3RuolOub o1ZwfuALgQg/A+eyD8PU0hB5MHA+1NeEL/bVl//9CzOxaVkvjh+RkggkSPr5/TO1UhtM kQFK1aGoUtEx3eZ2+NK8/N5hPxsyYAb2zzXt5FtzJeaDiEAsIINEG1uHZ6c5zFH/JQcp pMUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=ZgsFKEEkJgmDfVzy+89XL7/J64Z88i1Ao5CTEqg2jYY=; b=YPZYM49OArwFCPDCuFgrE0sDLNJUXZFsPdY8oqPV2+cM0VtXPo5oslrkXa2OVcNPUF lnegpKpeIS+ptV1KHk/zY/gx55QCPHS8mU4qxkYfKLLarp/VC6n7n+1eCJde1hZHcZKo iB3Hic6Bzm8h/dAvJQrrdxadR//DemVgiTx3aWgFAmGOFCVqBqhw5Zpd02xFcCsg7yxx 3I72W4iXMPjuEoehkVbPPnjIGcQKQacl0t+K7EszlmYrXe7OeoIDDfNKSsBpVxjWsAQV a1CqcKUPPVaBvqz1WYiRBC9CQx8JR87MPusCuxtW250XTxRhGu+K5spvFXcr84FMevvu Yj/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i13si4179505pgi.260.2018.12.14.06.08.17; Fri, 14 Dec 2018 06:08:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729991AbeLNOG7 (ORCPT + 99 others); Fri, 14 Dec 2018 09:06:59 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53916 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726554AbeLNOG6 (ORCPT ); Fri, 14 Dec 2018 09:06:58 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D76733091751; Fri, 14 Dec 2018 14:06:57 +0000 (UTC) Received: from work-vm (ovpn-117-198.ams2.redhat.com [10.36.117.198]) by smtp.corp.redhat.com (Postfix) with ESMTPS id ED721261B7; Fri, 14 Dec 2018 14:06:49 +0000 (UTC) Date: Fri, 14 Dec 2018 14:06:47 +0000 From: "Dr. David Alan Gilbert" To: Cornelia Huck Cc: Stefan Hajnoczi , David Hildenbrand , Vivek Goyal , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, miklos@szeredi.hu, sweil@redhat.com, swhiteho@redhat.com Subject: Re: [PATCH 18/52] virtio-fs: Map cache using the values from the capabilities Message-ID: <20181214140646.GG2454@work-vm> References: <20181213091320.GA2313@work-vm> <20181213100058.GC2313@work-vm> <96d9ea85-ddf3-3331-77ce-124475b26da4@redhat.com> <20181213121548.GN2313@work-vm> <0f1b43f6-57e3-c6d2-7ffe-cf783e125a7b@redhat.com> <20181213133823.2272736b.cohuck@redhat.com> <20181214134434.GA3882@stefanha-x1.localdomain> <20181214145058.6071bdac.cohuck@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181214145058.6071bdac.cohuck@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.41]); Fri, 14 Dec 2018 14:06:58 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Cornelia Huck (cohuck@redhat.com) wrote: > On Fri, 14 Dec 2018 13:44:34 +0000 > Stefan Hajnoczi wrote: > > > On Thu, Dec 13, 2018 at 01:38:23PM +0100, Cornelia Huck wrote: > > > On Thu, 13 Dec 2018 13:24:31 +0100 > > > David Hildenbrand wrote: > > > > > > > On 13.12.18 13:15, Dr. David Alan Gilbert wrote: > > > > > * David Hildenbrand (david@redhat.com) wrote: > > > > >> On 13.12.18 11:00, Dr. David Alan Gilbert wrote: > > > > >>> * David Hildenbrand (david@redhat.com) wrote: > > > > >>>> On 13.12.18 10:13, Dr. David Alan Gilbert wrote: > > > > >>>>> * David Hildenbrand (david@redhat.com) wrote: > > > > >>>>>> On 10.12.18 18:12, Vivek Goyal wrote: > > > > >>>>>>> Instead of assuming we had the fixed bar for the cache, use the > > > > >>>>>>> value from the capabilities. > > > > >>>>>>> > > > > >>>>>>> Signed-off-by: Dr. David Alan Gilbert > > > > >>>>>>> --- > > > > >>>>>>> fs/fuse/virtio_fs.c | 32 +++++++++++++++++--------------- > > > > >>>>>>> 1 file changed, 17 insertions(+), 15 deletions(-) > > > > >>>>>>> > > > > >>>>>>> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c > > > > >>>>>>> index 60d496c16841..55bac1465536 100644 > > > > >>>>>>> --- a/fs/fuse/virtio_fs.c > > > > >>>>>>> +++ b/fs/fuse/virtio_fs.c > > > > >>>>>>> @@ -14,11 +14,6 @@ > > > > >>>>>>> #include > > > > >>>>>>> #include "fuse_i.h" > > > > >>>>>>> > > > > >>>>>>> -enum { > > > > >>>>>>> - /* PCI BAR number of the virtio-fs DAX window */ > > > > >>>>>>> - VIRTIO_FS_WINDOW_BAR = 2, > > > > >>>>>>> -}; > > > > >>>>>>> - > > > > >>>>>>> /* List of virtio-fs device instances and a lock for the list */ > > > > >>>>>>> static DEFINE_MUTEX(virtio_fs_mutex); > > > > >>>>>>> static LIST_HEAD(virtio_fs_instances); > > > > >>>>>>> @@ -518,7 +513,7 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) > > > > >>>>>>> struct dev_pagemap *pgmap; > > > > >>>>>>> struct pci_dev *pci_dev; > > > > >>>>>>> phys_addr_t phys_addr; > > > > >>>>>>> - size_t len; > > > > >>>>>>> + size_t bar_len; > > > > >>>>>>> int ret; > > > > >>>>>>> u8 have_cache, cache_bar; > > > > >>>>>>> u64 cache_offset, cache_len; > > > > >>>>>>> @@ -551,17 +546,13 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs) > > > > >>>>>>> } > > > > >>>>>>> > > > > >>>>>>> /* TODO handle case where device doesn't expose BAR? */ > > > > >>>>>> > > > > >>>>>> For virtio-pmem we decided to not go via BARs as this would effectively > > > > >>>>>> make it only usable for virtio-pci implementers. Instead, we are going > > > > >>>>>> to export the applicable physical device region directly (e.g. > > > > >>>>>> phys_start, phys_size in virtio config), so it is decoupled from PCI > > > > >>>>>> details. Doing the same for virtio-fs would allow e.g. also virtio-ccw > > > > >>>>>> to make eventually use of this. > > > > >>>>> > > > > >>>>> That makes it a very odd looking PCI device; I can see that with > > > > >>>>> virtio-pmem it makes some sense, given that it's job is to expose > > > > >>>>> arbitrary chunks of memory. > > > > >>>>> > > > > >>>>> Dave > > > > >>>> > > > > >>>> Well, the fact that your are > > > > >>>> > > > > >>>> - including > > > > >>>> - adding pci related code > > > > >>>> > > > > >>>> in/to fs/fuse/virtio_fs.c > > > > >>>> > > > > >>>> tells me that these properties might be better communicated on the > > > > >>>> virtio layer, not on the PCI layer. > > > > >>>> > > > > >>>> Or do you really want to glue virtio-fs to virtio-pci for all eternity? > > > > >>> > > > > >>> No, these need cleaning up; and the split within the bar > > > > >>> is probably going to change to be communicated via virtio layer > > > > >>> rather than pci capabilities. However, I don't want to make our PCI > > > > >>> device look odd, just to make portability to non-PCI devices - so it's > > > > >>> right to make the split appropriately, but still to use PCI bars > > > > >>> for what they were designed for. > > > > >>> > > > > >>> Dave > > > > >> > > > > >> Let's discuss after the cleanup. In general I am not convinced this is > > > > >> the right thing to do. Using virtio-pci for anything else than pure > > > > >> transport smells like bad design to me (well, I am no virtio expert > > > > >> after all ;) ). No matter what PCI bars were designed for. If we can't > > > > >> get the same running with e.g. virtio-ccw or virtio-whatever, it is > > > > >> broken by design (or an addon that is tightly glued to virtio-pci, if > > > > >> that is the general idea). > > > > > > > > > > I'm sure we can find alternatives for virtio-*, so I wouldn't expect > > > > > it to be glued to virtio-pci. > > > > > > > > > > Dave > > > > > > > > As s390x does not have the concept of memory mapped io (RAM is RAM, > > > > nothing else), this is not architectured. vitio-ccw can therefore not > > > > define anything similar like that. However, in virtual environments we > > > > can do whatever we want on top of the pure transport (e.g. on the virtio > > > > layer). > > > > > > > > Conny can correct me if I am wrong. > > > > > > I don't think you're wrong, but I haven't read the code yet and I'm > > > therefore not aware of the purpose of this BAR. > > > > > > Generally, if there is a memory location shared between host and guest, > > > we need a way to communicate its location, which will likely differ > > > between transports. For ccw, I could imagine a new channel command > > > dedicated to exchanging configuration information (similar to what > > > exists today to communicate the locations of virtqueues), but I'd > > > rather not go down this path. > > > > > > Without reading the code/design further, can we use one of the > > > following instead of a BAR: > > > - a virtqueue; > > > - something in config space? > > > That would be implementable by any virtio transport. > > > > The way I think about this is that we wish to extend the VIRTIO device > > model with the concept of shared memory. virtio-fs, virtio-gpu, and > > virtio-vhost-user all have requirements for shared memory. > > > > This seems like a transport-level issue to me. PCI supports > > memory-mapped I/O and that's the right place to do it. If you try to > > put it into config space or the virtqueue, you'll end up with something > > that cannot be realized as a PCI device because it bypasses PCI bus > > address translation. > > > > If CCW needs a side-channel, that's fine. But that side-channel is a > > CCW-specific mechanism and probably doesn't apply to all other > > transports. > > But virtio-gpu works with ccw right now (I haven't checked what it > uses); can virtio-fs use an equivalent method? > > If there's a more generic case to be made for extending virtio devices > with a way to handle shared memory, a ccw for that would be fine. I > just want to avoid adding new ccws for everything as the namespace is > not infinite. In our case we've got somewhere between 0..3 ranges of memory, and I was specifying them as PCI capabilities; however Gerd's suggestion was that it would be better to just use 1 bar and then have something as part of virtio or the like to split them up. If we do that, then we could have something of the form (index, base, length) for each of the regions, where in the PCI case 'index' means BAR and in CCW it means something else. (For mmio it's probably irrelevant and the base is probably a physical address). Dave -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK