Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp2714822pxb; Mon, 6 Sep 2021 03:49:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxE1soT2KoAtT2+p2js2RedBV5dLaLn0TKrst6iqfoPNLoJuqTq/jFdDNiV+EAf7grHcdiy X-Received: by 2002:a17:906:5045:: with SMTP id e5mr13447466ejk.239.1630925399793; Mon, 06 Sep 2021 03:49:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630925399; cv=none; d=google.com; s=arc-20160816; b=aczWmF5OlNxp0k7rX1gNNFMle7rxctQMHEkfFDaZB42C+81CDptF54BiI6EJBb11tt 1zE0OAa0PPDhQ9YMW7TYaKmsXdA7RenGeBhfuEwLyEuVeo+U+82cp30GHjSpPw1OWmUy /j98KLsORvl0d7y0McMSoAJAHUZutvjk0lvEofiLoY6TJk2/IXwKqMq/pitro4IYGy32 oG3xwWfxs5CAvp3+ehJ8PtA0LD1TOEQNNb5MQyd6LT4qpYb6+IIjhRA5OHurRXxLCYfx BrKcHJLDwm75rkqg1A7RVkKNDSAt0fdhM02JObD5nkRcXrX2iG8b2oYdqgfxdnE7Z7mc UTLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Ul4qFH3gT696MqPqcmDxIOKc6JmUkzaJFABQhow9EWg=; b=zacgZ0LERcS5SdijN5QLa0l8ctBS5+zL9oDVr9ovmHm8PQT5zfrte7Dg3jg00C6Vkj iErVdBTyuDwsl8DfINmN9W7VLkLDcTH/WePh/OenXLwu9L5YlRc9wQf3fiiK7PtkFPjL D2gH9nKj6xDL90mpUFGwZOeoXTca8LAELU7+TPfknkIvcPKx6t/Qmxwx+PRWv9RXPTEf 4YFh3uwGGt734Hqhhr27nQdBTJcbj/wX6u2xK9H2GC5GjM69t8nyP7eoOt2IGdwNn1Lv LhIthVomhyevWF+MVp2zY9kPQSzEVPIYwuv1Spfa6sP4Ax6MbEm/Tb0LfMn9zKFv22pZ Yv/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="jS/4G4hs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z1si8476766edd.89.2021.09.06.03.49.36; Mon, 06 Sep 2021 03:49:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="jS/4G4hs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234532AbhIFKog (ORCPT + 99 others); Mon, 6 Sep 2021 06:44:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:49045 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230284AbhIFKob (ORCPT ); Mon, 6 Sep 2021 06:44:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630925006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Ul4qFH3gT696MqPqcmDxIOKc6JmUkzaJFABQhow9EWg=; b=jS/4G4hsSU7TeVPlCE0Ti6JQdICkGYroM73RM/9qW4a7z76rpZ0wkyDL8KuKuQ3qair+4f IlfyXpqlRCz0RxQQZtQGlxk1uVTIR5oun8JNUSsIF59ovWac6PsA9sdrFYvLaBtmVtvxDY 68nyTjuDh/EPGZY7cWwzhcO6nvqSTZ4= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-92-11cv2dvBNYKyXqpe40C_3w-1; Mon, 06 Sep 2021 06:43:25 -0400 X-MC-Unique: 11cv2dvBNYKyXqpe40C_3w-1 Received: by mail-wm1-f70.google.com with SMTP id s197-20020a1ca9ce000000b002e72ba822dcso3745283wme.6 for ; Mon, 06 Sep 2021 03:43:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Ul4qFH3gT696MqPqcmDxIOKc6JmUkzaJFABQhow9EWg=; b=TDpyLi8vbIAYMRBJ1wO+jGQkOKy+/inbpF5ehCXjxqAyEi7dYFt6VDI+o4siURyMfy jUlwFtVCUvIqufOs6zO3jdoWmYB4qvxqb45460jzr6GcGWc2U9RQW1rBfrel37Dyou1a 7bMQGJWMzVxKEp7xSsnqppbXM5dgxYbdFUjSIloME8WOAuMYrZOb5oEQ90O/nVTSE8iE g9ZiCjZOG4WUR27622ZmRkxk4LXHXdfRa3Go+PfsTGW5uEBoxo76/8O7JoT9Oea8Sniq 6pL7pGK74il9542WXj2juCo3Gc07fWqYSjlkk1oOkbkD3SWcp16GDoXQg66Q4cg3qQim AXfg== X-Gm-Message-State: AOAM531wx8XTbJ9Qg8B3+dUKuZ72HIQ/6zAFVxohCextF7eWNxKfb8nb 8gLnpPS+3QpN8fY3jAh+boHmohakhjhFoSzRFErBrNl8WdX7IesPCTHE7EScmeXFawNVLYOFjjw lRSw2KePZoUlo0Ij0WjYPDl6P X-Received: by 2002:a05:600c:3543:: with SMTP id i3mr10798059wmq.2.1630925004177; Mon, 06 Sep 2021 03:43:24 -0700 (PDT) X-Received: by 2002:a05:600c:3543:: with SMTP id i3mr10798024wmq.2.1630925003942; Mon, 06 Sep 2021 03:43:23 -0700 (PDT) Received: from redhat.com ([2.55.131.183]) by smtp.gmail.com with ESMTPSA id g5sm7424960wrq.80.2021.09.06.03.43.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Sep 2021 03:43:22 -0700 (PDT) Date: Mon, 6 Sep 2021 06:43:15 -0400 From: "Michael S. Tsirkin" To: Yongji Xie Cc: Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Parav Pandit , Christoph Hellwig , Christian Brauner , Randy Dunlap , Matthew Wilcox , Al Viro , Jens Axboe , bcrl@kvack.org, Jonathan Corbet , Mika =?iso-8859-1?Q?Penttil=E4?= , Dan Carpenter , joro@8bytes.org, Greg KH , He Zhe , Liu Xiaodong , Joe Perches , Robin Murphy , Will Deacon , John Garry , songmuchun@bytedance.com, virtualization , netdev@vger.kernel.org, kvm , linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel Subject: Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops Message-ID: <20210906053210-mutt-send-email-mst@kernel.org> References: <20210831103634.33-1-xieyongji@bytedance.com> <20210831103634.33-6-xieyongji@bytedance.com> <20210906015524-mutt-send-email-mst@kernel.org> <20210906023131-mutt-send-email-mst@kernel.org> <20210906035338-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 06, 2021 at 04:45:55PM +0800, Yongji Xie wrote: > On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin wrote: > > > > On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote: > > > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin wrote: > > > > > > > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote: > > > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin wrote: > > > > > > > > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > > > > > > This adds a new callback to support device specific reset > > > > > > > behavior. The vdpa bus driver will call the reset function > > > > > > > instead of setting status to zero during resetting. > > > > > > > > > > > > > > Signed-off-by: Xie Yongji > > > > > > > > > > > > > > > > > > This does gloss over a significant change though: > > > > > > > > > > > > > > > > > > > --- > > > > > > > @@ -348,12 +352,12 @@ static inline struct device *vdpa_get_dma_dev(struct vdpa_device *vdev) > > > > > > > return vdev->dma_dev; > > > > > > > } > > > > > > > > > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > > > > > > { > > > > > > > const struct vdpa_config_ops *ops = vdev->config; > > > > > > > > > > > > > > vdev->features_valid = false; > > > > > > > - ops->set_status(vdev, 0); > > > > > > > + return ops->reset(vdev); > > > > > > > } > > > > > > > > > > > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features) > > > > > > > > > > > > > > > > > > Unfortunately this breaks virtio_vdpa: > > > > > > > > > > > > > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > > > > > > { > > > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > > > > > > > > > > > vdpa_reset(vdpa); > > > > > > } > > > > > > > > > > > > > > > > > > and there's no easy way to fix this, kernel can't recover > > > > > > from a reset failure e.g. during driver unbind. > > > > > > > > > > > > > > > > Yes, but it should be safe with the protection of software IOTLB even > > > > > if the reset() fails during driver unbind. > > > > > > > > > > Thanks, > > > > > Yongji > > > > > > > > Hmm. I don't see it. > > > > What exactly will happen? What prevents device from poking at > > > > memory after reset? Note that dma unmap in e.g. del_vqs happens > > > > too late. > > > > > > But I didn't see any problems with touching the memory for virtqueues. > > > > Drivers make the assumption that after reset returns no new > > buffers will be consumed. For example a bunch of drivers > > call virtqueue_detach_unused_buf. > > I'm not sure if I get your point. But it looks like > virtqueue_detach_unused_buf() will check the driver's metadata first > rather than read the memory from virtqueue. > > > I can't say whether block makes this assumption anywhere. > > Needs careful auditing. > > > > > The memory should not be freed after dma unmap? > > > > But unmap does not happen until after the reset. > > > > I mean the memory is totally allocated and controlled by the VDUSE > driver. The VDUSE driver will not return them to the buddy system > unless userspace unmap it. Right. But what stops VDUSE from poking at memory after reset failed? > > > > > And the memory for the bounce buffer should also be safe to be > > > accessed by userspace in this case. > > > > > > > And what about e.g. interrupts? > > > > E.g. we have this: > > > > > > > > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */ > > > > vblk->vdev = NULL; > > > > > > > > and this is no longer true at this point. > > > > > > > > > > You're right. But I didn't see where the interrupt handler will use > > > the vblk->vdev. > > > > static void virtblk_done(struct virtqueue *vq) > > { > > struct virtio_blk *vblk = vq->vdev->priv; > > > > vq->vdev is the same as vblk->vdev. > > > > We will test the vq->ready (will be set to false in del_vqs()) before > injecting an interrupt in the VDUSE driver. So it should be OK? Maybe not ... It's not designed for such asynchronous access, so e.g. there's no locking or memory ordering around accesses. > > > > > So it seems to be not too late to fix it: > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > > > b/drivers/vdpa/vdpa_user/vduse_dev.c > > > index 5c25ff6483ad..ea41a7389a26 100644 > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > > > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct > > > vdpa_device *vdpa, unsigned int offset, > > > static int vduse_vdpa_reset(struct vdpa_device *vdpa) > > > { > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa); > > > + int ret; > > > > > > - if (vduse_dev_set_status(dev, 0)) > > > - return -EIO; > > > + ret = vduse_dev_set_status(dev, 0); > > > > > > vduse_dev_reset(dev); > > > > > > - return 0; > > > + return ret; > > > } > > > > > > static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa) > > > > > > Thanks, > > > Yongji > > > > Needs some comments to explain why it's done like this. > > > > This is used to make sure the userspace can't not inject the interrupt > any more after reset. The vduse_dev_reset() will clear the interrupt > callback and flush the irq kworker. > > > BTW device is generally wedged at this point right? > > E.g. if reset during initialization fails, userspace > > will still get the reset at some later point and be > > confused ... > > > > Sorry, I don't get why userspace will get the reset at some later point? > > Thanks, > Yongji I am generally a bit confused about how does reset work with vduse. We clearly want device to get back to its original state. How is that supposed to be achieved? -- MST