Date: Tue, 22 Aug 2017 08:32:28 +0100
From: Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
To: annie li <annie.li@oracle.com>, <xen-devel@lists.xenproject.org>,
        <linux-kernel@vger.kernel.org>
Subject: Re: [Xen-devel] [PATCH 1/1] xen-blkback: stop blkback thread of
 every queue in xen_blkif_disconnect
Message-ID: <20170822073228.wanqwaqkb5edfvwh@MacBook-Pro-de-Roger.local>
References: <1503009826-3363-1-git-send-email-annie.li@oracle.com>
 <20170818091411.cl2drb5mofmo3oav@MacBook-Pro-de-Roger.local>
 <f750d078-26a7-dd43-4c0a-2506c04300a0@oracle.com>
 <20170818172406.yupdjusjxx2mhu6d@MacBook-Pro-de-Roger.local>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20170818172406.yupdjusjxx2mhu6d@MacBook-Pro-de-Roger.local>
User-Agent: NeoMutt/20170714 (1.8.3)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3468
Lines: 71

On Fri, Aug 18, 2017 at 06:24:06PM +0100, Roger Pau Monn? wrote:
> On Fri, Aug 18, 2017 at 10:29:15AM -0400, annie li wrote:
> > 
> > On 8/18/2017 5:14 AM, Roger Pau Monn? wrote:
> > > On Thu, Aug 17, 2017 at 06:43:46PM -0400, Annie Li wrote:
> > > > If there is inflight I/O in any non-last queue, blkback returns -EBUSY
> > > > directly, and never stops thread of remaining queue and processs them. When
> > > > removing vbd device with lots of disk I/O load, some queues with inflight
> > > > I/O still have blkback thread running even though the corresponding vbd
> > > > device or guest is gone.
> > > > And this could cause some problems, for example, if the backend device type
> > > > is file, some loop devices and blkback thread always lingers there forever
> > > > after guest is destroyed, and this causes failure of umounting repositories
> > > > unless rebooting the dom0. So stop all threads properly and return -EBUSY
> > > > if any queue has inflight I/O.
> > > > 
> > > > Signed-off-by: Annie Li <annie.li@oracle.com>
> > > > Reviewed-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
> > > > Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com>
> > > > Reviewed-by: Adnan Misherfi <adnan.misherfi@oracle.com>
> > > > ---
> > > >   drivers/block/xen-blkback/xenbus.c | 10 ++++++++--
> > > >   1 file changed, 8 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
> > > > index 792da68..2adb859 100644
> > > > --- a/drivers/block/xen-blkback/xenbus.c
> > > > +++ b/drivers/block/xen-blkback/xenbus.c
> > > > @@ -244,6 +244,7 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
> > > >   {
> > > >   	struct pending_req *req, *n;
> > > >   	unsigned int j, r;
> > > > +	bool busy = false;
> > > >   	for (r = 0; r < blkif->nr_rings; r++) {
> > > >   		struct xen_blkif_ring *ring = &blkif->rings[r];
> > > > @@ -261,8 +262,10 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
> > > >   		 * don't have any discard_io or other_io requests. So, checking
> > > >   		 * for inflight IO is enough.
> > > >   		 */
> > > > -		if (atomic_read(&ring->inflight) > 0)
> > > > -			return -EBUSY;
> > > > +		if (atomic_read(&ring->inflight) > 0) {
> > > > +			busy = true;
> > > > +			continue;
> > > > +		}
> > > I guess I'm missing something, but I don't see how this is solving the
> > > problem described in the description.
> > > 
> > > If the problem is that xen_blkif_disconnect returns without cleaning
> > > all the queues, this patch keeps the current behavior, just that it
> > > will try to remove more queues before returning, as opposed to
> > > returning when finding the first busy queue.
> > Before checking inflight, following code stops the blkback thread,
> >                 if (ring->xenblkd) {
> >                         kthread_stop(ring->xenblkd);
> >                         wake_up(&ring->shutdown_wq);
> >                 }
> > This patch allows thread of every queue has the chance to get stopped.
> > Otherwise, only thread of queue before(including) first busy one get
> > stopped, threads of remaining queue will still run, and these blkthread and
> > corresponding loop device will linger forever even after guest is destroyed.
> 
> Thanks for the explanation:
> 
> Acked-by: Roger Pau Monn? <roger.pau@citrix.com>

Forgot to add, this needs to be backported to stable branches, so:

Cc: stable@vger.kernel.org

Roger.