Received: by 10.213.65.68 with SMTP id h4csp4255769imn; Tue, 10 Apr 2018 11:46:54 -0700 (PDT) X-Google-Smtp-Source: AIpwx48QbIpLGWptnKaFJJ4sDF/XRKmC7cWWCTrd77u823x2xg5Wajo1HzG50iIWEAIwyxmwQjmt X-Received: by 10.99.110.198 with SMTP id j189mr1072314pgc.71.1523386014680; Tue, 10 Apr 2018 11:46:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523386014; cv=none; d=google.com; s=arc-20160816; b=TjbqDShIHrVw1MSNXX8sK1XljaW79mEZ0QTILb8e4YCD8Wrf3pXLfaJ/Q0cfgCze9J vk5S4ia/2wshJiGuNDYgJiciWKQnIGXsuuMs1tTCxzlrlN0kQn5a1xQhWnQQPn3qsSGm ClfPNa4LyoaWFCs9FRTDjuT4nOi1ysjboXGRd5714MjKbUbKpJngBzuA/+q/J1pgaXYO 09PGWjXWq53eUP0S4LqTL5FPlI6/NBybMS6QI5/dJwpvoivQmvW8QmlSh3JqjNnScIyI Mkl5T/uVIcmowlKDglzZDid9BEhpmPrck7DZILM4fUxO5E1pJEPF3lK/PfDBsn2lW9YV c6lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=keGd1hvO/fxMZa89TxW6tNWEEZBnMyL+nXUSNulF4bY=; b=XSy+o7AQQjXFQH23lUfkSRmheg4HHycSeUWJaaP4KpkBO0tbQCcn2zotL4tT+A+Gda R8Q8z1uxtBiJiMPw8Ill0tjMTMGyQ2kVaZZu9Nf6znJj8oohcoJJ08W/UxfPk8btB3eS luwumBz0sqgI974MaWg6eKmuNERbrZbGdykFat3NVCYRzUkQk+zcRMLwgEBqWGrRFQ4f RtBIOhYXnA+N8XeWtOChvi8rs29bfI4I2qkSrkobcTjzOn6HkAIzDary4e+KUfs2zaoZ plUTLCKiuFvYLOgq6RgOzogPSe0okGeCeS+8Deo+og0i07pgnEtOf1+eyS9O2NBNbFVl BC0A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w129si2522481pfd.325.2018.04.10.11.46.17; Tue, 10 Apr 2018 11:46:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752775AbeDJSnk (ORCPT + 99 others); Tue, 10 Apr 2018 14:43:40 -0400 Received: from mondschein.lichtvoll.de ([194.150.191.11]:43705 "EHLO mail.lichtvoll.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751940AbeDJSnj (ORCPT ); Tue, 10 Apr 2018 14:43:39 -0400 Authentication-Results: auth=pass smtp.auth=martin smtp.mailfrom=martin@lichtvoll.de Received: from 127.0.0.1 (localhost [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.lichtvoll.de (Postfix) with ESMTPSA id 858C02EF475; Tue, 10 Apr 2018 20:43:37 +0200 (CEST) From: Martin Steigerwald To: Tejun Heo Cc: Sitsofe Wheeler , Jens Axboe , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, Stefan Haberland , Jan Hoeppner , Bart Van Assche Subject: Re: [PATCH] blk-mq: Directly schedule q->timeout_work when aborting a request Date: Tue, 10 Apr 2018 20:43:34 +0200 Message-ID: <2699418.6CV4rbzP8d@merkaba> In-Reply-To: <20180402220458.GJ388343@devbig577.frc2.facebook.com> References: <20180402220458.GJ388343@devbig577.frc2.facebook.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Tejun Heo - 03.04.18, 00:04: > Request abortion is performed by overriding deadline to now and > scheduling timeout handling immediately. For the latter part, the > code was using mod_timer(timeout, 0) which can't guarantee that the > timer runs afterwards. Let's schedule the underlying work item > directly instead. > > This fixes the hangs during probing reported by Sitsofe but it isn't > yet clear to me how the failure can happen reliably if it's just the > above described race condition. Compiling a 4.16.1 kernel with that patch to test whether this fixes the boot hang I reported in: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and boot failures with blk_mq_terminate_expired in backtrace https://bugzilla.kernel.org/show_bug.cgi?id=199077 The "Error updating SMART data during runtime" thing I reported there as well may still be another (independent) issue. > Signed-off-by: Tejun Heo > Reported-by: Sitsofe Wheeler > Reported-by: Meelis Roos > Fixes: 358f70da49d7 ("blk-mq: make blk_abort_request() trigger timeout > path") Cc: stable@vger.kernel.org # v4.16 > Link: > http://lkml.kernel.org/r/CALjAwxh-PVYFnYFCJpGOja+m5SzZ8Sa4J7ohxdK=r8NyOF-EM > A@mail.gmail.com Link: > http://lkml.kernel.org/r/alpine.LRH.2.21.1802261049140.4893@math.ut.ee --- > Hello, > > I don't have the full explanation yet but here's a preliminary patch. > > Thanks. > > block/blk-timeout.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/block/blk-timeout.c b/block/blk-timeout.c > index a05e367..f0e6e41 100644 > --- a/block/blk-timeout.c > +++ b/block/blk-timeout.c > @@ -165,7 +165,7 @@ void blk_abort_request(struct request *req) > * No need for fancy synchronizations. > */ > blk_rq_set_deadline(req, jiffies); > - mod_timer(&req->q->timeout, 0); > + kblockd_schedule_work(&req->q->timeout_work); > } else { > if (blk_mark_rq_complete(req)) > return; -- Martin