Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3682947yba; Tue, 9 Apr 2019 02:31:15 -0700 (PDT) X-Google-Smtp-Source: APXvYqxC8w/zJ+47zVsxuuEyDCOEd3XtpX+qTdzpNrxLe8ZAp4GSzp0qUo4BqatwsPawQBG+OHHY X-Received: by 2002:a63:9246:: with SMTP id s6mr34353300pgn.316.1554802275107; Tue, 09 Apr 2019 02:31:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554802275; cv=none; d=google.com; s=arc-20160816; b=KNOjCpdI5djlES3eyz2FZkS9aHF8Z7w964d3qNSHFUo+KvONkbNF831BbIMN6KHne4 9Z8jO+wt5CghPNBSSo76dBssNlmIhoFie0J61nSA1uhV7DhuZHEdoXVqJ3Ey3iJMJhYn 0LVJGzvrpY3+w9pItR01GZUzm7PEEiuzOQPxQwnBmTH505HcMJAk0OYYIwGMS7zEZFe9 moarettcK+Zuv84Ex5M5aAda6xEvricoQphuyq2MidhtDPlJ3w6SFfc2FxkR+9PX0kIm Jyl+Ubuv8nmpuWXJ3LfLva8kYhJp1PSnLyt0uiPVNP5cm6qF+mvYON0h62bxQ4NAeB/j Grgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=/lPkJNKAbh4YluUe8wkzivO7jBarzDZkEmu1FMVG3GQ=; b=OHC85pfkAv91p1neiKJNmHN6Ygn3sLjlIMg0pNYFTZLYqH2ESAx4KC+ZHYv/TS32gF Ip+rDjCZvPOQKefJ9kApVN9YdcC8hFFUBeUy2B6w8GzBvvwkQoYKWakZLAMvadcOfEgA wtSdfs0Q/43A/V/j83N4HmU4bzYilDEe1Tb+mIXJY4WKa2jfSYEb8ahyAO80N+K5N7/s MszrhiEqYJA6IqOc6Rit8fyyJNyBbnxG2Ee41AuSLIs0HSJnkKPB/HMqF04kySV5SCxE M+penlbufI1/sb/RByVX7BHuZd3SOzp0aAG5VO0xO8lG+B7d6JDBRpEr++cgNKcUxjez 41nA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=aGnDMBWt; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r24si4819155pls.398.2019.04.09.02.30.58; Tue, 09 Apr 2019 02:31:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=aGnDMBWt; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727020AbfDIJ34 (ORCPT + 99 others); Tue, 9 Apr 2019 05:29:56 -0400 Received: from mail-ot1-f66.google.com ([209.85.210.66]:33378 "EHLO mail-ot1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726981AbfDIJ3z (ORCPT ); Tue, 9 Apr 2019 05:29:55 -0400 Received: by mail-ot1-f66.google.com with SMTP id j10so14955419otq.0; Tue, 09 Apr 2019 02:29:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=/lPkJNKAbh4YluUe8wkzivO7jBarzDZkEmu1FMVG3GQ=; b=aGnDMBWtmpLTdQTm9E6aK+e7QRyTXyCpmqfp9w1FOho/D+h9pMERi+bbOX1sRHSHtg SRGD80dLCHWLD18WCy//UP4tepcwUJrf9p/1maM74Uz1zQJvlz/GRz3nrG6lWGeQp8Jv BsA89mVR3O7O/o/SqiQkzn3pzWVZO7GcCJQoR6f3pt14w7UHH/l01bGlolaq21QtmB7G fa/hIKlQL50cf3jd6ZKlsqOMhx00zpYsHrpGbN8RZJeLyzfWcMUwW9dWAwU0DGz61HU7 PukIMJK1/mMeLluV4DbJYkaM2fQ/QSnGtsbc0N8oy/OpvN2okLB2NSfrDnNCt6DKWst/ NxQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=/lPkJNKAbh4YluUe8wkzivO7jBarzDZkEmu1FMVG3GQ=; b=FozpherMoKIkVp3xRzLOKJQveJdMv32NoKBLb4nYv7XLFKv6FqmDHgdwSmLK2G++o1 LEXjW2V7IjKJGdUIRRS4Rr7IZfU0O1NbKP7Be9/BTRXYK8avNJWgppMHzEAZanPnJLtQ jrURbZZHq5DvUpdlqhlDHeeP/8f831KCzoTJRwR8KnoK/vpqm3x5DKK+q9u9gDdK8sLD 67DQTC2iGiN45gBOed9q+CrowtWlQM7+Yi5lslC5NubV9k5ul1CESQP+nNIzz+tWu7MZ yEeS7hgoI70+EI6Icxrdw34itEzs06m5E7xiug81FKRvIA+us8+G2xcapDEQpJVCrji9 Ex4w== X-Gm-Message-State: APjAAAVSFB+TZXZAaYp0LUEQBC4eDmv0glWGuhTFl98iO4uB4lHhnMOL Hr1TqyhOFsuf16xEsfWsqj1ZshqHZFFkIjyDeNo= X-Received: by 2002:a05:6830:183:: with SMTP id q3mr23049815ota.204.1554802194207; Tue, 09 Apr 2019 02:29:54 -0700 (PDT) MIME-Version: 1.0 References: <20190409090828.16282-1-bob.liu@oracle.com> In-Reply-To: <20190409090828.16282-1-bob.liu@oracle.com> From: Jinpu Wang Date: Tue, 9 Apr 2019 11:29:42 +0200 Message-ID: Subject: Re: [RESEND PATCH] blk-mq: fix hang caused by freeze/unfreeze sequence To: Bob Liu , r.peniaev@gmail.com Cc: linux-block@vger.kernel.org, shirley.ma@oracle.com, "Martin K. Petersen" , Akinobu Mita , Tejun Heo , Jens Axboe , Christoph Hellwig , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Bob Liu =E4=BA=8E2019=E5=B9=B44=E6=9C=889=E6=97=A5=E5= =91=A8=E4=BA=8C =E4=B8=8A=E5=8D=8811:11=E5=86=99=E9=81=93=EF=BC=9A > > This patch was proposed by Roman Pen[3] years ago. > Recently we hit a bug which is likely caused by the same reason,so rebase= d his > fix to v5.1 and resend. > Below is almost copied from that patch[3]. > > ------ > Long time ago there was a similar fix proposed by Akinobu Mita[1], > but it seems that time everyone decided to fix this subtle race in > percpu-refcount and Tejun Heo[2] did an attempt (as I can see that > patchset was not applied). > > The following is a description of a hang in blk_mq_freeze_queue_wait() - > same fix but a bug from another angle. > > The hang happens on attempt to freeze a queue while another task does > queue unfreeze. > > The root cause is an incorrect sequence of percpu_ref_reinit() and > percpu_ref_kill() and as a result those two can be swapped: > > CPU#0 CPU#1 > ---------------- ----------------- > percpu_ref_kill() > > percpu_ref_kill() << atomic reference does > percpu_ref_reinit() << not guarantee the order > > blk_mq_freeze_queue_wait() << HANG HERE > > percpu_ref_reinit() > > Firstly this wrong sequence raises two kernel warnings: > > 1st. WARNING at lib/percpu-recount.c:309 > percpu_ref_kill_and_confirm called more than once > > 2nd. WARNING at lib/percpu-refcount.c:331 > > But the most unpleasant effect is a hang of a blk_mq_freeze_queue_wait(), > which waits for a zero of a q_usage_counter, which never happens > because percpu-ref was reinited (instead of being killed) and stays in > PERCPU state forever. > > The simplified sequence above can be reproduced on shared tags, when > queue A is going to die meanwhile another queue B is in init state and > is trying to freeze the queue A, which shares the same tags set: > > CPU#0 CPU#1 > ------------------------------- ------------------------------------ > q1 =3D blk_mq_init_queue(shared_tags) > > q2 =3D blk_mq_init_queue(shared_tags): > blk_mq_add_queue_tag_set(shared_tags): > blk_mq_update_tag_set_depth(shared_ta= gs): > blk_mq_freeze_queue(q1) > blk_cleanup_queue(q1) ... > blk_mq_freeze_queue(q1) <<<->>> blk_mq_unfreeze_queue(q1) > > [1] Message id: 1443287365-4244-7-git-send-email-akinobu.mita@gmail.com > [2] Message id: 1443563240-29306-6-git-send-email-tj@kernel.org > [3] https://patchwork.kernel.org/patch/9268199/ > > Signed-off-by: Roman Pen > Signed-off-by: Bob Liu > Cc: Akinobu Mita > Cc: Tejun Heo > Cc: Jens Axboe > Cc: Christoph Hellwig > Cc: linux-block@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Replaced Roman's email address. We at 1 & 1 IONOS (former ProfitBricks) have been carried this patch for some years, it has been running in production for some years too, would be good to see it in upstream :) Thanks, Jack Wang Linux Kernel Developer @ 1 & 1 IONOS