Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp849646ybi; Fri, 12 Jul 2019 05:40:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqy4C5C2e9Sc/Ra2CNZGavZBg/HX8587nxvE83ULoRHUtYJurZYMxK/lwBEE1eyo1lyd/PFH X-Received: by 2002:a63:1d0e:: with SMTP id d14mr10616142pgd.324.1562935254218; Fri, 12 Jul 2019 05:40:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562935254; cv=none; d=google.com; s=arc-20160816; b=MqTGRfHU39Qc7jI17uTf5cRZ2ZBeM6DnT7dMWYy7CrKVypWypPf+lsEEfk4kuAJlwV xQ9UqEGEKlju1Tv17yUBYYsL4kKx5fNFMBJb78CFTVU/QlWSlOM2Q7x8FrVV9FwIHTaa zYT0cAlsxlIpazCFcaPtW/THGMlIX5BW9gPbOImew8FjRCcoQWT4NdVDOz2KGh61TvWq CnSlo7f9O9DxEbELsXkwx6znH4ibu4D6ualXWkc7Ao2pGuFAa6a7X1CC9TYpByWu5YJa 3Fzi5bamk2zZLAyC7xr/HSCH7ze2fanwnowJzbHOXwkGu2K9wiWI3QDrlLGAKr12lbef 5vqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=F1oo8FFiIOHJpLDENkwY8+OltEA28e0f+OIBYqV1Yxk=; b=OCe6dhhmzStp+2CYpICuRrd459KrPEKWmDdw2mFf5biHPTjufVJB4uHm7QObUPKjv1 AkFaewRiIcNPRtxeqbYQrjtJzq/MXBjcSQ9pxgP05EPpyNbNp72+RbQtffa7nk6PJhOo Ti8afKAiXPcJkfJqpkenLwkC3PhruhMQZIkAGeyI+NfQ7yqVszmo0CGhQuooSa83RrN1 nT8KZj/587TLFH6Ld2SmY7ktO5BJLzA5L1F58ZpSoS5cAtdL8R9rLjuThj1bztcOzD6r pojStvpuzSKt/kEZy/jVFUJjao33N4CkzXwBG7lZQPH4GVfc8NKRRcITrDgx/vWabXBA LYhg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="JLCyzVo/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k33si7761858pld.359.2019.07.12.05.40.38; Fri, 12 Jul 2019 05:40:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b="JLCyzVo/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728975AbfGLM30 (ORCPT + 99 others); Fri, 12 Jul 2019 08:29:26 -0400 Received: from mail.kernel.org ([198.145.29.99]:44114 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727442AbfGLM3T (ORCPT ); Fri, 12 Jul 2019 08:29:19 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 04BB9216B7; Fri, 12 Jul 2019 12:29:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1562934558; bh=8p+HUDU4IayyF4sZp3D9bYRweU5I1ssvFal9s1OG5iw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JLCyzVo/CnGBZFRDZj0H8oNxBgDht6DuV3y4quiPYVSmcqhS6MeNS14MeUURLR3Yj OKhH+OeN8jla4OvArgX1KyuYKIbxepvkRYRzOeWXn/EcCAbH8fWgcnZc5Dl3epJOAl RUzf8+RjLhgv3ITN2N6mkqsJ/QWpMXAv8gX05kf0= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Paolo Valente , Douglas Anderson , Jens Axboe Subject: [PATCH 5.1 095/138] block, bfq: NULL out the bic when its no longer valid Date: Fri, 12 Jul 2019 14:19:19 +0200 Message-Id: <20190712121632.409883135@linuxfoundation.org> X-Mailer: git-send-email 2.22.0 In-Reply-To: <20190712121628.731888964@linuxfoundation.org> References: <20190712121628.731888964@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Douglas Anderson commit dbc3117d4ca9e17819ac73501e914b8422686750 upstream. In reboot tests on several devices we were seeing a "use after free" when slub_debug or KASAN was enabled. The kernel complained about: Unable to handle kernel paging request at virtual address 6b6b6c2b ...which is a classic sign of use after free under slub_debug. The stack crawl in kgdb looked like: 0 test_bit (addr=, nr=) 1 bfq_bfqq_busy (bfqq=) 2 bfq_select_queue (bfqd=) 3 __bfq_dispatch_request (hctx=) 4 bfq_dispatch_request (hctx=) 5 0xc056ef00 in blk_mq_do_dispatch_sched (hctx=0xed249440) 6 0xc056f728 in blk_mq_sched_dispatch_requests (hctx=0xed249440) 7 0xc0568d24 in __blk_mq_run_hw_queue (hctx=0xed249440) 8 0xc0568d94 in blk_mq_run_work_fn (work=) 9 0xc024c5c4 in process_one_work (worker=0xec6d4640, work=0xed249480) 10 0xc024cff4 in worker_thread (__worker=0xec6d4640) Digging in kgdb, it could be found that, though bfqq looked fine, bfqq->bic had been freed. Through further digging, I postulated that perhaps it is illegal to access a "bic" (AKA an "icq") after bfq_exit_icq() had been called because the "bic" can be freed at some point in time after this call is made. I confirmed that there certainly were cases where the exact crashing code path would access the "bic" after bfq_exit_icq() had been called. Sspecifically I set the "bfqq->bic" to (void *)0x7 and saw that the bic was 0x7 at the time of the crash. To understand a bit more about why this crash was fairly uncommon (I saw it only once in a few hundred reboots), you can see that much of the time bfq_exit_icq_fbqq() fully frees the bfqq and thus it can't access the ->bic anymore. The only case it doesn't is if bfq_put_queue() sees a reference still held. However, even in the case when bfqq isn't freed, the crash is still rare. Why? I tracked what happened to the "bic" after the exit routine. It doesn't get freed right away. Rather, put_io_context_active() eventually called put_io_context() which queued up freeing on a workqueue. The freeing then actually happened later than that through call_rcu(). Despite all these delays, some extra debugging showed that all the hoops could be jumped through in time and the memory could be freed causing the original crash. Phew! To make a long story short, assuming it truly is illegal to access an icq after the "exit_icq" callback is finished, this patch is needed. Cc: stable@vger.kernel.org Reviewed-by: Paolo Valente Signed-off-by: Douglas Anderson Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman --- block/bfq-iosched.c | 1 + 1 file changed, 1 insertion(+) --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -4265,6 +4265,7 @@ static void bfq_exit_icq_bfqq(struct bfq unsigned long flags; spin_lock_irqsave(&bfqd->lock, flags); + bfqq->bic = NULL; bfq_exit_bfqq(bfqd, bfqq); bic_set_bfqq(bic, NULL, is_sync); spin_unlock_irqrestore(&bfqd->lock, flags);