Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp1148175rdh; Fri, 24 Nov 2023 06:18:23 -0800 (PST) X-Google-Smtp-Source: AGHT+IFFeTTZi73SG7rxjuJ0GlWpD7AvC02amvL2c9vGzbnd4YBCPgcSgysbn48Np01Oet0c7nWi X-Received: by 2002:a17:90b:3883:b0:285:98dc:2d19 with SMTP id mu3-20020a17090b388300b0028598dc2d19mr1132908pjb.28.1700835503031; Fri, 24 Nov 2023 06:18:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700835503; cv=none; d=google.com; s=arc-20160816; b=eq9zEYwYFfDgqVH/n9Ro7IG2mTw5875Hntp9UYeRCQ+IqttHs5rpry7GQmmuxf1yiM j5jDVmRI+p9tBAM9rs6yKoI0OuSTtyUcuWKi+zHfg9rimCddlz81O2zeFijFaqy6/iT3 cBy3cW+atLVW1r4A8sEzguXbcKn7m9KXbg4uZP+p7T2RHjjc1VkNE2pwlMAFwUxM6/zr qHSQ64eEKC11vTo0oQywnCfr6ou2PxygxiPPGlVglBegvPMQE5NIvVvRQpz4G5Xy0C/n hGASli0225cP0TBegutx/K9GDrwIk1MGneN1sbxC3ZvUS/7Q8PavOYTcWNEf205IvelY uw2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version; bh=1xC1dK2L/48f4BNdukf4uXVtQyyjBOZMODgqXK6DV9o=; fh=Kr8usgNCFB+kQuhzE9vWv7yrrkp0F5/Db80E+VDewzA=; b=1ECNZ7DP3C7iYVA/v3tmyrG03kU45YX5k6951I7RrSYXUsYQ9KS+Sn9uFXl5Y5PB2c Az6DectuRoRpxmrEnh8nV88pAaEEW3V/pJzloEMN9iaxnYf/vcCKLy1Y5v/3v1fJXs3w UDMZMrP6p6JLGkh0V8ozRJNVX+SLWdAToh1J89dbw1c/DQKisOG+BcrjA/9WbddrLwd6 8tI1qm8gPHrBzXzvt37z4m46bNXupKZqzWH/SmHft+dY2dH0x3LaH335FLC31Z94sLo5 HsYIjg3yk56G6ghlJLOi9YxUJWSbt+ac0wHcpf+Ds1ldeA0gPoWYIU0u7vfFA2m1XND2 ALSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=suse.de Return-Path: Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id c12-20020a170902d48c00b001ca119ef46csi3572381plg.478.2023.11.24.06.18.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 06:18:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 4718180AE558; Fri, 24 Nov 2023 06:18:20 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231156AbjKXOSC convert rfc822-to-8bit (ORCPT + 99 others); Fri, 24 Nov 2023 09:18:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231177AbjKXOR7 (ORCPT ); Fri, 24 Nov 2023 09:17:59 -0500 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAE1919A5; Fri, 24 Nov 2023 06:18:05 -0800 (PST) Received: from imap2.dmz-prg2.suse.org (imap2.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 6E0E01FE7D; Fri, 24 Nov 2023 14:18:04 +0000 (UTC) Received: from imap2.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap2.dmz-prg2.suse.org (Postfix) with ESMTPS id BCFCE139E8; Fri, 24 Nov 2023 14:18:00 +0000 (UTC) Received: from dovecot-director2.suse.de ([10.150.64.162]) by imap2.dmz-prg2.suse.org with ESMTPSA id l0A5GpiwYGXWIAAAn2gu4w (envelope-from ); Fri, 24 Nov 2023 14:18:00 +0000 Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.200.91.1.1\)) Subject: Re: bcache: kernel NULL pointer dereference since 6.1.39 From: Coly Li In-Reply-To: <1c2a1f362d667d36d83a5ba43218bad199855b11.camel@gekmihesg.de> Date: Fri, 24 Nov 2023 22:17:46 +0800 Cc: Thorsten Leemhuis , Zheng Wang , linux-kernel@vger.kernel.org, =?utf-8?Q?Stefan_F=C3=B6rster?= , Greg Kroah-Hartman , "stable@vger.kernel.org" , Jens Axboe , Linux kernel regressions list , Bcache Linux Content-Transfer-Encoding: 8BIT Message-Id: <3DF4A87A-2AC1-4893-AE5F-E921478419A9@suse.de> References: <71576a9ff7398bfa4b8c0a1a1a2523383b056168.camel@gekmihesg.de> <989C39B9-A05D-4E4F-A842-A4943A29FFD6@suse.de> <1c2a1f362d667d36d83a5ba43218bad199855b11.camel@gekmihesg.de> To: Markus Weippert X-Mailer: Apple Mail (2.3774.200.91.1.1) X-Spam-Level: Authentication-Results: smtp-out2.suse.de; none X-Rspamd-Server: rspamd2 X-Spamd-Result: default: False [-4.00 / 50.00]; REPLY(-4.00)[] X-Spam-Score: -4.00 X-Rspamd-Queue-Id: 6E0E01FE7D X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 24 Nov 2023 06:18:20 -0800 (PST) > 2023年11月24日 21:55,Markus Weippert 写道: > > On Fri, 2023-11-24 at 21:46 +0800, Coly Li wrote: >> >> >>> 2023年11月24日 21:29,Markus Weippert 写道: >>> >>>> On 23.11.23 14:53, Stefan Förster wrote: >>>>> >>>>> starting with kernel 6.1.39, we see the following error message >>>>> with >>>>> heavy I/O loads. We needed to revert >>>> >>>> Thx for the report. I assume that problem still occurs with the >>>> latest >>>> 6.1.y kernel? >>>> >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.1.39&id=68118c339c6e1e16ae017bef160dbe28a27ae9c8 >>>> >>>> FWIW, that is mainline commit 028ddcac477b69 ("bcache: Remove >>>> unnecessary NULL point check in node allocations") [v6.5-rc1]. >>>> >>>> Did a quick check and noticed a fix for that change was recently >>>> mainlined as f72f4312d43883 ("bcache: replace a mistaken IS_ERR() >>>> by >>>> IS_ERR_OR_NULL() in btree_gc_coalesce()") [v6.7-rc2-post]: >>>> https://lore.kernel.org/all/20231118163852.9692-1-colyli@suse.de/ >>>> >>>> It is expected to soon be interegrated into a 6.1.y kernel. >>>> >>>> But maybe it's something else. I CCed the involved people, they >>>> might >>>> know. >>> >>> We applied f72f4312d43883 to the current Debian kernel (based on >>> 6.1.55) but it didn't help, same stack trace. >>> Looking at the description, __bch_btree_node_alloc() should never >>> be >>> able to return NULL anyway after >>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.1.39&id=7ecea5ce3dc17339c280c75b58ac93d8c8620d9f >>> But I didn't verify all callers, so this might still be correct, if >>> it's not always initialized with the return value of >>> __bch_btree_node_alloc(). >>> >>> Anyway, I think we fixed it by applying this: >>> >>> diff -Naurp a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c >>> --- a/drivers/md/bcache/btree.c 2023-09-23 11:11:13.000000000 +0200 >>> +++ b/drivers/md/bcache/btree.c 2023-11-24 13:13:09.840013759 +0100 >>> @@ -1489,7 +1489,7 @@ out_nocoalesce: >>> bch_keylist_free(&keylist); >>> >>> for (i = 0; i < nodes; i++) >>> - if (!IS_ERR(new_nodes[i])) { >>> + if (!IS_ERR_OR_NULL(new_nodes[i])) { >>> btree_node_free(new_nodes[i]); >>> rw_unlock(true, new_nodes[i]); >>> } >>> >> >> The above change is what commit f72f4312d43883 ("bcache: replace a >> mistaken IS_ERR() by IS_ERR_OR_NULL() in btree_gc_coalesce()” does. > > But f72f4312d43883 reverts @@ -1340,7 +1340,7 @@, while the patch we > applied reverts @@ -1487,7 +1487,7 @@ instead. > Applying f72f4312d43883 didn't help for us. > OK, I know what you mean. Yes, your fix is necessary too. Would you like to post patch for your fix? Thanks. Coly Li >> >> Although the above patch is suggested to go into 6.5+ kernel, for >> this condition it should go into all stable kernels where commit >> 028ddcac477b69 ("bcache: Remove unnecessary NULL point check in node >> allocations”) were merged into. [snipped]