Received: by 2002:a05:6358:701b:b0:131:369:b2a3 with SMTP id 27csp3696097rwo; Mon, 24 Jul 2023 15:25:25 -0700 (PDT) X-Google-Smtp-Source: APBJJlHuG0wQjQxa5hS9whV5Bx6G/5jxup//+MXCdJTsB/5PWh8aw9x/B0eQILozZEuk5aLHc4aA X-Received: by 2002:a05:6402:350b:b0:51e:85d7:2c79 with SMTP id b11-20020a056402350b00b0051e85d72c79mr632794edd.7.1690237525137; Mon, 24 Jul 2023 15:25:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690237525; cv=none; d=google.com; s=arc-20160816; b=knv1VsBu+ci2FJhFNCHKIEEWLW/5+3K9Uu5ayHzxLh8BuBfQ59i0p8UB2z7/Lw24QT RC4nuGsaj2iYwc0vqtix6/TGZPbaetUw+3KkCflGTwMLQIkfuMToEbrnqwuNqy3P3CNM 1T1htW4uQIFmH3v+womX0xnKX3k4GiJ24G6d/irRu4ruVQlH/5JUt09Rs5dajKt4piYr gSrgZVpjkeM9NWmxeDalen6gIcefvTkSYY70NMKgyjhI8B3QFpAjs0/ltdXClI45UsRC ry3exYNt1ecaiUHF5I2N4INsrR+p2/cw5iQS2ux+tiultf+lrZGjTMu1LqRqNGMW8L6n gfjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Bi9QF+ezCeeyehvtVZ4C7/mvj+DDfBkK4HVduZqN9YM=; fh=a0/A+1NfI+/OiH/Izi2Y9JMhorvcDzaSWdtIn70vVJE=; b=TlrKbQ6CUWFAMOkDBiXmsdurnAGCAlA8mCnsBMrWeUS1WCidXNOMdQSj4sX1/+50Yv OXUWQ8R7+gc+5eN+D0UjgcmnQ0n+23FzNk2Vkv9XNL98OlPrc0cur/cUxATG8Zzf6y7O h49zUqqjXaVVOjJoVUoP1J3TWfH0lMDSSvl03XK4Bbkdlyl78HzbHxHqeEqefd7o2YRn o13x0qNZEQL8oWo9piYi2GEGGd2NtvKNQdhmYqnDgy87PcXpad7E76z2QZR4/sGjmJsb BrPshbqCSCxwzOThYFGpL9ZlX1UUSLoF85+ar9qM/fVhosvEx5qlC2yeZPqiNoxM7aRu dr4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=VlCmbuWK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r19-20020aa7c153000000b00522206ad6ccsi3367794edp.403.2023.07.24.15.25.00; Mon, 24 Jul 2023 15:25:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=VlCmbuWK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231235AbjGXWEa (ORCPT + 99 others); Mon, 24 Jul 2023 18:04:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229612AbjGXWE2 (ORCPT ); Mon, 24 Jul 2023 18:04:28 -0400 Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D714210E2 for ; Mon, 24 Jul 2023 15:04:26 -0700 (PDT) Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-4f954d7309fso6085046e87.1 for ; Mon, 24 Jul 2023 15:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; t=1690236265; x=1690841065; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Bi9QF+ezCeeyehvtVZ4C7/mvj+DDfBkK4HVduZqN9YM=; b=VlCmbuWK7dSqWRI5Eg+b7TWUQktclr/SkHYXZrKnYik2x0aBvELIjYCRsCd26iv8T0 i88uGHFfnkcjmWjvZhUUqrDPQ9HPuMuDAkEumLwrU3h/KMs5mQIwKJ9+y6f3bASs8oOm /cseRaC7aK3i5QscJgbEl/aAqA4goAwnQ+4WU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690236265; x=1690841065; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Bi9QF+ezCeeyehvtVZ4C7/mvj+DDfBkK4HVduZqN9YM=; b=CEgx8isHrePfhn69ew0jnEGp9gRJW5CCs4ZmCj7bozO0omm9RsUudOmAOEOEWwhrDl Sdms4GH5ZPwK0OzQitbf1M7DyHkIhAn3WOEV/GVhEFlqbVMrhDE9Vju09swNgs+Gzocm +hfOEK18MuYTMa5q8cD4gUn+XGAcauH6Y4Fs5bthGlvhooWl254RHXak6dEHHIhMzyXv teythoDK4mUA1I1/FLL2Hd5MqrxvNmpjuderikQ4JqLkY3Np9nY01rP8k4P4PRHbNXjA xElYhCLLuIqozMqrd1xQoU6alH1QhnvJVnSLvVe6r3savLGz00GYezmytdTKbQC4nv5a pWxg== X-Gm-Message-State: ABy/qLYzYofNNYlANvkpV8Rpnl6aKlZ2fJragBIsFL5ltmNzqxKsLEJP UaLqroOkm2HKmzwF6AxXq3NTxmv5ApgB0gw1/G1XhA== X-Received: by 2002:a05:6512:108d:b0:4fd:d1a0:ec8f with SMTP id j13-20020a056512108d00b004fdd1a0ec8fmr170469lfg.13.1690236264996; Mon, 24 Jul 2023 15:04:24 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Daniel Dao Date: Mon, 24 Jul 2023 23:04:13 +0100 Message-ID: Subject: Re: Kernel NULL pointer deref and data corruptions with xfs on 6.1 To: Dave Chinner Cc: linux-fsdevel@vger.kernel.org, Matthew Wilcox , kernel-team , linux-kernel , djwong@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 24, 2023 at 10:45=E2=80=AFPM Dave Chinner = wrote: > > On Mon, Jul 24, 2023 at 12:23:31PM +0100, Daniel Dao wrote: > > Hi again, > > > > We had another example of xarray corruption involving xfs and zsmalloc.= We are > > running zram as swap. We have 2 tasks deadlock waiting for page to be r= eleased > > Do your problems on 6.1 go away if you stop using zram as swap? We had xarray corruptions even on nodes without swap, so I'm not sure if swap matters. The corruption on those nodes were noted in the first email with the following trace BUG: kernel NULL pointer dereference, address: 0000000000000036 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 18806c5067 P4D 18806c5067 PUD 188ed48067 PMD 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 73 PID: 3579408 Comm: prometheus Tainted: G O 6.1.34-cloudflare-2023.6.7 #1 Hardware name: GIGABYTE R162-Z12-CD1/MZ12-HD4-CD, BIOS M03 11/19/2021 RIP: 0010:__filemap_get_folio (arch/x86/include/asm/atomic.h:29 include/linux/atomic/atomic-arch-fallback.h:1242 include/linux/atomic/atomic-arch-fallback.h:1267 include/linux/atomic/atomic-instrumented.h:608 include/linux/page_ref.h:238 include/linux/page_ref.h:247 include/linux/page_ref.h:280 include/linux/page_ref.h:313 mm/filemap.c:1863 mm/filemap.c:1915) It's hard for us to run tests without zram swap at scale since the benefits are significant with a lot of workloads. Daniel.