Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp20045pxk; Wed, 23 Sep 2020 20:53:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJztzjVVkNUKLLyRYaRHaD+9Sdc85xOjD1Nip94uT3xQbI8HbJBu05wNaHrETOBQJzWAv4Os X-Received: by 2002:a17:906:fa01:: with SMTP id lo1mr2755231ejb.394.1600919637176; Wed, 23 Sep 2020 20:53:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600919637; cv=none; d=google.com; s=arc-20160816; b=0g2jCGJv5YGc+ZJdbhWCGBmNUxx0erd+PJzGNZdj4wBETPJJFsYYl2FEvil9uNp/Cy LKXIqXa44EpWhIU8LidXk4guJwCdCkpVJ7fG0dfSNFD7UPeDm1WLBlyRrFeTLxqNq5hK wlgYW6NtVPIz0dyoTo53V7ocORtYgcqqWePNaBrihRLYhIBPaIripw6SLsJLh0GZaw4s Tt7WesZXH3dP1RtHx0iUsIt4nfm4eBUvZ3at0zUichqJm7sALoEkt8+uHUYLEgGg3hce Il/6BQK8+yv5mSAhPwXKW4HAPp6jMlGUXi/HilCB1Be7cv/URmFC3twq3QJyYU75BRId d3cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:user-agent :references:in-reply-to:subject:cc:to:from:ironport-sdr:ironport-sdr; bh=gILhg3VuhGQZuQfzVZPdkZJGAGzIqfzFsvy4P2VdRTE=; b=cR2dDcwb6vt6+m/p+BG8gzK63BP9wFSx8ueE3iRewV93uJixT5yM1P4R80qqFAYyhs 1gobpcJjPSv4a0xgZP3zPbrHfrD+UYvhP9Q971dmpyxSSeNFjRGiy8r3YQvXUCQtl2Nh xRc6Swr1r2H07TKReHtdWK6yCdUQ0iyjdEDt1OYW+0lm2PhgUVy/jAKvYqTmnQN3AHW2 kXmr82QmcFLjDhTU5oJNz6PTFbRZC4ihKh+vSFsAL5Eoxs11QpQIBhMHkGfI8A1Kjb/k nhzLHb5uGVHaYgibNdLCl3RBI7DaWbMzGyLOUcKxGJ+1rjVuiAPubixshYJmEN6Eirc5 UC8w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y59si1455081ede.24.2020.09.23.20.53.23; Wed, 23 Sep 2020 20:53:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726666AbgIXDvV (ORCPT + 99 others); Wed, 23 Sep 2020 23:51:21 -0400 Received: from mga06.intel.com ([134.134.136.31]:9325 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726485AbgIXDvU (ORCPT ); Wed, 23 Sep 2020 23:51:20 -0400 IronPort-SDR: v9G9PJcYY0+oVFFsA4TAdMnlsG+v5GMkPkzwdRWrKtCtv+f2R8WdJ7/L0CsJ8UWutW0mpF2rjG 1/TYexz65wyw== X-IronPort-AV: E=McAfee;i="6000,8403,9753"; a="222669056" X-IronPort-AV: E=Sophos;i="5.77,296,1596524400"; d="scan'208";a="222669056" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Sep 2020 20:51:20 -0700 IronPort-SDR: /vdPfafdgz10DNSJUl57elBKogSPucIkbKjcXpCfEiLS9x1pZggY8NCPz8TsTriZMaOyZUpYeh tsX1C/EYEWVA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,296,1596524400"; d="scan'208";a="511430514" Received: from unknown (HELO yhuang-dev) ([10.239.159.65]) by fmsmga006.fm.intel.com with ESMTP; 23 Sep 2020 20:51:18 -0700 From: "Huang\, Ying" To: Rafael Aquini Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org Subject: Re: [PATCH] mm: swapfile: avoid split_swap_cluster() NULL pointer dereference In-Reply-To: <20200924020928.GC1023012@optiplex-lnx> (Rafael Aquini's message of "Wed, 23 Sep 2020 22:09:28 -0400") References: <20200922184838.978540-1-aquini@redhat.com> <878sd1qllb.fsf@yhuang-dev.intel.com> <20200923043459.GL795820@optiplex-lnx> <87sgb9oz1u.fsf@yhuang-dev.intel.com> <20200923130138.GM795820@optiplex-lnx> <87blhwng5f.fsf@yhuang-dev.intel.com> <20200924020928.GC1023012@optiplex-lnx> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Date: Thu, 24 Sep 2020 11:51:17 +0800 Message-ID: <877dsjessq.fsf@yhuang-dev.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Rafael Aquini writes: > The bug here is quite simple: split_swap_cluster() misses checking for > lock_cluster() returning NULL before committing to change cluster_info->flags. I don't think so. We shouldn't run into this situation firstly. So the "fix" hides the real bug instead of fixing it. Just like we call VM_BUG_ON_PAGE(!PageLocked(head), head) in split_huge_page_to_list() instead of returning if !PageLocked(head) silently. > The fundamental problem has nothing to do with allocating, or not allocating > a swap cluster, but it has to do with the fact that the THP deferred split scan > can transiently race with swapcache insertion, and the fact that when you run > your swap area on rotational storage cluster_info is _always_ NULL. > split_swap_cluster() needs to check for lock_cluster() returning NULL because > that's one possible case, and it clearly fails to do so. If there's a race, we should fix the race. But the code path for swapcache insertion is, add_to_swap() get_swap_page() /* Return if fails to allocate */ add_to_swap_cache() SetPageSwapCache() While the code path to split THP is, split_huge_page_to_list() if PageSwapCache() split_swap_cluster() Both code paths are protected by the page lock. So there should be some other reasons to trigger the bug. And again, for HDD, a THP shouldn't have PageSwapCache() set at the first place. If so, the bug is that the flag is set and we should fix the setting. > Run a workload that cause multiple THP COW, and add a memory hogger to create > memory pressure so you'll force the reclaimers to kick the registered > shrinkers. The trigger is not heavy swapping, and that's probably why > most swap test cases don't hit it. The window is tight, but you will get the > NULL pointer dereference. Do you have a script to reproduce the bug? > Regardless you find furhter bugs, or not, this patch is needed to correct a > blunt coding mistake. As above. I don't agree with that. Best Regards, Huang, Ying