Received: by 10.223.148.5 with SMTP id 5csp6455510wrq; Wed, 17 Jan 2018 14:05:40 -0800 (PST) X-Google-Smtp-Source: ACJfBosXkYb+iDdfW7qQcereP7GDwDmoxlDYk/qAh68w2bIDa9RU6TaaWI/CCWoXFRvMciZ3BWK6 X-Received: by 10.159.244.144 with SMTP id y16mr42970184plr.166.1516226740699; Wed, 17 Jan 2018 14:05:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516226740; cv=none; d=google.com; s=arc-20160816; b=kjAZeANsjxUrN5U5eN0LPfPhUQARk7+C5zS/qKD2kqMyRJ20nEeb+mPTdbxJsItNzn R0CtieyWP5zyb5la03NxK3biAS98ZKeXkSkIl9P1ZWNntBykVV6MwIhchGrlrC8K5/sz ZovaghvIZGCCIxoK5zzw0cvspZXsW7iQTefLcYuoy7dmUYu1EtfyhJ5l+eimEZfZ2qBW bxpG1DlBqzGoM6voGGlEWQ/PwsvV6/qvDETrSlmqFj9bEVLfd8sQ+HQZAyU2HgLD2EA+ aA9fYarqrL4lDNY06YbeEyQCDMqEBzbroHAV4iqwVeqLm4rZfE7A0fQgN1loCpgy8JQ5 z0hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:cc:references:to:subject:arc-authentication-results; bh=+bmcN4lXH/hY5lNsWVNskThlPlwbW5GYalUcmx5w9c0=; b=CD2CI7yqvLEH/kYhDQhK6gLTkiiMEKhsfMTrJfrPpAAqkD3l71nd2y2jllunMpr0Un 0EE7aHch5GeWwkml1ZUjW0HlYm/4VGOGSB4/a4NiTN1rG7kKRu75aLFQxSzBl2zF9VKL COadDXerjKtAUJdzp1B7zQJhRcmeYBW7MCxSvXVIRNSmKW/y1xm2kRpxE3OhhQk5bix9 BnQm2IfEA0lIMAKB1z4Ne97UZQs/7Gppjhf/Jt4xKK++RRZlhniSm1aZBnEDtJDtiAlC grWhS+PzpJTYeVuGXwyxnlFAwNs0yTpjaIUupzdzYoLC1d4AYlyWyrkthfr/0rrEZW1i gRLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d15si5350880plr.551.2018.01.17.14.05.25; Wed, 17 Jan 2018 14:05:40 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754051AbeAQWEy (ORCPT + 99 others); Wed, 17 Jan 2018 17:04:54 -0500 Received: from mga06.intel.com ([134.134.136.31]:59229 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753730AbeAQWEx (ORCPT ); Wed, 17 Jan 2018 17:04:53 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jan 2018 14:04:52 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,374,1511856000"; d="scan'208";a="166957981" Received: from ray.jf.intel.com (HELO [10.7.201.126]) ([10.7.201.126]) by orsmga004.jf.intel.com with ESMTP; 17 Jan 2018 14:04:52 -0800 Subject: Re: [mm 4.15-rc8] Random oopses under memory pressure. To: Linus Torvalds , Tetsuo Handa References: <201801160115.w0G1FOIG057203@www262.sakura.ne.jp> <201801170233.JDG21842.OFOJMQSHtOFFLV@I-love.SAKURA.ne.jp> <201801172008.CHH39543.FFtMHOOVSQJLFO@I-love.SAKURA.ne.jp> Cc: "Kirill A. Shutemov" , Andrew Morton , Johannes Weiner , Joonsoo Kim , Mel Gorman , Tony Luck , Vlastimil Babka , Michal Hocko , Ingo Molnar , Linux Kernel Mailing List , linux-mm , the arch/x86 maintainers From: Dave Hansen Message-ID: <4fe52147-b6a1-83a7-ee4b-104846ddb919@linux.intel.com> Date: Wed, 17 Jan 2018 14:04:51 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/17/2018 01:51 PM, Linus Torvalds wrote: > In fact, it seems to be such a fundamental bug that I suspect I'm > entirely wrong, and full of shit. So it's an interesting and not > _obviously_ incorrect theory, but I suspect I must be missing > something. I'll just note that a few of the pfns I decoded were smack in the middle of the zone, not near either the high or low end of ZONE_NORMAL where we would expect this cross-zone stuff to happen. But I guess we could get similar wonkiness where 'struct page' is screwed up in so many different ways if during buddy joining you do: list_del(&buddy->lru); and 'buddy' is off in another zone for which you do not hold the spinlock. If we are somehow missing some locking, or double-allocating a page, something like this would help: static inline void rmv_page_order(struct page *page) { + WARN_ON_ONCE(!PageBuddy(page)); __ClearPageBuddy(page); set_page_private(page, 0); }