Received: by 2002:ac0:8c9a:0:0:0:0:0 with SMTP id r26csp4709572ima; Mon, 4 Feb 2019 23:22:48 -0800 (PST) X-Google-Smtp-Source: AHgI3IZCv2ogxgKxUTO3JESIXncsu+Ti0rBS7ycHUByl0r80sAgkZKH1s7Sb90fuxsb7HX761b20 X-Received: by 2002:a62:5910:: with SMTP id n16mr3530175pfb.128.1549351368718; Mon, 04 Feb 2019 23:22:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549351368; cv=none; d=google.com; s=arc-20160816; b=WFvXoGVykNWgzIEKFIvyOWFUpbOSg5UUs46zXNIYHUJjCHZnxg22maNTnSPfakasYi jEIjd1BjdpvmLgRIgiVY4I/Gsd+h6PIX7Xr2mUVb4uY4xL+i7ChgIw7LSfskMVP6oEHC cz1a0nvORjGbm3Iv5GmjShLZiRJfUQcGUEMPLVup60i/LftgKNq8IfRioOnuMF9JQUwC I82SDEbdZsMYLjZ0YJfcPIMCO5xv7bnyKoamESGL+zRkXDTcqosmye46HJt14zNDurvu 0/1AhpGn2LInoXMlpkgB217b6F3wSWJx5CnFRXMuLA4JTyLtYbL4+hBCbz0RRKKT6wxb pBvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date; bh=yEcYxoWgA0XqYv6PmRpdY3ci8NlVZV0O7p+Y0FVCXEI=; b=kMBK3NFT+rQwXSEZkiRW5wFV5BxCKBmN3gdULv7y+/ZCCADkVmaNzlr6MO3QmmxvLS EUB+zyE82XTcUbpsObUx+NPhSF1v2oGzl+eLo2ePT2OFJyRqhZRCm7KKzTHBPT47IE6f /t+04nyarHuwukRVRiGUH6bl3b9znEHgeTGIkH/vOJxY0LeZv2aCfglmNcpLePMbZkss zc86Kx/d3Id/OpZKg1vzLZ1Hm4inC5FE0BbJx1CF0OtqDB3V4Pk3B50vYNGDRGJnvMFA 7I/aO9DzlZMO0cIfvnvmVqNyvyPRqLWrndRiyXd9cEetIKm16NJ9BdyJ/hvfbAtRvGgv E7DA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g31si2514606pld.358.2019.02.04.23.22.32; Mon, 04 Feb 2019 23:22:48 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727038AbfBEHOg (ORCPT + 99 others); Tue, 5 Feb 2019 02:14:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41822 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726696AbfBEHOg (ORCPT ); Tue, 5 Feb 2019 02:14:36 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D92DFC049DC1; Tue, 5 Feb 2019 07:14:35 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B1B7E8E3DC; Tue, 5 Feb 2019 07:14:35 +0000 (UTC) Received: from zmail17.collab.prod.int.phx2.redhat.com (zmail17.collab.prod.int.phx2.redhat.com [10.5.83.19]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 3CB244A460; Tue, 5 Feb 2019 07:14:35 +0000 (UTC) Date: Tue, 5 Feb 2019 02:14:34 -0500 (EST) From: Jan Stancek To: Lars Persson Cc: linux-mm@kvack.org, lersek@redhat.com, alex williamson , aarcange@redhat.com, rientjes@google.com, kirill@shutemov.name, mgorman@techsingularity.net, mhocko@suse.com, linux-kernel@vger.kernel.org Message-ID: <997509746.100933786.1549350874925.JavaMail.zimbra@redhat.com> In-Reply-To: References: Subject: Re: [PATCH v2] mm: page_mapped: don't assume compound page is huge or THP MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.40.204.147, 10.4.195.8] Thread-Topic: page_mapped: don't assume compound page is huge or THP Thread-Index: UZRA0ONgTI/KE7JRWOwpjzv1lbRMeg== X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Tue, 05 Feb 2019 07:14:36 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- Original Message ----- > On Fri, Nov 30, 2018 at 1:07 PM Jan Stancek wrote: > > > > LTP proc01 testcase has been observed to rarely trigger crashes > > on arm64: > > page_mapped+0x78/0xb4 > > stable_page_flags+0x27c/0x338 > > kpageflags_read+0xfc/0x164 > > proc_reg_read+0x7c/0xb8 > > __vfs_read+0x58/0x178 > > vfs_read+0x90/0x14c > > SyS_read+0x60/0xc0 > > > > Issue is that page_mapped() assumes that if compound page is not > > huge, then it must be THP. But if this is 'normal' compound page > > (COMPOUND_PAGE_DTOR), then following loop can keep running > > (for HPAGE_PMD_NR iterations) until it tries to read from memory > > that isn't mapped and triggers a panic: > > for (i = 0; i < hpage_nr_pages(page); i++) { > > if (atomic_read(&page[i]._mapcount) >= 0) > > return true; > > } > > > > I could replicate this on x86 (v4.20-rc4-98-g60b548237fed) only > > with a custom kernel module [1] which: > > - allocates compound page (PAGEC) of order 1 > > - allocates 2 normal pages (COPY), which are initialized to 0xff > > (to satisfy _mapcount >= 0) > > - 2 PAGEC page structs are copied to address of first COPY page > > - second page of COPY is marked as not present > > - call to page_mapped(COPY) now triggers fault on access to 2nd > > COPY page at offset 0x30 (_mapcount) > > > > [1] > > https://github.com/jstancek/reproducers/blob/master/kernel/page_mapped_crash/repro.c > > > > Fix the loop to iterate for "1 << compound_order" pages. > > > > Debugged-by: Laszlo Ersek > > Suggested-by: "Kirill A. Shutemov" > > Signed-off-by: Jan Stancek > > --- > > mm/util.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > Changes in v2: > > - change the loop instead so we check also mapcount of subpages > > > > diff --git a/mm/util.c b/mm/util.c > > index 8bf08b5b5760..5c9c7359ee8a 100644 > > --- a/mm/util.c > > +++ b/mm/util.c > > @@ -478,7 +478,7 @@ bool page_mapped(struct page *page) > > return true; > > if (PageHuge(page)) > > return false; > > - for (i = 0; i < hpage_nr_pages(page); i++) { > > + for (i = 0; i < (1 << compound_order(page)); i++) { > > if (atomic_read(&page[i]._mapcount) >= 0) > > return true; > > } > > -- > > 1.8.3.1 > > Hi all > > This patch landed in the 4.9-stable tree starting from 4.9.151 and it > broke our MIPS1004kc system with CONFIG_HIGHMEM=y. Hi, are you using THP (CONFIG_TRANSPARENT_HUGEPAGE)? The changed line should affect only THP and normal compound pages, so a test with THP disabled might be interesting. > > The breakage consists of random processes dying with SIGILL or SIGSEGV > when we stress test the system with high memory pressure and explicit > memory compaction requested through /proc/sys/vm/compact_memory. > Reverting this patch fixes the crashes. > > We can put some effort on debugging if there are no obvious > explanations for this. Keep in mind that this is 32-bit system with > HIGHMEM. Nothing obvious that I can see. I've been trying to reproduce on 32-bit x86 Fedora with no luck so far. Regards, Jan