Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp649187ybi; Fri, 7 Jun 2019 14:25:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqzzX/PbDj5pL5+FKZcv/1K1penlt6NxosvG6IGnx5XQ81cXfKGA7lTWDCXxF5T9TrXDBm1K X-Received: by 2002:a65:4086:: with SMTP id t6mr4750728pgp.155.1559942751846; Fri, 07 Jun 2019 14:25:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559942751; cv=none; d=google.com; s=arc-20160816; b=U54ImIy1l38t6R7ES9tCzGYc40t1VBNGwklvvf3HaG1J01C2y68Zuiu14D1TqVy5tr 94y7HkZ71+6v9BR8UBlREbGGI4BaDFZ5hX4k3JrfvFJYWfWqQcMbGE8zhOLOW3yLY2kM yxl6l+265CJM10U+W89Wx3FmJ/HkNXEPlQaqxwRBrtivsDSeYy9zGku6jiCzGFZlRicY FdiE9ZnEptMooB0kXuFu4Ox+x83FPPUqc3UXtWf2UHpL43dPSuhGxjbInfbZf1XGB55p XB1H+G+XrRDR9luuDq54+uDW/dW+7qqJ/vCR2eXMMuDqzowzgKyNsB/mE60MQlpmQheJ FPZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=CtpnRA6Qzz0WbpuB8gLNio4h0HrmWyKurRoUjYFKT2E=; b=WuDKnt+l+vn9IKUOHeHTa/MlGIqzVLd+KNGxwwvYDMJ1ivpy4VyLmbYERwX5/mcYX9 E/yMoKwLe71HNMve+w/xb50L5PmsYk9cni1zxvOns5aWeyjJ6s+xe/fh8nk+eEoRF+am d6AI6AAdy9P4RDkgDpe2OzetdI1YBqmwmfarL+uI0w8zVUgHOSfkrB4XXuiVbQ5LK0i9 g3snAJfe66HS7egpglxbmIZ7lBCacYRuTrNKQj96kHZlI9Gjfbwjnm3A2dmYTiwA7fpv M2W8bbOPAR5AoOrn6BrKuOWk5jHnvSbC4bFJsnY+RQZG5SqyUf6ovTPgLkyHrzRzBEL6 +fJg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y8si2901637plt.202.2019.06.07.14.25.36; Fri, 07 Jun 2019 14:25:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729829AbfFGTN4 (ORCPT + 99 others); Fri, 7 Jun 2019 15:13:56 -0400 Received: from mga07.intel.com ([134.134.136.100]:13955 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728752AbfFGTN4 (ORCPT ); Fri, 7 Jun 2019 15:13:56 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2019 12:13:54 -0700 X-ExtLoop1: 1 Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga006.fm.intel.com with ESMTP; 07 Jun 2019 12:13:52 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 0BAED526; Fri, 7 Jun 2019 22:13:49 +0300 (EEST) Date: Fri, 7 Jun 2019 22:13:49 +0300 From: "Kirill A. Shutemov" To: Andrea Arcangeli Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Oleg Nesterov , Jann Horn , Hugh Dickins , Mike Rapoport , Mike Kravetz , Peter Xu , Jason Gunthorpe , Michal Hocko Subject: Re: [PATCH 1/1] coredump: fix race condition between collapse_huge_page() and core dumping Message-ID: <20190607191349.wvhhnnsd63vrz7xo@black.fi.intel.com> References: <20190607161558.32104-1-aarcange@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190607161558.32104-1-aarcange@redhat.com> User-Agent: NeoMutt/20170714-126-deb55f (1.8.3) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 07, 2019 at 04:15:58PM +0000, Andrea Arcangeli wrote: > When fixing the race conditions between the coredump and the mmap_sem > holders outside the context of the process, we focused on > mmget_not_zero()/get_task_mm() callers in commit > 04f5866e41fb70690e28397487d8bd8eea7d712a, but those aren't the only > cases where the mmap_sem can be taken outside of the context of the > process as Michal Hocko noticed while backporting that commit to > older -stable kernels. > > If mmgrab() is called in the context of the process, but then the > mm_count reference is transferred outside the context of the process, > that can also be a problem if the mmap_sem has to be taken for writing > through that mm_count reference. > > khugepaged registration calls mmgrab() in the context of the process, > but the mmap_sem for writing is taken later in the context of the > khugepaged kernel thread. > > collapse_huge_page() after taking the mmap_sem for writing doesn't > modify any vma, so it's not obvious that it could cause a problem to > the coredump, but it happens to modify the pmd in a way that breaks an > invariant that pmd_trans_huge_lock() relies upon. collapse_huge_page() > needs the mmap_sem for writing just to block concurrent page faults > that call pmd_trans_huge_lock(). > > Specifically the invariant that "!pmd_trans_huge()" cannot become > a "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs. > > The coredump will call __get_user_pages() without mmap_sem for > reading, which eventually can invoke a lockless page fault which will > need a functional pmd_trans_huge_lock(). > > So collapse_huge_page() needs to use mmget_still_valid() to check it's > not running concurrently with the coredump... as long as the coredump > can invoke page faults without holding the mmap_sem for reading. > > This has "Fixes: khugepaged" to facilitate backporting, but in my view > it's more a bug in the coredump code that will eventually have to be > rewritten to stop invoking page faults without the mmap_sem for > reading. So the long term plan is still to drop all > mmget_still_valid(). > > Cc: > Fixes: ba76149f47d8 ("thp: khugepaged") > Reported-by: Michal Hocko > Acked-by: Michal Hocko > Signed-off-by: Andrea Arcangeli Acked-by: Kirill A. Shutemov -- Kirill A. Shutemov